Skip to content

Extend ScanBenchmark report with per-phase durations#83

Merged
RasputinKaiser merged 1 commit into
mainfrom
extend-scanbenchmark-per-phase-durations
Jun 22, 2026
Merged

Extend ScanBenchmark report with per-phase durations#83
RasputinKaiser merged 1 commit into
mainfrom
extend-scanbenchmark-per-phase-durations

Conversation

@RasputinKaiser

Copy link
Copy Markdown
Owner

Summary

  • Add enumerateDuration, verifyDuration, and persistDuration to ScanBenchmarkReport, mirroring the P1 signpost phases (enumerate / verify / persist).
  • StorageScan now tracks enumerateDuration (startedAt → duplicate-verification start); duplicateVerificationDuration is kept for backward compatibility and surfaced as verifyDuration on the report.
  • ScanBenchmarkRunner accepts an optional DuplicateHashCache, drives persist() after the scan, and times it into persistDuration. Default behavior (no cache) leaves persistDuration at 0 so the existing benchmark path is unchanged.
  • Adds a totalDuration computed property enumerating the sum of the three phases.
  • Two new tests cover field surfaces, alias semantics, the cache-hit re-scan timing, and on-disk cache file placement.

P4 of the v0.5.0 Performance Upgrade.

Test plan

  • swift build
  • swift test (56 tests, including the 2 new benchmark tests)
  • ./script/public_upload_audit.sh
  • Sanity-run StorageScopeBenchmark --synthetic and confirm the new Enumerate/Verify/Persist/Phase total lines print.

🤖 Generated with Code

Add enumerateDuration, verifyDuration, and persistDuration fields to
ScanBenchmarkReport so the benchmark executable and tests surface where
the time goes during a scan. Mirrors the P1 signpost phases:
enumerate (recursive walk + concurrentPerform), verify (duplicate
hashing), and persist (hashCache.persist + on-disk writes).

- StorageScan gains enumerateDuration (time from startedAt to the
  duplicate verification start). duplicateVerificationDuration is
  retained for backward compat and mirrored as verifyDuration on the
  report.
- ScanBenchmarkRunner can now hold a DuplicateHashCache and times the
  persist() call against it. Default behavior (no cache) keeps
  persistDuration at 0 so the existing benchmark path is unchanged.
- totalDuration is a computed property summing the three phases so
  quick comparisons against the existing aggregate duration are easy.
- Two new tests assert the field surfaces, the alias semantics, the
  cache-hit re-scan timing, and that the on-disk cache file lands at
  the expected cache URL.

Co-Authored-By: NCode <noreply@noumena.com>
@RasputinKaiser RasputinKaiser merged commit c7a21b1 into main Jun 22, 2026
6 checks passed
@RasputinKaiser RasputinKaiser deleted the extend-scanbenchmark-per-phase-durations branch June 22, 2026 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant