feat: benchmark infrastructure — per-structure tasks, fast mode, JSON persistence, regression compare#14
Merged
bluuewhale merged 9 commits intomainfrom Mar 21, 2026
Merged
Conversation
d3a0328 to
af9f08c
Compare
Add design spec for benchmark improvements: independent per-structure Gradle tasks via JavaExec, fast/precise mode split, JSON result storage, and regression comparison script. Update CLAUDE.md to require before/after benchmark comparison after every optimization change.
- Combine duplicate -jvmArgs into single string to prevent flag overwrite - Remove SimdEq from jmhSwissMap regex to eliminate overlap with jmhSimd - Unify SimdEqBenchmark to canonical @fork(2)/@Warmup/@measurement values - Set warmup time (-w 3s) and measurement time (-r 5s) in fast-mode tasks - Ignore docs/superpowers/ in .gitignore
af9f08c to
7d0eea6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
*GetHitTestbenchmark classes; retainMapBenchmark,SetBenchmark,ConcurrentSwissMapGetTest,SimdEqBenchmark@Fork/@Warmup/@Measurementannotations to canonical precise-mode values (@Fork(2),@Warmup(5×1s),@Measurement(5×2s)); fast-mode tasks override via CLI flagsjmhFast,jmhSwissMap,jmhConcurrent,jmhSimdJavaExectasks (fast mode: fork=1, warmup=2, iter=3) that auto-write timestamped JSON tobenchmark-results/jmhComparetask invokingscripts/jmh_compare.pyto detect regressions between two JSON result files (configurable threshold, exit 1 on regression)benchmark-results/directory via.gitkeep; gitignore*.jsonresult filesTest Plan
./gradlew test apacheTest googleTest— all green./gradlew jmhSwissMapruns twice and writes timestamped JSON files./gradlew jmhCompareproduces tabular output and exits non-zero on regression-Pjmh.include=MapBenchmarkoverride narrows benchmark scope correctly./gradlew tasks --group verificationlists all new tasks