#292 Add sequential stopping rule for benchmark iterations by jathavaan · Pull Request #308 · kartAI/doppa

jathavaan · 2026-05-22T16:09:01Z

Summary

Sequential stopping rule for high-frequency single-machine queries (PIP, kNN, bbox): bootstrapped 95% CI on the mean elapsed time, floors at 10 iterations and 60s, ceiling at the existing BenchmarkIteration value, hard timeout at 1 hr. Stops when the relative CI half-width drops below 5%. Deterministic bootstrap RNG seeded from (run_id, query_id) so reruns reproduce.
National-scale spatial joins (Databricks, DuckDB, PostGIS) opt out via use_sequential_stopping=False: fixed 5 iterations, no wall-clock timeout, plus warmup_iterations=1 to keep the warmup cost from dominating long per-iteration runtimes.
New StopReason enum (precision, timeout, ceiling, fixed, failed) and metadata columns persisted on every run: achieved_iterations, stop_reason, ci_half_width_seconds, ci_half_width_relative, mean_elapsed_seconds, median_elapsed_seconds.
Closes Create lower and upper bound for queries #292.

Test plan

Run a fast benchmark (PIP small) end-to-end on ACI; verify stop_reason="precision" and achieved_iterations < 2500 in benchmark_metadata.parquet.
Run a slow benchmark (national-scale-spatial-join-duckdb-small); verify stop_reason="fixed", achieved_iterations=5, and only 1 warmup line in the logs.
Verify Databricks national-scale records stop_reason="fixed" and the new columns populate alongside the existing Spark-phase metrics.
Confirm BENCHMARK_MAX_TIMED_WINDOW_SECONDS does not fire on Databricks or single-machine national-scale runs even when total wall-clock exceeds 1 hr.
Inspect a TIMEOUT run (force-throttle a query) and confirm stop_reason="timeout" plus the cost-metric window still contains data.

Copilot

Pull request overview

Adds a warmup_iterations override to the @monitor benchmarking decorator and applies it to national-scale spatial join entrypoints so long-running benchmarks can run with a single warmup iteration (instead of the default warmup count), reducing wasted wall-clock time on expensive workloads.

Changes:

Extend monitor() with an optional warmup_iterations parameter and use it to drive the warmup loop.
Set warmup_iterations=1 for national-scale spatial join entrypoints (DuckDB, PostGIS, Databricks runner) that already opt out of sequential stopping.
Document the warmup override behavior in README.md and CLAUDE.md.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/presentation/entrypoints/national_scale_spatial_join_postgis.py	Set `warmup_iterations=1` for the PostGIS national-scale join benchmark.
src/presentation/entrypoints/national_scale_spatial_join_duckdb.py	Set `warmup_iterations=1` for the DuckDB national-scale join benchmark.
src/presentation/entrypoints/_databricks_benchmark_runner.py	Set `warmup_iterations=1` for the Databricks national-scale join runner.
src/application/common/monitor.py	Add `warmup_iterations` parameter and apply it to the warmup loop/logging.
README.md	Document that national-scale entrypoints override warmup to 1 iteration.
CLAUDE.md	Update contributor notes to include `warmup_iterations=1` guidance for long-running benchmarks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            effective_warmup_iterations = (
+                warmup_iterations
+                if warmup_iterations is not None
+                else Config.BENCHMARK_WARMUP_ITERATIONS
+            )
+


…-queries

jathavaan added 2 commits May 22, 2026 18:06

#292 Reduce warmup to 1 for national-scale spatial joins

750b90c

#292 Document warmup_iterations override for national-scale joins

3a93297

Copilot AI review requested due to automatic review settings May 22, 2026 16:09

Copilot started reviewing on behalf of jathavaan May 22, 2026 16:09 View session

jathavaan mentioned this pull request May 22, 2026

Prune Sedona default strategy matrix to large-tier only #309

Closed

6 tasks

Copilot AI reviewed May 22, 2026

View reviewed changes

Comment thread src/application/common/monitor.py

Comment on lines +82 to +87

effective_warmup_iterations = (

warmup_iterations

if warmup_iterations is not None

else Config.BENCHMARK_WARMUP_ITERATIONS

)

Merge branch 'main' into feature/292-create-lower-and-upper-bound-for…

e3848fa

…-queries

jathavaan merged commit d29ae1d into main May 22, 2026

jathavaan deleted the feature/292-create-lower-and-upper-bound-for-queries branch May 22, 2026 16:13

This was referenced May 23, 2026

stop_reason=fixed reported when warmup raises in non-sequential @monitor branch #310

Closed

Create lower and upper bound for queries #292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#292 Add sequential stopping rule for benchmark iterations#308

#292 Add sequential stopping rule for benchmark iterations#308
jathavaan merged 3 commits into
mainfrom
feature/292-create-lower-and-upper-bound-for-queries

jathavaan commented May 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jathavaan commented May 22, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants