Skip to content

#275 Wire RQ1/RQ2 benchmark matrix#280

Merged
jathavaan merged 1 commit into
mainfrom
feataure/275-wire-up-the-rq1rq2-benchmark-matrix-in-benchmarksyml-docker-composeyml-and-ci-workflows
May 19, 2026
Merged

#275 Wire RQ1/RQ2 benchmark matrix#280
jathavaan merged 1 commit into
mainfrom
feataure/275-wire-up-the-rq1rq2-benchmark-matrix-in-benchmarksyml-docker-composeyml-and-ci-workflows

Conversation

@jathavaan
Copy link
Copy Markdown
Collaborator

Rewrites benchmarks.yml to the 52 active experiments from Table 4.2.1 (RQ1: 4 query types x 7 size variants = 28; RQ2: 6 single-node + 18 Sedona = 24). RQ1 pairs three engines at small and two at medium/large via related_script_ids; Sedona variants run unpaired.

Adds _strip_dataset_size_suffix in benchmark_runner.py so 52 unique yml ids (each carrying a -small/-medium/-large tail) dispatch to the shared base entrypoint while dataset_size flows in via --dataset-size.

Collapses docker-compose.yml to 20 active services (one per (query_type, engine) pair) and trims both CI build matrices to match. Out-of-scope benchmarks stay in tree but are commented out at the bottom of benchmarks.yml, docker-compose.yml, and the workflow matrices so re-enabling them is a single search-and-uncomment.

Adds a Test matrix table to the README documenting which experiments launch as a parallel pair group and which run alone.

Rewrites benchmarks.yml to the 52 active experiments from Table 4.2.1
(RQ1: 4 query types x 7 size variants = 28; RQ2: 6 single-node + 18
Sedona = 24). RQ1 pairs three engines at small and two at medium/large
via `related_script_ids`; Sedona variants run unpaired.

Adds `_strip_dataset_size_suffix` in benchmark_runner.py so 52 unique
yml ids (each carrying a -small/-medium/-large tail) dispatch to the
shared base entrypoint while dataset_size flows in via --dataset-size.

Collapses docker-compose.yml to 20 active services (one per
(query_type, engine) pair) and trims both CI build matrices to match.
Out-of-scope benchmarks stay in tree but are commented out at the
bottom of benchmarks.yml, docker-compose.yml, and the workflow matrices
so re-enabling them is a single search-and-uncomment.

Adds a Test matrix table to the README documenting which experiments
launch as a parallel pair group and which run alone.
@jathavaan jathavaan self-assigned this May 19, 2026
Copilot AI review requested due to automatic review settings May 19, 2026 09:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reorganizes the benchmark configuration to align with Table 4.2.1 (52 active experiments: 28 for RQ1, 24 for RQ2) by introducing per-size experiment ids while keeping a single entrypoint per (query_type, engine) pair. The orchestrator already forwards --dataset-size separately, so a small suffix-stripping helper in benchmark_runner.py keeps the dispatch table compact. Out-of-scope items are commented out (not deleted) in all four affected files so re-enabling is uniform.

Changes:

  • Rewrites benchmarks.yml to 52 size-suffixed experiments with correctly scoped related_script_ids (3-way at small, 2-way at medium/large, Sedona unpaired); preserves out-of-scope entries as comments.
  • Adds _strip_dataset_size_suffix in benchmark_runner.py so size-suffixed ids dispatch to the existing base cases; the match table now correctly covers every active base id (verified all 12 RQ1 + 8 RQ2 base ids exist as cases).
  • Trims docker-compose.yml and the two GitHub Actions matrices to the 20 active services, and extends the README with a Test matrix section documenting pair groups (12 RQ1 + 21 RQ2 = 33, consistent with 52 experiments).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
benchmarks.yml Replaces flat experiment list with 52 size-suffixed entries; mutual related_script_ids are symmetric at each size; commented out-of-scope block.
benchmark_runner.py Adds _strip_dataset_size_suffix and wires it into the dispatch match; iterates DatasetSize enum so small/medium/large suffixes are stripped before lookup.
docker-compose.yml Reduces to 20 active services (one per base id) plus commented out-of-scope mirror.
.github/workflows/push-containers-to-acr.yml Trims build matrix to the 20 active images; commented mirror at the bottom.
.github/workflows/pull-request-tests.yml Mirrors the trimmed service matrix for PR tests.
README.md Adds TOC entry and "Test matrix" subsection documenting RQ1/RQ2 pair groupings and Shapefile-only-at-small policy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jathavaan jathavaan enabled auto-merge May 19, 2026 09:28
@jathavaan jathavaan disabled auto-merge May 19, 2026 09:40
@jathavaan jathavaan merged commit 6f05dd3 into main May 19, 2026
28 checks passed
@jathavaan jathavaan deleted the feataure/275-wire-up-the-rq1rq2-benchmark-matrix-in-benchmarksyml-docker-composeyml-and-ci-workflows branch May 19, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wire up the RQ1/RQ2 benchmark matrix in benchmarks.yml, docker-compose.yml, and CI workflows

2 participants