Skip to content

feat: default to sort-merge join#1651

Draft
andygrove wants to merge 6 commits intoapache:mainfrom
andygrove:feat/default-sort-merge-join
Draft

feat: default to sort-merge join#1651
andygrove wants to merge 6 commits intoapache:mainfrom
andygrove:feat/default-sort-merge-join

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #1648.

Rationale for this change

DataFusion's hash join has no spill support, so the build side must fit in memory per task. Ballista executors run multiple tasks per host, so per-task build sides aggregate and OOM the executor under realistic load — for example, the integration test suite (./dev/integration-tests.sh) does not complete on a typical machine without disabling hash joins.

DataFusion sets datafusion.optimizer.prefer_hash_join = true, which is fine for a single-node engine but a poor fit for a distributed multi-task one. This PR flips the Ballista default to sort-merge join, which spills, and leaves users a session-level knob to opt back into hash join when they know the build side fits.

What changes are included in this PR?

  • Set datafusion.optimizer.prefer_hash_join = false inside ballista_restricted_configuration (ballista/core/src/extension.rs), alongside the other soft DataFusion-default overrides. This is a soft default — users can override per session.
  • Update the existing should_support_sort_merge_join integration test to assert the default picks SortMergeJoinExec via EXPLAIN, instead of relying on a result-row check that would pass under either join algorithm.
  • Add should_support_hash_join_when_opted_in to verify users can still get HashJoinExec after SET datafusion.optimizer.prefer_hash_join = true.
  • Document the new default and the opt-in path in docs/source/user-guide/tuning-guide.md under a new "Join Strategy" subsection.

Are there any user-facing changes?

Yes — behaviour change. Queries that previously planned a HashJoinExec will now plan a SortMergeJoinExec by default. The old behaviour is one SET away. Documented in the tuning guide. No public Rust API change, so no api change label.

andygrove added 6 commits May 2, 2026 17:32
Replace fragile positional `.column(1)` with schema-based `index_of("plan")`
and use `col.iter().flatten()` instead of an index loop over non-null values.
Drop the now-unneeded `Array` trait import.
DataFusion's hash join has no spill support, so each parallel task on
an executor must hold the full build side in memory. Default Ballista
sessions to sort-merge join (which spills) until DataFusion gains a
spilling hash join. Users can opt back in with
`SET datafusion.optimizer.prefer_hash_join = true`.

See apache#1648
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should Ballista use sort-merge join rather than hash join by default?

1 participant