fix(rust): Flatten late merge_sorted unions#26962
fix(rust): Flatten late merge_sorted unions#26962pablogsal wants to merge 1 commit intopola-rs:mainfrom
merge_sorted unions#26962Conversation
The merge_sorted crash in 1.37 comes from a bad plan shape produced by
the optimizer, not from merge_sorted itself returning the wrong data.
A repeated reduction like:
functools.reduce(lambda a, b: a.merge_sorted(b, key="ts"), inputs)
builds a left-deep MergeSorted tree. When the query later ends in an
order-destroying or order-reestablishing operator such as a final sort,
the order-observability pass is allowed to rewrite each MergeSorted node
into an unordered Union because the exact intermediate merge order is no
longer semantically observable.
That rewrite is fine in isolation, but in 1.37 the order pass runs after
the main FlattenUnionRule. The result is that the optimizer can
manufacture a left-deep Union(Union(Union(...))) tree after the only
flattening pass has already finished. The memory engine then executes
that nested union tree recursively. With enough inputs, the recursive
execution depth is large enough to overflow a rayon worker stack and
segfault.
This patch keeps the existing MergeSorted -> Union optimization but
immediately runs FlattenUnionRule again afterwards. That is enough to
collapse the late-created nested unions back into one flat Union node
before execution planning continues.
In other words, this does not disable the optimization. It preserves the
cheaper unordered execution strategy while preventing the pathological
recursive executor shape that was causing the crash.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #26962 +/- ##
========================================
Coverage 81.73% 81.74%
========================================
Files 1808 1809 +1
Lines 249000 249151 +151
Branches 3139 3139
========================================
+ Hits 203531 203674 +143
- Misses 44664 44672 +8
Partials 805 805 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
merge_sorted unions
|
See #26960 (comment) for context. A more "correct" fix that doesn't involve a second pass is to fix it in the IR perhaps as right now it does: *ir = IR::Union {
inputs: vec![*input_left, *input_right],
...
};That creates a binary Union, and because this pass visits parents before children, a left-deep MergeSorted chain turns into a left-deep Union chain. Then you need late flattening. A better version can be:
Conceptually: fn collect_merge_sorted_inputs(node: Node, ir_arena: &Arena<IR>, out: &mut Vec<Node>) {
match ir_arena.get(node) {
IR::MergeSorted { input_left, input_right, .. } => {
collect_merge_sorted_inputs(*input_left, ir_arena, out);
collect_merge_sorted_inputs(*input_right, ir_arena, out);
},
_ => out.push(node),
}
}Then rewrite the current node to a flat union from those collected inputs as hat avoids the extra optimize_loop entirely. I can go this route if people agree. Edit: I dug a bit deeper into this. The more direct place to address the issue is indeed the *ir = IR::Union {
inputs: vec![*input_left, *input_right],
...
};That preserves the original binary shape, so a left-deep MergeSorted chain becomes a left-deep Union chain, which is why we need a late flatten afterwards. However, after tracing So I think there are two options:
Given that, I no longer think “flatten directly in |
Fixes: #26960
The merge_sorted crash in 1.37 comes from a bad plan shape produced by the optimizer, not from merge_sorted itself returning the wrong data.
A repeated reduction like:
builds a left-deep MergeSorted tree. When the query later ends in an order-destroying or order-reestablishing operator such as a final sort, the order-observability pass is allowed to rewrite each MergeSorted node into an unordered Union because the exact intermediate merge order is no longer semantically observable.
That rewrite is fine in isolation, but in 1.37 the order pass runs after the main FlattenUnionRule. The result is that the optimizer can manufacture a left-deep Union(Union(Union(...))) tree after the only flattening pass has already finished. The memory engine then executes that nested union tree recursively. With enough inputs, the recursive execution depth is large enough to overflow a rayon worker stack and segfault.
This patch keeps the existing MergeSorted -> Union optimization but immediately runs FlattenUnionRule again afterwards. That is enough to collapse the late-created nested unions back into one flat Union node before execution planning continues.
In other words, this does not disable the optimization. It preserves the cheaper unordered execution strategy while preventing the pathological recursive executor shape that was causing the crash.
The regression test added here intentionally uses a much smaller merge_sorted chain than the real crash repro. The production repro uses thousands of inputs; the test only needs enough inputs to prove the optimizer emits a single flat UNION block instead of a chain of nested UNION nodes.
As requested by the first contribution guidelines: