Add projection pushdown to binary expression#691
Open
yeya24 wants to merge 3 commits intothanos-io:mainfrom
Open
Add projection pushdown to binary expression#691yeya24 wants to merge 3 commits intothanos-io:mainfrom
yeya24 wants to merge 3 commits intothanos-io:mainfrom
Conversation
Signed-off-by: yeya24 <benye@amazon.com>
96bd215 to
90d332a
Compare
Signed-off-by: yeya24 <benye@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #689
The idea is to reuse the same projection logical optimizer to pushdown projections to binary expression. Binary expression vector operator can use the projection information to skip non projected labels when materializing labels in the join table.
Added comprehensive tests and correctness tests to ensure the correctness.
My local benchmark showed that this helps mainly when label string interning is disabled. We are still using
slicelabelsin Cortex so it helps with our usecase. Users can choose whether to enable or disable this functionality.Here is the AI generated benchmark report based on the benchmarks I ran locally.
Binary Projection Pushdown - Benchmark Results
Test Configuration
-tags slicelabels(label interning disabled)Results
Small Dataset (1K series, 10 labels)
Without Projection:
With Projection:
Savings:
Large Dataset (10K series, 20 labels)
Without Projection:
With Projection:
Savings:
Key Findings
Scaling Behavior
The optimization's benefits scale linearly with dataset size:
Why It Works (with slicelabels)
Memory Breakdown (Large Dataset)
Without projection (22.3 MB):
With projection (4.1 MB):
Savings: 18.2 MB (82%)
Impact of Label Interning
With Default Build (Label Interning Enabled)
The optimization provides minimal benefit because:
With slicelabels Build (Label Interning Disabled)
The optimization provides massive benefit because:
Real-World Implications
For Prometheus (Default Build with Interning)
Not recommended - The optimization adds CPU overhead without meaningful memory savings.
For Systems Without Label Interning
Highly recommended - Provides:
When to Enable
Enable this optimization when:
Conclusion
The optimization is correctly implemented and provides dramatic benefits for systems without label interning: