Skip to content

fix(binary): Detect duplicate output series in group_left/group_right#693

Draft
harry671003 wants to merge 1 commit intothanos-io:mainfrom
harry671003:fix_bug
Draft

fix(binary): Detect duplicate output series in group_left/group_right#693
harry671003 wants to merge 1 commit intothanos-io:mainfrom
harry671003:fix_bug

Conversation

@harry671003
Copy link
Contributor

Notes

When group_left or group_right produces multiple result series with identical labels, the engine should error with "grouping labels must ensure unique matches" to match Prometheus behavior. Previously,
joinHelper.append() silently deduplicated output series by hash, causing multiple matched pairs to merge into a single series with duplicate timestamps instead of returning an error.

Make joinHelper.append() report whether a label set was already seen, and check for duplicates in initJoinTables for many-to-one/one-to-many operations.

Same behavior in Prometheus engine: https://github.com/prometheus/prometheus/blob/1906c9877fbe7d33d4bd220e9851c719e1b43156/promql/engine.go#L3049

@harry671003 harry671003 requested a review from Copilot March 10, 2026 17:08
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts binary vector matching to error (Prometheus-compatible) when group_left/group_right produces duplicate output label sets, instead of silently deduplicating/merging series.

Changes:

  • Make initJoinTables return an error and propagate it during operator initialization.
  • Extend joinHelper.append() to report whether a label set was already seen, and use it to detect duplicates for non-one-to-one matching.
  • Add a new engine test case and ignore a new local folder in .gitignore.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
execution/binary/vector.go Detect duplicate output series during join table initialization; propagate errors; extend join helper API.
engine/engine_test.go Adds a scenario around group_right with missing matching labels.
.gitignore Ignores .kiro directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +585 to 592
// append adds a label set to the join helper and returns its index.
// If the label set was already seen (by hash), it returns the existing index
// and true. Otherwise it appends a new entry and returns false.
func (h *joinHelper) append(ls labels.Labels) (int, bool) {
hash := ls.Hash()
if n, ok := h.seen[hash]; ok {
return n
return n, true
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

joinHelper.append() treats a matching labels.Hash() as an identical label set. Hash collisions (even if rare) would incorrectly mark distinct series as duplicates (triggering a false Prometheus error) or incorrectly deduplicate to the wrong index. Consider storing and comparing actual label sets on hash match (e.g., map hash -> list of indices, then labels.Equal(h.ls[idx], ls)), and only returning duplicate=true when labels are truly identical. If you keep hash-only semantics, the doc comment should also make it explicit that this is hash-based and collision-prone.

Copilot uses AI. Check for mistakes.
Comment on lines +2340 to +2345
num_cpus{pod="nginx-1"} 4+0x40
memory_usage_bytes{pod="nginx-1", type="buffers"} 100+10x40
memory_usage_bytes{pod="nginx-1", type="cached"} 200+20x40
memory_usage_bytes{pod="nginx-1", type="free"} 300+30x40
memory_usage_bytes{pod="nginx-1", type="used"} 400+40x40`,
query: `num_cpus * on(instance, job) group_right(type) memory_usage_bytes`,
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test case doesn't appear to exercise the new behavior described in the PR (erroring when group_left/group_right yields multiple output series with identical labels). Consider adding a dedicated case that produces duplicate output labels and asserts the engine returns multiple matches for labels: grouping labels must ensure unique matches (e.g., use group_right without including a distinguishing label like type, or craft inputs where the grouping labels still collide).

Suggested change
num_cpus{pod="nginx-1"} 4+0x40
memory_usage_bytes{pod="nginx-1", type="buffers"} 100+10x40
memory_usage_bytes{pod="nginx-1", type="cached"} 200+20x40
memory_usage_bytes{pod="nginx-1", type="free"} 300+30x40
memory_usage_bytes{pod="nginx-1", type="used"} 400+40x40`,
query: `num_cpus * on(instance, job) group_right(type) memory_usage_bytes`,
num_cpus{pod="nginx-1",instance="localhost:9100",job="node"} 4+0x40
memory_usage_bytes{pod="nginx-1",instance="localhost:9100",job="node",type="buffers"} 100+10x40
memory_usage_bytes{pod="nginx-1",instance="localhost:9100",job="node",type="cached"} 200+20x40
memory_usage_bytes{pod="nginx-1",instance="localhost:9100",job="node",type="free"} 300+30x40
memory_usage_bytes{pod="nginx-1",instance="localhost:9100",job="node",type="used"} 400+40x40`,
// Multiple right-hand series share the same (instance, job) labels; omitting a
// distinguishing label like "type" in group_right causes multiple matches and
// should trigger: "multiple matches for labels: grouping labels must ensure unique matches".
query: `num_cpus * on(instance, job) group_right memory_usage_bytes`,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants