Skip to content

Fix HAVING clause with time_bucket_gapfill returning wrong rows#9624

Open
svenklemm wants to merge 1 commit intomainfrom
sven/gapfill_having
Open

Fix HAVING clause with time_bucket_gapfill returning wrong rows#9624
svenklemm wants to merge 1 commit intomainfrom
sven/gapfill_having

Conversation

@svenklemm
Copy link
Copy Markdown
Member

HAVING quals were attached to the GroupAggregate below the GapFill
node, so filtered-out groups disappeared before gap rows could be
generated. When HAVING eliminated every group the query returned zero
rows; when it eliminated some, GapFill still synthesized NULL rows for
the filtered buckets.

Lift the quals off the aggregate subpath and attach them to the
CustomScan plan's scan.plan.qual, then evaluate them on every tuple
(real and gap-filled) returned from gapfill_exec. Gap rows have NULL
aggregates, so standard SQL HAVING semantics apply: count() > N drops
them, count(
) IS NULL keeps them.

Fall back to the old behaviour for queries where HAVING references an
Aggref that is not a top-level expression of the GapFill pathtarget
(e.g. HAVING sum(x) > 4 with locf(sum(x)) in the target list), since
set_customscan_references cannot resolve the bare Aggref against
custom_scan_tlist in that case.

Fixes #5202

@svenklemm svenklemm requested a review from a team April 18, 2026 17:35
@github-actions github-actions Bot requested review from antekresic and kpan2034 April 18, 2026 17:35
@github-actions
Copy link
Copy Markdown

@antekresic, @kpan2034: please review this pull request.

Powered by pull-review

@svenklemm svenklemm force-pushed the sven/gapfill_having branch from 8539f75 to 88783de Compare April 18, 2026 17:36
Comment thread tsl/src/nodes/gapfill/gapfill_exec.c
@svenklemm svenklemm added this to the v2.27.0 milestone Apr 29, 2026
@svenklemm svenklemm force-pushed the sven/gapfill_having branch 2 times, most recently from 3b66e77 to 0b88bc5 Compare April 29, 2026 05:39
When a HAVING clause was used with gapfill, groups it filtered out
disappeared before gapfill could extend them. Queries that filtered
out every group returned zero rows, and queries that filtered out
some still got gapfilled rows for the filtered buckets.

Move the HAVING quals onto the gapfill node so they run on every
tuple it produces, real or gapfilled. Gap rows carry NULL aggregates,
so normal SQL semantics apply: count(*) > N drops them and
count(*) IS NULL keeps them.

This only works when each aggregate in HAVING is a top-level entry
of the gapfill output. Queries that wrap aggregates with locf or
interpolate keep the old behaviour, which is correct as long as
HAVING does not eliminate every real group.

Fixes #5202
@svenklemm svenklemm force-pushed the sven/gapfill_having branch from 0b88bc5 to 74e17dc Compare April 29, 2026 06:03
Comment on lines +482 to +484
* disappear before gapfill can extend them (#5202). Skipped when an
* Aggref is not a top-level pathtarget expression (locf/interpolate
* wraps it), since setrefs cannot resolve it then.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implicatioins of this? HAVING still doesn't work if you use locf/interpolate?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should locf/interpolate be at the Agg node at all? Maybe we should remove them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: time_bucket_gapfill with HAVING clause behaving weirdly

3 participants