Cagg try#9629
Draft
gayyappan wants to merge 23 commits intotimescale:mainfrom
Draft
Conversation
3705055 to
85adf5a
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
9263fd4 to
09047a5
Compare
Add a new catalog table to track which devices have backfilled data into old chunks. This will be used by the cagg refresh to only re-materialize data for devices that actually backfilled, rather than refreshing the entire time range. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a nullable tenant_column_name column to the continuous_agg catalog table. When set, it identifies the column used for backfill-aware refresh. NULL means tracking is disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add support for setting the tenant column on a continuous aggregate via ALTER MATERIALIZED VIEW ... SET (timescaledb.tenant_column). Validates that the column exists on the raw hypertable, that all sibling caggs agree on the same tenant column, and errors if a tenant column is already set on the cagg. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Store the chunk's time range end on ChunkInsertState during chunk routing. we will use this to determine watermark to check if this is late arriving data for tenant. 1) Instrument the INSERT and COPY paths to detect when data is inserted into old chunks (below the low watermark). When backfill is detected, the device value and time range are buffered in a transaction-local hash table and flushed to the backfill_tracker catalog table at commit. The watermark is derived automatically: now - max(chunk_interval, 1 day). Per-row cost for non-backfill inserts is a single bool check (cached per chunk). Backfill rows pay slot_getattr + hash lookup + min/max comparisons. Text conversion only happens at flush time. 2) Cache GetCurrentTimestamp once per transaction for backfill watermark Avoid calling GetCurrentTimestamp on every new chunk seen during backfill detection. The timestamp is now computed once in backfill_tracker_init and reused for all watermark checks within the transaction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
continuous_agg_backfill_check now returns bool — true means the row's invalidation has been recorded in the per-device backfill tracker and the coarse hypertable_invalidation_log entry is not needed. The INSERT and COPY callers are re-ordered to invoke the tracker check first and only fall through to continuous_agg_dml_invalidate when it returns false (recent chunks,d or hypertables without a configured tenant column). This ensures backfill-chunk inserts with a tenant column produce exactly one invalidation record. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
we hade = earlier. we need to compare against a list of tennats. convert equality with <single value> to euqlaity with ANY (arry of values)
group backfill invalidations by bucket + tenant.
Manual refresh now consumes the continuous_aggs_backfill_tracker in addition to the coarse cagg invalidation log. Entries in the window are bucket-grouped and re-materialised per-tenant via DELETE+INSERT scoped by tenant = ANY($3), bypassing a full time-range rewrite of buckets that only one tenant backfilled into. Adds two pieces plumbed into the existing T3 of continuous_agg_refresh_internal: - collect_and_delete_tracker_entries_in_window (invalidation.c) expands each tracker row [lowest, greatest] under the cagg's bucket function into per-bucket (tenant) pairs, sorts + dedups, and deletes the source rows. - continuous_agg_refresh_with_tracker (refresh.c) walks the resulting groups and calls continuous_agg_update_materialization_for_tenant once per bucket with a constructed tenant ArrayType. process_cagg_invalidations_and_refresh now returns true when either the cagg log or the tracker produced work, so the "already up-to-date" notice is correctly suppressed after a tracker-only refresh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cover backfill that spans multiple buckets for one tenant and backfill that touches multiple tenants across buckets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r region NOT HANDLED well today. Need to fix.
09047a5 to
087fed1
Compare
we need this for the case where there are multiple caggs defined on the hypertable. Fix this later and add only if there are multiple caggs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.