Conversation
This will help do quicker mass deletions in a column
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new batch-deletion API to the column store so callers can delete many rows in one operation (with new tests to validate behavior across value types and null handling).
Changes:
- Introduce
DeleteBatch(ReadOnlySpan<int> targets)onIColumn/IDataColumnand implement it across column/data-column types. - Add
Column.DeleteBatchthat updates null accounting and delegates to validity/data storage. - Add extensive core tests covering deletes across multiple types (ints/strings/bools/doubles/unions/lists/maps/structs) and null scenarios.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/FlowtideDotNet.Core.Tests/ColumnStore/ColumnDeleteBatchTests.cs | New test suite validating Column.DeleteBatch across types, nulls, and large-scale deletes |
| src/FlowtideDotNet.Core/ColumnStore/IColumn.cs | Adds DeleteBatch to the public column interface |
| src/FlowtideDotNet.Core/ColumnStore/Column.cs | Implements DeleteBatch with null counter + validity/data updates |
| src/FlowtideDotNet.Core/ColumnStore/ColumnWithOffset.cs | Explicitly does not support DeleteBatch |
| src/FlowtideDotNet.Core/ColumnStore/AlwaysNullColumn.cs | Explicitly does not support DeleteBatch |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/IDataColumn.cs | Adds DeleteBatch to the public data-column interface |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/IntegerColumn.cs | Delegates batch delete to underlying primitive storage |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/StringColumn.cs | Delegates batch delete to underlying BinaryList |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/BinaryColumn.cs | Delegates batch delete to underlying BinaryList (also adds an unused using) |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/BoolColumn.cs | Delegates batch delete to underlying storage |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/DoubleColumn.cs | Delegates batch delete to underlying storage |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/DecimalColumn.cs | Delegates batch delete to underlying storage |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/TimestampTzColumn.cs | Delegates batch delete to underlying storage |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/NullColumn.cs | Implements batch delete by decrementing count |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/UnionColumn.cs | Implements batch delete via repeated RemoveAt (reverse order) |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/ListColumn.cs | Implements batch delete via repeated RemoveAt (reverse order) |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/MapColumn.cs | Implements batch delete via repeated RemoveAt (reverse order) |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/StructColumn.cs | Implements batch delete via repeated RemoveAt (reverse order) |
| src/FlowtideDotNet.Core/ColumnStore/DataColumns/Int64Column.cs | Adds DeleteBatch but throws NotImplementedException |
Comments suppressed due to low confidence (1)
src/FlowtideDotNet.Core/ColumnStore/DataColumns/BinaryColumn.cs:30
- The newly added
using SqlParser.Ast;appears unused in this file (and introduces an unnecessary dependency/compile warning). Please remove it (and consider removing the existing unused static imports as well if they aren’t needed).
using FlowtideDotNet.Substrait.Expressions;
using SqlParser.Ast;
using System.Buffers;
using System.Diagnostics;
using System.IO.Hashing;
using System.Text.Json;
using static SqlParser.Ast.FetchDirection;
using static SqlParser.Ast.MatchRecognizeSymbol;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
There was a problem hiding this comment.
Benchmark
Details
| Benchmark suite | Current: 53224d3 | Previous: 859330d | Ratio |
|---|---|---|---|
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.InnerJoin |
201850350 ns (± 8514685.996591745) |
307785430 ns (± 9897277.576973714) |
0.66 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.LeftJoin |
292859461.1111111 ns (± 10769895.874780364) |
461337240 ns (± 19940786.10567074) |
0.63 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.ProjectionAndNormalization |
64862430 ns (± 6505969.787484927) |
125533570 ns (± 10748358.533159066) |
0.52 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.SumAggregation |
71991690 ns (± 5339178.33305005) |
138528520 ns (± 8580649.062577441) |
0.52 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.ListAggWithMapAggregation |
1384550350 ns (± 122618112.33690152) |
1823440280 ns (± 56615189.59691922) |
0.76 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.WindowSum |
262595220 ns (± 30493988.94907359) |
359591925 ns (± 5946342.432057349) |
0.73 |
FlowtideDotNet.Benchmarks.Stream.StreamBenchmark.ListAggWithStructAggregation |
1079525070 ns (± 73140501.10509604) |
1507727890 ns (± 61734781.22500134) |
0.72 |
This comment was automatically generated by workflow using github-action-benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This will help do quicker mass deletions in a column