Skip to content

[Variant] Take top-level nulls into consideration when extracting perfectly shredded children#9702

Open
AdamGS wants to merge 2 commits intoapache:mainfrom
AdamGS:adamg/perfect-shredded-array-nulls
Open

[Variant] Take top-level nulls into consideration when extracting perfectly shredded children#9702
AdamGS wants to merge 2 commits intoapache:mainfrom
AdamGS:adamg/perfect-shredded-array-nulls

Conversation

@AdamGS
Copy link
Copy Markdown
Contributor

@AdamGS AdamGS commented Apr 13, 2026

Which issue does this PR close?

Rationale for this change

Fixes a correctness issue, where top-level nullability will be dropped in these cases.
Its important to note that due to the current canonicalization behavior, some types (like Binary) actually do behave correctly, this will be fully addressed in #9610 where we can support more underlying types, which simplifies it significantly.

What changes are included in this PR?

Union the nullability buffers of perfectly shredded variant children with the array's top-level nullability.

Are these changes tested?

In addition to existing tests, add tests that verify that the nulls are applied, both when the child is has no-nulls and when it does.

Are there any user-facing changes?

Fixes incorrect behavior

@AdamGS AdamGS force-pushed the adamg/perfect-shredded-array-nulls branch from 9397795 to 9792bd9 Compare April 13, 2026 17:13
@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Apr 13, 2026
@AdamGS AdamGS changed the title Variant: Take top-level nulls into consideration when extracting perfectly shredded children [Variant] Take top-level nulls into consideration when extracting perfectly shredded children Apr 13, 2026
Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the find+fix!

make_array(data)
};

return Some(target_array.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the clone any more?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% right, IDK when clippy catches these and when it doesn't

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Perfectly shredded arrays with top-level null values loss nullability when typed_value is extracted

2 participants