Fix deprecate MutableArrayData::extend and MutableArrayData::extend_nulls in favour of fallible try_extend / try_extend_nulls#9710
Open
HawaiianSpork wants to merge 3 commits intoapache:mainfrom
Conversation
Contributor
|
Have you run any benchmarks against this PR? Also, are there any public API changes? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
MutableArrayData::extendandMutableArrayData::extend_nullsin favour of fallibletry_extend/try_extend_nulls#9709Rationale for this change
MutableArrayData::extendandextend_nullspanic at runtime when offset arithmetic overflows the underlying integer type (e.g. accumulating more than 2 GiB of data in aStringArray) or when the run-end counter overflows in aRunEndEncodedarray. Because these methods return()there is no way for callers to recover from or even detect the failure. This makes it impossible to build robust, panic-free pipelines on top ofMutableArrayData.What changes are included in this PR?
Adds two new methods to
MutableArrayDatainarrow-data:try_extend(index, start, end) -> Result<(), ArrowError>— returns an error instead of panicking on offset overflow. On error, internal buffers are rolled back to their pre-call lengths so the builder is left in a consistent, valid state.try_extend_nulls(len) -> Result<(), ArrowError>— returns an error instead of panicking when the run-end counter overflows in aRunEndEncodedarray.The original
extendandextend_nullsmethods are deprecated with a note pointing to the new alternatives. Their implementations delegate to the new methods andexpecton the result, preserving existing behaviour for code that has not yet migrated.All deprecated call sites within the workspace are updated:
arrow-cast/src/cast/list.rsextend→try_extend,extend_nulls→try_extend_nulls; errors mapped toArrowError::CastErrorarrow-json/src/reader/run_end_array.rsextend→try_extend; errors mapped toArrowError::JsonErrorarrow-select/src/concat.rsextend→try_extendarrow-select/src/filter.rsextend→try_extend;for_eachclosures converted toforloops to allow?arrow-select/src/interleave.rsextend→try_extendarrow-select/src/merge.rsextend→try_extend,extend_nulls→try_extend_nulls;for_eachclosure converted toforlooparrow-select/src/zip.rsextend→try_extend;for_eachclosure converted toforloopparquet/src/arrow/array_reader/fixed_size_list_array.rsextend→try_extend,extend_nulls→try_extend_nulls; errors mapped viageneral_err!parquet/src/arrow/array_reader/list_array.rsextend→try_extend; errors mapped viageneral_err!At each call site errors are surfaced in the type idiomatic for that crate rather than leaking an
InvalidArgumentErrorfrom the transform layer — cast functions returnCastError, JSON reader functions returnJsonError, and parquet reader functions convert toParquetError.Are these changes tested?
The changes are covered by the existing test suites for each affected crate. The new
try_extendandtry_extend_nullsmethods are exercised indirectly through all existing tests that exerciseextendandextend_nulls(since the deprecated methods now delegate to them), as well as through all the call sites updated in this PR.Are there any user-facing changes?
MutableArrayData::extendandMutableArrayData::extend_nullsare deprecated. Callers should migrate totry_extendandtry_extend_nullsrespectively and handle the returnedResult. The deprecated methods continue to compile and behave identically to before (panicking on overflow), so there are no breaking changes.Notes
An LLM was used to make some of the code modifications. All code has been reviewed by a human.