Skip to content

Fix deprecate MutableArrayData::extend and MutableArrayData::extend_nulls in favour of fallible try_extend / try_extend_nulls#9710

Open
HawaiianSpork wants to merge 3 commits intoapache:mainfrom
relativityone:safe_array_extends
Open

Fix deprecate MutableArrayData::extend and MutableArrayData::extend_nulls in favour of fallible try_extend / try_extend_nulls#9710
HawaiianSpork wants to merge 3 commits intoapache:mainfrom
relativityone:safe_array_extends

Conversation

@HawaiianSpork
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

MutableArrayData::extend and extend_nulls panic at runtime when offset arithmetic overflows the underlying integer type (e.g. accumulating more than 2 GiB of data in a StringArray) or when the run-end counter overflows in a RunEndEncoded array. Because these methods return () there is no way for callers to recover from or even detect the failure. This makes it impossible to build robust, panic-free pipelines on top of MutableArrayData.

What changes are included in this PR?

Adds two new methods to MutableArrayData in arrow-data:

  • try_extend(index, start, end) -> Result<(), ArrowError> — returns an error instead of panicking on offset overflow. On error, internal buffers are rolled back to their pre-call lengths so the builder is left in a consistent, valid state.
  • try_extend_nulls(len) -> Result<(), ArrowError> — returns an error instead of panicking when the run-end counter overflows in a RunEndEncoded array.

The original extend and extend_nulls methods are deprecated with a note pointing to the new alternatives. Their implementations delegate to the new methods and expect on the result, preserving existing behaviour for code that has not yet migrated.

All deprecated call sites within the workspace are updated:

File Changes
arrow-cast/src/cast/list.rs extendtry_extend, extend_nullstry_extend_nulls; errors mapped to ArrowError::CastError
arrow-json/src/reader/run_end_array.rs extendtry_extend; errors mapped to ArrowError::JsonError
arrow-select/src/concat.rs extendtry_extend
arrow-select/src/filter.rs extendtry_extend; for_each closures converted to for loops to allow ?
arrow-select/src/interleave.rs extendtry_extend
arrow-select/src/merge.rs extendtry_extend, extend_nullstry_extend_nulls; for_each closure converted to for loop
arrow-select/src/zip.rs extendtry_extend; for_each closure converted to for loop
parquet/src/arrow/array_reader/fixed_size_list_array.rs extendtry_extend, extend_nullstry_extend_nulls; errors mapped via general_err!
parquet/src/arrow/array_reader/list_array.rs extendtry_extend; errors mapped via general_err!

At each call site errors are surfaced in the type idiomatic for that crate rather than leaking an InvalidArgumentError from the transform layer — cast functions return CastError, JSON reader functions return JsonError, and parquet reader functions convert to ParquetError.

Are these changes tested?

The changes are covered by the existing test suites for each affected crate. The new try_extend and try_extend_nulls methods are exercised indirectly through all existing tests that exercise extend and extend_nulls (since the deprecated methods now delegate to them), as well as through all the call sites updated in this PR.

Are there any user-facing changes?

MutableArrayData::extend and MutableArrayData::extend_nulls are deprecated. Callers should migrate to try_extend and try_extend_nulls respectively and handle the returned Result. The deprecated methods continue to compile and behave identically to before (panicking on overflow), so there are no breaking changes.

Notes

An LLM was used to make some of the code modifications. All code has been reviewed by a human.

@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate labels Apr 14, 2026
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 14, 2026

Have you run any benchmarks against this PR?

Also, are there any public API changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deprecate MutableArrayData::extend and MutableArrayData::extend_nulls in favour of fallible try_extend / try_extend_nulls

2 participants