Skip to content

fix: correct accounting in DictEncoder::estimated_memory_size#9720

Open
mzabaluev wants to merge 1 commit intoapache:mainfrom
mzabaluev:fix-estimated-memory-size-on-dict-encoder
Open

fix: correct accounting in DictEncoder::estimated_memory_size#9720
mzabaluev wants to merge 1 commit intoapache:mainfrom
mzabaluev:fix-estimated-memory-size-on-dict-encoder

Conversation

@mzabaluev
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

The returned value should estimate the actual memory usage, but instead it uses the evaluation of the encoded size of the dictionary data, and bypasses the hash table memory usage added by the Interner member. The implementation of Storage::estimated_memory_size implementation for the unique key storage was not correct as well, but it was unused.

What changes are included in this PR?

Correct both problems by making the KeyStorage's implementation of estimated_memory_size return the size of the allocated uniques vector, and make DictEncoder::estimated_memory_size delegate to the interner, which calls the method of KeyStorage and adds accounting for its own data structure.

Are these changes tested?

No. I've discovered no existing tests exercising this method, either.

Are there any user-facing changes?

No.

The returned value should estimate the actual memory usage, but
instead it used the evaluation of the encoded size of the dictionary
data, and bypassed the hash table memory usage added by the Interner.
The implementation of Storage::estimated_memory_size for the
unique key storage was not correct as well, but it was unused.
Correct both problems.
@github-actions github-actions bot added the parquet Changes to the parquet crate label Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect accounting in DictEncoder::estimated_memory_size

1 participant