perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868
Draft
mykaul wants to merge 1 commit intoscylladb:masterfrom
Draft
perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul wants to merge 1 commit intoscylladb:masterfrom
Conversation
…atomic checks Three optimizations targeting the per-row/per-column overhead in Iter.Scan and iterScanner: 1. isNullableValue() fast path (marshal.go): Add a type-switch for common single-pointer destination types before falling back to reflect.ValueOf + Kind checks. Benchmarks show ~4x speedup (6ns → 1.5ns) for the common case (*string, *int64, etc.) which covers 99%+ of real scan destinations. 2. scanColumn() by pointer + index-based iteration (session.go): Change scanColumn signature from ColumnInfo by-value to *ColumnInfo, and switch from 'for _, col := range' to index-based iteration. Avoids copying the ColumnInfo struct (contains TypeInfo interface, strings) per column per row. 3. Move iter.closed atomic check from per-column to per-row (session.go): The atomic.LoadInt32 in readColumn() ran N times per row (once per column). Since Iter is documented as not safe for concurrent use, checking once at the start of Scan()/Next() is sufficient. Also fixes a pre-existing bug where iterScanner.Scan used the expanded destination offset to index is.cols[] instead of the raw column index, which would panic when scanning tuple columns through the Scanner API. Benchmark results (isNullableValue, A/B on same process): *string: old=6.1ns new=1.5ns (4.0x faster) *int64: old=5.9ns new=1.5ns (3.9x faster) *UUID: old=6.0ns new=1.5ns (3.9x faster)
b68e4a5 to
1fc9b90
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
isNullableValue()to avoid reflection for common scan destination types (~4x faster)scanColumn()to accept*ColumnInfoand use index-based iteration to avoid struct copies per column per rowiter.closedatomic check from per-column (readColumn) to per-row (Scan/Next)iterScanner.Scanused wrong index foris.cols[]with tuple columns (same fix as fix: scanner Scan panics with tuple columns due to wrong index #867)Benchmark results (A/B comparison, same process)
Notes
Iteris documented as not safe for concurrent use, so the per-row closed check is semantically equivalentisNullableValuefast path covers the types returned byNativeType.NewWithError()which represent 99%+ of real scan destinations