Skip to content

perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868

Draft
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:mykaul/perf/scan-hot-path-optimize
Draft

perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:mykaul/perf/scan-hot-path-optimize

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented May 2, 2026

Summary

  • Add type-switch fast path in isNullableValue() to avoid reflection for common scan destination types (~4x faster)
  • Change scanColumn() to accept *ColumnInfo and use index-based iteration to avoid struct copies per column per row
  • Move iter.closed atomic check from per-column (readColumn) to per-row (Scan/Next)
  • Fix pre-existing bug: iterScanner.Scan used wrong index for is.cols[] with tuple columns (same fix as fix: scanner Scan panics with tuple columns due to wrong index #867)

Benchmark results (A/B comparison, same process)

isNullableValue (common *T path):
  *string:  old=6.1ns  new=1.5ns  (4.0x faster)
  *int64:   old=5.9ns  new=1.5ns  (3.9x faster)
  *UUID:    old=6.0ns  new=1.5ns  (3.9x faster)

Notes

  • Depends on / includes the fix from fix: scanner Scan panics with tuple columns due to wrong index #867
  • Iter is documented as not safe for concurrent use, so the per-row closed check is semantically equivalent
  • The isNullableValue fast path covers the types returned by NativeType.NewWithError() which represent 99%+ of real scan destinations

…atomic checks

Three optimizations targeting the per-row/per-column overhead in Iter.Scan
and iterScanner:

1. isNullableValue() fast path (marshal.go):
   Add a type-switch for common single-pointer destination types before
   falling back to reflect.ValueOf + Kind checks. Benchmarks show ~4x
   speedup (6ns → 1.5ns) for the common case (*string, *int64, etc.)
   which covers 99%+ of real scan destinations.

2. scanColumn() by pointer + index-based iteration (session.go):
   Change scanColumn signature from ColumnInfo by-value to *ColumnInfo,
   and switch from 'for _, col := range' to index-based iteration.
   Avoids copying the ColumnInfo struct (contains TypeInfo interface,
   strings) per column per row.

3. Move iter.closed atomic check from per-column to per-row (session.go):
   The atomic.LoadInt32 in readColumn() ran N times per row (once per
   column). Since Iter is documented as not safe for concurrent use,
   checking once at the start of Scan()/Next() is sufficient.

Also fixes a pre-existing bug where iterScanner.Scan used the expanded
destination offset to index is.cols[] instead of the raw column index,
which would panic when scanning tuple columns through the Scanner API.

Benchmark results (isNullableValue, A/B on same process):
  *string:  old=6.1ns  new=1.5ns  (4.0x faster)
  *int64:   old=5.9ns  new=1.5ns  (3.9x faster)
  *UUID:    old=6.0ns  new=1.5ns  (3.9x faster)
@mykaul mykaul force-pushed the mykaul/perf/scan-hot-path-optimize branch from b68e4a5 to 1fc9b90 Compare May 2, 2026 10:17
@mykaul mykaul marked this pull request as draft May 2, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant