perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks by mykaul · Pull Request #868 · scylladb/gocql

mykaul · 2026-05-02T10:15:46Z

Summary

Add type-switch fast path in isNullableValue() to avoid reflection for common scan destination types (~4x faster)
Change scanColumn() to accept *ColumnInfo and use index-based iteration to avoid struct copies per column per row
Move iter.closed atomic check from per-column (readColumn) to per-row (Scan/Next)
Fix pre-existing bug: iterScanner.Scan used wrong index for is.cols[] with tuple columns (same fix as fix: scanner Scan panics with tuple columns due to wrong index #867)

Benchmark results (A/B comparison, same process)

isNullableValue (common *T path):
  *string:  old=6.1ns  new=1.5ns  (4.0x faster)
  *int64:   old=5.9ns  new=1.5ns  (3.9x faster)
  *UUID:    old=6.0ns  new=1.5ns  (3.9x faster)

Notes

Depends on / includes the fix from fix: scanner Scan panics with tuple columns due to wrong index #867
Iter is documented as not safe for concurrent use, so the per-row closed check is semantically equivalent
The isNullableValue fast path covers the types returned by NativeType.NewWithError() which represent 99%+ of real scan destinations

…atomic checks Three optimizations targeting the per-row/per-column overhead in Iter.Scan and iterScanner: 1. isNullableValue() fast path (marshal.go): Add a type-switch for common single-pointer destination types before falling back to reflect.ValueOf + Kind checks. Benchmarks show ~4x speedup (6ns → 1.5ns) for the common case (*string, *int64, etc.) which covers 99%+ of real scan destinations. 2. scanColumn() by pointer + index-based iteration (session.go): Change scanColumn signature from ColumnInfo by-value to *ColumnInfo, and switch from 'for _, col := range' to index-based iteration. Avoids copying the ColumnInfo struct (contains TypeInfo interface, strings) per column per row. 3. Move iter.closed atomic check from per-column to per-row (session.go): The atomic.LoadInt32 in readColumn() ran N times per row (once per column). Since Iter is documented as not safe for concurrent use, checking once at the start of Scan()/Next() is sufficient. Also fixes a pre-existing bug where iterScanner.Scan used the expanded destination offset to index is.cols[] instead of the raw column index, which would panic when scanning tuple columns through the Scanner API. Benchmark results (isNullableValue, A/B on same process): *string: old=6.1ns new=1.5ns (4.0x faster) *int64: old=5.9ns new=1.5ns (3.9x faster) *UUID: old=6.0ns new=1.5ns (3.9x faster)

mykaul force-pushed the mykaul/perf/scan-hot-path-optimize branch from b68e4a5 to 1fc9b90 Compare May 2, 2026 10:17

mykaul marked this pull request as draft May 2, 2026 10:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868

perf: optimize scan hot path — reduce reflection, struct copies, and atomic checks#868
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:mykaul/perf/scan-hot-path-optimize

mykaul commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mykaul commented May 2, 2026

Summary

Benchmark results (A/B comparison, same process)

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant