Specialize `sample` for sparse weights #943

AntonOresten · 2024-11-26T01:48:52Z

This PR adds a new sample method for sparse weights, as well as tests. It brings the time complexity from O(n) to O(n_nonzero).

This would be useful for e.g. top-p sampling, where one might have on the order of 100k tokens to sample from, but only a few are considered.

Benchmarks across different sizes and densities

Results

This shows the dense baseline, and the relative performance increase to invoking sample with the generic method for sparse weights.

Dense vs Sparse vs Generic sampling:
size    density         dense_time      generic_time    sparse_time     speedup_dense   speedup_generic
----------------------------------------------------------------------------------------------------
10      0.10            8.200 ns        16.717 ns       20.900 ns       0.4x            0.8x
10      0.25            12.312 ns       29.648 ns       25.726 ns       0.5x            1.2x
10      0.50            15.000 ns       31.426 ns       31.000 ns       0.5x            1.0x
10      1.00            17.918 ns       46.821 ns       33.835 ns       0.5x            1.4x

100     0.01            53.799 ns       133.853 ns      21.421 ns       2.5x            6.2x
100     0.10            44.803 ns       235.024 ns      34.303 ns       1.3x            6.9x
100     0.25            54.095 ns       380.565 ns      40.302 ns       1.3x            9.4x
100     0.50            52.775 ns       454.237 ns      50.655 ns       1.0x            9.0x
100     1.00            51.160 ns       553.982 ns      69.706 ns       0.7x            7.9x

1000    0.01            376.093 ns      2.613 μs        34.102 ns       11.0x           76.6x
1000    0.10            405.793 ns      6.025 μs        70.871 ns       5.7x            85.0x
1000    0.25            393.353 ns      7.775 μs        128.072 ns      3.1x            60.7x
1000    0.50            383.527 ns      8.743 μs        219.973 ns      1.7x            39.7x
1000    1.00            384.167 ns      8.444 μs        398.155 ns      1.0x            21.2x

10000   0.01            3.533 μs        44.500 μs       69.706 ns       50.7x           638.4x
10000   0.10            3.778 μs        88.300 μs       403.333 ns      9.4x            218.9x
10000   0.25            3.689 μs        121.850 μs      940.230 ns      3.9x            129.6x
10000   0.50            3.720 μs        152.863 μs      1.880 μs        2.0x            81.3x
10000   1.00            3.744 μs        131.150 μs      3.750 μs        1.0x            35.0x

Benchmark setup

function benchmark_sparse_sample(; sizes=[10, 100, 1000, 10_000], densities=[0.01, 0.1, 0.25, 0.5, 1.0])
    println("Dense vs Sparse vs Generic sampling:")
    println("size\tdensity\t\tdense_time\tgeneric_time\tsparse_time\tspeedup_dense\tspeedup_generic")
    println("-" ^ 100)
    
    for n in sizes
        for density in densities
            n * density < 1 && continue
            nnz = round(Int, n * density)

            indices = sort!(sample(1:n, nnz, replace=false))
            values = rand(nnz)
            values ./= sum(values)
            sparse_vector = sparsevec(indices, values, n)

            sparse_weights = Weights(sparse_vector)
            dense_weights = Weights(collect(sparse_vector))
            
            dense = @benchmark sample($dense_weights)
            sparse = @benchmark sample($sparse_weights)
            generic = @benchmark invoke(sample, Tuple{AbstractRNG, AbstractWeights}, 
                    $(Random.default_rng()), $sparse_weights)
            
            dense_time = median(dense).time
            generic_time = median(generic).time
            sparse_time = median(sparse).time
            speedup_dense = dense_time / sparse_time
            speedup_generic = generic_time / sparse_time
            
            @printf("%-8d%-16.2f%-16s%-16s%-16s%.1fx\t\t%.1fx\n",
                    n, density, BenchmarkTools.prettytime(dense_time),
                    BenchmarkTools.prettytime(generic_time),
                    BenchmarkTools.prettytime(sparse_time), speedup_dense, speedup_generic)
        end
        println()
    end
end

Note: For small vector lengths (~10) and low densities (~0.2) the performance difference becomes noisy and less meaningful. The generic method can sometimes be faster in these cases due to less overhead when it happens to find the target probability mass early in the vector. However, for these small cases the absolute timing differences are negligible (few nanoseconds) and sparse storage isn't really beneficial anyway.

Note: The implementation uses SparseArrays.nonzeroinds, which is not public.

nalimilan

Thanks!

src/sampling.jl

test/wsampling.jl

nalimilan · 2026-01-05T21:52:35Z

Bump.

devmotion · 2026-01-05T22:52:42Z

My gut feeling is that we should address #885 first and then add a specialisation to the SparseArrays extension. By keeping a hard dependency on SparseArrays, StatsBase is holding back large parts of the Julia ecosystem.

nalimilan · 2026-01-07T09:36:13Z

Yeah but this method is easy to move to an extension as soon as we create it, and it doesn't make things worse until then.

Co-authored-by: Milan Bouchet-Valat <[email protected]>

devmotion · 2026-01-07T16:59:05Z

src/sampling.jl

+    i = sample(rng, Weights(nonzeros(wv.values), sum(wv)))
+    return rowvals(wv.values)[i]


The code is unsafe - in general AbstractWeights are not required to have a values field. It's just a few AbstractWeights subtypes in StatsBase that have an (undocumented and internal) values field.

So actually better define this method only for types defines in Base. Probably using:

for W in (AnalyticWeights, FrequencyWeights, ProbabilityWeights, Weights) @eval function sample(rng::AbstractRNG, wv::W{<:Real,<:Real,<:SparseVector}) ...

(I'm saying this because AFAICT there's no public API which allows accessing the backing array. And anyway I'm not aware of custom AbstractWeights types defined elsewhere so we don't really care to apply this optimization to them.)

devmotion · 2026-01-07T17:01:10Z

test/wsampling.jl

The tests are insufficient - since the method is implemented for AbstractWeights, to be sure it works not only for Weights we should test all subtypes implemented in StatsBase and a custom subtype of AbstractWeights.

nalimilan reviewed Dec 29, 2024

View reviewed changes

src/sampling.jl Outdated Show resolved Hide resolved

src/sampling.jl Outdated Show resolved Hide resolved

test/wsampling.jl Show resolved Hide resolved

AntonOresten and others added 3 commits January 7, 2026 17:02

Specialize sample for sparse weights

ff780fd

Update src/sampling.jl

3b442b3

Co-authored-by: Milan Bouchet-Valat <[email protected]>

Update src/sampling.jl

db126d1

Co-authored-by: Milan Bouchet-Valat <[email protected]>

AntonOresten force-pushed the sample-sparse branch from d852edf to db126d1 Compare January 7, 2026 16:03

AntonOresten added 2 commits January 7, 2026 17:05

Add test for sampling without replacement

b5f5af1

fix test

1f8f74d

devmotion reviewed Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Specialize `sample` for sparse weights #943

Specialize `sample` for sparse weights #943

Uh oh!

AntonOresten commented Nov 26, 2024 •

edited

Loading

Uh oh!

nalimilan left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nalimilan commented Jan 5, 2026

Uh oh!

devmotion commented Jan 5, 2026 •

edited

Loading

Uh oh!

nalimilan commented Jan 7, 2026

Uh oh!

devmotion Jan 7, 2026

Uh oh!

nalimilan Jan 8, 2026 •

edited

Loading

Uh oh!

devmotion Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		i = sample(rng, Weights(nonzeros(wv.values), sum(wv)))
		return rowvals(wv.values)[i]

Specialize sample for sparse weights #943

Are you sure you want to change the base?

Specialize sample for sparse weights #943

Uh oh!

Conversation

AntonOresten commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results

Uh oh!

nalimilan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nalimilan commented Jan 5, 2026

Uh oh!

devmotion commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nalimilan commented Jan 7, 2026

Uh oh!

devmotion Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

nalimilan Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devmotion Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Specialize `sample` for sparse weights #943

Specialize `sample` for sparse weights #943

AntonOresten commented Nov 26, 2024 •

edited

Loading

devmotion commented Jan 5, 2026 •

edited

Loading

nalimilan Jan 8, 2026 •

edited

Loading