perf(sort): use lookup table to speed up sequence conversion to bitpacked representation by corneliusroemer · Pull Request #1649 · nextstrain/nextclade

corneliusroemer · 2025-06-09T13:41:54Z

Looking into sort performance (see #1647), I noticed that the various string containment operations where performance limiting.

Instead of doing multiple string containment call, this PR uses a compile time lookup table to turn a sequence into a bitpacked representation.

A quick benchmark on 240MB of SC2 sequences show this reduces core sort algorithm runtime (excluding minimizer download) by 42%.

…cked representation Looking into sort performance (see #1647), I noticed that the various string containment operations where performance limiting. Instead of doing multiple string containment call, this PR uses a compile time lookup table to turn a sequence into a bitpacked representation. A quick benchmark on 240MB of SC2 sequences show this reduces core sort algorithm runtime (excluding minimizer download) by 42%.

github-actions · 2025-06-09T14:00:21Z

Preview: https://nextstrain--nextclade--pr-1649.previews.neherlab.click

(ci)

corneliusroemer · 2025-06-09T14:14:09Z

-      return cutoff + 1; // break out of loop, return hash above cutoff
-    }
-
-    // A=11=3, C=10=2, G=00=0, T=01=1


This comment was actually wrong - T and C should have been interchanged.

That's why the initial preview finds no matches (I used the comment rather than the code to implement the lookup table)

Now 2.26s vs 4.91 in lateset version

This reverts commit 9eb2efa.

This reverts commit 1e57875.

ivan-aksamentov · 2025-06-10T06:11:36Z

I was requested to review this. Perf improvement sounds nice! (haven't checked numbers though)

I am having difficulties understanding the diff and commit messages. Looks like the work is ongoing? Agent went sentient? :) The PR goes well beyond the "bitpacked representation". Some replaced pieces seem unrelated and don't look equivalent. Will need some more explanation to understand.

It's unclear how to verify correctness. We don't have tests here, so we will have to "big brain" it, and I will need all the help available to do that.

Ideally, I'd like bit-twiddling tricks to be isolated, rather than just spread all over the place. And if it's a well-known algo (most of them are), to be named explicitly. If some kind of a bit container is used, perhaps it could be made into a reusable class? Or perhaps an existing library can be used? (sparing us from edge cases, like all kinds of over-/under-flows, and from maintenance)

Oftentimes bits don't need to be moved directly. Compiler can usually transform well-written, human-readable code into bit hackery itself. Usually even better than humans can. This looks like an over-optimization and "not invented here" code (typical for LLM code). Though if performance gains are there and correctness is preserved, I would not mind. But I might be a bit grumpy :)

Will wait for some complete work and some comments before re-reviewing. Feel free to ping me or re-request a review when ready.

corneliusroemer requested review from ivan-aksamentov and rneher June 9, 2025 13:43

tweaks to comments

969d40e

corneliusroemer marked this pull request as draft June 9, 2025 14:04

corneliusroemer added 2 commits June 9, 2025 16:08

Also accept lower case

0faced0

T/C were actually wrong in comments previously

5e21274

corneliusroemer commented Jun 9, 2025

View reviewed changes

corneliusroemer added 2 commits June 9, 2025 16:17

Fix T/C mixup

2ed072b

Reuse our skip_every iterator extension

9e72eb2

corneliusroemer marked this pull request as ready for review June 9, 2025 14:39

corneliusroemer added 11 commits June 9, 2025 16:50

fix lint

1248066

Give capacity hints

73e0638

Use u32 as hashes are always 32-bit

ae438d1

More idiomatic and faster yet (~5%)

3805161

Faster still

50b8a4a

Now 2.26s vs 4.91 in lateset version

Convert only once

04fa6b4

Condense

1e57875

Use filter map, this is actually slow

9eb2efa

Revert "Use filter map, this is actually slow"

66db9c2

This reverts commit 9eb2efa.

Revert "Condense"

6fd1137

This reverts commit 1e57875.

go brrr

e3d6a81

ivan-aksamentov removed their request for review June 10, 2025 06:14

corneliusroemer added 3 commits June 10, 2025 09:33

Use gxhash

68bdc52

Use std Hashset instead of gxhash

a617da5

Use fast version of unique

e385f1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(sort): use lookup table to speed up sequence conversion to bitpacked representation#1649

perf(sort): use lookup table to speed up sequence conversion to bitpacked representation#1649
corneliusroemer wants to merge 20 commits intomasterfrom
sort-lookup

corneliusroemer commented Jun 9, 2025

Uh oh!

github-actions bot commented Jun 9, 2025 •

edited

Loading

Uh oh!

corneliusroemer Jun 9, 2025

Uh oh!

corneliusroemer Jun 9, 2025

Uh oh!

ivan-aksamentov commented Jun 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

corneliusroemer commented Jun 9, 2025

Uh oh!

github-actions bot commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corneliusroemer Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

corneliusroemer Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

ivan-aksamentov commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Jun 9, 2025 •

edited

Loading

ivan-aksamentov commented Jun 10, 2025 •

edited

Loading