Skip to content

perf(castore): generate metadata from bytes in in-mem cache#614

Open
sambhav-jain-16 wants to merge 3 commits intouber:masterfrom
sambhav-jain-16:generate-metadata-from-bytes
Open

perf(castore): generate metadata from bytes in in-mem cache#614
sambhav-jain-16 wants to merge 3 commits intouber:masterfrom
sambhav-jain-16:generate-metadata-from-bytes

Conversation

@sambhav-jain-16
Copy link
Copy Markdown
Collaborator

What?

While downloading the blobs in the in-memory cache, we use the same method as for disk cache to generate metadata for the blob downloaded.
core.NewMetaInfo creates a 32KB scratch buffer for each piece of the blob, resulting in unnecessary allocs even though the blob is already in memory. This is shown in the io.CopyN, which internally copies the buffer.
Although this is just 0.5% of the total allocation, it provides a quick win, with further improvements already planned.
Screenshot 2026-05-04 at 13 41 05

This change adds core.NewMetaInfoFromBytes, a byte-slice variant of core.NewMetaInfo that avoids the per-piece 32KB allocation.castore.generateMetadataFromBytes is updated to use it

Testing

Added a new benchmark BenchmarkNewMetaInfo, which was first executed on core.NewMetaInfo and then on core.NewMetaInfoFromBytes.

goos: linux
goarch: amd64
pkg: github.com/uber/kraken/core
cpu: AMD EPYC 7B13
                                  │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/before.txt │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/after.txt │
                                  │                                sec/op                                 │                    sec/op                     vs base                │
NewMetaInfo/1MB_4pc-96                                                                       159.95µ ± 2%                                    91.44µ ± 2%  -42.83% (p=0.000 n=50)
NewMetaInfo/16MB_64pc-96                                                                      2.343m ± 1%                                    1.424m ± 1%  -39.22% (p=0.000 n=50)
NewMetaInfo/64MB_256pc-96                                                                    10.920m ± 1%                                    6.080m ± 1%  -44.32% (p=0.000 n=50)
NewMetaInfo/256MB_1024pc-96                                                                   40.96m ± 1%                                    24.20m ± 1%  -40.93% (p=0.000 n=50)
NewMetaInfo/16MB_4pc_4MBpc-96                                                                 2.035m ± 1%                                    1.428m ± 1%  -29.84% (p=0.000 n=50)
NewMetaInfo/16MB_16pc_1MBpc-96                                                                2.121m ± 1%                                    1.397m ± 1%  -34.11% (p=0.000 n=50)
NewMetaInfo/16MB_1024pc_16KBpc-96                                                             5.688m ± 2%                                    1.526m ± 1%  -73.17% (p=0.000 n=50)
geomean                                                                                       3.284m                                         1.788m       -45.56%

                                  │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/before.txt │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/after.txt │
                                  │                                  B/s                                  │                     B/s                      vs base                 │
NewMetaInfo/1MB_4pc-96                                                                       6.105Gi ± 2%                                 10.680Gi ± 2%   +74.93% (p=0.000 n=50)
NewMetaInfo/16MB_64pc-96                                                                     6.668Gi ± 1%                                 10.971Gi ± 1%   +64.53% (p=0.000 n=50)
NewMetaInfo/64MB_256pc-96                                                                    5.723Gi ± 2%                                 10.279Gi ± 1%   +79.59% (p=0.000 n=50)
NewMetaInfo/256MB_1024pc-96                                                                  6.103Gi ± 1%                                 10.332Gi ± 2%   +69.30% (p=0.000 n=50)
NewMetaInfo/16MB_4pc_4MBpc-96                                                                7.679Gi ± 1%                                 10.945Gi ± 1%   +42.53% (p=0.000 n=50)
NewMetaInfo/16MB_16pc_1MBpc-96                                                               7.367Gi ± 1%                                 11.182Gi ± 2%   +51.78% (p=0.000 n=50)
NewMetaInfo/16MB_1024pc_16KBpc-96                                                            2.747Gi ± 1%                                 10.238Gi ± 0%  +272.70% (p=0.000 n=50)
geomean                                                                                      5.801Gi                                       10.65Gi        +83.68%

                                  │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/before.txt │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/after.txt │
                                  │                                 B/op                                  │                         B/op                          vs base        │
NewMetaInfo/1MB_4pc-96                                                                     162.213Ki ± 0%                                           1.053Ki ± 0%  -99.35% (n=50)
NewMetaInfo/16MB_64pc-96                                                                  2087.698Ki ± 0%                                           3.250Ki ± 0%  -99.84% (n=50)
NewMetaInfo/64MB_256pc-96                                                                  8248.18Ki ± 0%                                           11.50Ki ± 0%  -99.86% (n=50)
NewMetaInfo/256MB_1024pc-96                                                               32894.65Ki ± 0%                                           44.50Ki ± 0%  -99.86% (n=50)
NewMetaInfo/16MB_4pc_4MBpc-96                                                              161.412Ki ± 0%                                           1.047Ki ± 0%  -99.35% (n=50)
NewMetaInfo/16MB_16pc_1MBpc-96                                                             546.874Ki ± 0%                                           1.688Ki ± 0%  -99.69% (n=50)
NewMetaInfo/16MB_1024pc_16KBpc-96                                                         16505.61Ki ± 0%                                           44.53Ki ± 0%  -99.73% (n=50)
geomean                                                                                      1.966Mi                                                5.422Ki       -99.73%

                                  │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/before.txt │ /home/user/kraken/bench-results/run-a1-20260504-104049-n50/after.txt │
                                  │                               allocs/op                               │                      allocs/op                        vs base        │
NewMetaInfo/1MB_4pc-96                                                                         38.00 ± 0%                                             21.00 ± 0%  -44.74% (n=50)
NewMetaInfo/16MB_64pc-96                                                                      284.00 ± 0%                                             83.00 ± 0%  -70.77% (n=50)
NewMetaInfo/64MB_256pc-96                                                                     1056.0 ± 0%                                             277.0 ± 0%  -73.77% (n=50)
NewMetaInfo/256MB_1024pc-96                                                                   4.133k ± 0%                                            1.047k ± 0%  -74.67% (n=50)
NewMetaInfo/16MB_4pc_4MBpc-96                                                                  38.00 ± 0%                                             21.00 ± 0%  -44.74% (n=50)
NewMetaInfo/16MB_16pc_1MBpc-96                                                                 89.00 ± 0%                                             34.00 ± 0%  -61.80% (n=50)
NewMetaInfo/16MB_1024pc_16KBpc-96                                                             4.137k ± 0%                                            1.047k ± 0%  -74.69% (n=50)
geomean                                                                                        351.2                                                  120.9       -65.57%

Copilot AI review requested due to automatic review settings May 4, 2026 11:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes metainfo generation for blobs already resident in the in-memory CA store by adding a byte-slice-specific path in core and switching the memory-cache codepath to use it. It fits into the existing torrent/metainfo pipeline by preserving the same serialized MetaInfo shape while reducing allocation overhead during cache population.

Changes:

  • Add core.NewMetaInfoFromBytes and calcPieceSumsFromBytes to compute piece checksums directly from []byte.
  • Update lib/store/ca_store.go to use the new byte-slice path when generating metainfo for in-memory cache entries.
  • Add correctness tests comparing reader-based vs byte-slice metainfo generation, plus a benchmark for the new path.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
lib/store/ca_store.go Switches in-memory metainfo generation from bytes.NewReader to the new byte-slice API.
core/metainfo.go Introduces the new []byte metainfo constructor and byte-slice piece checksum helper.
core/metainfo_test.go Adds equivalence tests for the new constructor and benchmarks its performance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/metainfo.go Outdated
Comment thread core/metainfo.go Outdated
Comment thread core/metainfo_test.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/metainfo.go Outdated
Comment thread core/metainfo_test.go
Comment thread core/metainfo.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sambhav-jain-16 sambhav-jain-16 marked this pull request as ready for review May 4, 2026 12:41
Comment thread core/metainfo.go
infoHash: h,
digest: d,
}, nil
return assembleMetaInfo(d, length, pieceSums, pieceLength)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a 0.5% performance improvement worth the extra complexity we are introducing from extra code?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given the code addition is very less and it is in the critical part of downloading and generating the metainfo, it should benefit us.

Comment thread lib/store/ca_store.go
return nil, fmt.Errorf("new digest from hex: %s", err)
}
metaInfo, err := core.NewMetaInfo(digest, bytes.NewReader(data), pieceLength)
metaInfo, err := core.NewMetaInfoFromBytes(digest, data, pieceLength)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use it for metainfogen.GenerateFromBuffer, which currently also creates a new reader?

Copy link
Copy Markdown
Collaborator Author

@sambhav-jain-16 sambhav-jain-16 May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im planning to remove that function, no method calls it

Comment thread core/metainfo_test.go
Comment on lines +164 to +166
if err != nil {
t.Fatal(err)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we simply use require.NoError(err) instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants