perf(castore): generate metadata from bytes in in-mem cache#614
perf(castore): generate metadata from bytes in in-mem cache#614sambhav-jain-16 wants to merge 3 commits intouber:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes metainfo generation for blobs already resident in the in-memory CA store by adding a byte-slice-specific path in core and switching the memory-cache codepath to use it. It fits into the existing torrent/metainfo pipeline by preserving the same serialized MetaInfo shape while reducing allocation overhead during cache population.
Changes:
- Add
core.NewMetaInfoFromBytesandcalcPieceSumsFromBytesto compute piece checksums directly from[]byte. - Update
lib/store/ca_store.goto use the new byte-slice path when generating metainfo for in-memory cache entries. - Add correctness tests comparing reader-based vs byte-slice metainfo generation, plus a benchmark for the new path.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
lib/store/ca_store.go |
Switches in-memory metainfo generation from bytes.NewReader to the new byte-slice API. |
core/metainfo.go |
Introduces the new []byte metainfo constructor and byte-slice piece checksum helper. |
core/metainfo_test.go |
Adds equivalence tests for the new constructor and benchmarks its performance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0db0de4 to
0a384be
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| infoHash: h, | ||
| digest: d, | ||
| }, nil | ||
| return assembleMetaInfo(d, length, pieceSums, pieceLength) |
There was a problem hiding this comment.
Is a 0.5% performance improvement worth the extra complexity we are introducing from extra code?
There was a problem hiding this comment.
I think given the code addition is very less and it is in the critical part of downloading and generating the metainfo, it should benefit us.
| return nil, fmt.Errorf("new digest from hex: %s", err) | ||
| } | ||
| metaInfo, err := core.NewMetaInfo(digest, bytes.NewReader(data), pieceLength) | ||
| metaInfo, err := core.NewMetaInfoFromBytes(digest, data, pieceLength) |
There was a problem hiding this comment.
Should we also use it for metainfogen.GenerateFromBuffer, which currently also creates a new reader?
There was a problem hiding this comment.
Im planning to remove that function, no method calls it
| if err != nil { | ||
| t.Fatal(err) | ||
| } |
There was a problem hiding this comment.
nit: Can we simply use require.NoError(err) instead?
What?
While downloading the blobs in the in-memory cache, we use the same method as for disk cache to generate metadata for the blob downloaded.

core.NewMetaInfocreates a 32KB scratch buffer for each piece of the blob, resulting in unnecessary allocs even though the blob is already in memory. This is shown in theio.CopyN, which internally copies the buffer.Although this is just 0.5% of the total allocation, it provides a quick win, with further improvements already planned.
This change adds
core.NewMetaInfoFromBytes, a byte-slice variant ofcore.NewMetaInfothat avoids the per-piece 32KB allocation.castore.generateMetadataFromBytesis updated to use itTesting
Added a new benchmark
BenchmarkNewMetaInfo, which was first executed oncore.NewMetaInfoand then oncore.NewMetaInfoFromBytes.