feat(rust): add serialize/deserialize support for CAGRA index#1840
Open
zbennett10 wants to merge 3 commits intorapidsai:mainfrom
Open
feat(rust): add serialize/deserialize support for CAGRA index#1840zbennett10 wants to merge 3 commits intorapidsai:mainfrom
zbennett10 wants to merge 3 commits intorapidsai:mainfrom
Conversation
Add idiomatic Rust bindings for CAGRA index serialization and deserialization, wrapping the existing C API functions: - `Index::serialize()` - Save index to file with optional dataset - `Index::serialize_to_hnswlib()` - Save in hnswlib-compatible format - `Index::deserialize()` - Load index from file Includes comprehensive test suite: - Round-trip serialize/deserialize with search verification - Serialization with and without dataset - hnswlib format serialization Closes rapidsai#1479
Member
|
/ok to test b66d204 |
Member
|
/ok to test f530475 |
Member
|
/ok to test 2c5f58b |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1479
Description
Adds serialization and deserialization support to the Rust CAGRA index bindings. The underlying C API functions (
cuvsCagraSerialize,cuvsCagraDeserialize,cuvsCagraSerializeToHnswlib) already exist but were not exposed through the Rust crate.Methods added to
cagra::Index:serialize(&self, res, filename, include_dataset)— Save the CAGRA index to file, with option to include or exclude the datasetserialize_to_hnswlib(&self, res, filename)— Save the CAGRA index in hnswlib-compatible formatdeserialize(res, filename) -> Result<Index>— Load a CAGRA index from fileDesign decisions:
&self(notself) for serialize methods so the index remains usable after serialization. This differs from the Vamana pattern but is more ergonomic for the common case of serializing a hot index.CStringfor safe C string interop with the FFI layer.check_cuvs.Tests:
test_cagra_serialize_deserialize— Round-trip test: build → serialize (with dataset) → deserialize → search → verify nearest neighbors matchtest_cagra_serialize_without_dataset— Verifies serialization works withinclude_dataset=falsetest_cagra_serialize_to_hnswlib— Verifies hnswlib format export produces a non-empty fileDocumentation:
cagramodule docs inmod.rsChecklist