raft 26.02.00 (4 Feb 2026)

🚨 Breaking Changes

Use CCCL's mdspan implementation by @bdice in #2836
Default to static linking of libcudart by @bdice in #2890
Remove neighbors/, cluster/, distance/, spatial/, sparse/neighbors/ apis by @aamijar in #2885
Remove cutlass and cuco dependencies by @divyegala in #2916

🐛 Bug Fixes

Include <thrust/for_each.h> where it is used by @bdice in #2883
Include CTest module in CMakeLists.txt by @bdice in #2895
Fix Lanczos Determinism by @aamijar in #2894
Change compile-time assertion to runtime assertion on is_strided by @bdice in #2909
Set memory pool through RMM by @viclafargue in #2866

📖 Documentation

New readme image by @aamijar in #2907
Readme improvements by @aamijar in #2906

🚀 New Features

Tile Policy for Uint8 Input (Pairwise) by @tarang-jain in #2770
Add copy_vectorized to RAFT by @lowener in #2900

🛠️ Improvements

Use strict priority in CI conda tests by @bdice in #2879
Use strict priority in CI conda tests by @bdice in #2884
Remove alpha specs from non-RAPIDS dependencies by @bdice in #2886
Enable merge barriers by @KyleFromNVIDIA in #2889
Fix is_exhaustive, no longer constexpr by @bdice in #2888
Add devcontainer fallback for C++ test location by @bdice in #2893
eigsh optional seed by @aamijar in #2899
Empty commit to trigger a build by @bdice in #2904
Update to C++20 by @divyegala in #2908
Use SPDX license identifiers in pyproject.toml, bump build dependency floors by @jameslamb in #2910
Remove neighbors/detail/faiss_select by @aamijar in #2902
Remove sparse/distance by @aamijar in #2905
Add CUDA 13.1 support by @bdice in #2896
Fix CCCL 3.2 mdspan constexpr issues by @bdice in #2911
build and test against CUDA 13.1.0 by @jameslamb in #2912
Laplacian Kernel for COO inputs by @aamijar in #2891
Empty commit to trigger a build by @jameslamb in #2919
Use main shared-workflows branch by @jameslamb in #2921
Fix update-version.sh incorrectly replacing main() function names by @AyodeAwe in #2923
Lanczos remove dead code by @aamijar in #2918
wheel builds: react to changes in pip's handling of build constraints by @mmccarty in #2927
fix(build): build package on merge to release/* branch by @gforsyth in #2929

New Contributors

@mmccarty made their first contribution in #2927

Full Changelog: https://github.com/rapidsai/raft/compare/v26.02.00a...release/26.02

raft 25.12.00 (10 Dec 2025)

🚨 Breaking Changes

More consistent container policies & host memory resource by @achirkin in #2835
Require CUDA 12.2+ by @jakirkham in #2850

🐛 Bug Fixes

Correct tagging in the irecv function of the STD communicator by @viclafargue in #2829
Fix copyright hook file exclusion by @KyleFromNVIDIA in #2840
Properly guard usage of openmp function calls by @robertmaynard in #2839
Fix reduce mdspan API by @lowener in #2853
Fix for STD comm waitall function by @viclafargue in #2852
Pin Cython pre-3.2.0 and PyTest pre-9 by @jakirkham in #2864
refactored update-version.sh to handle new branching strategy by @rockhowse in #2863
Fix laplacian scaling coefficients by @aamijar in #2871
Revert "Remove Deprecated API (#2813)" by @csadorf in #2881

📖 Documentation

Use current system architecture in conda environment creation command by @bdice in #2862

🚀 New Features

BENCH_PRIMS: convenience reporting of benchmark parameters and read throughput by @achirkin in #2824

🛠️ Improvements

Update to rapids-logger 0.2 by @bdice in #2828
Enable sccache-dist connection pool by @trxcllnt in #2837
Use main in RAPIDS_BRANCH by @bdice in #2842
Use main shared-workflows branch by @bdice in #2844
Use SPDX for all copyright headers by @KyleFromNVIDIA in #2845
Use ruff-check, ruff-format instead of black, flake8, isort by @KyleFromNVIDIA in #2855
Remove shims for CCCL < 3.1 compatibility by @bdice in #2858
Always convert warnings to errors by @jakirkham in #2857
Lanczos Solver with COO input and cusparse wrappers by @aamijar in #2851
COO support in sparse matrix utilities by @aamijar in #2861
Update RMM includes from <rmm/mr/device/*> to <rmm/mr/*> by @bdice in #2867
Use sccache-dist build cluster for conda and wheel builds by @trxcllnt in #2859
Remove Deprecated API by @jnke2016 in #2813

New Contributors

@rockhowse made their first contribution in #2863

Full Changelog: https://github.com/rapidsai/raft/compare/v25.12.00a...release/25.12

raft 25.10.00 (8 Oct 2025)

🐛 Bug Fixes

Workaround for an illegal memory access on SM 120 devices (#2821) @achirkin
Fix sparse select_k: don't write beyond min(input_len, k) (#2814) @achirkin
[BUG] Fix compilation error in matrix/detail/gather.cuh (#2811) @enp1s0
Fix select_k for negative bfloat16 (#2799) @apivovarov
Fix index types for coo kernels (#2793) @aamijar
Fix the GEMM pointer mode setting (#2777) @achirkin
Fix host_vector_policy issue (#2739) @viclafargue

📖 Documentation

Fix UCX-Py mention to UCXX in docstring (#2804) @pentschev

🚀 New Features

Update cutlass to a version that supports CUDA 13 (#2774) @robertmaynard

🛠️ Improvements

Fix missed deps in update-version.sh (#2826) @AyodeAwe
Empty commit to trigger a build (#2816) @msarahan
Make warpsort kernels use the IEEE 754 bit representation for ordering (#2807) @achirkin
Configure repo for automatic release notes generation (#2806) @AyodeAwe
Support < 2 element arrays in rand_index/adjusted_rand_index (#2805) @jcrist
update dependencies: use cuda-toolkit wheels (#2802) @jameslamb
Use branch-25.10 again (#2800) @jameslamb
Remove CMake find UCX package (#2798) @pentschev
use dask-cuda[cu12, cu13] extras for wheel dependencies (#2797) @jameslamb
Remove UCX-Py (#2791) @pentschev
Update rapids-dependency-file-generator (#2790) @KyleFromNVIDIA
Build and test with CUDA 13.0.0 (#2787) @jameslamb
Fix template arg passing in adjusted_rand_index (#2785) @jinsolp
Use build cluster in devcontainers (#2781) @trxcllnt
Use rapids_cuda_enable_fatbin_compression (#2780) @robertmaynard
Increase Dask tests verbosity in CI (#2779) @pentschev
Update rapids_config to handle user defined branch name (#2778) @robertmaynard
[REVIEW] Fix: skip default_allocation_limit() if unnecessary (#2775) @i-Pear
Update rapids-build-backend to 0.4.1 (#2773) @KyleFromNVIDIA
ci(labeler): update labeler action to [@v5 (#2772) @gforsyth](https://github.com/v5 (#2772) @gforsyth)
Register bfloat16/bfloat162 in util/vectorized.cuh (#2769) @apivovarov
Use mdspan::index_type to Only Instantiate Specific Kernels (#2767) @tarang-jain
Allow latest OS in devcontainers (#2759) @bdice
Update build infra to support new branching strategy (#2751) @robertmaynard
Use GCC 14 in conda builds. (#2708) @vyasr

raft 25.08.00 (6 Aug 2025)

🚨 Breaking Changes

MatrixLinewiseOp compile-time-invocation (#2701) @aamijar
Remove CUDA 11 from dependencies.yaml (#2695) @KyleFromNVIDIA
stop uploading packages to downloads.rapids.ai (#2688) @jameslamb
Reduce instantiations of Reduction kernels (#2679) @divyegala

🐛 Bug Fixes

Fix stream sync for Copy2DAsync test (#2744) @lowener
Several small fixes to make Raft compile with LLVM. (#2735) @vitor1001
Add missing header (#2734) @vitor1001
[REVIEW] Fix static initialization order fiasco in lanczos.cu (#2733) @legrosbuffle
[REVIEW] Fix assertion in fill_indices_by_rows_kernel. (#2732) @legrosbuffle
libucx: consider post-releases in wheel builds (#2729) @jameslamb
Fix laplacian cast (#2725) @aamijar
Fix excess_subsample (#2723) @mfoerste4
Fix the constructor for coordinate_structure for non-zero nnz. (#2717) @legrosbuffle
[REVIEW] Fix compile error when using mdbuffer with all-static extents. (#2716) @legrosbuffle
Fix unsafe cast coo_remove_scalar (#2713) @aamijar
Fix laplacian self-loops (#2712) @aamijar
[REVIEW] Fix a few memory leaks. (#2710) @legrosbuffle
Fix MST bug for graph with identical edge weights (#2707) @jnke2016
Missed update accounting for reduction related APIs (#2704) @divyegala
Adding GH_TOKEN pass-through to summarize job (#2702) @msarahan
Work around Cython ctypedef bug (#2686) @vyasr

📖 Documentation

add docs on CI workflow inputs (#2728) @jameslamb

🛠️ Improvements

An additional small change to remove cuda 11 stuff (#2763) @cjnolet
Removing CUDA 11 from docs and code (#2757) @cjnolet
fix(docker): use versioned -latest tag for all rapidsai images (#2745) @gforsyth
Update protocol name for UCX-Py tests (#2743) @pentschev
Remove sphinx upper bound (#2742) @bdice
remove cuspatial references, avoid triggering tests on clang-format config changes (#2740) @jameslamb
MST Edge Case (#2736) @tarang-jain
Add missing #include <cassert> in cpp/include/raft/core/math.hpp (#2730) @trxcllnt
Update leftover CUDA 12.8 to 12.9 in docs (#2724) @jakirkham
Fix docs lanczos solver (#2722) @aamijar
Use CUDA 12.9 in Conda, Devcontainers, Spark, GHA, etc. (#2721) @jakirkham
Remove nvidia and dask channels (#2720) @vyasr
[REVIEW] Fix compile error of abs_op when compiling with clang (#2718) @legrosbuffle
Avoid using internal method std::experimental::details::alignTo(). (#2714) @vitor1001
refactor(shellcheck): fix all remaining warnings/errors (#2703) @gforsyth
MatrixLinewiseOp compile-time-invocation (#2701) @aamijar
Remove pytest pin (#2699) @vyasr
Fix several issues that breaks LLVM (#2698) @vitor1001
Remove CUDA 11 from dependencies.yaml (#2695) @KyleFromNVIDIA
Remove CUDA 11 devcontainers and update CI scripts (#2690) @bdice
refactor(rattler): remove cuda11 options and general cleanup (#2689) @gforsyth
stop uploading packages to downloads.rapids.ai (#2688) @jameslamb
fix(devcontainers): typo in container name (#2687) @gforsyth
Reduce instantiations of Reduction kernels (#2679) @divyegala
Forward-merge branch-25.06 into branch-25.08 (#2675) @divyegala
Add support for F16 in linalg::transpose (#2672) @enp1s0
Forward-merge branch-25.06 into branch-25.08 (#2664) @gforsyth
Support coo_matrix in coo_symmetrize and coo_remove_scalar (#2662) @aamijar
Lanczos Solver which=SA,SM,LA,LM argument (#2628) @aamijar

raft 25.06.00 (5 Jun 2025)

🚨 Breaking Changes

Decoupling multi gpu resources from nccl usage (#2647) @jinsolp

🐛 Bug Fixes

NCCL comm resource fix (#2692) @viclafargue
Fix the launch bounds for nn-descent kernel for 1210 and remove nn-descent tests (#2691) @viclafargue
Prefer host gather when dataset is available both on host and device (#2671) @tfeher
Fix warnings treated as errors downstream in cuVS (#2644) @achirkin
Fix nccl_comm.hpp warning: #83-D: type qualifier specified more than once (#2643) @achirkin
NVTX: null destination pointer warning-treated-as-error (#2639) @achirkin
Add UCXX and NCCL to libraft conda recipe (#2636) @divyegala
Fix building cutlass (#2619) @miscco
Fix COO symmetrization (#2582) @viclafargue

🚀 New Features

[Feat] add cudaMemcpy2DAsync wrapper (#2674) @rhdong
Python wrapper for device_resources_snmg (#2666) @jinsolp
Laplacian normalization primitives (#2648) @aamijar
[FEA] Matrix shift rows and columns (#2634) @jinsolp
Use NCCL wheels from PyPI for CUDA 12 builds (#2629) @divyegala
Support strided matrix view as an input to matrix::samples_rows (#2626) @enp1s0
[Feat] add support for bm25 and tfidf (#2567) @jperez999

🛠️ Improvements

use 'rapids-init-pip' in wheel CI, other CI changes (#2677) @jameslamb
Dask 2025.4.1 compatibility (#2673) @TomAugspurger
Finish CUDA 12.9 migration and use branch-25.06 workflows (#2669) @bdice
Update to clang 20 (#2665) @bdice
Quote head_rev in conda recipes (#2660) @bdice
CUDA 12.9 use updated compression flags (#2657) @robertmaynard
Build and test with CUDA 12.9.0 (#2655) @bdice
Exclude librmm.so from auditwheel (#2654) @bdice
Fix cub include in normalize.cuh (#2652) @lowener
Add support for Python 3.13 (#2649) @gforsyth
Decoupling multi gpu resources from nccl usage (#2647) @jinsolp
[BUGFIX] Fixed quoting in wheel paths in pylibraft and raft_dask wheel tests (#2645) @VenkateshJaya
Download build artifacts from Github for CI (#2640) @VenkateshJaya
Limit allowed wheel sizes (#2638) @divyegala
Remove CUDA whole compilation ODR violations (#2633) @divyegala
refactor(rattler): enable strict channel priority for builds (#2632) @gforsyth
Vendor RAPIDS.cmake (#2631) @bdice
Replace Thrust iterator facilities and replace them with libcu++ ones (#2627) @miscco
Port all conda recipes to rattler-build (#2623) @gforsyth
Add missing thrust include (#2618) @miscco
Moving wheel builds to specified location and uploading build artifacts to Github (#2617) @VenkateshJaya
Fixed pytest marker warnings by removing unused pytest.ini (#2591) @TomAugspurger
Introduction of the raft::device_resources_snmg type (#2549) @viclafargue
Create a NCCL sub-communicator using ncclCommSplit (#2495) @seunghwak

raft 25.04.00 (9 Apr 2025)

🚨 Breaking Changes

Account for cugraph API breakage (#2581) @divyegala
Use new rapids-logger library (#2566) @vyasr

🐛 Bug Fixes

Backport build patch fix (#2620) @KyleFromNVIDIA
Revert "Temporarily increase max_days_without_success (#2602)" (#2613" (#2613)) @divyegala
Relax max duplicates in batched NN Descent (#2610) @jinsolp
[Fix] Lanczos solver gemv fix (#2607) @aamijar
[Fix] select-k-csr failure on CUDA11.x + H100 (#2604) @rhdong
Temporarily increase max_days_without_success (#2602) @divyegala
Swap blocks and threads_per_block in compute_graph_laplacian (#2597) @jcrist
[BUG] Fix illegal memory access in linalg::reduction (#2592) @enp1s0
Require sphinx<8.2.0 (#2590) @KyleFromNVIDIA
Account for cugraph API breakage (#2581) @divyegala
#include <numeric> for std::iota (#2578) @benfred
Fix Laplacian calculation in spectral partitioning (#2568) @wphicks
Take argument by const& as the input range is const (#2558) @miscco
Allow some of the sparse utility functions to handle larger matrices (#2541) @viclafargue

🛠️ Improvements

ci: pre-filter 11.4 jobs before they are enabled in shared workflows (#2608) @gforsyth
Use conda-build instead of conda-mambabuild (#2595) @bdice
Replace cub::Sum and cub::Max with cuda::std::plus and cuda::maximum (#2594) @miscco
Update all conda_build_config.yamls RAPIDS UCX version (#2589) @jakirkham
Drop cub::TransformInputIterator in favor of thrust::transform_iterator (#2588) @miscco
Consolidate more Conda solves in CI (#2587) @KyleFromNVIDIA
Fix duplicate indices in batch NN Descent (#2586) @jinsolp
Require CMake 3.30.4 (#2584) @robertmaynard
Create Conda CI test env in one step (#2580) @KyleFromNVIDIA
Use shared-workflows branch-25.04 (#2576) @bdice
Add shellcheck to pre-commit and fix warnings (#2575) @gforsyth
Add build_type input field for test.yaml (#2573) @gforsyth
Use rapids-pip-retry in CI jobs that might need retries (#2571) @gforsyth
Avoid limited memory adaptor issue in balanced KMeans (#2570) @csadorf
update telemetry and retarget 25.04 (#2569) @msarahan
Use new rapids-logger library (#2566) @vyasr
disallow fallback to Make in Python builds (#2563) @jameslamb
Forward-merge branch-25.02 into branch-25.04 (#2561) @bdice
Migrate to NVKS for amd64 CI runners (#2559) @bdice
Add verify-codeowners hook (#2557) @KyleFromNVIDIA

raft 25.02.00 (13 Feb 2025)

🚨 Breaking Changes

Update pip devcontainers to UCX 1.18 (#2550) @jameslamb
Switch over to rapids-logger (#2530) @vyasr
Adapt to rmm logger changes (#2513) @vyasr

🐛 Bug Fixes

Rename test to tests. (#2546) @bdice
Fix bit order of RMAT Rectangular Generator to match expectation (#2542) @mfoerste4
Fix broken link to python doc (#2537) @lowener
Fix lanczos solver integer overflow (#2536) @viclafargue
Fix rnd bit generation in rmat_rectangular_kernel (#2524) @tfeher

📖 Documentation

Fix docs builds (#2562) @bdice
[DOC] Fix sample codes (#2518) @enp1s0

🚀 New Features

Add cuda 12.8 support (#2551) @robertmaynard
Add support for different data type of bitset (#2535) @lowener
[Feat] Support bitset_to_csr (#2523) @rhdong
Remove upper bounds on cuda-python to allow 12.6.2 and 11.8.5 (#2517) @bdice

🛠️ Improvements

Revert CUDA 12.8 shared workflow branch changes (#2560) @vyasr
Build and test with CUDA 12.8.0 (#2555) @bdice
Update pip devcontainers to UCX 1.18 (#2550) @jameslamb
use dynamic CUDA wheels on CUDA 11 (#2548) @jameslamb
Normalize whitespace (#2547) @bdice
Use cuda.bindings layout. (#2545) @bdice
Revert "Introduction of the raft::device_resources_snmg type (#2487)" (#2543" (#2543)) @cjnolet
Add missing #include <cstdint> (#2540) @jakirkham
Use GCC 13 in CUDA 12 conda builds. (#2539) @bdice
Use rapids-cmake for the logger (#2534) @vyasr
Check if nightlies have succeeded recently enough (#2533) @vyasr
remove unused 'joblib' and 'numba' dependencies, other packaging cleanup (#2532) @jameslamb
introduce libraft wheels (#2531) @jameslamb
Switch over to rapids-logger (#2530) @vyasr
reduce duplication, removed unused things in dependencies.yaml (#2529) @jameslamb
Update cuda-python lower bounds to 12.6.2 / 11.8.5 (#2522) @bdice
[Opt] Optimizing the performance of bitmap_to_csr (#2516) @rhdong
prefer system install of UCX in devcontainers, update outdated RAPIDS references (#2514) @jameslamb
Adapt to rmm logger changes (#2513) @vyasr
Require approval to run CI on draft PRs (#2512) @bdice
Shrink wheel size limit following removal of vector search APIs. (#2509) @bdice
Forward-merge branch-24.12 to branch-25.02 (#2508) @bdice
Introduction of the raft::device_resources_snmg type (#2487) @viclafargue
Add breaking change workflow trigger (#2482) @AyodeAwe
Remove 'sample' parameter from stats::mean API (#2389) @mfoerste4

raft 24.12.00 (11 Dec 2024)

🚨 Breaking Changes

Do not initialize the pinned mdarray at construction time (#2478) @achirkin

🐛 Bug Fixes

Skip gtests for new lanczos solver when CUDA version is 11.4 or below. (#2520) @cjnolet
Switch assert to static_assert (#2510) @divyegala
Revert use of new Lanczos solver in spectral clustering (#2507) @lowener
Put a ceiling on cuda-python (#2486) @bdice
Don't presume pointers location infers usability. (#2480) @robertmaynard
Use Python for sccache hit rate computation. (#2474) @bdice
Allow compilation with CUDA 12.6.1 (#2469) @robertmaynard

🚀 New Features

[FEA] Lanczos solver v2 (#2481) @lowener

🛠️ Improvements

Skip gtests for Rmat Lanczos tests with cuda <= 11.4 (#2525) @benfred
Upgrade to latest cutlass version (#2503) @vyasr
Removing some left over places where implicit instantiations were being ignored in headers (#2501) @cjnolet
Remove leftover template project code. (#2500) @bdice
2412 remove libraft vss instantiations (#2498) @cjnolet
Remove raft-ann-bench (#2497) @cjnolet
Pin FAISS Version for raft-ann-bench (#2496) @tarang-jain
enforce wheel size limits and README formatting in CI, put a ceiling on Cython dependency (#2490) @jameslamb
Do not initialize the pinned mdarray at construction time (#2478) @achirkin
Use environment variables in cache hit rate computation. (#2475) @bdice
devcontainer: replace VAULT_HOST with AWS_ROLE_ARN (#2472) @jjacobelli
print sccache stats in builds (#2470) @jameslamb
make package installations in CI stricter (#2467) @jameslamb
Prune workflows based on changed files (#2466) @KyleFromNVIDIA
Merge branch-24.10 into branch-24.12 (#2461) @jameslamb
Update all rmm imports to use pylibrmm/librmm (#2451) @Matt711

raft 24.10.00 (9 Oct 2024)

🚨 Breaking Changes

[Feat] add repeat, sparsity, eval_n_elements APIs to bitset (#2439) @rhdong

🐛 Bug Fixes

Disable NN Descent Batch tests temporarily (#2453) @divyegala
Fix sed syntax in update-version.sh (#2441) @raydouglass
Use runtime check of cudart version for eig (#2430) @lowener
[BUG] Fix bitset function visibility (#2429) @lowener
Exclude any kernel symbol that uses cutlass (#2425) @robertmaynard

🚀 New Features

[Feat] add repeat, sparsity, eval_n_elements APIs to bitset (#2439) @rhdong
[Opt] Enforce the UT Coverity and add benchmark for transpose (#2438) @rhdong
[FEA] Support for half-float mixed precise in brute-force (#2382) @rhdong

🛠️ Improvements

bump NCCL floor to 2.19 (#2458) @jameslamb
Deprecating vector search APIs and updating README accordingly (#2448) @cjnolet
Update update-version.sh to use packaging lib (#2447) @AyodeAwe
Switch traceback to native (#2446) @galipremsagar
bump NCCL floor to 2.18.1.1 (#2443) @jameslamb
Add missing cuda_suffixed: true (#2440) @trxcllnt
Use CI workflow branch 'branch-24.10' again (#2437) @jameslamb
Update to flake8 7.1.1. (#2435) @bdice
Update fmt (to 11.0.2) and spdlog (to 1.14.1). (#2433) @jameslamb
Allow coo_sort to work on int64_t indices (#2432) @benfred
Adding NCCL clique to the RAFT handle (#2431) @viclafargue
Add support for Python 3.12 (#2428) @jameslamb
Update rapidsai/pre-commit-hooks (#2420) @KyleFromNVIDIA
Drop Python 3.9 support (#2417) @jameslamb
Use CUDA math wheels (#2415) @KyleFromNVIDIA
Remove NumPy <2 pin (#2414) @seberg
Update pre-commit hooks (#2409) @KyleFromNVIDIA
Improve update-version.sh (#2408) @bdice
Use tool.scikit-build.cmake.version, set scikit-build-core minimum-version (#2406) @jameslamb
[FEA] Batching NN Descent (#2403) @jinsolp
Update pip devcontainers to UCX v1.17.0 (#2401) @jameslamb
Merge branch-24.08 into branch-24.10 (#2397) @jameslamb

raft 24.08.00 (7 Aug 2024)

🚨 Breaking Changes

[Refactor] move popc to under util (#2394) @rhdong
[Opt] Expose the detail::popc as public API (#2346) @rhdong

🐛 Bug Fixes

Add timeout to UCXX generic operations (#2398) @pentschev
[Fix] bitmap set/test issue (#2371) @rhdong
Fix 0 recall issue in raft_cagra_hnswlib ANN benchmark (#2369) @divyegala
Fix ef setting in HNSW wrapper (#2367) @divyegala
Fix cagra graph opt bug (#2365) @enp1s0
Fix a bug where the wrong API is used to free the memory (#2361) @PointKernel
Allow anonymous user in devcontainer name (#2355) @bdice
Fix compilation error when _CLK_BREAKDOWN is defined in cagra. (#2350) @jiangyinzuo
ensure raft-dask wheel tests install pylibraft wheel from the same CI run, fix wheel dependencies (#2349) @jameslamb
Change --config-setting to --config-settings (#2342) @KyleFromNVIDIA
Add workaround for syevd in CUDA 12.0 (#2332) @lowener

🚀 New Features

[FEA] add the support of masked_matmul (#2362) @rhdong
[FEA] Dice Distance for Dense Inputs (#2359) @aamijar
[Opt] Expose the detail::popc as public API (#2346) @rhdong
Enable distance return for NN Descent (#2345) @jinsolp

🛠️ Improvements

[Refactor] move popc to under util (#2394) @rhdong
split up CUDA-suffixed dependencies in dependencies.yaml (#2388) @jameslamb
Use workflow branch 24.08 again (#2385) @KyleFromNVIDIA
Add cusparseSpMV_preprocess to cusparse wrapper (#2384) @Kh4ster
Consolidate SUM reductions (#2381) @mfoerste4
Use slicing kernel to copy distances inside NN Descent (#2380) @jinsolp
Build and test with CUDA 12.5.1 (#2378) @KyleFromNVIDIA
Add CUDA_STATIC_MATH_LIBRARIES (#2376) @KyleFromNVIDIA
skip CMake 3.30.0 (#2375) @jameslamb
Use verify-alpha-spec hook (#2373) @KyleFromNVIDIA
Binarize Dice Distance for Dense Inputs (#2370) @aamijar
[FEA] Add distance epilogue for NN Descent (#2364) @jinsolp
resolve dependency-file-generator warning, other rapids-build-backend followup (#2360) @jameslamb
Remove text builds of documentation (#2354) @vyasr
Use default init in reduction (#2351) @akifcorduk
ensure update-version.sh preserves alpha spec, add tests on version constants (#2344) @jameslamb
remove unnecessary 'setuptools' dependencies (#2343) @jameslamb
Use rapids-build-backend (#2331) @KyleFromNVIDIA
Add FAISS with RAFT enabled Benchmarking to raft-ann-bench (#2026) @tarang-jain

raft 24.06.00 (5 Jun 2024)

🚨 Breaking Changes

Rename raft-ann-bench module to raft_ann_bench (#2333) @KyleFromNVIDIA
Scaling workspace resources (#2322) @achirkin
[REVIEW] Adjust UCX dependencies (#2304) @pentschev
Convert device_memory_resource* to device_async_resource_ref (#2269) @harrism

🐛 Bug Fixes

Fix import of VERSION file in raft-ann-bench (#2338) @KyleFromNVIDIA
Rename raft-ann-bench module to raft_ann_bench (#2333) @KyleFromNVIDIA
Support building faiss main statically (#2323) @robertmaynard
Refactor spectral scale_obs to use existing normalization function (#2319) @ChuckHastings
Correct initializer list order found by cuvs (#2317) @robertmaynard
ANN_BENCH: enable move semantics for configured_raft_resources (#2311) @achirkin
Revert "Build C++ wheel (#2264)" (#2305" (#2305)) @vyasr
Revert "Add compile-library by default on pylibraft build" (#2300) @vyasr
Add VERSION to raft-ann-bench package (#2299) @KyleFromNVIDIA
Remove nonexistent job from workflow (#2298) @vyasr
libucx should be run dependency of raft-dask (#2296) @divyegala
Fix clang intrinsic warning (#2292) @aaronmondal
Replace too long index file name with hash in ANN bench (#2280) @tfeher
Fix build command for C++ compilation (#2270) @lowener
Fix a compilation error in CAGRA when enabling log output (#2262) @enp1s0
Correct member initialization order (#2254) @robertmaynard
Fix time computation in CAGRA notebook (#2231) @lowener

📖 Documentation

Fix citation info (#2318) @enp1s0

🚀 New Features

Scaling workspace resources (#2322) @achirkin
ANN_BENCH: AnnGPU::uses_stream() for optional algo GPU sync (#2314) @achirkin
[FEA] Split Bitset code (#2295) @lowener
[FEA] support of prefiltered brute force (#2294) @rhdong
Always use a static gtest and gbench (#2265) @robertmaynard
Build C++ wheel (#2264) @vyasr
InnerProduct Distance Metric for CAGRA search (#2260) @tarang-jain
[FEA] Add support for select_k on CSR matrix (#2140) @rhdong

🛠️ Improvements

ANN_BENCH: common AnnBase::index_type (#2315) @achirkin
ANN_BENCH: split instances of RaftCagra into multiple files (#2313) @achirkin
ANN_BENCH: a global pool of result buffers across benchmark cases (#2312) @achirkin
Remove the shared state and the mutex from NVTX internals (#2310) @achirkin
docs: update README.md (#2308) @eltociear
[REVIEW] Reenable raft-dask wheel tests requiring UCX-Py (#2307) @pentschev
[REVIEW] Adjust UCX dependencies (#2304) @pentschev
Overhaul ops-codeowners (#2303) @raydouglass
Make thrust nosync execution policy the default thrust policy (#2302) @abc99lr
InnerProduct testing for CAGRA+HNSW (#2297) @divyegala
Enable warnings as errors for Python tests (#2288) @mroeschke
Normalize dataset vectors in the CAGRA InnerProduct tests (#2287) @enp1s0
Use dynamic version for raft-ann-bench (#2285) @KyleFromNVIDIA
Make 'librmm' a 'host' dependency for conda packages (#2284) @jameslamb
Fix comments in cpp/include/raft/neighbors/cagra_serialize.cuh (#2283) @jiangyinzuo
Only use functions in the limited API (#2282) @vyasr
define 'ucx' pytest marker (#2281) @jameslamb
Migrate to {{ stdlib("c") }} (#2278) @hcho3
add --rm and --name to devcontainer run args (#2275) @trxcllnt
Update pip devcontainers to UCX v1.15.0 (#2274) @trxcllnt
#ifdef out pragma deprecation warning messages (#2271) @trxcllnt
Convert device_memory_resource* to device_async_resource_ref (#2269) @harrism
Update the developer's guide with new copyright hook (#2266) @KyleFromNVIDIA
Improve coalesced reduction performance for tall and thin matrices (up to 2.6x faster) (#2259) @Nyrio
Adds missing files to update-version.sh (#2255) @AyodeAwe
Enable all tests for arm64 jobs (#2248) @galipremsagar
Update nvtx3 link in cmake (#2246) @lowener
Add CAGRA-Q subspace dim = 4 support (#2244) @enp1s0
Get rid of cuco::sentinel namespace (#2243) @PointKernel
Replace usages of raw get_upstream with get_upstream_resource() (#2207) @miscco
Set the import mode for dask tests (#2142) @vyasr
Add UCXX support (#1983) @pentschev

raft 24.04.00 (10 Apr 2024)

🐛 Bug Fixes

Update pre-commit-hooks to v0.0.3 (#2239) @KyleFromNVIDIA
MAINT: Simplify NCCL worker rank identification (#2228) @VibhuJawa
Fix bug in blockRankedReduce (#2226) @akifcorduk
Fix illegal acces mean/stdev, sum add Kahan Summation (#2223) @mfoerste4
Batch cutlass distance kernels along N matrix dim (#2215) @mdoijade
Fix out of bounds access in sum kernel (#2183) @tfeher
Fix ANN bench ground truth generation for k>1024 (#2180) @tfeher
Fixing cusparse aligned address issue and adding note (#2179) @cjnolet
Launch neighborhood_recall kernel on CUDA stream (#2156) @divyegala
Add compile-library by default on pylibraft build (#2090) @lowener

📖 Documentation

Adding cuVS notice to README and front page of docs. (#2224) @cjnolet

🚀 New Features

Add CAGRA-Q to ANN benchmarks (#2233) @achirkin
Add CAGRA-Q build (compression) (#2213) @achirkin
CAGRA-Q search (#2206) @enp1s0
Demangle backtrace symbols on raft error (#2188) @achirkin
Reapply: Support for fp16 in CAGRA and IVF-PQ (#2172) @achirkin
Remove supports_streams from custom RAFT memory resources (#2121) @harrism
[FEA] Add support for bitmap_view & the API of bitmap_to_csr (#2109) @rhdong

🛠️ Improvements

Use conda env create --yes instead of --force (#2247) @bdice
Align ucx version pinning with ucx-py/ucxx. (#2227) @bdice
Add upper bound to prevent usage of NumPy 2 (#2222) @bdice
Performance optimization of IVF-flat / select_k (#2221) @mfoerste4
Replace local copyright check with pre-commit-hooks verify-copyright (#2220) @KyleFromNVIDIA
Remove hard-coding of RAPIDS version where possible (#2219) @KyleFromNVIDIA
Fix style. (#2214) @bdice
Add explicit instantiations for IVF-PQ search kernels used in tests (#2212) @tfeher
Improve RBC eps-neighborhood query performance (#2211) @mfoerste4
Add test for spmm (#2210) @mfoerste4
Only install necessary components in conda packages. (#2209) @bdice
Automate C++ include file grouping and ordering using clang-format (#2202) @harrism
Add support for Python 3.11, require NumPy 1.23+ (#2200) @jameslamb
Pass std::optional instead of thrust::optional to RMM (#2199) @trxcllnt
Update devcontainers to CUDA Toolkit 12.2 (#2192) @trxcllnt
target branch-24.04 for GitHub Actions workflows (#2189) @jameslamb
Fixing workaround for cuSPARSE bug with correct copy dimensions (#2185) @mfoerste4
Allow topk larger than 1024 in CAGRA (#2181) @benfred
IVF-FLAT support k > 256 (#2169) @mfoerste4
Add environment-agnostic scripts for running ctests and pytests (#2165) @trxcllnt
Ensure that ctest is called with --no-tests=error. (#2163) @bdice
Update ops-bot.yaml (#2158) @AyodeAwe
random sampling of dataset rows with improved memory utilization (#2155) @tfeher
[FIX] Ensure hnswlib can be found from RAFT's build dir (#2145) @trxcllnt
Improve analysis experience for ANN benchmarks (#2139) @achirkin
Enable CAGRA index building without adding dataset to the index (#2126) @tfeher
Add fused cosine 1-NN cutlass based kernel (#2125) @mdoijade
Update raft for compatibility with the latest cuco (#2118) @PointKernel
Support CUDA 12.2 (#2092) @jameslamb
Cache IVF-PQ and select-warpsort kernel launch parameters to reduce latency (#1786) @achirkin

raft 24.02.00 (12 Feb 2024)

🚨 Breaking Changes

Switch to scikit-build-core (#2051) @vyasr
Update to CCCL 2.2.0. (#2049) @bdice
Update raft-ann-bench output filenames and add features to plotting (#2043) @divyegala
Remove selection_faiss (#2027) @benfred

🐛 Bug Fixes

fix is_row/col_order for strided layouts (#2173) @mfoerste4
Fix failing C++ tests and revert #2097, #2085. (#2168) @cjnolet
Exclude tests from builds (#2162) @vyasr
[HOTFIX] 24.02 Revert Random Sampling (#2144) @cjnolet
Pin to pytest 7. (#2137) @bdice
Conditionally include hnsw wrapper source in CMake (#2135) @divyegala
[BUG] Fix SPMM strided view (#2124) @lowener
Fixing small bug in CUSPARSE spmm w/ CUDA 12.2 (#2117) @cjnolet
[BUG] Fix num_cta_per_query div (#2107) @lowener
Remove extraneous host pinnings from libraft-headers-only. (#2102) @bdice
Remove unneeded CI symbol excludes (#2098) @robertmaynard
Properly taking ownership of nccl subcomm (and destroying it) (#2094) @cjnolet
Fix max_queries for CAGRA (#2081) @lowener
Fix compile failure on RTX 4090 (#2076) @JieFengWang
Fix a crash in FAISS benchmark wrapper introduced in #2021 (#2062) @achirkin
Correct function that wasn't returning a value (#2045) @robertmaynard
Fixing small bug in raft-ann-bench (#2041) @cjnolet
Make device_resources accessed from device_resources_manager thread-safe (#2030) @wphicks
Fix ann-bench multithreading (#2021) @achirkin
Fix ci/checks/copyright.py to mirror RAPIDS reference (#2008) @divyegala
Fix pyproject versions (#2002) @vyasr

📖 Documentation

Adding license info for wiki-all dataset (#2129) @cjnolet
[DOC] Documentation updates for release 24.02 (#2093) @cjnolet
Fix errors with ingroup exposed by doxygen 1.10 (#2079) @wphicks
Fix a typo (#2070) @narangvivek10
Add usage example for brute_force::build (#2029) @benfred
Add filtering to vector search tutorial (#1996) @lowener

🚀 New Features

Update to use rapids-cmake for all deps (#2096) @robertmaynard
Add IVF-PQ example into the template project (#2091) @achirkin
Support for fp16 in CAGRA and IVF-PQ (#2085) @achirkin
Add random subsampling for IVF methods (#2077) @tfeher
Update raft-ann-bench output filenames and add features to plotting (#2043) @divyegala
Add brute_force index serialization (#2036) @wphicks
Add eps-neighbor search via RBC (#2028) @mfoerste4
libraft and pylibraft API for CAGRA build and HNSW search (#2022) @divyegala
Export Pareto frontier in raft-ann-bench.data_export (#2009) @divyegala
Implement maybe-owning multi-dimensional container (mdbuffer) (#1999) @wphicks
Add support for 1024+ dim vectors in CAGRA search (#1994) @enp1s0
Replace GEMM backend: cublas.gemm -> cublaslt.matmul (#1736) @achirkin

🛠️ Improvements

Remove get_mem_info functions from RAFT custom memory resources (#2108) @harrism
Replace call to mr::get_mem_info() (#2099) @harrism
Allow topk larger than 1024 in CAGRA (#2097) @benfred
Remove usages of rapids-env-update (#2095) @KyleFromNVIDIA
Provide explicit pool size for pool_memory_resources and clean up includes (#2088) @harrism
refactor CUDA versions in dependencies.yaml (#2086) @jameslamb
ANN bench fix latency measurement overhead (#2084) @tfeher
Remove hardcoded limit in print_results function (#2080) @narangvivek10
[FEA] Add support for SDDMM by wrapping the cusparseSDDMM (#2067) (#2067 (#2067)) @rhdong
Benchmark brute force knn (#2063) @benfred
[BUG] fix empty initialization of device_ndarray in pylibraft (#2061) @mfoerste4
Improve parallelism of refine host (#2059) @anaruse
Subsampling for IVF-PQ codebook generation (#2052) @abc99lr
Switch to scikit-build-core (#2051) @vyasr
Update to CCCL 2.2.0. (#2049) @bdice
Use cuda::proclaim_return_type on device lambda. (#2048) @bdice
Removing code that explicitly compares equality of rmm memory resources (#2047) @cjnolet
Add public enum for select-k algorithm selection (#2046) @benfred
Update dependencies.yaml to new pip index (#2042) @vyasr
Remove RAFT_BUILD_WHEELS and standardize Python builds (#2040) @vyasr
Fix ucx-py version pinning in dependencies.yaml. (#2035) @bdice
[REVIEW] Fix typos in parameter tuning guide (#2034) @abc99lr
Add AIR-Top-k reference (#2031) @tfeher
Remove selection_faiss (#2027) @benfred
Fixing json parse error in raft-ann-bench.data_export (#2025) @cjnolet
Updating cagra build constraint (#2016) @cjnolet
Update to fmt 10.1.1 and spdlog 1.12.0. (#1957) @bdice
Enable host dataset for IVF-Flat (#1635) @tfeher
add half/bfloat support to myInf and abs (#1592) @Kh4ster

raft 23.12.00 (6 Dec 2023)

🐛 Bug Fixes

Update actions/labeler to v4 (#2037) @raydouglass
pylibraft only depends on numpy at runtime, not build time. (#2013) @bdice
Fixes to update-version.sh (#1991) @raydouglass
Adjusting end-to-end start time so it doesn't include stream creation time (#1989) @cjnolet
CAGRA graph optimizer: clamp rev_graph_count (#1987) @tfeher
Catching conversion errors in data_export instead of fully failing (#1979) @cjnolet
Fix syncing mechanism in raft-ann-bench C++ search (#1961) @divyegala
Fixing hnswlib in latency mode (#1959) @cjnolet
Fix ucx-py alpha version update for raft-dask (#1953) @divyegala
Reduce NN Descent test threshold (#1946) @divyegala
Fixes to new YAML config raft-bench-ann (#1945) @divyegala
Set RNG seeds in NN Descent to diagnose flaky tests (#1931) @divyegala
Fix FAISS CPU algorithm names in raft-ann-bench (#1916) @divyegala
Increase iterations in NN Descent tests to avoid flakiness (#1915) @divyegala
Fix filepath in raft-ann-bench/split_groundtruth module (#1911) @divyegala
Remove dynamic entry-points from raft-ann-bench (#1910) @benfred
Remove unnecessary dataset path check in ANN bench (#1908) @tfeher
Fixing Googletests and re-enabling in CI (#1904) @cjnolet
Fix NN Descent overflows (#1875) @divyegala
Build fix for CUDA 12.2 (#1870) @benfred
[BUG] Fix a bug in NN descent (#1869) @enp1s0

📖 Documentation

Brute Force Index documentation fix (#1944) @lowener
Add wiki_all dataset config and documentation. (#1918) @cjnolet
Updates to raft-ann-bench docs (#1905) @cjnolet
End-to-end vector search tutorial in docs (#1776) @cjnolet

🚀 New Features

Adding dry-run option to raft-ann-bench (#1970) @cjnolet
Add ANN bench scripts to generate ground truth (#1967) @tfeher
CAGRA build + HNSW search (#1956) @divyegala
Verify conda-cpp-post-build-checks (#1935) @robertmaynard
Make all cuda kernels have hidden visibility (#1898) @robertmaynard
Update rapids-cmake functions to non-deprecated signatures (#1884) @robertmaynard
[FEA] Helpers for identifying contiguous layouts. (#1861) @trivialfis
Add raft::stats::neighborhood_recall (#1860) @divyegala
[FEA] Helpers and CodePacker for IVF-PQ (#1826) @tarang-jain

🛠️ Improvements

Pinning fmt and spdlog for raft-ann-bench-cpu (#2018) @cjnolet
Build concurrency for nightly and merge triggers (#2011) @bdice
Using EXPORT_SET in rapids_find_package_root (#2006) @cjnolet
Remove static checks for serialization size (#1997) @cjnolet
Skipping bad json parse (#1990) @cjnolet
Update select-k heuristic (#1985) @benfred
ANN bench: use different offset for each thread (#1981) @tfeher
Allow raft-ann-bench/run to continue after encountering bad YAML configs (#1980) @divyegala
Add build and search params to raft-ann-bench.data_export CSVs (#1971) @divyegala
Use new rapids-dask-dependency metapackage for managing dask versions (#1968) @galipremsagar
Remove unused header (#1960) @wphicks
Adding pool back in and fixing cagra benchmark params (#1951) @cjnolet
Add constraints to hnswlib in raft-bench-ann (#1949) @divyegala
Add support for iterating over batches in bfknn (#1947) @benfred
Fix ANN bench latency (#1940) @tfeher
Add YAML config files to run parameter sweeps for ANN benchmarks (#1929) @divyegala
Relax ucx pinning (#1927) @vyasr
Try using contiguous rank to fix cuda_visible_devices (#1926) @VibhuJawa
Unpin dask and distributed for 23.12 development (#1925) @galipremsagar
Adding throughput and latency modes to raft-ann-bench (#1920) @cjnolet
Providing aarch64 yaml environment files (#1914) @cjnolet
CAGRA ANN bench: parse build options for IVF-PQ build algo (#1912) @tfeher
Fix python script location in ANN bench description (#1906) @tfeher
Refactor install/build guide. (#1899) @cjnolet
Check return values of raft-ann-bench subprocess calls (#1897) @benfred
ANN bench options to specify CAGRA graph and dataset locations (#1896) @cjnolet
Add check-json to pre-commit linters, and fix invalid ann-bench JSON config (#1894) @benfred
Use branch-23.12 workflows. (#1886) @bdice
Setup Consistent Nightly Versions for Pip and Conda (#1880) @divyegala
Fix and improve one-block radix select (#1878) @yong-wang
[FEA] Improvements on bitset class (#1877) @lowener
Branch 23.12 merge 23.10 (#1873) @AyodeAwe
Branch 23.12 merge 23.10 (#1868) @cjnolet
Replace raft::random calls to not use deprecated API (#1867) @lowener
raft: Build CUDA 12.0 ARM conda packages. (#1853) @bdice
Documentation for raft ANN benchmark containers. (#1833) @dantegd
[FEA] Support vector deletion in ANN IVF (#1831) @lowener
Provide a raft::copy overload for mdspan-to-mdspan copies (#1818) @wphicks
Adding FAISS cpu to raft-ann-bench (#1814) @cjnolet

raft 23.10.00 (11 Oct 2023)

🚨 Breaking Changes

Change CAGRA auto mode selection (#1830) @enp1s0
Update CAGRA serialization (#1755) @benfred
Improvements to ANN Benchmark Python scripts and docs (#1734) @divyegala
Update to Cython 3.0.0 (#1688) @vyasr
ANN-benchmarks: switch to use gbench (#1661) @achirkin

🐛 Bug Fixes

[BUG] Fix a bug in the filtering operation in CAGRA multi-kernel (#1862) @enp1s0
Fix conf file for benchmarking glove datasets (#1846) @dantegd
raft-ann-bench package fixes for plotting and conf files (#1844) @dantegd
Fix update-version.sh for all pyproject.toml files (#1839) @raydouglass
Make RMM a run dependency of the raft-ann-bench conda package (#1838) @dantegd
Printing actual exception in require base set (#1816) @cjnolet
Adding rmm to raft-ann-bench dependencies (#1815) @cjnolet
Use conda mambabuild not mamba mambabuild (#1812) @bdice
Fix raft-dask naming in wheel builds (#1805) @divyegala
neighbors::refine_host: check the dataset bounds (#1793) @achirkin
[BUG] Fix search parameter check in CAGRA (#1784) @enp1s0
IVF-Flat: fix search batching (#1764) @achirkin
Using expanded distance computations in pylibraft (#1759) @cjnolet
Fix ann-bench Documentation (#1754) @divyegala
Make get_cache_idx a weak symbol with dummy template (#1733) @ahendriksen
Fix IVF-PQ fused kernel performance problems (#1726) @achirkin
Fix build.sh to enable NEIGHBORS_ANN_CAGRA_TEST (#1724) @enp1s0
Fix template types for create_descriptor function. (#1680) @csadorf

📖 Documentation

Fix the CAGRA paper citation (#1788) @enp1s0
Add citation info for the CAGRA paper preprint (#1787) @enp1s0
[DOC] Fix grouping for ANN in C++ doxygen (#1782) @lowener
Update RAFT documentation (#1717) @lowener
Additional polishing of README and docs (#1713) @cjnolet

🚀 New Features

[FEA] Add bitset_filter for CAGRA indices removal (#1837) @lowener
ann-bench: miscellaneous improvements (#1808) @achirkin
[FEA] Add bitset for ANN pre-filtering and deletion (#1803) @lowener
Adding config files for remaining (relevant) ann-benchmarks million-scale datasets (#1761) @cjnolet
Port NN-descent algorithm to use in cagra::build() (#1748) @divyegala
Adding conda build for libraft static (#1746) @cjnolet
[FEA] Provide device_resources_manager for easy generation of device_resources (#1716) @wphicks

🛠️ Improvements

Add option to brute_force index to store non-owning reference to norms (#1865) @benfred
Pin dask and distributed for 23.10 release (#1864) @galipremsagar
Update image names (#1835) @AyodeAwe
Fixes for OOM during CAGRA benchmarks (#1832) @benfred
Change CAGRA auto mode selection (#1830) @enp1s0
Update to clang 16.0.6. (#1829) @bdice
Add IVF-Flat C++ example (#1828) @tfeher
matrix::select_k: extra tests and benchmarks (#1821) @achirkin
Add index class for brute_force knn (#1817) @benfred
[FEA] Add pre-filtering to CAGRA (#1811) @enp1s0
More updates to ann-bench docs (#1810) @cjnolet
Add best deep-100M configs for IVF-PQ to ANN benchmarks (#1807) @tfeher
A few fixes to raft-ann-bench recipe and docs (#1806) @cjnolet
Simplify wheel build scripts and allow alphas of RAPIDS dependencies (#1804) @divyegala
Various fixes to reproducible benchmarks (#1800) @cjnolet
ANN-bench: more flexible cuda_stub.hpp (#1792) @achirkin
Add RAFT devcontainers (#1791) @trxcllnt
Cagra memory optimizations (#1790) @benfred
Fixing a couple security concerns in raft-dask nccl unique id generation (#1785) @cjnolet
Don't serialize dataset with CAGRA bench (#1781) @benfred
Use copy-pr-bot (#1774) @ajschmidt8
Add GPU and CPU packages for ANN benchmarks (#1773) @dantegd
Improvements to raft-ann-bench scripts, docs, and benchmarking implementations. (#1769) @cjnolet
[REVIEW] Introducing host API for PCG (#1767) @vinaydes
Unpin dask and distributed for 23.10 development (#1760) @galipremsagar
Add ivf-flat notebook (#1758) @tfeher
Update CAGRA serialization (#1755) @benfred
Remove block size template parameter from CAGRA search (#1740) @enp1s0
Add NVTX ranges for cagra search/serialize functions (#1737) @benfred
Improvements to ANN Benchmark Python scripts and docs (#1734) @divyegala
Fixing forward merger for 23.08 -> 23.10 (#1731) @cjnolet
[FEA] Use CAGRA in C++ template (#1730) @lowener
fixed box around raft image (#1710) @nwstephens
Enable CUTLASS-based distance kernels on CTK 12 (#1702) @ahendriksen
Update bench-ann configuration (#1696) @lowener
Update to Cython 3.0.0 (#1688) @vyasr
Update CMake version (#1677) @vyasr
Branch 23.10 merge 23.08 (#1672) @vyasr
ANN-benchmarks: switch to use gbench (#1661) @achirkin

raft 23.08.00 (9 Aug 2023)

🚨 Breaking Changes

Separate CAGRA index type from internal idx type (#1664) @tfeher
Stop using setup.py in build.sh (#1645) @vyasr
CAGRA max_queries auto configuration (#1613) @enp1s0
Rename the CAGRA prune function to optimize (#1588) @enp1s0
CAGRA pad dataset for 128bit vectorized load (#1505) @tfeher
Sparse Pairwise Distances API Updates (#1502) @divyegala
Cagra index construction without copying device mdarrays (#1494) @tfeher
[FEA] Masked NN for connect_components (#1445) @tarang-jain
Limiting workspace memory resource (#1356) @achirkin

🐛 Bug Fixes

Remove push condition on docs-build (#1693) @raydouglass
IVF-PQ: Fix illegal memory access with large max_samples (#1685) @achirkin
Fix missing parameter for select_k (#1682) @ucassjy
Separate CAGRA index type from internal idx type (#1664) @tfeher
Add rmm to pylibraft run dependencies, since it is used by Cython. (#1656) @bdice
Hotfix: wrong constant in IVF-PQ fp_8bit2half (#1654) @achirkin
Fix sparse KNN for large batches (#1640) @viclafargue
Fix uploading of RAFT nightly packages (#1638) @dantegd
Fix cagra multi CTA bug (#1628) @enp1s0
pass correct stream to cutlass kernel launch of L2/cosine pairwise distance kernels (#1597) @mdoijade
Fix launchconfig y-gridsize too large in epilogue kernel (#1586) @mfoerste4
Fix update version and pinnings for 23.08. (#1556) @bdice
Fix for function exposing KNN merge (#1418) @viclafargue

📖 Documentation

Critical doc fixes and updates for 23.08 (#1705) @cjnolet
Fix the documentation about changing the logging level (#1596) @enp1s0
Fix raft::bitonic_sort small usage example (#1580) @enp1s0

🚀 New Features

Use rapids-cmake new parallel testing feature (#1623) @robertmaynard
Add support for row-major slice (#1591) @lowener
IVF-PQ tutorial notebook (#1544) @achirkin
[FEA] Masked NN for connect_components (#1445) @tarang-jain
raft: Build CUDA 12 packages (#1388) @vyasr
Limiting workspace memory resource (#1356) @achirkin

🛠️ Improvements

Pin dask and distributed for 23.08 release (#1711) @galipremsagar
Add algo parameter for CAGRA ANN bench (#1687) @tfeher
ANN benchmarks python wrapper for splitting billion-scale dataset groundtruth (#1679) @divyegala
Rename CAGRA parameter num_parents to search_width (#1676) @tfeher
Renaming namespaces to promote CAGRA from experimental (#1666) @cjnolet
CAGRA Python wrappers (#1665) @dantegd
Add notebook for Vector Search - Question Retrieval (#1662) @lowener
Fix CMake CUDA support for pylibraft when raft is found. (#1659) @bdice
Cagra ANN benchmark improvements (#1658) @tfeher
ANN-benchmarks: avoid using the dataset during search when possible (#1657) @achirkin
Revert CUDA 12.0 CI workflows to branch-23.08. (#1652) @bdice
ANN: Optimize host-side refine (#1651) @achirkin
Cagra template instantiations (#1650) @tfeher
Modify comm_split to avoid ucp (#1649) @ChuckHastings
Stop using setup.py in build.sh (#1645) @vyasr
IVF-PQ: Add a (faster) direct conversion fp8->half (#1644) @achirkin
Simplify bench/ann scripts to Python based module (#1642) @divyegala
Further removal of uses-setup-env-vars (#1639) @dantegd
Drop blank line in raft-dask/meta.yaml (#1637) @jakirkham
Enable conservative memory allocations for RAFT IVF-Flat benchmarks. (#1634) @tfeher
[FEA] Codepacking for IVF-flat (#1632) @tarang-jain
Fixing ann bench cmake (and docs) (#1630) @cjnolet
[WIP] Test CI issues (#1626) @VibhuJawa
Set pool memory resource for raft IVF ANN benchmarks (#1625) @tfeher
Adding sort option to matrix::select_k api (#1615) @cjnolet
CAGRA max_queries auto configuration (#1613) @enp1s0
Use exceptions instead of exit(-1) (#1594) @benfred
[REVIEW] Add scheduler_file argument to support MNMG setup (#1593) @VibhuJawa
Rename the CAGRA prune function to optimize (#1588) @enp1s0
This PR adds support to __half and nb_bfloat16 to myAtomicReduce (#1585) @Kh4ster
[IMP] move core CUDA RT macros to cuda_rt_essentials.hpp (#1584) @MatthiasKohl
preprocessor syntax fix (#1582) @AyodeAwe
use rapids-upload-docs script (#1578) @AyodeAwe
Unpin dask and distributed for development and fix merge_labels test (#1574) @galipremsagar
Remove documentation build scripts for Jenkins (#1570) @ajschmidt8
Add support to __half and nv_bfloat16 to most math functions (#1554) @Kh4ster
Add RAFT ANN benchmark for CAGRA (#1552) @enp1s0
Update CAGRA knn_graph_sort to use Raft::bitonic_sort (#1550) @enp1s0
Add identity matrix function (#1548) @lowener
Unpin scikit-build upper bound (#1547) @vyasr
Migrate wheel workflow scripts locally (#1546) @divyegala
Add sample filtering for ivf_flat. Filtering code refactoring and cleanup (#1541) @alexanderguzhva
CAGRA pad dataset for 128bit vectorized load (#1505) @tfeher
Sparse Pairwise Distances API Updates (#1502) @divyegala
Add CAGRA gbench (#1496) @tfeher
Cagra index construction without copying device mdarrays (#1494) @tfeher

raft 23.06.00 (7 Jun 2023)

🚨 Breaking Changes

ivf-pq::search: fix the indexing type of the query-related mdspan arguments (#1539) @achirkin
Dropping Python 3.8 (#1454) @divyegala

🐛 Bug Fixes

[HOTFIX] Fix distance metrics L2/cosine/correlation when X & Y are same buffer but with different shape and add unit test for such case. (#1571) @mdoijade
Using raft::resources in rsvd (#1543) @cjnolet
ivf-pq::search: fix the indexing type of the query-related mdspan arguments (#1539) @achirkin
Check python brute-force knn inputs (#1537) @benfred
Fix failing TiledKNNTest unittest (#1533) @benfred
ivf-flat: fix incorrect recomputed size of the index (#1525) @achirkin
ivf-flat: limit the workspace size of the search via batching (#1515) @achirkin
Support uint64_t in CAGRA index data type (#1514) @enp1s0
Workaround for cuda 12 issue in cusparse (#1508) @cjnolet
Un-scale output distances (#1499) @achirkin
Inline get_cache_idx (#1492) @ahendriksen
Pin to scikit-build<17.2 (#1487) @vyasr
Remove pool_size() calls from debug printouts (#1484) @tfeher
Add missing ext declaration for log detail::format (#1482) @tfeher
Remove include statements from inside namespace (#1467) @robertmaynard
Use pin_compatible to ensure that lower CTKs can be used (#1462) @vyasr
fix ivf_pq n_probes (#1456) @benfred
The glog project root CMakeLists.txt is where we should build from (#1442) @robertmaynard
Add missing resource factory virtual destructor (#1433) @cjnolet
Removing cuda stream view include from mdarray (#1429) @cjnolet
Fix dim param for IVF-PQ wrapper in ANN bench (#1427) @tfeher
Remove MetricProcessor code from brute_force::knn (#1426) @benfred
Fix is_min_close (#1419) @benfred
Have consistent compile lines between BUILD_TESTS enabled or not (#1401) @robertmaynard
Fix ucx-py pin in raft-dask recipe (#1396) @vyasr

📖 Documentation

Various updates to the docs for 23.06 release (#1538) @cjnolet
Rename kernel arch finding function for dispatch (#1536) @mdoijade
Adding bfknn and ivf-pq python api to docs (#1507) @cjnolet
Add RAPIDS cuDF as a library that supports cuda_array_interface (#1444) @miguelusque

🚀 New Features

IVF-PQ: manipulating individual lists (#1298) @achirkin
Gram matrix support for sparse input (#1296) @mfoerste4
[FEA] Add randomized svd from cusolver (#1000) @lowener

🛠️ Improvements

Require Numba 0.57.0+ (#1559) @jakirkham
remove device_resources include from linalg::map (#1540) @benfred
Learn heuristic to pick fastest select_k algorithm (#1523) @benfred
[REVIEW] make raft::cache::Cache protected to allow overrides (#1522) @mfoerste4
[REVIEW] Fix padding assertion in sparse Gram evaluation (#1521) @mfoerste4
run docs nightly too (#1520) @AyodeAwe
Switch back to using primary shared-action-workflows branch (#1519) @vyasr
Python API for IVF-Flat serialization (#1516) @tfeher
Introduce sample filtering to IVFPQ index search (#1513) @alexanderguzhva
Migrate from raft::device_resources -> raft::resources (#1510) @benfred
Use rmm allocator in CAGRA prune (#1503) @enp1s0
Update recipes to GTest version >=1.13.0 (#1501) @bdice
Remove raft/matrix/matrix.cuh includes (#1498) @benfred
Generate dataset of select_k times (#1497) @benfred
Re-use memory pool between benchmark runs (#1495) @benfred
Support CUDA 12.0 for pip wheels (#1489) @divyegala
Update cupy dependency (#1488) @vyasr
Enable sccache hits from local builds (#1478) @AyodeAwe
Build wheels using new single image workflow (#1477) @vyasr
Revert shared-action-workflows pin (#1475) @divyegala
CAGRA: Separate graph index sorting functionality from prune function (#1471) @enp1s0
Add generic reduction functions and separate reductions/warp_primitives (#1470) @akifcorduk
[ENH] [FINAL] Header structure: combine all PRs into one (#1469) @ahendriksen
use matrix::select_k in brute_force::knn call (#1463) @benfred
Dropping Python 3.8 (#1454) @divyegala
Fix linalg::map to work with non-power-of-2-sized types again (#1453) @ahendriksen
[ENH] Enable building with clang (limit strict error checking to GCC) (#1452) @ahendriksen
Remove usage of rapids-get-rapids-version-from-git (#1436) @jjacobelli
Minor Updates to Sparse Structures (#1432) @divyegala
Use nvtx3 includes. (#1431) @bdice
Remove wheel pytest verbosity (#1424) @sevagh
Add python bindings for matrix::select_k (#1422) @benfred
Using raft::resources across raft::random (#1420) @cjnolet
Generate build metrics report for test and benchmarks (#1414) @divyegala
Update clang-format to 16.0.1. (#1412) @bdice
Use ARC V2 self-hosted runners for GPU jobs (#1410) @jjacobelli
Remove uses-setup-env-vars (#1406) @vyasr
Resolve conflicts in auto-merger of branch-23.06 and branch-23.04 (#1403) @galipremsagar
Adding base header-only conda package without cuda math libs (#1386) @cjnolet
Fix IVF-PQ API to use device_vector_view (#1384) @lowener
Branch 23.06 merge 23.04 (#1379) @vyasr
Forward merge branch 23.04 into 23.06 (#1350) @cjnolet
Fused L2 1-NN based on cutlass 3xTF32 / DMMA (#1118) @mdoijade

raft 23.04.00 (6 Apr 2023)

🚨 Breaking Changes

Pin dask and distributed for release (#1399) @galipremsagar
Remove faiss_mr.hpp (#1351) @benfred
Removing FAISS from build (#1340) @cjnolet
Generic linalg::map (#1337) @achirkin
Consolidate pre-compiled specializations into single libraft binary (#1333) @cjnolet
Generic linalg::map (#1329) @achirkin
Update and standardize IVF indexes API (#1328) @viclafargue
IVF-Flat index splitting (#1271) @lowener
IVF-PQ: store cluster data in individual lists and reduce templates (#1249) @achirkin
Fix for svd API (#1190) @lowener
Remove deprecated headers (#1145) @lowener

🐛 Bug Fixes

Fix primitives benchmarks (#1389) @ahendriksen
Fixing index-url link on pip install docs (#1378) @cjnolet
Adding some functions back in that seem to be a copy/paste error (#1373) @cjnolet
Remove usage of Dask's get_worker (#1365) @pentschev
Remove MANIFEST.in use auto-generated one for sdists and package_data for wheels (#1348) @vyasr
Revert "Generic linalg::map (#1329)" (#1336" (#1336)) @cjnolet
Small follow-up to specializations cleanup (#1332) @cjnolet
Fixing select_k specializations (#1330) @cjnolet
Fixing remaining bug in ann_quantized (#1327) @cjnolet
Fixign a couple small kmeans bugs (#1274) @cjnolet
Remove no longer instantiated templates from list of extern template declarations (#1272) @vyasr
Bump pinned deps to 23.4 (#1266) @vyasr
Fix the destruction of interruptible token registry (#1229) @achirkin
Expose raft::handle_t in the public header (#1192) @vyasr
Fix for svd API (#1190) @lowener

📖 Documentation

Adding architecture diagram to README.md (#1370) @cjnolet
Adding small readme image (#1354) @cjnolet
Fix serialize documentation of ivf_flat (#1347) @lowener
Small updates to docs (#1339) @cjnolet

🚀 New Features

Add Options to Generate Build Metrics Report (#1369) @divyegala
Generic linalg::map (#1337) @achirkin
Generic linalg::map (#1329) @achirkin
matrix::select_k specializations (#1268) @achirkin
Use rapids-cmake new COMPONENT exporting feature (#1154) @robertmaynard

🛠️ Improvements

Pin dask and distributed for release (#1399) @galipremsagar
Pin cupy in wheel tests to supported versions (#1383) @vyasr
CAGRA (#1375) @tfeher
add a distance epilogue function to the bfknn call (#1371) @benfred
Relax UCX pin to allow 1.14 (#1366) @pentschev
Generate pyproject dependencies with dfg (#1364) @vyasr
Add nccl to dependencies.yaml (#1361) @benfred
Add extern template for ivfflat_interleaved_scan (#1360) @ahendriksen
Stop setting package version attribute in wheels (#1359) @vyasr
Fix ivf flat specialization header IdxT from uint64_t -> int64_t (#1358) @ahendriksen
Remove faiss_mr.hpp (#1351) @benfred
Rename optional helper function (#1345) @viclafargue
Pass minimum target compile options through raft::raft (#1341) @cjnolet
Removing FAISS from build (#1340) @cjnolet
Add dispatch based on compute architecture (#1335) @ahendriksen
Consolidate pre-compiled specializations into single libraft binary (#1333) @cjnolet
Update and standardize IVF indexes API (#1328) @viclafargue
Using int64_t specializations for ivf_pq and refine (#1325) @cjnolet
Migrate as much as possible to pyproject.toml (#1324) @vyasr
Pass AWS_SESSION_TOKEN and SCCACHE_S3_USE_SSL vars to conda build (#1321) @ajschmidt8
Numerical stability fixes for l2 pairwise distance (#1319) @benfred
Consolidate linter configuration into pyproject.toml (#1317) @vyasr
IVF-Flat Python wrappers (#1316) @tfeher
Add stream overloads to ivf_pq serialize/deserialize methods (#1315) @divyegala
Temporary buffer to view host or device memory in device (#1313) @divyegala
RAFT skeleton project template (#1312) @cjnolet
Fix docs build to be pydata-sphinx-theme=0.13.0 compatible (#1311) @galipremsagar
Update to GCC 11 (#1309) @bdice
Reduce compile times of distance specializations (#1307) @ahendriksen
Fix docs upload path (#1305) @AyodeAwe
Add end-to-end CUDA ann-benchmarks to raft (#1304) @cjnolet
Make docs builds less verbose (#1302) @AyodeAwe
Stop using versioneer to manage versions (#1301) @vyasr
Adding util to get the device id for a pointer address (#1297) @cjnolet
Enable dfg in pre-commit. (#1293) @vyasr
Python API for brute-force KNN (#1292) @cjnolet
support k up to 2048 in faiss select (#1287) @benfred
CI: Remove specification of manual stage for check_style.sh script. (#1283) @csadorf
New Sparse Matrix APIs (#1279) @cjnolet
fix build on cuda 11.5 (#1277) @benfred
IVF-Flat index splitting (#1271) @lowener
Remove duplicate librmm runtime dependency (#1264) @ajschmidt8
build.sh: Add option to log nvcc compile times (#1262) @ahendriksen
Reduce error handling verbosity in CI tests scripts (#1259) @AjayThorve
Update shared workflow branches (#1256) @ajschmidt8
Keeping only compute similarity specializations for uint64_t for now (#1255) @cjnolet
Fix compile time explosion for minkowski distance (#1254) @ahendriksen
Unpin dask and distributed for development (#1253) @galipremsagar
Remove gpuCI scripts. (#1252) @bdice
IVF-PQ: store cluster data in individual lists and reduce templates (#1249) @achirkin
Fix inconsistency between the building doc and CMakeLists.txt (#1248) @yong-wang
Consolidating ANN benchmarks and tests (#1243) @cjnolet
mdspan view for IVF-PQ API (#1236) @viclafargue
Remove uint32 distance idx specializations (#1235) @cjnolet
Add innerproduct to the pairwise distance api (#1226) @benfred
Move date to build string in conda recipe (#1223) @ajschmidt8
Replace faiss bfKnn (#1202) @benfred
Expose KMeans init_plus_plus in pylibraft (#1198) @betatim
Fix ucx-py version (#1184) @ajschmidt8
Improve the performance of radix top-k (#1175) @yong-wang
Add docs build job (#1168) @AyodeAwe
Remove deprecated headers (#1145) @lowener
Simplify distance/detail to make is easier to dispatch to different kernel implementations (#1142) @ahendriksen
Initial port of auto-find-k (#1070) @cjnolet

raft 23.02.00 (9 Feb 2023)

🚨 Breaking Changes

Remove faiss ANN code from knnIndex (#1121) @benfred
Use GenPC (Permuted Congruential) as the default random number generator everywhere (#1099) @Nyrio

🐛 Bug Fixes

Reverting a few commits from 23.02 and speeding up end-to-end build time (#1232) @cjnolet
Update README.md: fix a missing word (#1185) @achirkin
balanced-k-means: fix a too large initial memory pool size (#1148) @achirkin
Catch signal handler change error (#1147) @tfeher
Squared norm fix follow-up (change was lost in merge conflict) (#1144) @Nyrio
IVF-Flat bug fix: the squared norm is required for expanded distance calculations (#1141) @Nyrio
build.sh switch to use RAPIDS magic value (#1132) @robertmaynard
Fix euclidean_dist in IVF-Flat search (#1122) @Nyrio
Update handle docstring (#1103) @dantegd
Pin libcusparse and libcusolver to avoid CUDA 12 (#1095) @wphicks
Fix race condition in raft::random::discrete (#1094) @Nyrio
Fixing libraft conda recipes (#1084) @cjnolet
Ensure that we get the cuda version of faiss. (#1078) @vyasr
Fix double definition error in ANN refinement header (#1067) @tfeher
Specify correct global targets names to raft_export (#1054) @robertmaynard
Fix concurrency issues in k-means++ initialization (#1048) @Nyrio

📖 Documentation

Adding small comms tutorial to docs (#1204) @cjnolet
Separating more namespaces into easier-to-consume sections (#1091) @cjnolet
Paying down some tech debt on docs, runtime API, and cython (#1055) @cjnolet

🚀 New Features

Add function to convert mdspan to a const view (#1188) @lowener
Internal library to share headers between test and bench (#1162) @achirkin
Add public API and tests for hierarchical balanced k-means (#1113) @Nyrio
Export NCCL dependency as part of raft::distributed. (#1077) @vyasr
Serialization of IVF Flat and IVF PQ (#919) @tfeher

🛠️ Improvements

Pin dask and distributed for release (#1242) @galipremsagar
Update shared workflow branches (#1241) @ajschmidt8
Removing interruptible from basic handle sync. (#1224) @cjnolet
pre-commit: Update isort version to 5.12.0 (#1215) @wence-
Pin wheel dependencies to same RAPIDS release (#1200) @sevagh
Serializer for mdspans (#1173) @hcho3
Use CTK 118/cp310 branch of wheel workflows (#1169) @sevagh
Enable shallow copy of handle_t's resources with different workspace_resource (#1165) @cjnolet
Protect balanced k-means out-of-memory in some cases (#1161) @achirkin
Use squeuclidean for metric name in ivf_pq python bindings (#1160) @benfred
ANN tests: make the min_recall check strict (#1156) @achirkin
Make cutlass use static ctk (#1155) @sevagh
Fix various build errors (#1152) @hcho3
Remove faiss bfKnn call from fused_l2_knn unittest (#1150) @benfred
Fix unary_op docs and add map_offset as an improved version of write_only_unary_op (#1149) @Nyrio
Improvement of the math API wrappers (#1146) @Nyrio
Changing handle_t to device_resources everywhere (#1140) @cjnolet
Add L2SqrtExpanded support to ivf_pq (#1138) @benfred
Adding workspace resource (#1137) @cjnolet
Add raft::void_op functor (#1136) @ahendriksen
IVF-PQ: tighten the test criteria (#1135) @achirkin
Fix documentation author (#1134) @bdice
Add L2SqrtExpanded support to ivf_flat ANN indices (#1133) @benfred
Improvements in matrix::gather: test coverage, compilation errors, performance (#1126) @Nyrio
Adding ability to use an existing stream in the pylibraft Handle (#1125) @cjnolet
Remove faiss ANN code from knnIndex (#1121) @benfred
Update builds for CUDA 11.8 and Python 3.10 (#1120) @ajschmidt8
Update workflows for nightly tests (#1119) @ajschmidt8
Enable Recently Updated Check (#1117) @ajschmidt8
Build wheels alongside conda CI (#1116) @sevagh
Allow host dataset for IVF-PQ (#1114) @tfeher
Decoupling raft handle from underlying resources (#1111) @cjnolet
Fixing an index error introduced in PR #1109 (#1110) @vinaydes
Fixing the sample-without-replacement test failures (#1109) @vinaydes
Remove faiss dependency from fused_l2_knn.cuh, selection_faiss.cuh, ball_cover.cuh and haversine_distance.cuh (#1108) @benfred
Remove redundant operators in sparse/distance and move others to raft/core (#1105) @Nyrio
Speedup make_blobs by up to 2x by fixing inefficient kernel launch configuration (#1100) @Nyrio
Use GenPC (Permuted Congruential) as the default random number generator everywhere (#1099) @Nyrio
Cleanup faiss includes (#1098) @benfred
matrix::select_k: move selection and warp-sort primitives (#1085) @achirkin
Exclude changelog from pre-commit spellcheck (#1083) @benfred
Add GitHub Actions Workflows. (#1076) @bdice
Adding uninstall option to build.sh (#1075) @cjnolet
Use doctest for testing python example docstrings (#1073) @benfred
Minor cython fixes / cleanup (#1072) @benfred
IVF-PQ: tweak launch configuration (#1069) @achirkin
Unpin dask and distributed for development (#1068) @galipremsagar
Bifurcate Dependency Lists (#1065) @ajschmidt8
Add support for 64bit svdeig (#1060) @lowener
switch mma instruction shape to 1684 from current 1688 for 3xTF32 L2/cosine kernel (#1057) @mdoijade
Make IVF-PQ build index in batches when necessary (#1056) @achirkin
Remove unused setuputils modules (#1053) @vyasr
Branch 23.02 merge 22.12 (#1051) @benfred
Shared-memory-cached kernel for reduce_cols_by_key to limit atomic conflicts (#1050) @Nyrio
Unify use of common functors (#1049) @Nyrio
Replace k-means++ CPU bottleneck with a random::discrete prim (#1039) @Nyrio
Add python bindings for kmeans fit (#1016) @benfred
Add MaskedL2NN (#838) @ahendriksen
Move contractions tiling logic outside of Contractions_NT (#837) @ahendriksen

raft 22.12.00 (8 Dec 2022)

🚨 Breaking Changes

Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
Remove make_mdspan template for memory_type enum (#1005) @wphicks
ivf-pq performance tweaks (#926) @achirkin

🐛 Bug Fixes

fusedL2NN: Add input alignment checks (#1045) @achirkin
Fix fusedL2NN bug that can happen when the same point appears in both x and y (#1040) @Nyrio
Fix trivial deprecated header includes (#1034) @achirkin
Suppress ptxas stack size warning in Debug mode (#1033) @tfeher
Don't use CMake 3.25.0 as it has a FindCUDAToolkit show stopping bug (#1029) @robertmaynard
Fix for gemmi deprecation (#1020) @lowener
Remove make_mdspan template for memory_type enum (#1005) @wphicks
Add except + to cython extern cdef declarations (#1001) @benfred
Changing Overloads for GCC 11/12 bug (#995) @divyegala
Changing Overloads for GCC 11/12 bugs (#992) @divyegala
Fix pylibraft docstring example code (#980) @benfred
Update raft tests to compile with C++17 features enabled (#973) @robertmaynard
Making ivf flat gtest invoke mdspanified APIs (#955) @cjnolet
Updates to kmeans public API to fix cuml (#932) @cjnolet
Fix logger (vsnprintf consumes args) (#917) @Nyrio
Adding missing include for device mdspan in mean_squared_error.cuh (#906) @cjnolet

📖 Documentation

Add links to the docs site in the README (#1042) @benfred
Moving contributing and developer guides to main docs (#1006) @cjnolet
Update compiler flags in build docs (#999) @cjnolet
Updating minimum required gcc version (#993) @cjnolet
important doc updates for core, cluster, and neighbors (#933) @cjnolet

🚀 New Features

ANN refinement Python wrapper (#1052) @tfeher
Add ANN refinement method (#1038) @tfeher
IVF-Flat: make adaptive-centers behavior optional (#1019) @achirkin
Add wheel builds (#1013) @vyasr
Update cuSparse wrappers to avoid deprecated functions (#989) @wphicks
Provide memory_type enum (#984) @wphicks
Add Tests for kmeans API (#982) @lowener
mdspanifying weighted_mean and add raft::stats tests (#910) @lowener
Implement raft::stats API with mdspan (#802) @lowener

🛠️ Improvements

Pin dask and distributed for release (#1062) @galipremsagar
IVF-PQ: use device properties helper (#1035) @achirkin
Make ucx linkage explicit and add a new CMake target for it (#1032) @vyasr
Fixing broken doc functions and improving coverage (#1030) @cjnolet
Expose cluster_cost to python (#1028) @benfred
Adding lightweight cai_wrapper to reduce boilerplate (#1027) @cjnolet
Change raft docs theme to pydata-sphinx-theme (#1026) @galipremsagar
Revert " Pin dask and distributed for release" (#1023) @galipremsagar
Pin dask and distributed for release (#1022) @galipremsagar
Replace dots_along_rows with rowNorm and improve coalescedReduction performance (#1011) @Nyrio
Moving TestDeviceBuffer to pylibraft.common.device_ndarray (#1008) @cjnolet
Add codespell as a linter (#1007) @benfred
Fix environment channels (#996) @bdice
Automatically sync handle when not passed to pylibraft functions (#987) @benfred
Replace normalize_rows in ann_utils.cuh by a new rowNormalize prim and improve performance for thin matrices (small n_cols) (#979) @Nyrio
Forward merge 22.10 into 22.12 (#978) @vyasr
Use new rapids-cmake functionality for rpath handling. (#976) @vyasr
Update cuda-python dependency to 11.7.1 (#975) @galipremsagar
IVF-PQ Python wrappers (#970) @tfeher
Remove unnecessary requirements for raft-dask. (#969) @vyasr
Expose linalg::dot in public API (#968) @benfred
Fix kmeans cluster templates (#966) @lowener
Run linters using pre-commit (#965) @benfred
linewiseop padded span test (#964) @mfoerste4
Add unittest for linalg::mean_squared_error (#961) @benfred
Exposing fused l2 knn to public APIs (#959) @cjnolet
Remove a left over print statement from pylibraft (#958) @betatim
Switch to using rapids-cmake for gbench. (#954) @vyasr
Some cleanup of k-means internals (#953) @cjnolet
Remove stale labeler (#951) @raydouglass
Adding optional handle to each public API function (along with example) (#947) @cjnolet
Improving documentation across the board. Adding quick-start to breathe docs. (#943) @cjnolet
Add unittest for linalg::axpy (#942) @benfred
Add cutlass 3xTF32,DMMA based L2/cosine distance kernels for SM 8.0 or higher (#939) @mdoijade
Calculate max cluster size correctly for IVF-PQ (#938) @tfeher
Add tests for raft::matrix (#937) @lowener
Add fusedL2NN benchmark (#936) @Nyrio
ivf-pq performance tweaks (#926) @achirkin
Adding fused_l2_nn_argmin wrapper to Pylibraft (#924) @cjnolet
Moving kernel gramm primitives to raft::distance::kernels (#920) @cjnolet
kmeans improvements: random initialization on GPU, NVTX markers, no batching when using fusedL2NN (#918) @Nyrio
Moving raft::spatial::knn -> raft::neighbors (#914) @cjnolet
Create cub-based argmin primitive and replace argmin_along_rows in ANN kmeans (#912) @Nyrio
Replace map_along_rows with matrixVectorOp (#911) @Nyrio
Integrate accumulate_into_selected from ANN utils into linalg::reduce_rows_by_keys (#909) @Nyrio
Re-enabling Fused L2 NN specializations and renaming cub::KeyValuePair -> raft::KeyValuePair (#905) @cjnolet
Unpin dask and distributed for development (#886) @galipremsagar
Adding padded layout 'layout_padded_general' (#725) @mfoerste4

raft 22.10.00 (12 Oct 2022)

🚨 Breaking Changes

Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
Remove type punning from TxN_t (#781) @wphicks
ivf_flat::index: hide implementation details (#747) @achirkin

🐛 Bug Fixes

ivf-pq integration: hotfixes (#891) @achirkin
Removing cub symbol from libraft-distance instantiation. (#887) @cjnolet
ivf-pq post integration hotfixes (#878) @achirkin
Fixing a few compile errors in new APIs (#874) @cjnolet
Include knn.cuh in knn.cu benchmark source for finding brute_force_knn (#855) @teju85
Do not use strcpy to copy 2 char (#848) @mhoemmen
rng_state not including necessary cstdint (#839) @MatthiasKohl
Fix integer overflow in ANN kmeans (#835) @Nyrio
Add alignment to the TxN_t vectorized type (#792) @achirkin
Fix adj_to_csr_kernel (#785) @ahendriksen
Use rapids-cmake 22.10 best practice for RAPIDS.cmake location (#784) @robertmaynard
Remove type punning from TxN_t (#781) @wphicks
Various fixes for build.sh (#771) @vyasr

📖 Documentation

Fix target names in build.sh help text (#879) @Nyrio
Document that minimum required CMake version is now 3.23.1 (#841) @robertmaynard

🚀 New Features

mdspanify raft::random functions uniformInt, normalTable, fill, bernoulli, and scaled_bernoulli (#897) @mhoemmen
mdspan-ify several raft::random rng functions (#857) @mhoemmen
Develop new mdspan-ified multi_variable_gaussian interface (#845) @mhoemmen
Mdspanify permute (#834) @mhoemmen
mdspan-ify rmat_rectangular_gen (#833) @mhoemmen
mdspanify sampleWithoutReplacement (#830) @mhoemmen
mdspan-ify make_regression (#811) @mhoemmen
Updating raft::linalg APIs to use mdspan (#809) @divyegala
Integrate KNN implementation: ivf-pq (#789) @achirkin

🛠️ Improvements

Some fixes for build.sh (#901) @cjnolet
Revert recent fused l2 nn instantiations (#899) @cjnolet
Update Python build instructions (#898) @betatim
Adding ninja and cxx compilers to conda dev dependencies (#893) @cjnolet
Output non-normalized distances in IVF-PQ and brute-force KNN (#892) @Nyrio
Readme updates for 22.10 (#884) @cjnolet
Breaking apart benchmarks into individual binaries (#883) @cjnolet
Pin dask and distributed for release (#858) @galipremsagar
Mdspanifying (currently tested) raft::matrix (#846) @cjnolet
Separating _RAFT_HOST and _RAFT_DEVICE macros (#836) @cjnolet
Updating cpu job in hopes it speeds up python cpu builds (#828) @cjnolet
Mdspan-ifying raft::spatial (#827) @cjnolet
Fixing init.py for handle and stream (#826) @cjnolet
Moving a few more things around (#822) @cjnolet
Use fusedL2NN in ANN kmeans (#821) @Nyrio
Separating test executables (#820) @cjnolet
Separating mdspan/mdarray infra into host_* and device_* variants (#810) @cjnolet
Fix malloc/delete mismatch (#808) @mhoemmen
Renaming pyraft -> raft-dask (#801) @cjnolet
Branch 22.10 merge 22.08 (#800) @cjnolet
Statically link all CUDA toolkit libraries (#797) @trxcllnt
Minor follow-up fixes for ivf-flat (#796) @achirkin
KMeans benchmarks (cuML + ANN implementations) and fix for IndexT=int64_t (#795) @Nyrio
Optimize fusedL2NN when data is skinny (#794) @ahendriksen
Complete the deprecation of duplicated hpp headers (#793) @ahendriksen
Prepare parts of the balanced kmeans for ivf-pq (#788) @achirkin
Unpin dask and distributed for development (#783) @galipremsagar
Exposing python wrapper for the RMAT generator logic (#778) @teju85
Device, Host, Managed Accessor Types for mdspan (#776) @divyegala
Fix Forward-Merger Conflicts (#768) @ajschmidt8
Fea 2208 kmeans use specializations (#760) @cjnolet
ivf_flat::index: hide implementation details (#747) @achirkin

raft 22.08.00 (17 Aug 2022)

🚨 Breaking Changes

Update mdspan to account for changes to extents (#751) @divyegala
Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
Integrate KNN implementation: ivf-flat (#652) @achirkin
Moving kmeans from cuml to Raft (#605) @lowener

🐛 Bug Fixes

Relax ivf-flat test recall thresholds (#766) @achirkin
Restrict the use of ] to CXX 20 only. (#764) @trivialfis
Update rapids-cmake version for pyraft in update-version.sh (#749) @vyasr

📖 Documentation

Use documented header template for doxygen (#773) @galipremsagar
Switch language from None to "en" in docs build (#721) @galipremsagar

🚀 New Features

Update mdspan to account for changes to extents (#751) @divyegala
Implement matrix transpose with mdspan. (#739) @trivialfis
Implement unravel_index for row-major array. (#723) @trivialfis
Integrate KNN implementation: ivf-flat (#652) @achirkin

🛠️ Improvements

Use common js and css code (#779) @galipremsagar
Pin dask & distributed for release (#772) @galipremsagar
Move cmake to the build section. (#763) @vyasr
Adding old kmeans impl back in (as kmeans_deprecated) (#761) @cjnolet
Fix for KMeans raw pointers API (#758) @lowener
Fix KMeans (#756) @divyegala
Add inline to nccl_sync_stream() (#750) @seunghwak
Replace csr_adj_graph functions with faster equivalent (#746) @ahendriksen
Add wrapper functions for ncclGroupStart() and ncclGroupEnd() (#742) @seunghwak
Fix variadic template type check for mdarrays (#741) @hlinsen
RMAT rectangular graph generator (#738) @teju85
Update conda recipes to UCX 1.13.0 (#736) @pentschev
Add warp-aggregated atomic increment (#735) @ahendriksen
fix logic bug in include_checker.py utility (#734) @grlee77
Support 32bit and unsigned indices in bruteforce KNN (#730) @achirkin
Ability to use ccache to speedup local builds (#729) @teju85
Pin max version of cuda-python to 11.7.0 (#728) @Ethyling
Always add raft::raft_nn_lib and raft::raft_distance_lib aliases (#727) @trxcllnt
Add several type aliases and helpers for creating mdarrays (#726) @achirkin
fix nans in naive kl divergence kernel introduced by div by 0. (#724) @mdoijade
Use rapids-cmake for cuco (#722) @vyasr
Update Python classifiers. (#719) @bdice
Fix sccache (#718) @Ethyling
Introducing raft::mdspan as an alias (#715) @divyegala
Update cuco version (#714) @vyasr
Update conda environment pinnings and update-versions.sh. (#713) @bdice
Branch 22.08 merge branch 22.06 (#712) @cjnolet
Testing conda compilers (#705) @cjnolet
Unpin dask & distributed for development (#704) @galipremsagar
Avoid shadowing CMAKE_ARGS variable in build.sh (#701) @vyasr
Use unique ptr in print_device_vector (#695) @lowener
Add missing Thrust includes (#678) @bdice
Consolidate C++ conda recipes and add libraft-tests package (#641) @Ethyling
Moving kmeans from cuml to Raft (#605) @lowener

raft 22.06.00 (7 Jun 2022)

🚨 Breaking Changes

Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl

🐛 Bug Fixes

For fixing the cuGraph test failures with PCG (#690) @vinaydes
Fix excessive memory used in selection test (#689) @achirkin
Revert print vector changes because of std::vector<bool> (#681) @lowener
fix race in fusedL2knn smem read/write by adding a syncwarp (#679) @mdoijade
gemm: fix parameter C mistakenly set as const (#664) @achirkin
Fix SelectionTest: allow different indices when keys are equal. (#659) @achirkin
Revert recent cmake updates (#657) @cjnolet
Don't install component dependency files in raft-header only mode (#655) @robertmaynard
Rng: removed cyclic dependency creating hard-to-debug compiler errors (#639) @MatthiasKohl
Fixing raft compile bug w/ RNG changes (#634) @cjnolet
Get libcudacxx from cuco (#632) @trxcllnt
RNG API fixes (#630) @MatthiasKohl
Fix mdspan accessor mixin offset policy. (#628) @trivialfis
Branch 22.06 merge 22.04 (#625) @cjnolet
fix issue in fusedL2knn which happens when rows are multiple of 256 (#604) @mdoijade

🚀 New Features

Restore changes from #653 and #655 and correct cmake component dependencies (#686) @robertmaynard
Adding handle and stream to pylibraft (#683) @cjnolet
Map CMake install components to conda library packages (#653) @robertmaynard
Rng: expose host-rng-state in host-only API (#609) @MatthiasKohl
mdspan/mdarray template functions and utilities (#601) @divyegala

🛠️ Improvements

Change build.sh to find C++ library by default (#697) @vyasr
Pin dask and distributed for release (#693) @galipremsagar
Pin dask & distributed for release (#680) @galipremsagar
Improve logging (#673) @achirkin
Fix minor errors in CMake configuration (#662) @vyasr
Pulling mdspan fork (from official rapids repo) into raft to remove dependency (#649) @cjnolet
Fixing the unit test issue(s) in RAFT (#646) @vinaydes
Build pyraft with scikit-build (#644) @vyasr
Some fixes to pairwise distances for cupy integration (#643) @cjnolet
Require UCX 1.12.1+ (#638) @jakirkham
Updating raft rng host public API and adding docs (#636) @cjnolet
Build pylibraft with scikit-build (#633) @vyasr
Add cuda_lib_dir to library_dirs, allow changing UCX/RMM/Thrust/spdlog locations via envvars in setup.py (#624) @trxcllnt
Remove perf prints from MST (#623) @divyegala
Enable components installation using CMake (#621) @Ethyling
Allow nullptr as input-indices argument of select_k (#618) @achirkin
Update CMake pinning to allow newer CMake versions (#617) @vyasr
Unpin dask & distributed for development (#616) @galipremsagar
Improve performance of select-top-k RADIX implementation (#615) @achirkin
Moving more prims benchmarks to RAFT (#613) @cjnolet
Allow enabling NVTX markers by downstream projects after install (#610) @achirkin
Improve performance of select-top-k WARP_SORT implementation (#606) @achirkin
Enable building static libs (#602) @trxcllnt
Update ucx-py version (#596) @ajschmidt8
Fix merge conflicts (#587) @ajschmidt8
Making cuco, thrust, and mdspan optional dependencies. (#585) @cjnolet
Some RBC3D fixes (#530) @cjnolet

raft 22.04.00 (6 Apr 2022)

🚨 Breaking Changes

Moving some of the remaining linalg prims from cuml (#502) @cjnolet
Fix badly merged cublas wrappers (#492) @achirkin
Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet
Cleaning up cusparse_wrappers (#441) @cjnolet
Improvents to RNG (#434) @vinaydes
Remove RAFT memory management (#400) @viclafargue
LinAlg impl in detail (#383) @divyegala

🐛 Bug Fixes

Pin cmake in conda recipe to <3.23 (#600) @dantegd
Fix make_device_vector_view (#595) @lowener
Update cuco version. (#592) @vyasr
Fixing raft headers dir (#574) @cjnolet
Update update-version.sh (#560) @raydouglass
find_package(raft) can now be called multiple times safely (#532) @robertmaynard
Allocate sufficient memory for Hungarian if number of batches > 1 (#531) @ChuckHastings
Adding lap.hpp back (with deprecation) (#529) @cjnolet
raft-config is idempotent no matter RAFT_COMPILE_LIBRARIES value (#516) @robertmaynard
Call initialize() in mpi_comms_t constructor. (#506) @seunghwak
Improve row-major meanvar kernel via minimizing atomicCAS locks (#489) @achirkin
Adding destructor for std comms and using nccl allreduce for barrier in mpi comms (#473) @cjnolet

📖 Documentation

Updating docs for 22.04 (#566) @cjnolet

🚀 New Features

Add benchmarks (#549) @achirkin
Unify weighted mean code (#514) @lowener
single-pass raft::stats::meanvar (#472) @achirkin
Move random package of cuML to RAFT (#449) @divyegala
mdspan integration. (#437) @trivialfis
Interruptible execution (#433) @achirkin
make raft sources compilable with clang (#424) @MatthiasKohl
Span implementation. (#399) @trivialfis

🛠️ Improvements

Adding build script for docs (#589) @cjnolet
Temporarily disable new ops-bot functionality (#586) @ajschmidt8
Fix commands to get conda output files (#584) @Ethyling
Link to cuco and add faiss EXCLUDE_FROM_ALL option (#583) @trxcllnt
exposing faiss::faiss (#582) @cjnolet
Pin dask and distributed version (#581) @galipremsagar
removing exclude_from_all from cuco (#580) @cjnolet
Adding INSTALL_EXPORT_SET for cuco, rmm, thrust (#579) @cjnolet
Thrust package name case (#576) @trxcllnt
Add missing thrust includes to transpose.cuh (#575) @zbjornson
Use unanchored clang-format version check (#573) @zbjornson
Fixing accidental removal of thrust target from cmakelists (#571) @cjnolet
Don't add gtest to build export set or generate a gtest-config.cmake (#565) @trxcllnt
Set main label by default (#559) @galipremsagar
Add local conda channel while looking for conda outputs (#558) @Ethyling
Updated dask and distributed to >=2022.02.1 (#557) @rlratzel
Upload packages using testing label for nightlies (#556) @Ethyling
Add .github/ops-bot.yaml config file (#554) @ajschmidt8
Disabling benchmarks building by default. (#553) @cjnolet
KNN select-top-k variants (#551) @achirkin
Adding logger (#550) @cjnolet
clang-tidy support: improved clang run scripts with latest changes (see cugraph-ops) (#548) @MatthiasKohl
Pylibraft for pairwise distances (#540) @cjnolet
mdspan PoC for distance make_blobs (#538) @cjnolet
Include thrust/sort.h in ball_cover.cuh (#526) @akifcorduk
Increase parallelism in allgatherv (#525) @seunghwak
Moving device functions to cuh files and deprecating hpp (#524) @cjnolet
Use dynamic_extent from stdex. (#523) @trivialfis
Updating some of the ci check scripts (#522) @cjnolet
Use shfl_xor in warpReduce for broadcast (#521) @akifcorduk
Fixing Python conda package and installation (#520) @cjnolet
Adding instructions to install from conda and build using CPM (#519) @cjnolet
Implement span storage optimization. (#515) @trivialfis
RNG test fixes and improvements (#513) @vinaydes
Moving scores and metrics over to raft::stats (#512) @cjnolet
Random ball cover in 3d (#510) @cjnolet
Initializing memory in RBC (#509) @cjnolet
Adjusting conda packaging to remove duplicate dependencies (#508) @cjnolet
Moving remaining stats prims from cuml (#507) @cjnolet
Correcting the namespace (#505) @vinaydes
Passing stream through commsplit (#503) @cjnolet
Moving some of the remaining linalg prims from cuml (#502) @cjnolet
Fixing spectral APIs (#496) @cjnolet
Fix badly merged cublas wrappers (#492) @achirkin
Fix integer overflow in distances (#490) @RAMitchell
Reusing shared libs in gpu ci builds (#487) @cjnolet
Adding fatbin to shared libs and fixing conda paths in cpu build (#485) @cjnolet
Add CMake install rule for tests (#483) @ajschmidt8
Adding cpu ci for conda build (#482) @cjnolet
iUpdating codeowners to use new raft codeowners (#480) @cjnolet
Hiding implementation details for lap, clustering, spectral, and label (#477) @cjnolet
Define PTDS via -D to fix cache misses in sccache (#476) @trxcllnt
Unpin dask and distributed (#474) @galipremsagar
Replace ccache with sccache (#471) @ajschmidt8
More README updates (#467) @cjnolet
CUBLAS wrappers with switchable host/device pointer mode (#453) @achirkin
Cleaning up cusparse_wrappers (#441) @cjnolet
Adding conda packaging for libraft and pyraft (#439) @cjnolet
Improvents to RNG (#434) @vinaydes
Hiding implementation details for comms (#409) @cjnolet
Remove RAFT memory management (#400) @viclafargue
LinAlg impl in detail (#383) @divyegala

raft 22.02.00 (2 Feb 2022)

🚨 Breaking Changes

Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
One cudaStream_t instance per raft::handle_t (#291) @divyegala

🐛 Bug Fixes

Removing extra logging from faiss mr (#463) @cjnolet
Pin dask & distributed versions (#455) @galipremsagar
Replace RMM CUDA Python bindings with those provided by CUDA-Python (#451) @shwina
Fix comms memory leak (#436) @seunghwak
Fix C++ doxygen documentation (#426) @achirkin
Fix clang-format style errors (#425) @achirkin
Fix using incorrect macro RAFT_CHECK_CUDA in place of RAFT_CUDA_TRY (#415) @achirkin
Fix CUDA_CHECK_NO_THROW compatibility define (#414) @zbjornson
Disabling fused l2 knn from bfknn (#407) @cjnolet
Disabling expanded fused l2 knn to unblock cuml CI (#404) @cjnolet
Reverting default knn distance to L2Unexpanded for now. (#403) @cjnolet

📖 Documentation

README and build fixes before release (#459) @cjnolet
Updates to Python and C++ Docs (#442) @cjnolet

🚀 New Features

error macros: determining buffer size instead of fixed 2048 chars (#420) @MatthiasKohl
NVTX range helpers (#416) @achirkin

🛠️ Improvements

Splitting fused l2 knn specializations (#461) @cjnolet
Update cuCollection git tag (#447) @seunghwak
Remove libcudacxx patch needed for nvcc 11.4 (#446) @robertmaynard
Unpin dask and distributed (#440) @galipremsagar
Public apis for remainder of matrix and stats (#438) @divyegala
Fix bug in producer-consumer buffer exchange which occurs in UMAP test on GV100 (#429) @mdoijade
Simplify raft component CMake logic, and allow compilation without FAISS (#428) @robertmaynard
Update ucx-py version on release using rvc (#422) @Ethyling
Disabling fused l2 knn again. Not sure how this got added back. (#421) @cjnolet
Adding no throw macro variants (#417) @cjnolet
Remove IncludeCategories from .clang-format (#412) @codereport
fix nan issues in L2 expanded sqrt KNN distances (#411) @mdoijade
Consistent renaming of CHECK_CUDA and *_TRY macros (#410) @cjnolet
Faster matrix-vector-ops (#401) @achirkin
Adding dev conda environment files. (#397) @cjnolet
Update to UCX-Py 0.24 (#392) @pentschev
Branch 21.12 merge 22.02 (#386) @cjnolet
Hiding implementation details for sparse API (#381) @cjnolet
Adding distance specializations (#376) @cjnolet
Use FAISS with RMM (#363) @viclafargue
Add Fused L2 Expanded KNN kernel (#339) @mdoijade
Update .clang-format to be consistent with all other RAPIDS repos (#300) @codereport
One cudaStream_t instance per raft::handle_t (#291) @divyegala

raft 21.12.00 (9 Dec 2021)

🚨 Breaking Changes

Use 64 bit CuSolver API for Eigen decomposition (#349) @lowener

🐛 Bug Fixes

Fixing bad host->device copy (#375) @cjnolet
Fix coalesced access checks in matrix_vector_op (#372) @achirkin
Port libcudacxx patch from cudf (#370) @dantegd
Fixing overflow in expanded distances (#365) @cjnolet

📖 Documentation

Getting doxygen to run (#371) @cjnolet

🛠️ Improvements

Upgrade clang to 11.1.0 (#394) @galipremsagar
Fix Changelog Merge Conflicts for branch-21.12 (#390) @ajschmidt8
Pin max dask & distributed (#388) @galipremsagar
Removing conflict w/ CUDA_CHECK (#378) @cjnolet
Update RAFT test directory (#359) @viclafargue
Update to UCX-Py 0.23 (#358) @pentschev
Hiding implementation details for random, stats, and matrix (#356) @divyegala
README updates (#351) @cjnolet
Use 64 bit CuSolver API for Eigen decomposition (#349) @lowener
Hiding implementation details for distance primitives (dense + sparse) (#344) @cjnolet
Unpin dask & distributed in CI (#338) @galipremsagar

raft 21.10.00 (7 Oct 2021)

🚨 Breaking Changes

Miscellaneous tech debts/cleanups (#286) @viclafargue

🐛 Bug Fixes

Accounting for rmm::cuda_stream_pool not having a constructor for 0 streams (#329) @divyegala
Fix wrong lda parameter in gemv (#327) @achirkin
Fix matrixVectorOp to verify promoted pointer type is still aligned to vectorized load boundary (#325) @viclafargue
Pin rmm to branch-21.10 and remove warnings from kmeans.hpp (#322) @dantegd
Temporarily pin RMM while refactor removes deprecated calls (#315) @dantegd
Fix more warnings (#311) @harrism

📖 Documentation

Fix build doc (#316) @lowener

🚀 New Features

Add Hamming, Jensen-Shannon, KL-Divergence, Russell rao and Correlation distance metrics support (#306) @mdoijade

🛠️ Improvements

Pin max dask and distributed versions to 2021.09.1 (#334) @galipremsagar
Make sure we keep the rapids-cmake and raft cal version in sync (#331) @robertmaynard
Add broadcast with const input iterator (#328) @seunghwak
Fused L2 (unexpanded) kNN kernel for NN <= 64, without using temporary gmem to store intermediate distances (#324) @mdoijade
Update with rapids cmake new features (#320) @robertmaynard
Update to UCX-Py 0.22 (#319) @pentschev
Fix Forward-Merge Conflicts (#318) @ajschmidt8
Enable CUDA device code warnings as errors (#307) @harrism
Remove max version pin for dask & distributed on development branch (#303) @galipremsagar
Warnings are errors (#299) @harrism
Use the new RAPIDS.cmake to fetch rapids-cmake (#298) @robertmaynard
ENH Replace gpuci_conda_retry with gpuci_mamba_retry (#295) @dillon-cullinan
Miscellaneous tech debts/cleanups (#286) @viclafargue
Random Ball Cover Algorithm for 2D Haversine/Euclidean (#213) @cjnolet

raft 21.08.00 (4 Aug 2021)

🚨 Breaking Changes

expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings

🐛 Bug Fixes

Fix support for different input and output types in linalg::reduce (#296) @Nyrio
Const raft handle in sparse bfknn (#280) @cjnolet
Add cuco::cuco to list of linked libraries (#279) @trxcllnt
Use nested include in destination of install headers to avoid docker permission issues (#263) @dantegd
Update UCX-Py version to 0.21 (#255) @pentschev
Fix mst knn test build failure due to RMM device_buffer change (#253) @mdoijade

🚀 New Features

Add chebyshev, canberra, minkowksi and hellinger distance metrics (#276) @mdoijade
Move FAISS ANN wrappers to RAFT (#265) @cjnolet
Remaining sparse semiring distances (#261) @cjnolet
removing divye from codeowners (#257) @divyegala

🛠️ Improvements

Pinning cuco to a specific commit hash for release (#304) @rlratzel
Pin max dask & distributed versions (#301) @galipremsagar
Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels (#292) @mdoijade
Always add faiss library alias if it's missing (#287) @trxcllnt
Use NVIDIA/cuCollections repo again (#284) @trxcllnt
Use the 21.08 branch of rapids-cmake as rmm requires it (#278) @robertmaynard
expose epsilon parameter to allow precision to to be specified (#275) @ChuckHastings
Fix 21.08 forward-merge conflicts (#274) @ajschmidt8
Add lds and sts inline ptx instructions to force vector instruction generation (#273) @mdoijade
Move ANN to RAFT (additional updates) (#270) @cjnolet
Sparse semirings cleanup + hash table & batching strategies (#269) @divyegala
Revert "pin dask versions in CI (#260)" (#264" (#264)) @ajschmidt8
Pass stream to device_scalar::value() calls. (#259) @harrism
Update get_rmm.cmake to better support CalVer (#258) @harrism
Add Grid stride pairwise dist and fused L2 NN kernels (#250) @mdoijade
Fix merge conflicts (#236) @ajschmidt8

raft 21.06.00 (9 Jun 2021)

🐛 Bug Fixes

Update UCX-Py version to 0.20 (#254) @pentschev
cuco git tag update (again) (#248) @seunghwak
Revert PR #232 for 21.06 release (#246) @dantegd
Python comms to hold onto server endpoints (#241) @cjnolet
Fix Thrust 1.12 compile errors (#231) @trxcllnt
Make sure we use CalVer when checking out rapids-cmake (#230) @robertmaynard
Loss of Precision in MST weight alteration (#223) @divyegala

🛠️ Improvements

cuco git tag update (#243) @seunghwak
Update CHANGELOG.md links for calver (#233) @ajschmidt8
Add Grid stride pairwise dist and fused L2 NN kernels (#232) @mdoijade
Updates to enable HDBSCAN (#208) @cjnolet

raft 0.19.0 (21 Apr 2021)

🐛 Bug Fixes

Exposing spectral random seed property (#193) @cjnolet
Fix pointer arithmetic in spmv smem kernel (#183) @lowener
Modify default value for rowMajorIndex and rowMajorQuery in bf-knn (#173) @viclafargue
Remove setCudaMallocWarning() call for libfaiss[@v1.7.0 (#167) @trxcllnt](https://github.com/v1.7.0 (#167) @trxcllnt)
Add const to KNN handle (#157) @hlinsen

🚀 New Features

Moving optimized L2 1-nearest neighbors implementation from cuml (#158) @cjnolet

🛠️ Improvements

Fixing codeowners (#194) @cjnolet
Adjust Hellinger pairwise distance to vaoid NaNs (#189) @lowener
Add column major input support in contractions_nt kernels with new kernel policy for it (#188) @mdoijade
Dice formula correction (#186) @lowener
Scaling knn graph fix connectivities algorithm (#181) @cjnolet
Fixing RAFT CI & a few small updates for SLHC Python wrapper (#178) @cjnolet
Add Precomputed to the DistanceType enum (for cuML DBSCAN) (#177) @Nyrio
Enable matrix::copyRows for row major input (#176) @tfeher
Add Dice distance to distancetype enum (#174) @lowener
Porting over recent updates to distance prim from cuml (#172) @cjnolet
Update KNN (#171) @viclafargue
Adding translations parameter to brute_force_knn (#170) @viclafargue
Update Changelog Link (#169) @ajschmidt8
Map operation (#168) @viclafargue
Updating sparse prims based on recent changes (#166) @cjnolet
Prepare Changelog for Automation (#164) @ajschmidt8
Update 0.18 changelog entry (#163) @ajschmidt8
MST symmetric/non-symmetric output for SLHC (#162) @divyegala
Pass pre-computed colors to MST (#154) @divyegala
Streams upgrade in RAFT handle (RMM backend + create handle from parent's pool) (#148) @afender
Merge branch-0.18 into 0.19 (#146) @dantegd
Add device_send, device_recv, device_sendrecv, device_multicast_sendrecv (#144) @seunghwak
Adding SLHC prims. (#140) @cjnolet
Moving cuml sparse prims to raft (#139) @cjnolet

raft 0.18.0 (24 Feb 2021)

Breaking Changes 🚨

Make NCCL root initialization configurable. (#120) @drobison00

Bug Fixes 🐛

Add idx_t template parameter to matrix helper routines (#131) @tfeher
Eliminate CUDA 10.2 as valid for large svd solving (#129) @wphicks
Update check to allow svd solver on CUDA>=10.2 (#125) @wphicks
Updating gpu build.sh and debugging threads CI issue (#123) @dantegd

New Features 🚀

Adding additional distances (#116) @cjnolet

Improvements 🛠️

Update stale GHA with exemptions & new labels (#152) @mike-wendt
Add GHA to mark issues/prs as stale/rotten (#150) @Ethyling
Prepare Changelog for Automation (#135) @ajschmidt8
Adding Jensen-Shannon and BrayCurtis to DistanceType for Nearest Neighbors (#132) @lowener
Add brute force KNN (#126) @hlinsen
Make NCCL root initialization configurable. (#120) @drobison00
Auto-label PRs based on their content (#117) @jolorunyomi
Add gather & gatherv to raft::comms::comms_t (#114) @seunghwak
Adding canberra and chebyshev to distance types (#99) @cjnolet
Gpuciscripts clean and update (#92) @msadang

RAFT 0.17.0 (10 Dec 2020)

New Features

PR #65: Adding cuml prims that break circular dependency between cuml and cumlprims projects
PR #101: MST core solver
PR #93: Incorporate Date/Nagi implementation of Hungarian Algorithm
PR #94: Allow generic reductions for the map then reduce op
PR #95: Cholesky rank one update prim

Improvements

PR #108: Remove unused old-gpubuild.sh
PR #73: Move DistanceType enum from cuML to RAFT
pr #92: Cleanup gpuCI scripts
PR #98: Adding InnerProduct to DistanceType
PR #103: Epsilon parameter for Cholesky rank one update
PR #100: Add divyegala as codeowner
PR #111: Cleanup gpuCI scripts
PR #120: Update NCCL init process to support root node placement.

Bug Fixes

PR #106: Specify dependency branches to avoid pip resolver failure
PR #77: Fixing CUB include for CUDA < 11
PR #86: Missing headers for newly moved prims
PR #102: Check alignment before binaryOp dispatch
PR #104: Fix update-version.sh
PR #109: Fixing Incorrect Deallocation Size and Count Bugs

RAFT 0.16.0 (Date TBD)

New Features

PR #63: Adding MPI comms implementation
PR #70: Adding CUB to RAFT cmake

Improvements

PR #59: Adding csrgemm2 to cusparse_wrappers.h
PR #61: Add cusparsecsr2dense to cusparse_wrappers.h
PR #62: Adding get_device_allocator to handle.pxd
PR #67: Remove dependence on run-time type info

Bug Fixes

PR #56: Fix compiler warnings.
PR #64: Remove cublas_try from cusolver_wrappers.h
PR #66: Fixing typo get_stream to getStream in handle.pyx
PR #68: Change the type of recvcounts & displs in allgatherv from size_t[] to size_t* and int[] to size_t*, respectively.
PR #69: Updates for RMM being header only
PR #74: Fix std_comms::comm_split bug
PR #79: remove debug print statements
PR #81: temporarily expose internal NCCL communicator

RAFT 0.15.0 (Date TBD)

New Features

PR #12: Spectral clustering.
PR #7: Migrating cuml comms -> raft comms_t
PR #18: Adding commsplit to cuml communicator
PR #15: add exception based error handling macros
PR #29: Add ceildiv functionality
PR #44: Add get_subcomm and set_subcomm to handle_t

Improvements

PR #13: Add RMM_INCLUDE and RMM_LIBRARY options to allow linking to non-conda RMM
PR #22: Preserve order in comms workers for rank initialization
PR #38: Remove #include <cudar_utils.h> from raft/mr/
PR #39: Adding a virtual destructor to raft::handle_t and raft::comms::comms_t
PR #37: Clean-up CUDA related utilities
PR #41: Upgrade to cusparseSpMV(), alg selection, and rectangular matrices.
PR #45: Add Ampere target to cuda11 cmake
PR #47: Use gtest conda package in CMake/build.sh by default

Bug Fixes

PR #17: Make destructor inline to avoid redeclaration error
PR #25: Fix bug in handle_t::get_internal_streams
PR #26: Fix bug in RAFT_EXPECTS (add parentheses surrounding cond)
PR #34: Fix issue with incorrect docker image being used in local build script
PR #35: Remove #include <nccl.h> from raft/error.hpp
PR #40: Preemptively fixed future CUDA 11 related errors.
PR #43: Fixed CUDA version selection mechanism for SpMV.
PR #46: Fix for cpp file extension issue (nvcc-enforced).
PR #48: Fix gtest target names in cmake build gtest option.
PR #49: Skip raft comms test if raft module doesn't exist

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

raft 26.02.00 (4 Feb 2026)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

New Contributors

raft 25.12.00 (10 Dec 2025)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

New Contributors

raft 25.10.00 (8 Oct 2025)

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

raft 25.08.00 (6 Aug 2025)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🛠️ Improvements

raft 25.06.00 (5 Jun 2025)

🚨 Breaking Changes

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

raft 25.04.00 (9 Apr 2025)

🚨 Breaking Changes

🐛 Bug Fixes

🛠️ Improvements

raft 25.02.00 (13 Feb 2025)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

raft 24.12.00 (11 Dec 2024)

🚨 Breaking Changes

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

raft 24.10.00 (9 Oct 2024)

🚨 Breaking Changes

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

raft 24.08.00 (7 Aug 2024)

🚨 Breaking Changes

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

raft 24.06.00 (5 Jun 2024)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

raft 24.04.00 (10 Apr 2024)

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

raft 24.02.00 (12 Feb 2024)

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

raft 23.12.00 (6 Dec 2023)

🐛 Bug Fixes

📖 Documentation