Skip to content

Tune HNSW Vector Search parameters for optimal performance #39

@kornelrabczak

Description

@kornelrabczak

Summary

Configure and tune the HNSW (Hierarchical Navigable Small World) index parameters for our vector search implementation to optimize accuracy, speed, and memory usage.

Background

Our current setup uses LadybugDB with HNSW indexing for vector similarity search. We have approximately 10,000 records with 384-dimensional vectors (all-MiniLM-L6-v2).

HNSW Parameter Overview

Parameter Description Current Recommended
mu Max degree Upper (highway connections) 30 (default) 24
ml Max degree Lower (street-level connections) 60 (default) 48
pu Sampling rate (nodes in upper graph) 0.05 (5%) 0.1 (10%)
efc Construction effort during indexing 200 (default) 300-400
metric Distance measurement Cosine Cosine

Rationale

  • mu=24: At 10k records, a dense upper graph isn't needed for speed
  • ml=48: Provides high connectivity without over-linking
  • pu=0.1: With 10k nodes, 5% sample = 500 nodes. 10% (1,000 nodes) creates a more robust "highway"
  • efc=300-400: Higher value ensures best neighbors are found during construction; indexing will still complete quickly at this scale

Resource Impact

Resource Estimate
Vector Storage ~30.7 MB
Index Overhead ~3.8 MB
Total RAM ~35 MB
Search Latency 2-5 ms

Tuning Trade-offs Reference

Goal Adjustment
Higher accuracy (Recall) Increase ml (e.g., 80) and efc (e.g., 400)
Faster search speed Decrease ml and mu
Save Memory/RAM Decrease ml and mu
Slow data import Decrease efc

Score Calculation Note

Using Cosine metric with formula: Score = 1 - Distance
If scores cluster too closely (e.g., all 0.85-0.90), consider temperature scaling:
Adjusted Score = e^(-k * Distance) // where k = 2.0 or 3.0

Embedding Model Evaluation

We should benchmark alternative embedding models to ensure we're using the best option for our use case:

Candidates to Test

Model Dimensions Provider Notes
all-mpnet-base-v2 768 Sentence Transformers Current baseline
all-MiniLM-L6-v2 384 Sentence Transformers Faster, smaller, slightly lower quality
bge-large-en-v1.5 1024 BAAI High quality, open source
bge-small-en-v1.5 384 BAAI Compact alternative

Evaluation Criteria

  • Recall@K: Measure retrieval accuracy with ground truth queries
  • Latency: Embedding generation time + search time
  • Memory footprint: Vector storage requirements

Benchmark Methodology

  1. Create evaluation dataset with known relevant document pairs
  2. Generate embeddings with each candidate model
  3. Measure recall at K=5, K=10, K=20
  4. Measure average query latency

Tasks

  • Update HNSW index configuration with recommended values
  • Benchmark search accuracy/recall before and after
  • Measure search latency impact
  • Set up embedding model evaluation framework
  • Test at least 3 alternative embedding models
  • Document final configuration in codebase

References

  • Standard profiles: Light & Fast (16/32/100), Standard (30/60/200), High Recall (48/96/400), Billion-Scale (64/128/800)
  • Rule of thumb: Always ensure efc >= ml * 2
  • MTEB Leaderboard for embedding model benchmarks

Metadata

Metadata

Assignees

No one assigned

    Labels

    0.2.xIssues for the 0.2 releaseenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions