Tune HNSW Vector Search parameters for optimal performance

## Summary
Configure and tune the HNSW (Hierarchical Navigable Small World) index parameters for our vector search implementation to optimize accuracy, speed, and memory usage.

## Background
Our current setup uses LadybugDB with HNSW indexing for vector similarity search. We have approximately **10,000 records** with **384-dimensional vectors** (all-MiniLM-L6-v2).

## HNSW Parameter Overview
| Parameter | Description | Current | Recommended |
|-----------|-------------|---------|-------------|
| `mu` | Max degree Upper (highway connections) | 30 (default) | 24 |
| `ml` | Max degree Lower (street-level connections) | 60 (default) | 48 |
| `pu` | Sampling rate (nodes in upper graph) | 0.05 (5%) | 0.1 (10%) |
| `efc` | Construction effort during indexing | 200 (default) | 300-400 |
| `metric` | Distance measurement | Cosine | Cosine |

## Rationale
- **mu=24**: At 10k records, a dense upper graph isn't needed for speed
- **ml=48**: Provides high connectivity without over-linking
- **pu=0.1**: With 10k nodes, 5% sample = 500 nodes. 10% (1,000 nodes) creates a more robust "highway"
- **efc=300-400**: Higher value ensures best neighbors are found during construction; indexing will still complete quickly at this scale

## Resource Impact
| Resource | Estimate |
|----------|----------|
| Vector Storage | ~30.7 MB |
| Index Overhead | ~3.8 MB |
| **Total RAM** | **~35 MB** |
| Search Latency | 2-5 ms |

## Tuning Trade-offs Reference
| Goal | Adjustment |
|------|------------|
| Higher accuracy (Recall) | Increase `ml` (e.g., 80) and `efc` (e.g., 400) |
| Faster search speed | Decrease `ml` and `mu` |
| Save Memory/RAM | Decrease `ml` and `mu` |
| Slow data import | Decrease `efc` |

## Score Calculation Note
Using Cosine metric with formula: `Score = 1 - Distance`
If scores cluster too closely (e.g., all 0.85-0.90), consider temperature scaling:
Adjusted Score = e^(-k * Distance)  // where k = 2.0 or 3.0

## Embedding Model Evaluation
We should benchmark alternative embedding models to ensure we're using the best option for our use case:

### Candidates to Test
| Model | Dimensions | Provider | Notes |
|-------|------------|----------|-------|
| all-mpnet-base-v2 | 768 | Sentence Transformers | Current baseline |
| all-MiniLM-L6-v2 | 384 | Sentence Transformers | Faster, smaller, slightly lower quality |
| bge-large-en-v1.5 | 1024 | BAAI | High quality, open source |
| bge-small-en-v1.5 | 384 | BAAI | Compact alternative |

### Evaluation Criteria
- **Recall@K**: Measure retrieval accuracy with ground truth queries
- **Latency**: Embedding generation time + search time
- **Memory footprint**: Vector storage requirements

### Benchmark Methodology
1. Create evaluation dataset with known relevant document pairs
2. Generate embeddings with each candidate model
3. Measure recall at K=5, K=10, K=20
4. Measure average query latency

## Tasks
- [ ] Update HNSW index configuration with recommended values
- [ ] Benchmark search accuracy/recall before and after
- [ ] Measure search latency impact
- [ ] Set up embedding model evaluation framework
- [ ] Test at least 3 alternative embedding models
- [ ] Document final configuration in codebase

## References
- Standard profiles: Light & Fast (16/32/100), Standard (30/60/200), High Recall (48/96/400), Billion-Scale (64/128/800)
- Rule of thumb: Always ensure `efc >= ml * 2`
- [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for embedding model benchmarks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tune HNSW Vector Search parameters for optimal performance #39

Summary

Background

HNSW Parameter Overview

Rationale

Resource Impact

Tuning Trade-offs Reference

Score Calculation Note

Embedding Model Evaluation

Candidates to Test

Evaluation Criteria

Benchmark Methodology

Tasks

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parameter	Description	Current	Recommended
`mu`	Max degree Upper (highway connections)	30 (default)	24
`ml`	Max degree Lower (street-level connections)	60 (default)	48
`pu`	Sampling rate (nodes in upper graph)	0.05 (5%)	0.1 (10%)
`efc`	Construction effort during indexing	200 (default)	300-400
`metric`	Distance measurement	Cosine	Cosine

Resource	Estimate
Vector Storage	~30.7 MB
Index Overhead	~3.8 MB
Total RAM	~35 MB
Search Latency	2-5 ms

Goal	Adjustment
Higher accuracy (Recall)	Increase `ml` (e.g., 80) and `efc` (e.g., 400)
Faster search speed	Decrease `ml` and `mu`
Save Memory/RAM	Decrease `ml` and `mu`
Slow data import	Decrease `efc`

Model	Dimensions	Provider	Notes
all-mpnet-base-v2	768	Sentence Transformers	Current baseline
all-MiniLM-L6-v2	384	Sentence Transformers	Faster, smaller, slightly lower quality
bge-large-en-v1.5	1024	BAAI	High quality, open source
bge-small-en-v1.5	384	BAAI	Compact alternative

Tune HNSW Vector Search parameters for optimal performance #39

Description

Summary

Background

HNSW Parameter Overview

Rationale

Resource Impact

Tuning Trade-offs Reference

Score Calculation Note

Embedding Model Evaluation

Candidates to Test

Evaluation Criteria

Benchmark Methodology

Tasks

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions