Building a Sub-Millisecond Vector Database in Rust/WASM
I recently built EdgeVec, a high-performance vector database that runs entirely in the browser. Here’s how I achieved sub-millisecond search times with WebAssembly.
The Challenge
Vector databases are everywhere in AI applications – they power semantic search, RAG systems, and recommendation engines. But they typically require a server. I wanted to explore: can we get comparable performance entirely in the browser?
Performance Results
| Scale | Float32 | Quantized (SQ8) |
|---|---|---|
| 10k vectors | 203 µs | 88 µs |
| 50k vectors | 480 µs | 167 µs |
| 100k vectors | 572 µs | 329 µs |
This is comparable to server-side solutions, but running entirely client-side.
Technical Approach
1. HNSW Algorithm
I implemented Hierarchical Navigable Small World graphs – the same algorithm used by production vector databases like Weaviate and Qdrant.
2. Scalar Quantization
Instead of storing 32-bit floats (768 dimensions × 4 bytes = 3KB per vector), I compress to 8-bit integers. This gives 3.6x memory savings with minimal accuracy loss.
3. SIMD Optimization
Using Rust’s portable SIMD, I vectorize distance calculations:
- AVX2 on native (x86_64)
- simd128 on WASM (where available)
4. WASM Compilation
Built with wasm-pack, the final bundle is just 148 KB gzipped – small enough for any web app.
Use Cases
Where does client-side vector search make sense?
- Privacy: Embeddings never leave the device
- Latency: Zero network round-trip
- Offline: Works without internet
- Cost: No server bills
Perfect for browser extensions, local-first apps, and privacy-preserving RAG.
Try It
import init, { EdgeVec, EdgeVecConfig } from 'edgevec';
await init();
const config = new EdgeVecConfig(768);
const index = new EdgeVec(config);
index.insert(new Float32Array(768).fill(0.1));
const results = index.search(query, 10);
// results: [{ id: 0, score: 0.0 }, ...]
GitHub: https://github.com/matte1782/edgevec
npm: npm install edgevec
This is an alpha release – feedback welcome!