Building a Sub-Millisecond Vector Database in Rust/WASM

I recently built EdgeVec, a high-performance vector database that runs entirely in the browser. Here’s how I achieved sub-millisecond search times with WebAssembly.

The Challenge

Vector databases are everywhere in AI applications – they power semantic search, RAG systems, and recommendation engines. But they typically require a server. I wanted to explore: can we get comparable performance entirely in the browser?

Performance Results

Scale Float32 Quantized (SQ8)
10k vectors 203 µs 88 µs
50k vectors 480 µs 167 µs
100k vectors 572 µs 329 µs

This is comparable to server-side solutions, but running entirely client-side.

Technical Approach

1. HNSW Algorithm

I implemented Hierarchical Navigable Small World graphs – the same algorithm used by production vector databases like Weaviate and Qdrant.

2. Scalar Quantization

Instead of storing 32-bit floats (768 dimensions × 4 bytes = 3KB per vector), I compress to 8-bit integers. This gives 3.6x memory savings with minimal accuracy loss.

3. SIMD Optimization

Using Rust’s portable SIMD, I vectorize distance calculations:

  • AVX2 on native (x86_64)
  • simd128 on WASM (where available)

4. WASM Compilation

Built with wasm-pack, the final bundle is just 148 KB gzipped – small enough for any web app.

Use Cases

Where does client-side vector search make sense?

  • Privacy: Embeddings never leave the device
  • Latency: Zero network round-trip
  • Offline: Works without internet
  • Cost: No server bills

Perfect for browser extensions, local-first apps, and privacy-preserving RAG.

Try It

import init, { EdgeVec, EdgeVecConfig } from 'edgevec';

await init();
const config = new EdgeVecConfig(768);
const index = new EdgeVec(config);
index.insert(new Float32Array(768).fill(0.1));
const results = index.search(query, 10);
// results: [{ id: 0, score: 0.0 }, ...]

GitHub: https://github.com/matte1782/edgevec
npm: npm install edgevec

This is an alpha release – feedback welcome!

Similar Posts