Roadmap

quantvec is built in waves with a doer / verifier / devil’s-advocate subagent workflow; each wave merges only when the gate is green (typecheck, lint, tests, coverage) and reviewed.

Shipped

Core math — seeded RNG, orthonormal rotation (Householder QR), Beta pdf/cdf/quantile, Lloyd-Max codebooks validated against a scipy reference and the paper’s distortion bounds.
Encode pipeline — normalize → rotate → quantize → RaBitQ scale.
Flat search — per-query nibble-LUT scan, three metrics, masking, bounded-heap top-k.
Index classes — TurboQuantIndex (positional) and IdMapIndex (stable number/string/bigint ids), both 100% covered.
Persistence — single versioned QVEC format with untrusted-input-hardened loading; Node fs helpers.
Bit-packed serialization — codes stored at true 2/3/4 bits on disk (7.9–15.7× compression, on par with native TurboQuant implementations).
FWHT rotation — for power-of-two dims, an exact Randomized Hadamard Transform (O(d·log d) build and apply; ~25× faster encode than the dense O(d²)/O(d³) Householder rotation) with equal-or-better recall. createRotation picks it automatically; non-power-of-two dims keep the dense rotation (truncation would degrade recall). A deterministic function of dim — no serialization change.
TQ+ per-coordinate calibration — opt-in (calibrate: true), auto-fit from the first ≥1000-vector batch and frozen. Remaps each rotated coordinate onto the canonical marginal; data-dependent (helps real embeddings, neutral-to-negative on well-conditioned synthetic data), hence off by default.
WASM scoring kernel — an AssemblyScript kernel (assembly/index.ts) with the index’s codes resident in linear memory (uploaded once per mutation, not per query). It accumulates in f64 over the same order as the scalar oracle, so results are bit-identical (an exact acceleration, ~1.3× faster query), and it is base64-inlined with runtime feature-detection and an automatic pure-TS fallback — the kernel is an optimization, never a dependency.
v128 FastScan — a SIMD swizzle kernel (assembly/index.ts + wasm/kernel.ts): codes stored in a 16-vector blocked layout; per-query u8 LUT built with a global affine map so dim·max ≤ 65535 (no u16 overflow); v128.swizzle performs 16 simultaneous table lookups; u16 accumulators rank a candidate pool; the pool is rescored exactly via the exact WASM kernel. Enabled with fastscan: true (4-bit only; falls back to the exact kernel otherwise). Measured speedup: 1.8× on 10k vectors, 5.7× on 50k vectors (gain scales with n; the rescore-pool overhead is constant).
Ergonomic createCollection — qdrant-style Collection<P> with typed payloads, upsert, delete, get, and a must/should/must_not filter DSL compiled to slot masks. Zero overhead: thin wrapper over IdMapIndex.
Real-dataset benchmarks — SIFT-small (10k × 128-d, L2, dataset ground truth), GloVe-200 (100k of 1.18M × 200-d, cosine, brute-force sub-sample ground truth), and dbpedia-OpenAI-100k (1536-d OpenAI text-embedding-ada-002, cosine, brute-force sub-sample ground truth). Results in benchmarks/results/. GloVe-200 exercises the dense Householder rotation (dim=200, non-power-of-two); dbpedia-OpenAI-100k exercises the FWHT path (dim=1536, power-of-two).
IVF / coarse quantizer — opt-in (ivf: { nlist, nprobe? }) sublinear search for large corpora: seeded k-means++ cells (spherical for cosine/dot, L2 for euclidean) trained from the first ≥ nlist-vector batch and frozen (same contract as calibration), posting lists kept in lockstep with swap-remove (full remove parity), per-query nprobe knob, and serialization in format v2. The probed-cell scan reuses the exact scalar kernel, so nprobe = nlist reproduces the flat scan bit-for-bit. Measured (20k × 768-d clustered, 4-bit): 11.4× QPS at the flat scan’s recall (nprobe = 8/128), 22.8× at nprobe = 1. Full parity through IdMapIndex and Collection (ivf config + nprobe search param).

Planned

(none — all planned waves have shipped)

Non-goals (for now)

An HNSW graph index — quantvec is deliberately a flat quantized index; IVF (shipped) is the path to scale.
A trained/learned codebook — the data-oblivious, zero-training property is the point. (The IVF coarse quantizer trains only the cell partition, never the per-coordinate codebook — codes stay data-oblivious.)