slice B · efficiency frontier
recall vs latency, per engine
Slice A fixes the effort knob and compares engines at one operating point. Slice B does the opposite: it sweeps each engine's effort knob across its whole range, at N = 500K, and plots the recall/latency frontier. Where's the knee? How much latency do you pay for the last point of recall? Is the default knob value right?
data provenance
Where the numbers come from. Same source, same generator, same ground truth for every engine in the comparison.
Simple English Wikipedia, passages ≥ 500 chars truncated to ~400 chars. Public dump preprocessed once and frozen for the bench.
mxbai-embed-large-v1, 1024 dimensions. sentence-transformers via Apple Metal (MPS). Same model used to embed corpus and queries (no cross-model leakage).
1 000 hold-out passages (same set as slice A) re-used at this scale.
Top-100 nearest neighbours computed with exact brute-force cosine over float32 vectors. Computed once per scale, reused by every engine, frozen as a parquet next to the corpus.
recall @ 10 vs p99 latency
Each dot is a (knob-value, recall, latency) point. The line connects the dots in sweep order, tracing one engine's frontier. Lower-right is better: high recall at low latency. Hover any dot for the exact knob value; the steepness of the line tells you how aggressive the recall/latency trade is on that engine.
QPS as the knob opens
Throughput at single-client concurrency as the effort knob grows.
Each engine has a different knob (skeg l_search 50–800,
qdrant ef 32–512, qdrant-pq adds nprobes),
so the lines start at different X positions · that's the knob's
valid range, not a render glitch. Compare lines vertically at the
same X for a fair "at this effort level" view, and use the
frontier above for the apples-to-apples recall-vs-latency picture.
all numbers
Click headers to sort.
| engine | scale | knob | value | recall@10 | recall@100 | p50 µs | p99 µs | qps | rss MiB |
|---|---|---|---|---|---|---|---|---|---|
| chroma-hnsw | 500k | ef | 32 | 0.9788 | 0.9145 | 3859 | 6565 | 255 | 2272.9 |
| chroma-hnsw | 500k | ef | 64 | 0.9805 | 0.9153 | 3871 | 6609 | 255 | 2273.9 |
| chroma-hnsw | 500k | ef | 128 | 0.9854 | 0.9382 | 4346 | 7365 | 227 | 2275.9 |
| chroma-hnsw | 500k | ef | 256 | 0.9958 | 0.9770 | 6478 | 11113 | 155 | 2287.3 |
| chroma-hnsw | 500k | ef | 512 | 0.9992 | 0.9925 | 10046 | 15256 | 102 | 2292.9 |
| qdrant-hnsw | 500k | ef | 32 | 0.9846 | 0.9321 | 2651 | 3588 | 370 | 2347.9 |
| qdrant-hnsw | 500k | ef | 64 | 0.9822 | 0.9245 | 2580 | 3297 | 384 | 2331.8 |
| qdrant-hnsw | 500k | ef | 128 | 0.9888 | 0.9480 | 2712 | 3533 | 367 | 2334.0 |
| qdrant-hnsw | 500k | ef | 256 | 0.9963 | 0.9809 | 3454 | 4717 | 287 | 2369.0 |
| qdrant-hnsw | 500k | ef | 512 | 0.9995 | 0.9939 | 4700 | 6482 | 212 | 2376.0 |
| qdrant-pq | 500k | ef | 32 | 0.7770 | 0.8257 | 2285 | 3103 | 424 | 2380.0 |
| qdrant-pq | 500k | ef | 64 | 0.7817 | 0.8324 | 2252 | 2867 | 435 | 2346.8 |
| qdrant-pq | 500k | ef | 128 | 0.7813 | 0.8406 | 2494 | 3183 | 397 | 2386.0 |
| qdrant-pq | 500k | ef | 256 | 0.7830 | 0.8421 | 2855 | 3730 | 346 | 2392.9 |
| qdrant-pq | 500k | ef | 512 | 0.7815 | 0.8426 | 3507 | 4704 | 282 | 2379.8 |
| qdrant-sq | 500k | ef | 32 | 0.9503 | 0.9242 | 1997 | 3240 | 484 | 2677.1 |
| qdrant-sq | 500k | ef | 64 | 0.9506 | 0.9317 | 2011 | 3058 | 487 | 2885.0 |
| qdrant-sq | 500k | ef | 128 | 0.9547 | 0.9547 | 2191 | 3066 | 447 | 2363.6 |
| qdrant-sq | 500k | ef | 256 | 0.9557 | 0.9679 | 2629 | 3513 | 372 | 2779.7 |
| qdrant-sq | 500k | ef | 512 | 0.9613 | 0.9703 | 3031 | 4269 | 326 | 2789.5 |
| skeg-int8 | 500k | l_search | 50 | 0.9949 | 0.9652 | 754 | 1443 | 1265 | 630.7 |
| skeg-int8 | 500k | l_search | 100 | 0.9950 | 0.9652 | 737 | 1508 | 1309 | 623.3 |
| skeg-int8 | 500k | l_search | 150 | 0.9974 | 0.9881 | 994 | 1797 | 984 | 629.0 |
| skeg-int8 | 500k | l_search | 200 | 0.9987 | 0.9927 | 1188 | 2110 | 825 | 632.1 |
| skeg-int8 | 500k | l_search | 300 | 0.9994 | 0.9963 | 1643 | 2865 | 597 | 636.3 |
| skeg-pq128 | 500k | l_search | 50 | 0.9909 | 0.7294 | 880 | 1357 | 1113 | 219.7 |
| skeg-pq128 | 500k | l_search | 100 | 0.9910 | 0.7291 | 892 | 1376 | 1100 | 229.1 |
| skeg-pq128 | 500k | l_search | 200 | 0.9979 | 0.9202 | 1456 | 2284 | 685 | 221.2 |
| skeg-pq128 | 500k | l_search | 300 | 0.9994 | 0.9683 | 1829 | 3008 | 540 | 217.7 |
| skeg-pq128 | 500k | l_search | 500 | 0.9999 | 0.9861 | 2620 | 4438 | 379 | 218.7 |
| skeg-pq128 | 500k | l_search | 800 | 0.9999 | 0.9867 | 3543 | 5771 | 279 | 224.9 |