Skip to main content

HNSW Tuning

Added in vectlite 0.10.0.

vectlite stores its approximate nearest-neighbour graph using HNSW (Hierarchical Navigable Small World). The defaults are good for most workloads, but four knobs let you trade recall, latency, and ingest throughput.

KnobWhat it controlsWhen to raise itWhen to lower it
mOut-degree per node.Higher recall on hard datasets.Smaller graph / lower memory.
ef_constructionBuild-time search width.Higher recall (slower ingest).Faster ingest (lower recall).
ef_searchQuery-time search width.Higher recall (slower query).Faster query (lower recall).
parallel_insert_thresholdDataset size at which Rayon-backed parallel HNSW insertion kicks in during bulk_ingest. Default 256.Raise it if you only ingest small batches and want determinism.Lower it to parallelise smaller batches.

ef_search is the first knob to try. Changing it is free — it only affects the query path, no rebuild needed.

Tuning an existing database

import vectlite

with vectlite.open("knowledge.vdb") as db:
db.set_ef_search(128) # higher recall, slightly slower query

# or set multiple knobs at once
db.set_index_config(m=32, ef_construction=200, ef_search=128)

print(db.index_config())
# {"m": 32, "ef_construction": 200, "ef_search": 128,
# "parallel_insert_threshold": 256, "tombstone_rebuild_pct": 30}
const { open } = require('vectlite')

const db = open('knowledge.vdb')

db.setEfSearch(128)

db.setIndexConfig({ m: 32, efConstruction: 200, efSearch: 128 })

console.log(db.indexConfig())
// { m: 32, efConstruction: 200, efSearch: 128,
// parallelInsertThreshold: 256, tombstoneRebuildPct: 30 }
Rebuild cost

Changing m or ef_construction triggers a full HNSW rebuild on the next operation. Changing only ef_search (or parallel_insert_threshold / tombstone_rebuild_pct) is free.

Tuning at ingest time

bulk_ingest accepts the same knobs, applied for the duration of the call.

db.bulk_ingest(
records,
batch_size=10_000,
m=32,
ef_construction=200,
ef_search=128,
parallel_insert_threshold=512,
)
db.bulkIngest(records, {
batchSize: 10000,
m: 32,
efConstruction: 200,
efSearch: 128,
parallelInsertThreshold: 512,
})

Async variants (bulkIngestAsync in Node) accept the same options.

Benchmark reference

Synthetic 5,000 × 384 cosine benchmark on M-class macOS (from the upstream CHANGELOG):

ConfigIngest throughput
Pre-0.10 baseline~47 vec/s
0.10.0 defaults~917 vec/s
0.10.0 fast preset~1782 vec/s

The default settings already capture most of the gain. The fast preset trades a few recall points for the highest throughput; the high_recall preset goes the other way.

What the presets do

The Rust core exposes two named presets that the bindings mirror through set_index_config:

Presetmef_constructionef_searchUse when
fastsmallsmallsmallIngest-heavy workloads, recall ≥ 0.9 is acceptable.
high_recalllargelargelargeRecall-critical workloads, query latency budget is generous.

How to choose

  1. Start at defaults. Measure recall against an exact (brute-force) baseline on a held-out query set.
  2. If recall is too low, raise ef_search first.
  3. If recall is still too low at high ef_search, raise m and/or ef_construction — these trigger a rebuild but offer the largest gains.
  4. If ingest is too slow, lower ef_construction, raise parallel_insert_threshold if you want the parallel path to kick in earlier, and look at the throughput tuning guide for the WAL-side knobs.

See also