HNSW Tuning

Added in vectlite 0.10.0.

vectlite stores its approximate nearest-neighbour graph using HNSW (Hierarchical Navigable Small World). The defaults are good for most workloads, but four knobs let you trade recall, latency, and ingest throughput.

Knob	What it controls	When to raise it	When to lower it
`m`	Out-degree per node.	Higher recall on hard datasets.	Smaller graph / lower memory.
`ef_construction`	Build-time search width.	Higher recall (slower ingest).	Faster ingest (lower recall).
`ef_search`	Query-time search width.	Higher recall (slower query).	Faster query (lower recall).
`parallel_insert_threshold`	Dataset size at which Rayon-backed parallel HNSW insertion kicks in during `bulk_ingest`. Default `256`.	Raise it if you only ingest small batches and want determinism.	Lower it to parallelise smaller batches.

ef_search is the first knob to try. Changing it is free — it only affects the query path, no rebuild needed.

Tuning an existing database

import vectlite

with vectlite.open("knowledge.vdb") as db:
    db.set_ef_search(128)   # higher recall, slightly slower query

    # or set multiple knobs at once
    db.set_index_config(m=32, ef_construction=200, ef_search=128)

    print(db.index_config())
    # {"m": 32, "ef_construction": 200, "ef_search": 128,
    #  "parallel_insert_threshold": 256, "tombstone_rebuild_pct": 30}

const { open } = require('vectlite')

const db = open('knowledge.vdb')

db.setEfSearch(128)

db.setIndexConfig({ m: 32, efConstruction: 200, efSearch: 128 })

console.log(db.indexConfig())
// { m: 32, efConstruction: 200, efSearch: 128,
//   parallelInsertThreshold: 256, tombstoneRebuildPct: 30 }

Rebuild cost

Changing m or ef_construction triggers a full HNSW rebuild on the next operation. Changing only ef_search (or parallel_insert_threshold / tombstone_rebuild_pct) is free.

Tuning at ingest time

bulk_ingest accepts the same knobs, applied for the duration of the call.

db.bulk_ingest(
    records,
    batch_size=10_000,
    m=32,
    ef_construction=200,
    ef_search=128,
    parallel_insert_threshold=512,
)

db.bulkIngest(records, {
  batchSize: 10000,
  m: 32,
  efConstruction: 200,
  efSearch: 128,
  parallelInsertThreshold: 512,
})

Async variants (bulkIngestAsync in Node) accept the same options.

Benchmark reference

Synthetic 5,000 × 384 cosine benchmark on M-class macOS (from the upstream CHANGELOG):

Config	Ingest throughput
Pre-0.10 baseline	~47 vec/s
0.10.0 defaults	~917 vec/s
0.10.0 `fast` preset	~1782 vec/s

The default settings already capture most of the gain. The fast preset trades a few recall points for the highest throughput; the high_recall preset goes the other way.

What the presets do

The Rust core exposes two named presets that the bindings mirror through set_index_config:

Preset	`m`	`ef_construction`	`ef_search`	Use when
`fast`	small	small	small	Ingest-heavy workloads, recall ≥ 0.9 is acceptable.
`high_recall`	large	large	large	Recall-critical workloads, query latency budget is generous.

How to choose

Start at defaults. Measure recall against an exact (brute-force) baseline on a held-out query set.
If recall is too low, raise ef_search first.
If recall is still too low at high ef_search, raise m and/or ef_construction — these trigger a rebuild but offer the largest gains.
If ingest is too slow, lower ef_construction, raise parallel_insert_threshold if you want the parallel path to kick in earlier, and look at the throughput tuning guide for the WAL-side knobs.

Tuning an existing database​

Tuning at ingest time​

Benchmark reference​

What the presets do​

How to choose​

See also​

Tuning an existing database

Tuning at ingest time

Benchmark reference

What the presets do

How to choose

See also