Zero-Copy Bulk Ingest
Added in vectlite 0.12.0.
For large ingestion jobs, the dict-based bulk_ingest(list_of_dicts) / bulkIngest(records) path is no longer the fastest option. It still works (and is preserved for compat), but every record pays for Python → Rust object marshalling (Python) or a JSON serialise / deserialise round-trip (Node).
0.12.0 adds bulk_ingest_array (Python) and bulkIngestArray (Node) which take the vector data as a single contiguous native array — NumPy float32 for Python, Float32Array for Node — and hand it to the Rust core by reference. 10–30× faster than the dict path on large batches.
Python (NumPy)
import numpy as np
import vectlite
N, D = 50_000, 384
ids = [f"doc-{i}" for i in range(N)]
vectors = np.asarray(embeddings, dtype=np.float32) # shape (N, D), C-contiguous
metadata = [{"source": "blog", "lang": "en"} for _ in range(N)]
with vectlite.open("knowledge.vdb", dimension=D) as db:
db.bulk_ingest_array(ids, vectors, metadata)
Things to know:
vectorsmust bedtype=np.float32and shape(N, D)whereD == db.dimension. Passnp.ascontiguousarray(...)if your array came from a slice that isn't C-contiguous.metadatais optional. Skip it if you only need ids + vectors.- The GIL is released during the Rust call (since 0.12.0 on all write paths), so other Python threads make progress.
numpy >= 1.23is now a runtime dependency of the Python package — no extra install step.
All the same HNSW knobs (m, ef_construction, ef_search, parallel_insert_threshold, tombstone_rebuild_pct, segment_size_threshold) are accepted as kwargs.
Node (Float32Array)
const { open } = require('vectlite')
const N = 50_000
const D = 384
const ids = []
const flat = new Float32Array(N * D) // row-major
for (let i = 0; i < N; i++) {
ids.push(`doc-${i}`)
flat.set(embeddings[i], i * D)
}
const db = open('knowledge.vdb', { dimension: D })
db.bulkIngestArray(ids, flat, D, {
metadata: ids.map(() => ({ source: 'blog', lang: 'en' })),
})
Things to know:
flatmust be aFloat32Arrayof lengthN * D, row-major (one record perDslots).- napi-rs gives the Rust side a direct reference into the underlying
ArrayBuffer— there's noJSON.stringifyof the vector data. - The third argument is the vector dimension, not the record count.
options.metadatais an array parallel toids; passundefinedto skip.
All HNSW knobs available on bulkIngest (m, efConstruction, efSearch, parallelInsertThreshold, tombstoneRebuildPct, segmentSizeThreshold) are accepted in options.
When NOT to use it
- Small batches (a few hundred records). The marshalling cost is negligible there; use
insert/upsertdirectly. Since 0.12.0, single-recorddb.insert()anddb.upsert()route to the incremental WAL path in both bindings (no full HNSW rebuild per insert), which fixed a long-standing 150 vec/s ceiling. - Records without a precomputed vector, e.g. when you're using
upsert_textwith a built-in embedder. The array path is for vectors that already exist as floats. - Heterogeneous vector lengths. All rows in the same call must match
db.dimension.
HNSW segments
0.12.0 also segments the HNSW index LSM-tree style: new inserts land in the active segment until it hits segment_size_threshold (default 50_000), then a new segment is started. This caps per-insert HNSW cost at O(log segment_size) instead of O(log total) — streaming throughput stays flat as the corpus grows.
Inspect the segment count:
print(db.ann_segment_count()) # default namespace
print(db.ann_segment_count(vector_name="colbert"))
console.log(db.annSegmentCount())
console.log(db.annSegmentCount(undefined, 'colbert'))
The manifest format bumped to ANN3. Old ANN1 / ANN2 databases still load — they report as 1-segment indexes and re-segment naturally as new inserts land.
See also
- HNSW tuning — pick the right
m,ef_construction,ef_search. - Throughput tuning — relax WAL sync mode and tombstoning for streaming workloads.
- Changelog 0.12.0 — full release notes.
- Changelog 0.13.0 — startup latency improvements that complement the ingest path.