Python API Reference
Complete reference for the vectlite Python package.
import vectlite
Module Functions
open
vectlite.open(
path: str,
dimension: int | None = None,
*,
metric: str = "cosine",
read_only: bool = False,
lock_timeout: float | None = None,
) -> Database
Open or create a vectlite database.
| Parameter | Type | Description |
|---|---|---|
path | str | Path to the .vdb file. Created if it does not exist. |
dimension | int | None | Vector dimension. Required when creating a new database. Omit when opening an existing one (the stored dimension is used). |
metric | str | Distance metric used at creation time. One of "cosine" (default), "euclidean" (alias "l2"), "dotproduct" (aliases "dot", "ip", "inner_product", "dot_product"), or "manhattan" (alias "l1"). Persisted in the database file. Ignored when opening an existing database. Scores are always oriented so higher is better. |
read_only | bool | Open in read-only mode. Uses shared file locks so multiple readers can access the same file. Write operations raise VectLiteError. |
lock_timeout | float | None | Maximum seconds to wait when acquiring the advisory file lock. None fails immediately on contention. |
Returns: Database
Use as a context manager to release the file lock deterministically:
with vectlite.open("knowledge.vdb", dimension=384) as db:
db.upsert("doc1", embedding, {"source": "blog"})
open_store
vectlite.open_store(root: str) -> Store
Open or create a collection store (a directory of independent databases).
| Parameter | Type | Description |
|---|---|---|
root | str | Path to the directory that holds the collections. Created if it does not exist. |
Returns: Store
restore
vectlite.restore(source: str, dest: str) -> Database
Restore a backup to a new database path.
| Parameter | Type | Description |
|---|---|---|
source | str | Path to the backup directory (created by Database.backup()). |
dest | str | Path where the restored .vdb file will be written. |
Returns: Database -- the restored database, opened for read-write.
sparse_terms
vectlite.sparse_terms(text: str) -> dict[str, float]
Tokenize and weight text into a sparse term vector suitable for BM25 search.
| Parameter | Type | Description |
|---|---|---|
text | str | Input text to analyse. |
Returns: dict[str, float] -- mapping of terms to their weights.
upsert_text
vectlite.upsert_text(
db: Database,
id: str,
text: str,
embed: Callable[[str], list[float]],
metadata: Metadata | None = None,
namespace: str | None = None,
) -> None
High-level helper that generates a dense embedding and sparse terms from text, then upserts the record.
| Parameter | Type | Description |
|---|---|---|
db | Database | Target database. |
id | str | Record identifier. |
text | str | Text to embed and index. |
embed | Callable[[str], list[float]] | Function that converts text to a dense vector. |
metadata | Metadata | None | Optional metadata dict. |
namespace | str | None | Optional namespace. |
search_text
vectlite.search_text(
db: Database,
query: str,
embed: Callable[[str], list[float]],
*,
k: int = 10,
filter: Filter | None = None,
namespace: str | None = None,
all_namespaces: bool = False,
dense_weight: float = 1.0,
sparse_weight: float = 1.0,
fetch_k: int = 0,
mmr_lambda: float | None = None,
vector_name: str | None = None,
fusion: str = "linear",
rrf_k: int = 60,
explain: bool = False,
rerank: RerankHook | None = None,
rerank_k: int = 0,
) -> list[SearchResult]
High-level hybrid search. Generates a dense embedding and sparse terms from query, then runs a fused search.
| Parameter | Type | Default | Description |
|---|---|---|---|
db | Database | -- | Target database. |
query | str | -- | Natural-language query. |
embed | Callable[[str], list[float]] | -- | Function that converts text to a dense vector. |
k | int | 10 | Number of results to return. |
filter | Filter | None | None | MongoDB-style metadata filter. |
namespace | str | None | None | Restrict to a single namespace. |
all_namespaces | bool | False | Search across all namespaces. |
dense_weight | float | 1.0 | Weight for the dense score component. |
sparse_weight | float | 1.0 | Weight for the sparse (BM25) score component. |
fetch_k | int | 0 | Number of candidates to fetch before re-ranking. 0 uses the engine default. |
mmr_lambda | float | None | None | Maximal Marginal Relevance diversity parameter (0 = max diversity, 1 = max relevance). None disables MMR. |
vector_name | str | None | None | Search a specific named vector space. |
fusion | str | "linear" | Fusion strategy: "linear" or "rrf". |
rrf_k | int | 60 | RRF smoothing constant (only used when fusion="rrf"). |
explain | bool | False | Include scoring breakdown in results. |
rerank | RerankHook | None | None | Optional reranker function. See vectlite.rerankers. |
rerank_k | int | 0 | Number of candidates to pass to the reranker. 0 uses fetch_k. |
Returns: list[SearchResult]
search_text_with_stats
vectlite.search_text_with_stats(
db: Database,
query: str,
embed: Callable[[str], list[float]],
*,
# same parameters as search_text
) -> SearchResponse
Same as search_text but returns a SearchResponse containing both results and query statistics.
configure_opentelemetry
vectlite.configure_opentelemetry(
config: bool | dict | None = None,
) -> object | None
Enable, disable, or configure optional OpenTelemetry tracing for search operations. When active, every search_text() and search_text_with_stats() call is wrapped in a vectlite.search span carrying db.system, db.operation.name, and search-specific attributes (vectlite.search.k, .namespace, .has_dense, .has_sparse, .fusion, .used_ann, .result_count, .total_us).
opentelemetry-api is imported lazily and is not a runtime dependency. If a search raises, the span records the exception and sets an error status before re-raising.
| Form | Behaviour |
|---|---|
configure_opentelemetry() | Auto-detect: resolves a tracer from opentelemetry.trace if installed. |
configure_opentelemetry({"tracer": my_tracer}) | Use a user-supplied tracer instance. |
configure_opentelemetry({"tracer_name": "my-app"}) | Override the tracer name (default "vectlite"). |
configure_opentelemetry(False) | Disable tracing. |
Returns: the active tracer (or None when disabled).
Database
Returned by open() and restore().
Properties
| Property | Type | Description |
|---|---|---|
path | str | Absolute path to the .vdb file. |
wal_path | str | Path to the write-ahead log file. |
dimension | int | Vector dimension for this database. |
metric | str | Distance metric used at creation time: "cosine", "euclidean", "dotproduct", or "manhattan". |
read_only | bool | Whether the database was opened in read-only mode. |
quantization_method | str | None | Active quantization method ("scalar", "binary", "product") or None. |
quantization_method is a property (no parentheses). Use db.is_quantized() as a method to check whether quantization is enabled — is_quantized was changed from a property to a method in 0.9.1 for consistency with the rest of the quantization helper API.
count
db.count(
*,
namespace: str | None = None,
filter: Filter | None = None,
) -> int
Return the number of records in the database. Use len(db) as a shortcut for the unfiltered count.
| Parameter | Type | Description |
|---|---|---|
namespace | str | None | Restrict the count to a single namespace. |
filter | Filter | None | MongoDB-style metadata filter to scope the count. |
namespaces
db.namespaces() -> list[str]
Return a list of all namespaces present in the database.
transaction
db.transaction() -> Transaction
Begin a new transaction. Use as a context manager for automatic commit/rollback:
with db.transaction() as tx:
tx.upsert("id", vector, metadata)
Returns: Transaction
insert
db.insert(
id: str,
vector: list[float],
metadata: Metadata | None = None,
*,
namespace: str | None = None,
sparse: dict[str, float] | None = None,
vectors: dict[str, list[float]] | None = None,
ttl: float | None = None,
) -> None
Insert a new record. Raises VectLiteError if a record with the same id (and namespace) already exists.
| Parameter | Type | Description |
|---|---|---|
id | str | Record identifier. |
vector | list[float] | Dense embedding vector. |
metadata | Metadata | None | Optional metadata dict. |
namespace | str | None | Target namespace. |
sparse | dict[str, float] | None | Sparse term vector for BM25 search. |
vectors | dict[str, list[float]] | None | Additional named vectors. |
ttl | float | None | Time-to-live in seconds. Expired records are filtered from reads and garbage-collected on compact(). |
upsert
db.upsert(
id: str,
vector: list[float],
metadata: Metadata | None = None,
*,
namespace: str | None = None,
sparse: dict[str, float] | None = None,
vectors: dict[str, list[float]] | None = None,
ttl: float | None = None,
) -> None
Insert or update a record. If the id already exists the record is replaced.
Parameters are identical to insert().
insert_many
db.insert_many(
records: list[Record],
*,
namespace: str | None = None,
) -> int
Batch insert multiple records. Raises on duplicate IDs.
| Parameter | Type | Description |
|---|---|---|
records | list[Record] | List of record dicts with keys id, vector, and optionally metadata, sparse, vectors. |
namespace | str | None | Target namespace for all records. |
Returns: int -- number of records inserted.
upsert_many
db.upsert_many(
records: list[Record],
*,
namespace: str | None = None,
) -> int
Batch upsert multiple records.
Returns: int -- number of records upserted.
bulk_ingest
db.bulk_ingest(
records: list[Record],
*,
namespace: str | None = None,
batch_size: int = 10000,
) -> int
Ingest a batch of records efficiently. This is the fastest bulk-write path exposed by the current Python package.
| Parameter | Type | Description |
|---|---|---|
records | list[Record] | List of record dicts. |
namespace | str | None | Target namespace. |
batch_size | int | Internal WAL batch size. |
Returns: int -- total number of records ingested.
get
db.get(
id: str,
*,
namespace: str | None = None,
) -> Record | None
Retrieve a record by ID. Returns None if not found.
delete
db.delete(
id: str,
*,
namespace: str | None = None,
) -> bool
Delete a record by ID. Returns True if the record existed and was deleted.
delete_many
db.delete_many(
ids: list[str],
*,
namespace: str | None = None,
) -> int
Delete multiple records by ID. Returns the number of records actually deleted.
delete_by_filter
db.delete_by_filter(
filter: Filter,
*,
namespace: str | None = None,
) -> int
Delete every record that matches filter in a single pass. Returns the number of records deleted.
deleted = db.delete_by_filter({"stale": True}, namespace="docs")
update_metadata
db.update_metadata(
id: str,
metadata: Metadata,
*,
namespace: str | None = None,
) -> bool
Merge a metadata patch into an existing record without re-writing the vector. Keys present in metadata overwrite existing keys; other keys are left untouched. Skips ANN, sparse, quantized, and multi-vector index rebuilds when a WAL batch contains only metadata updates.
Returns True if the record was found and updated, False if the id does not exist.
db.update_metadata("doc1", {"status": "reviewed", "score": 0.95})
list
db.list(
*,
namespace: str | None = None,
filter: Filter | None = None,
limit: int = 0,
offset: int = 0,
) -> list[Record]
List records without issuing a vector query. limit=0 returns every match.
records = db.list(namespace="docs", filter={"stale": False}, limit=20)
list_cursor
db.list_cursor(
*,
namespace: str | None = None,
filter: Filter | None = None,
limit: int = 100,
cursor: str | None = None,
) -> tuple[list[Record], str | None]
Cursor-based pagination for efficient iteration over large collections. Returns the next page of records and an opaque continuation cursor (None when exhausted).
cursor = None
while True:
page, cursor = db.list_cursor(limit=100, cursor=cursor)
for record in page:
process(record)
if cursor is None:
break
set_ttl
db.set_ttl(
id: str,
ttl_secs: float,
*,
namespace: str | None = None,
) -> bool
Set a time-to-live (in seconds from now) on an existing record. Expired records are transparently filtered from get(), list(), count(), and search(), and are permanently removed on compact(). The expires_at field on Record exposes the absolute expiry timestamp (epoch seconds).
clear_ttl
db.clear_ttl(
id: str,
*,
namespace: str | None = None,
) -> bool
Remove the expiry from a record so it lives indefinitely.
create_index
db.create_index(field: str, index_type: str) -> None
Create a payload index on a metadata field to accelerate filtered queries 10-100x on large collections. Indexes are automatically used by search(), count(), and list() to narrow candidates before full filter evaluation. AND filters intersect index results; OR filters union when all sub-filters are indexed. Index definitions persist across close/reopen in a .vdb.pidx sidecar file.
| Parameter | Type | Description |
|---|---|---|
field | str | Metadata field to index. |
index_type | str | "keyword" for string equality / $in, or "numeric" for range queries ($gt, $gte, $lt, $lte). |
db.create_index("source", "keyword")
db.create_index("score", "numeric")
drop_index
db.drop_index(field: str) -> bool
Remove a payload index. Returns True if the index existed.
list_indexes
db.list_indexes() -> list[tuple[str, str]]
Return all active payload indexes as [(field, index_type), ...].
enable_quantization
db.enable_quantization(
method: str,
*,
rescore_multiplier: int = 4,
num_sub_vectors: int | None = None,
num_centroids: int = 256,
) -> None
Enable vector quantization to reduce memory and accelerate search. All methods use a 2-stage pipeline: fast quantized candidate selection followed by exact float32 rescoring. Parameters persist in a .vdb.quant sidecar file and reload automatically on open. Indexes rebuild automatically on inserts, upserts, and bulk ingestion.
| Parameter | Type | Description |
|---|---|---|
method | str | "scalar" (int8, ~4x compression), "binary" (~32x compression, best for normalized embeddings), or "product" / "pq" (configurable PQ with k-means). Method names are parsed case-insensitively. |
rescore_multiplier | int | Rescoring budget — the engine fetches k × rescore_multiplier candidates from the quantized scan and rescores them with exact float32 distances. Higher = better recall, slower search. |
num_sub_vectors | int | None | (Product quantization only) Number of sub-vector splits. When None, vectlite picks a dimension-compatible default. Use valid_num_sub_vectors() to see the legal values for your dimension. |
num_centroids | int | (Product quantization only) Centroids per sub-vector codebook. |
db.enable_quantization("scalar")
db.enable_quantization("binary", rescore_multiplier=10)
db.enable_quantization("pq") # dimension-aware default for num_sub_vectors
db.enable_quantization("product", num_sub_vectors=16, num_centroids=256)
disable_quantization
db.disable_quantization() -> None
Disable quantization and remove the persisted .vdb.quant sidecar.
is_quantized
db.is_quantized() -> bool
Whether vector quantization is currently enabled. Method, not a property since 0.9.1.
valid_num_sub_vectors
db.valid_num_sub_vectors() -> list[int]
Return the list of valid num_sub_vectors values for this database's dimension (i.e. the divisors that allow an even split). Useful before calling enable_quantization("product", num_sub_vectors=...).
upsert_multi_vectors
db.upsert_multi_vectors(
id: str,
vector: list[float],
multi_vectors: dict[str, list[list[float]]],
metadata: Metadata | None = None,
*,
namespace: str | None = None,
sparse: dict[str, float] | None = None,
) -> None
Insert or replace a record carrying per-token (ColBERT, ColPali) embeddings for late-interaction search. vector is the usual dense embedding; multi_vectors maps space names (e.g. "colbert", "colpali") to a list of token vectors.
db.upsert_multi_vectors(
"doc1",
dense_vector,
{"colbert": [token_vec_1, token_vec_2, ...]},
metadata={"source": "paper"},
)
search_multi_vector
db.search_multi_vector(
space: str,
query_tokens: list[list[float]],
*,
k: int = 10,
namespace: str | None = None,
filter: Filter | None = None,
) -> list[SearchResult]
MaxSim late-interaction search: for each query token, take the max cosine similarity against all document tokens, then sum across query tokens.
results = db.search_multi_vector("colbert", query_token_vectors, k=10)
enable_multi_vector_quantization
db.enable_multi_vector_quantization(space: str) -> None
Enable ColBERTv2-style 2-bit quantization (~16x compression) for a multi-vector space. Quantized search uses a 2-stage pipeline: fast approximate MaxSim candidate selection followed by exact float32 rescoring. Parameters persist in a .vdb.mvquant.<space> sidecar file.
disable_multi_vector_quantization
db.disable_multi_vector_quantization(space: str) -> None
Disable 2-bit quantization for a multi-vector space.
is_multi_vector_quantized
db.is_multi_vector_quantized(space: str) -> bool
Whether a multi-vector space currently has 2-bit quantization enabled.
close
db.close() -> None
Flush pending state, release the file lock, and invalidate the handle. Subsequent operations that return data raise VectLiteError. Prefer the context-manager form (with vectlite.open(...) as db:) when possible.
flush
db.flush() -> None
Flush pending writes to the WAL. Data is durable after this call but not yet compacted into the main file.
In the current package, flush() is an alias for compact().
compact
db.compact() -> None
Merge the WAL into the main .vdb file and rebuild ANN indexes if necessary. Call this periodically or after large batch writes.
snapshot
db.snapshot(dest: str) -> None
Create a self-contained copy of the database at dest. Includes all committed data (call compact() first to include WAL entries).
backup
db.backup(dest: str) -> None
Full backup: copies the .vdb file and all ANN sidecar files to the dest directory.
search
db.search(
vector: list[float] | None = None,
*,
k: int = 10,
filter: Filter | None = None,
namespace: str | None = None,
all_namespaces: bool = False,
sparse: dict[str, float] | None = None,
dense_weight: float = 1.0,
sparse_weight: float = 1.0,
fusion: str = "linear",
rrf_k: int = 60,
fetch_k: int = 0,
mmr_lambda: float | None = None,
vector_name: str | None = None,
query_vectors: dict[str, list[float]] | None = None,
vector_weights: dict[str, float] | None = None,
explain: bool = False,
rerank: RerankHook | None = None,
rerank_k: int = 0,
truncate_dim: int | None = None,
) -> list[SearchResult]
Run a search query. Supports dense, sparse, hybrid, and multi-vector search modes.
| Parameter | Type | Default | Description |
|---|---|---|---|
vector | list[float] | None | None | Dense query vector. Pass None for sparse-only or multi-vector search. Must match the database dimension unless truncate_dim is set. An all-zeros vector raises VectLiteError for cosine and dot-product metrics (distance from the origin is undefined for these). |
k | int | 10 | Number of results to return. |
filter | Filter | None | None | MongoDB-style metadata filter. |
namespace | str | None | None | Restrict to a namespace. |
all_namespaces | bool | False | Search all namespaces. |
sparse | dict[str, float] | None | None | Sparse term vector for keyword search. |
dense_weight | float | 1.0 | Weight for the dense component in hybrid search. |
sparse_weight | float | 1.0 | Weight for the sparse component in hybrid search. |
fusion | str | "linear" | Fusion strategy: "linear" or "rrf". |
rrf_k | int | 60 | RRF smoothing constant. |
fetch_k | int | 0 | Number of candidates to retrieve before reranking. 0 uses the engine default. |
mmr_lambda | float | None | None | MMR diversity parameter. None disables MMR. |
vector_name | str | None | None | Search a specific named vector space. |
query_vectors | dict[str, list[float]] | None | None | Named query vectors for multi-vector search. |
vector_weights | dict[str, float] | None | None | Weights for multi-vector search. |
explain | bool | False | Include scoring breakdown in results. |
rerank | RerankHook | None | None | Reranker function. |
rerank_k | int | 0 | Candidates to pass to the reranker. |
truncate_dim | int | None | None | Matryoshka prefix search. When set, query and database vectors are truncated to the first truncate_dim dimensions before scoring. Required when the query length is shorter than the database dimension; otherwise the engine raises DimensionMismatch. |
Returns: list[SearchResult]
Search now strictly rejects query vectors whose length does not match the database dimension. Previously, undersized queries were silently truncated via Matryoshka logic. To opt into prefix search, pass truncate_dim explicitly.
search_with_stats
db.search_with_stats(
# same parameters as search()
) -> SearchResponse
Same as search() but returns a SearchResponse with results and query statistics.
Store
Returned by open_store(). Manages a directory of independent database collections.
Properties
| Property | Type | Description |
|---|---|---|
root | str | Absolute path to the store directory. |
collections
store.collections() -> list[str]
List all collection names in the store.
create_collection
store.create_collection(name: str, dimension: int) -> Database
Create a new collection. Raises VectLiteError if it already exists.
open_or_create_collection
store.open_or_create_collection(name: str, dimension: int) -> Database
Open an existing collection or create a new one.
open_collection
store.open_collection(name: str) -> Database
Open an existing collection. Raises VectLiteError if it does not exist.
drop_collection
store.drop_collection(name: str) -> bool
Delete a collection and all its data from disk. Returns True if the collection existed.
close
store.close() -> None
Symmetry with Database.close(). The store holds no open file handles, so this is currently a no-op, but using it future-proofs your code and avoids AttributeError surprises across binding versions. Available in Python, Node, Swift, and Kotlin since 0.9.1.
Transaction
Returned by Database.transaction(). Supports use as a context manager (with statement) for automatic commit on success and rollback on exception.
Context Manager
with db.transaction() as tx:
tx.upsert("id", vector, metadata)
# auto-commits here; rolls back on exception
insert
tx.insert(
id: str,
vector: list[float],
metadata: Metadata | None = None,
*,
namespace: str | None = None,
sparse: dict[str, float] | None = None,
vectors: dict[str, list[float]] | None = None,
ttl: float | None = None,
) -> None
Queue an insert within the transaction. ttl (seconds from now) sets an expiry on the record.
upsert
tx.upsert(
id: str,
vector: list[float],
metadata: Metadata | None = None,
*,
namespace: str | None = None,
sparse: dict[str, float] | None = None,
vectors: dict[str, list[float]] | None = None,
ttl: float | None = None,
) -> None
Queue an upsert within the transaction. ttl (seconds from now) sets an expiry on the record.
insert_many
tx.insert_many(
records: list[Record],
*,
namespace: str | None = None,
) -> int
Queue a batch insert. Returns the number of records queued.
upsert_many
tx.upsert_many(
records: list[Record],
*,
namespace: str | None = None,
) -> int
Queue a batch upsert. Returns the number of records queued.
delete
tx.delete(
id: str,
*,
namespace: str | None = None,
) -> bool
Queue a delete within the transaction.
commit
tx.commit() -> None
Commit all queued operations atomically.
rollback
tx.rollback() -> None
Discard all queued operations.
__len__
len(tx) -> int
Return the number of queued operations in the transaction.
Types
MetadataValue
MetadataValue = str | int | float | bool | None | list | dict
A single metadata field value.
Metadata
Metadata = dict[str, MetadataValue]
A metadata dictionary attached to a record.
Filter
Filter = dict[str, Any]
MongoDB-style filter expression. See Metadata Filters for the full query syntax.
RerankHook
RerankHook = Callable[[dict[str, Any], list[dict[str, Any]]], list[dict[str, Any]]]
A function that receives a query payload dict and a list of result dicts, and returns a reordered list. Used with the rerank parameter.
Record
class Record(TypedDict, total=False):
namespace: str
id: str # Required
vector: list[float] # Required
vectors: dict[str, list[float]]
multi_vectors: dict[str, list[list[float]]]
sparse: dict[str, float]
metadata: Metadata
expires_at: float | None # epoch seconds (TTL)
A record dictionary. Used for batch operations and returned by get(). expires_at is the absolute expiry timestamp in epoch seconds when a TTL is set (otherwise None).
SearchResult
class SearchResult(TypedDict):
namespace: str
id: str
score: float
dense_score: float
sparse_score: float
vector_name: str | None
matched_terms: list[str]
dense_rank: int | None
sparse_rank: int | None
bm25_term_scores: dict[str, float]
rerank_score: float
metadata: Metadata
explain: ExplainDetails
A single search result.
ExplainDetails
class ExplainDetails(TypedDict, total=False):
fusion: str
dense_score: float
sparse_score: float
matched_terms: list[str]
vector_name: str | None
dense_rank: int | None
sparse_rank: int | None
bm25_term_scores: dict[str, float]
rerankers: list[dict[str, Any]]
Scoring breakdown when explain=True.
SearchStats
class SearchStats(TypedDict):
used_ann: bool
ann_candidate_count: int
exact_fallback: bool
considered_count: int
fetch_k: int
mmr_applied: bool
sparse_candidate_count: int
ann_loaded_from_disk: bool
wal_entries_replayed: int
fusion: str
rerank_applied: bool
rerank_count: int
timings: SearchTimings
Engine statistics for a search query.
SearchTimings
class SearchTimings(TypedDict):
dense_us: int
sparse_us: int
fusion_us: int
total_us: int
Timing breakdown in microseconds.
SearchResponse
class SearchResponse(TypedDict):
results: list[SearchResult]
stats: SearchStats
Returned by search_with_stats() and search_text_with_stats().
Exceptions
VectLiteError
class VectLiteError(Exception): ...
Base exception for all vectlite errors. Raised for:
- Write operations on a read-only database
- Duplicate ID on
insert() - Dimension mismatch
- Corrupt database file
- File lock contention
- I/O errors
Sub-modules
vectlite.analyzers
Text analysis utilities for customizing sparse tokenization.
Analyzer
class Analyzer:
def __init__(self) -> None: ...
def tokenizer(self, fn: Callable[[str], list[str]]) -> Analyzer: ...
def lowercase(self) -> Analyzer: ...
def stopwords(self, lang_or_set: str | frozenset[str] | set[str]) -> Analyzer: ...
def stemmer(self, lang: str = "english") -> Analyzer: ...
def ngrams(self, n: int = 2) -> Analyzer: ...
def filter(self, fn: Callable[[list[str]], list[str]]) -> Analyzer: ...
def tokenize(self, text: str) -> list[str]: ...
def sparse_terms(self, text: str) -> dict[str, float]: ...
def sparse_terms_weighted(
self,
fields: dict[str, str],
weights: dict[str, float] | None = None,
) -> dict[str, float]: ...
Methods:
tokenizer(fn)-- replace the default tokenizer.lowercase()-- add a lowercase filter.stopwords(lang_or_set)-- remove stopwords by language code or custom set.stemmer(lang)-- add a Snowball stemmer. RequiresPyStemmer.ngrams(n)-- add character n-gram generation.filter(fn)-- add a custom token filter.tokenize(text)-- return the processed token list.sparse_terms(text)-- return a sparse term-frequency vector.sparse_terms_weighted(fields, weights)-- combine multiple text fields with per-field weights.
Constants
vectlite.analyzers.ENGLISH_STOPWORDS: frozenset[str]
vectlite.analyzers.FRENCH_STOPWORDS: frozenset[str]
Pre-built stopword sets.
vectlite.rerankers
Composable reranking functions for search post-processing.
text_match
vectlite.rerankers.text_match(
*,
text_key: str = "text",
title_key: str | None = "title",
text_weight: float = 1.0,
title_weight: float = 1.5,
matched_term_weight: float = 0.25,
phrase_boost: float = 1.0,
) -> RerankHook
Boost results according to term overlap between the query payload and metadata text/title fields.
metadata_boost
vectlite.rerankers.metadata_boost(
field: str,
boosts: Mapping[Any, float],
*,
default: float = 0.0,
) -> RerankHook
Adjust scores by adding the configured boost for a metadata field value.
cross_encoder
vectlite.rerankers.cross_encoder(
model_name_or_path: str = "cross-encoder/ms-marco-MiniLM-L-6-v2",
*,
text_key: str = "text",
batch_size: int = 32,
device: str | None = None,
) -> RerankHook
Rerank using a sentence-transformers cross-encoder loaded from a model name or local path.
| Parameter | Type | Description |
|---|---|---|
model_name_or_path | str | HuggingFace model name or local path. |
text_key | str | Metadata field containing document text. |
batch_size | int | Batch size for inference. |
device | str | None | Device such as "cpu" or "cuda". |
bi_encoder
vectlite.rerankers.bi_encoder(
model_name_or_path: str = "sentence-transformers/all-MiniLM-L6-v2",
*,
text_key: str = "text",
batch_size: int = 64,
device: str | None = None,
) -> RerankHook
Rerank using a sentence-transformers bi-encoder loaded from a model name or local path.
onnx_cross_encoder
vectlite.rerankers.onnx_cross_encoder(
model_name_or_path: str = "cross-encoder/ms-marco-MiniLM-L-6-v2",
*,
text_key: str = "text",
batch_size: int = 32,
) -> RerankHook
Zero-PyTorch cross-encoder reranking via onnxruntime + tokenizers. Auto-downloads models from the HuggingFace Hub. Same RerankHook interface as cross_encoder().
Requires: pip install onnxruntime tokenizers huggingface-hub.
reranker = vectlite.rerankers.onnx_cross_encoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
results = db.search(query, k=20, rerank=reranker)
compose
vectlite.rerankers.compose(
*hooks: RerankHook,
strategy: str = "sequential",
rank_constant: int = 60,
) -> RerankHook
Compose multiple rerankers either sequentially or with reciprocal rank fusion.
reranker = vectlite.rerankers.compose(
vectlite.rerankers.text_match(text_weight=0.5),
vectlite.rerankers.metadata_boost("priority", {"high": 2.0, "low": 0.5}),
)
results = db.search(query_emb, k=10, rerank=reranker, rerank_k=50)
vectlite.embedders
Ready-to-use embedding functions for upsert_text() and search_text(). Every provider lazy-imports its SDK, so none of them are runtime dependencies.
from vectlite import embedders
embed = embedders.openai("text-embedding-3-small")
embed = embedders.cohere("embed-english-v3.0")
embed = embedders.voyage("voyage-3")
embed = embedders.fastembed("BAAI/bge-small-en-v1.5") # local ONNX
embed = embedders.sentence_transformer("sentence-transformers/all-MiniLM-L6-v2")
embed = embedders.ollama("nomic-embed-text")
vectlite.upsert_text(db, "doc1", "Hello world", embed)
results = vectlite.search_text(db, "greeting", embed, k=5)
| Factory | Required extra | Description |
|---|---|---|
embedders.openai(model) | openai | OpenAI hosted embeddings (reads OPENAI_API_KEY). |
embedders.cohere(model) | cohere | Cohere hosted embeddings. |
embedders.voyage(model) | voyageai | Voyage AI hosted embeddings. |
embedders.fastembed(model) | fastembed | Local ONNX inference, no API calls. |
embedders.sentence_transformer(model) | sentence-transformers | Local PyTorch inference. |
embedders.ollama(model) | -- | Calls a local Ollama server over HTTP. |
Each factory returns a Callable[[str], list[float]].
vectlite.schema
Optional typed metadata schemas with clear error messages on type mismatch.
from vectlite import schema
s = schema.Schema({
"price": "number",
"title": "string",
"tags": "array<string>",
"author": {
"name": "string",
"age": "number",
},
}, strict=True) # strict=True rejects unknown fields
s.validate({"price": 9.99, "title": "Hello"}) # OK
s.validate({"price": "free"}) # raises SchemaError
# Wrap a database to auto-validate on every write
validated_db = schema.validated(db, s)
validated_db.upsert("doc1", vector, {"price": 9.99}) # OK
validated_db.upsert("doc2", vector, {"price": "free"}) # raises SchemaError
# Persist alongside the database in a .vdb.schema.json sidecar
s.save(db)
loaded = schema.load(db)
Supported types: string, number, integer, boolean, null, any, array, array<string>, array<number>, object, plus nested objects.
SchemaError
class schema.SchemaError(VectLiteError): ...
Raised by Schema.validate() and schema.validated() writes when metadata fails type or strict-mode checks.
vectlite.langchain
LangChain VectorStore implementation. Requires pip install langchain-core.
from vectlite.langchain import VectLiteVectorStore
from langchain_openai import OpenAIEmbeddings
store = VectLiteVectorStore(
path="my.vdb",
embedding=OpenAIEmbeddings(),
dimension=1536,
)
store.add_texts(["Hello world", "How to authenticate"])
results = store.similarity_search("greeting", k=3)
results_with_scores = store.similarity_search_with_score("greeting", k=3)
Implements add_texts, add_documents, similarity_search, similarity_search_with_score, similarity_search_by_vector, delete, and the from_texts class method. Compatible with VectorStoreIndex, RetrievalQA, and other LangChain consumers.
vectlite.llamaindex
LlamaIndex VectorStore implementation. Requires pip install llama-index-core.
from vectlite.llamaindex import VectLiteVectorStore
from llama_index.core import StorageContext, VectorStoreIndex
store = VectLiteVectorStore(path="my.vdb", dimension=1536)
storage_ctx = StorageContext.from_defaults(vector_store=store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_ctx)
query_engine = index.as_query_engine()
response = query_engine.query("How do I authenticate?")
Implements add, delete, and query against LlamaIndex's VectorStore protocol.
Command-Line Interface
Installed as the vectlite console script (also reachable via python -m vectlite).
| Command | Purpose |
|---|---|
vectlite stats <path> | Dimension, metric, record counts, file sizes, active indexes. |
vectlite count <path> [--namespace ns] [--filter JSON] | Filtered or namespace-scoped count. |
vectlite list <path> [--limit N] [--filter JSON] [--namespace ns] | List records as JSON. |
vectlite dump <path> | Export every record as JSONL via cursor pagination. |
vectlite search <path> --query JSON [--k N] [--filter JSON] | Run a dense search and print results. |
vectlite compact <path> | Fold the WAL into the snapshot and persist ANN sidecars. |
vectlite verify <path> | Check WAL + snapshot integrity. |
vectlite bench <path> --queries N [--k N] | Search benchmark with QPS and latency. |
vectlite import-jsonl <path> <file> --dimension N | Bulk import from a JSONL file. |
vectlite import-csv <path> <file> --dimension N --vector-col embedding | Bulk import from a CSV file. |
vectlite stats my.vdb
vectlite count my.vdb --namespace blog
vectlite list my.vdb --limit 10 --filter '{"source": "blog"}'
vectlite dump my.vdb > backup.jsonl
vectlite import-jsonl my.vdb data.jsonl --dimension 384
vectlite bench my.vdb --queries 1000 --k 10