Vector Search¶
Uni provides native vector search over embedding properties with 8 single-vector ANN index algorithms (Flat, IVF-Flat/SQ/PQ/RQ, HNSW-Flat/SQ/PQ) featuring scalar, product, and RaBitQ quantization — plus MUVERA for multi-vector (ColBERT / late-interaction) columns. The default algorithm is IVF-PQ with cosine distance. Use it for semantic search, RAG, and similarity-based retrieval.
What It Provides¶
- Vector properties stored alongside graph data.
- ANN indexes with cosine, L2, or dot distance — scores are automatically converted to [0, 1] similarity regardless of metric.
CALL uni.vector.query(...)for KNN retrieval.similar_to()expression function for point-scoring bound nodes inWHERE,RETURN, and Locy rules.- Auto-embedding: Pass text directly and let Uni embed it using the index's configured embedding model.
- Multi-vector (ColBERT / late-interaction): store many vectors per row and rank by exact MaxSim, optionally accelerated by a MUVERA first-stage index.
Example¶
use uni_db::{DataType, IndexType, Uni, VectorAlgo, VectorIndexCfg, VectorMetric};
# async fn demo() -> Result<(), uni_db::UniError> {
let db = Uni::open("./my_db").build().await?;
db.schema()
.label("Document")
.property("title", DataType::String)
.property("embedding", DataType::Vector { dimensions: 384 })
.index("embedding", IndexType::Vector(VectorIndexCfg {
algorithm: VectorAlgo::HnswSq { m: 16, ef_construction: 200, partitions: None },
metric: VectorMetric::Cosine,
embedding: None, // Or configure auto-embed
}))
.apply()
.await?;
let session = db.session();
let rows = session.query_with(
"CALL uni.vector.query('Document', 'embedding', $q, 10) YIELD node, score RETURN node, score"
)
.param("q", vec![0.1_f32, 0.2, 0.3])
.fetch_all()
.await?;
println!("{:?}", rows);
# Ok(())
# }
import uni_db
db = uni_db.Uni.open("./my_db")
db.schema() \
.label("Document") \
.property("title", "string") \
.vector("embedding", 384) \
.index("embedding", "vector") \
.done() \
.apply()
session = db.session()
rows = session.query(
"CALL uni.vector.query('Document', 'embedding', $q, 10) YIELD node, score RETURN node, score",
{"q": [0.1, 0.2, 0.3]}
)
print(rows)
Auto-Embedding Queries¶
With an embedding configuration on your index, you can query with text directly:
-- Create index with embedding config
CREATE VECTOR INDEX doc_embed FOR (d:Document) ON (d.embedding)
OPTIONS {
metric: 'cosine',
embedding: {
alias: 'embed/default',
source: ['title'],
batch_size: 32
}
}
-- Query with text - Uni auto-embeds it
CALL uni.vector.query('Document', 'embedding', 'machine learning tutorial', 10)
YIELD node, score
RETURN node.title, score
The ~= Operator¶
The ~= (approximate equality) operator is shorthand for a top-K vector index scan — it desugars to uni.vector.query under the hood:
-- ~= operator: top-K scan against the vector index
MATCH (p:Paper) WHERE p.embedding ~= $query_vector
RETURN p.title, p._score AS score
ORDER BY score DESC LIMIT 10
~= is vector-only — it cannot do FTS or hybrid search. For hybrid search, use similar_to() with multi-source arrays.
=~ is regex, ~= is vector similarity
These are unrelated operators that look similar. n.name =~ '(?i)john' is a regex match on strings. n.embedding ~= $vec is vector similarity search.
Expression-Based Scoring: similar_to¶
For scoring already-bound nodes rather than top-K retrieval, use similar_to():
MATCH (a:Paper)-[:CITES]->(b:Paper)
WHERE similar_to(b.embedding, 'attention mechanisms') > 0.7
RETURN b.title, similar_to(b.embedding, 'attention mechanisms') AS score
For hybrid search (vector + FTS combined), use multi-source arrays with fusion:
-- Correct hybrid: single similar_to with multi-source arrays
MATCH (d:Doc)
RETURN d.title,
similar_to([d.embedding, d.content], [$query_vector, $query_text]) AS score
ORDER BY score DESC
similar_to supports metric-aware vector scoring (Cosine, L2, Dot Product), FTS scoring, and multi-source hybrid fusion with RRF or weighted algorithms. It automatically uses the distance metric configured on the vector index. It works in WHERE, RETURN, ORDER BY, and Locy rule bodies. See the Vector Search guide for full details.
Uni-Xervo Runtime¶
Beyond auto-embedding, the Uni::xervo() facade gives direct access to embedding and generation models:
use uni_db::xervo::{Message, GenerationOptions};
let xervo = db.xervo();
// Structured messages with roles
let result = xervo.generate("llm/default", &[
Message::system("You are a helpful assistant."),
Message::user("Summarize this document."),
], GenerationOptions::default()).await?;
// Or plain strings (convenience)
let result = xervo.generate_text("llm/default",
&["Summarize this."],
GenerationOptions::default(),
).await?;
Uni-Xervo supports local providers (Candle, mistral.rs, ONNX Runtime — embed/rerank/raw) and remote providers (OpenAI, Gemini, Anthropic, Cohere, Vertex AI, Mistral, Voyage AI, Azure OpenAI). See the Vector Search Guide for the full provider table and configuration details.
Use Cases¶
- Semantic search for documents or products.
- RAG retrieval over knowledge graphs.
- Similarity search over embeddings generated in-app.
- Scoring graph-traversed nodes with
similar_to()inWHEREand Locy rules. - LLM generation with context from graph queries.
Cross-Encoder Reranking¶
For higher-precision results, add a cross-encoder reranking stage to uni.vector.query. The reranker re-scores over-fetched candidates using a (query, document) cross-encoder model:
CALL uni.vector.query('Document', 'embedding', 'graph databases', 10,
null, null,
{reranker: 'rerank/minilm', reranker_property: 'content'})
YIELD node, score, rerank_score
RETURN node.title, score
Supports local ONNX models (local/onnx) and remote APIs (Cohere, Voyage AI). See Hybrid Search — Reranking for full details.
Multi-Vector Search (ColBERT / Late-Interaction)¶
Late-interaction models (ColBERT, ColQwen2) represent each document and query as a set of per-token vectors rather than a single dense vector, then score with MaxSim — for each query token, take its best match across the document's tokens, and sum:
This gives token-level matching precision, the strongest known approach for visual/layout-rich and long-document retrieval.
Declaring a multi-vector property¶
A multi-vector property is a LIST<VECTOR(dim)> — a variable-length list of fixed-size token vectors. It must be schema-declared (multi-vectors cannot be stored on a schemaless property; a CypherValue/JSON column is the flexible-dimension alternative).
The Pydantic OGM maps a list[Vector[96]] field to the same list:vector:96 type — see Pydantic OGM reference.
Querying with MaxSim¶
Retrieve candidates with a fast first stage (dense ANN, or a MUVERA index over the tokens), then re-rank them by exact MaxSim. Pass the per-token query via maxsim_query:
CALL uni.vector.query(
'Document',
'embedding', -- dense property for first-stage ANN
$dense_query_vector,
50, -- over-fetch candidates to re-rank
null,
null,
{
reranker: 'maxsim',
reranker_property: 'tokens', -- the LIST<VECTOR> property
maxsim_query: [[0.1, 0.2], [0.3, 0.4]], -- per-token query embeddings
maxsim_metric: 'cosine' -- optional; default 'cosine'
}
)
YIELD node, score, rerank_score
RETURN node.title, rerank_score
ORDER BY rerank_score DESC
You can also query a multi-vector property directly (no separate dense stage) by passing a list of vectors as the query — score is then the exact MaxSim similarity:
CALL uni.vector.query('Document', 'tokens', [[0.1, 0.2], [0.3, 0.4]], 10)
YIELD node, score
RETURN node.title, score
ORDER BY score DESC
MUVERA first-stage index¶
For large corpora, add a MUVERA index on the multi-vector column. It encodes each row's token set into a single fixed-dimensional vector (FDE), indexes that with a normal single-vector ANN for fast candidate generation, then re-ranks with exact MaxSim:
CREATE VECTOR INDEX doc_tokens FOR (d:Document) ON d.tokens
OPTIONS { type: 'muvera', k_sim: 4, reps: 20, d_proj: 16 }
Because the final stage is always an exact MaxSim re-rank, a weak FDE only costs recall, never precision. See Indexing — MUVERA Multi-Vector Indexes for parameters and tuning, and Hybrid Search — Reranking for using MaxSim inside uni.fts.query / uni.search.
When To Use¶
Choose vector search when you need semantic similarity rather than exact matching. Pair it with graph traversal for contextual results.
- Use
CALL uni.vector.query(...)to find top-K candidates from a full label. - Use
similar_to()to score nodes already bound byMATCH. - Add a
rerankeroption for higher-precision results on the final candidate set.
See also: Full-Text Search | Hybrid Search