Quick Start¶
Get Uni running with real data in under 5 minutes. This guide walks you through building the project, importing a sample dataset, and executing your first graph queries.
Prerequisites¶
Before starting, ensure you have: - Uni installed (Installation Guide) - At least 1 GB free disk space for sample data - Terminal access
Step 1: Build the Project¶
If you haven't already built Uni, compile it in release mode for optimal performance:
The binary will be available at target/release/uni. For convenience, you can add it to your PATH:
Step 2: Explore the Demo Dataset¶
Uni includes a sample dataset based on academic papers and citations, perfect for learning graph queries.
Dataset Overview¶
| Entity | Description | Sample Size |
|---|---|---|
| Papers | Academic publications with titles, years, and embeddings | ~1,000 |
| Citations | Directed edges from citing to cited papers | ~5,000 |
Data Format¶
Vertices (papers.jsonl):
{"id": "paper_001", "title": "Attention Is All You Need", "year": 2017, "venue": "NeurIPS", "embedding": [0.12, -0.34, ...]}
{"id": "paper_002", "title": "BERT: Pre-training of Deep Bidirectional Transformers", "year": 2018, "venue": "NAACL", "embedding": [0.08, -0.21, ...]}
Edges (citations.jsonl):
The src paper cites the dst paper. In graph terms: (paper_002)-[:CITES]->(paper_001).
Step 3: Import Data¶
Use the import command to ingest the sample dataset. This creates the graph structure, property storage, and vector indexes.
uni import semantic-scholar \
--papers demos/demo01/data/papers.jsonl \
--citations demos/demo01/data/citations.jsonl \
--output ./my-first-graph
What Happens During Import¶
[1/5] Validating schema...
[2/5] Parsing vertex data (1,000 papers)...
[3/5] Allocating vertex IDs...
[4/5] Building adjacency index (5,000 edges)...
[5/5] Writing to Lance storage...
Import complete!
Vertices: 1,000
Edges: 5,000
Storage: ./my-first-graph (12.4 MB)
The import process: 1. Schema Validation — Ensures all properties match expected types 2. ID Allocation — Maps string IDs to internal 64-bit integers (VIDs) 3. Graph Construction — Builds CSR-format adjacency lists 4. Property Storage — Writes columnar data to Lance datasets 5. Vector Indexing — Creates HNSW indexes for embedding search
Step 4: Use the Interactive REPL¶
The easiest way to explore Uni is via the interactive shell (REPL). It supports syntax highlighting, command history, and formatted table output.
Start the REPL by running uni without any arguments (or with the repl command):
You should see the welcome message:
Try These Queries in the REPL:¶
1. Count all papers:
2. Find papers cited by "Attention Is All You Need":
MATCH (source:Paper {title: 'Attention Is All You Need'})-[:CITES]->(cited)
RETURN cited.title, cited.year
ORDER BY cited.year DESC
3. Clear the screen:
4. Exit the shell:
Step 5: Run One-off Queries¶
You can also run queries directly from your terminal using the query command:
Step 6: Create New Data¶
Add new papers and citations with CREATE:
# Create a new paper
uni query "
CREATE (p:Paper {
id: 'new_paper_001',
title: 'My Research Paper',
year: 2024,
venue: 'ArXiv'
})
RETURN p.title
" --path ./my-first-graph
# Create a citation edge
uni query "
MATCH (citing:Paper), (cited:Paper)
WHERE citing.id = 'new_paper_001'
AND cited.title = 'Attention Is All You Need'
CREATE (citing)-[:CITES]->(cited)
RETURN citing.title, cited.title
" --path ./my-first-graph
Understanding Query Execution¶
When you run a query, Uni:
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ 1. Parse │ ──▶ │ 2. Plan │ ──▶ │ 3. Execute │
│ Cypher │ │ Optimize │ │ Vectorized │
└────────────────┘ └────────────────┘ └────────────────┘
│ │ │
▼ ▼ ▼
"MATCH (n)..." LogicalPlan with Process batches
becomes AST filter pushdown of 1024 rows
- Parse: Cypher text → Abstract Syntax Tree
- Plan: AST → Optimized Logical Plan (with predicate pushdown)
- Execute: Vectorized operators process Arrow batches
What's Next?¶
You've successfully: - Imported graph data from JSONL files - Traversed graph relationships with MATCH - Filtered results with WHERE clauses - Aggregated data with COUNT and ORDER BY - Searched vectors with semantic similarity
Continue Learning¶
| Topic | Description |
|---|---|
| CLI Reference | All CLI commands and options |
| Cypher Querying | Complete query language guide |
| Vector Search | Deep dive into semantic search |
| Data Ingestion | Import strategies and formats |
| Architecture | Understand Uni's internals |
Troubleshooting¶
"No vertices found"¶
Ensure your JSONL files have the correct format:
"Property not found"¶
Check that query property names match your schema exactly (case-sensitive):
"Path does not exist"¶
Provide the full path to your storage directory:
Slow queries¶
For large datasets, ensure vector indexes are built: