Skip to content

Storage Engine Internals

Uni's storage engine is architected for high-throughput ingestion and low-latency analytics, leveraging a tiered LSM-tree-like structure backed by Lance columnar format. This design enables both OLTP-style writes and OLAP-style analytical queries on the same data.

Architecture Overview

flowchart TB Writes["Writes"] subgraph L0["L0: Memory"] SG["SimpleGraph<br/>(graph)"] WAL["WAL<br/>(log)"] PropBuf["Property Maps<br/>(HashMap)"] end subgraph Delta["Delta Layer"] DeltaDS["DeltaDataset<br/>(Lance)"] end subgraph Base["Base Layer"] Compacted["Compacted Lance Dataset<br/>โ€ข Full indexes (Vector, Scalar)<br/>โ€ข Optimized row groups<br/>โ€ข Statistics per column"] end Reads["Reads โ†’ Merge View:<br/>L0 โˆช Delta โˆช Base<br/>(VID-based deduplication)"] Writes --> L0 L0 -->|flush| Delta Delta -->|compact| Base Base --> Reads

Tiered Storage Model

L0: Memory Buffer

The L0 layer handles all incoming writes with maximum throughput.

pub struct L0Buffer {
    /// In-memory graph structure (SimpleGraph)
    pub graph: SimpleGraph,

    /// Edge tombstones with version tracking
    pub tombstones: HashMap<Eid, TombstoneEntry>,

    /// Vertex tombstones
    pub vertex_tombstones: HashSet<Vid>,

    /// Edge version tracking
    pub edge_versions: HashMap<Eid, u64>,

    /// Vertex version tracking
    pub vertex_versions: HashMap<Vid, u64>,

    /// Edge properties (separate from graph structure)
    pub edge_properties: HashMap<Eid, Properties>,

    /// Vertex properties (separate from graph structure)
    pub vertex_properties: HashMap<Vid, Properties>,

    /// Edge endpoint mapping (eid -> (src, dst, type_id))
    pub edge_endpoints: HashMap<Eid, (Vid, Vid, u16)>,

    /// Current version number
    pub current_version: u64,

    /// Mutation counter for flush triggering
    pub mutation_count: usize,

    /// Write-ahead log reference
    pub wal: Option<Arc<WriteAheadLog>>,

    /// WAL LSN at last flush
    pub wal_lsn_at_flush: u64,
}

Characteristics:

Property Value Notes
Format SimpleGraph + HashMap Row-oriented for inserts
Durability WAL-backed Survives crashes
Latency ~550ยตs per 1K writes Memory-speed
Capacity Configurable (default 128MB) Auto-flush when full

Write Path:

1. Acquire write lock (single-writer)
2. Append to WAL (sync or async based on config)
3. Insert into SimpleGraph (vertex/edge)
4. Store properties in HashMap
5. Increment mutation counter
6. If threshold reached โ†’ trigger async flush

Auto-Flush Triggers

L0 buffer is automatically flushed to L1 (Lance storage) based on two configurable triggers:

Trigger Default Description
Mutation Count 10,000 Flush when mutations exceed threshold
Time Interval 5 seconds Flush after time elapsed (if mutations > 0)

Configuration:

let config = UniConfig {
    // Flush on mutation count (high-transaction systems)
    auto_flush_threshold: 10_000,

    // Flush on time interval (low-transaction systems)
    auto_flush_interval: Some(Duration::from_secs(5)),

    // Minimum mutations before time-based flush triggers
    auto_flush_min_mutations: 1,
    ..Default::default()
};

Flush Decision Logic:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         AUTO-FLUSH DECISION                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                             โ”‚
โ”‚   After each write:                                                         โ”‚
โ”‚                                                                             โ”‚
โ”‚   mutations >= 10,000?  โ”€โ”€โ”€โ”€โ”€Yesโ”€โ”€โ”€โ”€โ–บ  FLUSH IMMEDIATELY                   โ”‚
โ”‚           โ”‚                                                                 โ”‚
โ”‚           No                                                                โ”‚
โ”‚           โ”‚                                                                 โ”‚
โ”‚           โ–ผ                                                                 โ”‚
โ”‚   Background timer (every 5s):                                              โ”‚
โ”‚                                                                             โ”‚
โ”‚   time_since_last_flush >= 5s                                               โ”‚
โ”‚   AND mutations >= min_mutations?  โ”€โ”€โ”€โ”€โ”€Yesโ”€โ”€โ”€โ”€โ–บ  FLUSH                    โ”‚
โ”‚           โ”‚                                                                 โ”‚
โ”‚           No                                                                โ”‚
โ”‚           โ”‚                                                                 โ”‚
โ”‚           โ–ผ                                                                 โ”‚
โ”‚   Continue buffering                                                        โ”‚
โ”‚                                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Interval Tradeoffs:

Interval Min Mutations Data at Risk I/O Overhead Use Case
1s 1 ~1s High Critical data, local storage
5s 1 ~5s Moderate Default
30s 100 ~30s Low Cost-sensitive cloud workloads
None - All unflushed None Legacy behavior, WAL-only

WAL Still Provides Crash Recovery

Regardless of flush interval, the Write-Ahead Log ensures no committed data is lost on crash. Time-based flush determines when data reaches L1/cloud storage for query visibility and durability.

Delta Layer

When L0 fills up, it flushes to the Delta layer as Lance datasets.

pub struct DeltaDataset {
    /// Lance dataset for deltas
    dataset: Arc<Dataset>,

    /// Edge type this delta is for
    edge_type: EdgeTypeId,

    /// Direction (outgoing or incoming)
    direction: Direction,
}

pub struct L0Manager {
    /// Current L0 buffer
    buffer: Arc<RwLock<L0Buffer>>,

    /// Storage backend
    store: Arc<dyn ObjectStore>,

    /// Schema manager reference
    schema: Arc<SchemaManager>,

    /// Configuration
    config: L0ManagerConfig,
}

Flush Process:

flowchart TB S1["1. SNAPSHOT L0<br/>Create consistent snapshot<br/>of graph + properties"] S2["2. CONVERT TO COLUMNAR<br/>โ€ข Sort vertices by VID<br/>โ€ข Build Arrow RecordBatches<br/>โ€ข Compute column statistics"] S3["3. WRITE LANCE DATASET<br/>โ€ข Write vertex dataset (per label)<br/>โ€ข Write edge dataset (per type)<br/>โ€ข Write adjacency dataset"] S4["4. UPDATE MANIFEST<br/>Add new L1 run to manifest"] S5["5. CLEAR L0<br/>โ€ข Truncate WAL<br/>โ€ข Reset in-memory buffers"] S1 --> S2 --> S3 --> S4 --> S5

L2: Base Layer

The L2 layer contains fully compacted, indexed data.

pub struct L2Layer {
    /// Main vertex dataset (per label)
    vertex_datasets: HashMap<LabelId, Dataset>,

    /// Main edge dataset (per type)
    edge_datasets: HashMap<EdgeTypeId, Dataset>,

    /// Adjacency datasets (per edge type + direction)
    adjacency_datasets: HashMap<(EdgeTypeId, Direction), Dataset>,

    /// Vector indexes
    vector_indexes: HashMap<IndexId, VectorIndex>,

    /// Scalar indexes
    scalar_indexes: HashMap<IndexId, ScalarIndex>,
}

Compaction Process:

flowchart TB C1["1. SELECT RUNS<br/>Choose L1 runs for compaction<br/>(size/age based)"] C2["2. MERGE-SORT<br/>โ€ข Open all selected runs<br/>โ€ข Merge by VID (newest wins)<br/>โ€ข Apply tombstones"] C3["3. REBUILD INDEXES<br/>โ€ข Vector index: Rebuild HNSW/IVF_PQ<br/>โ€ข Scalar indexes: Rebuild BTree<br/>โ€ข Compute new statistics"] C4["4. WRITE NEW L2<br/>Write as single optimized<br/>Lance dataset"] C5["5. CLEANUP<br/>Delete compacted L1 runs"] C1 --> C2 --> C3 --> C4 --> C5

Lance Integration

Uni uses Lance as its core columnar format.

Why Lance?

Feature Benefit
Native Vector Support Built-in HNSW, IVF_PQ indexes
Versioning Time-travel, ACID transactions
Fast Random Access O(1) row lookup by index
Columnar Scans Efficient analytical queries
Object Store Native S3/GCS support built-in
Zero-Copy Arrow-compatible memory layout

Lance File Format

Lance Dataset Structure:

data/
โ”œโ”€โ”€ _versions/                    # Version metadata
โ”‚   โ”œโ”€โ”€ 1.manifest               # Version 1 manifest
โ”‚   โ”œโ”€โ”€ 2.manifest               # Version 2 manifest
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ _indices/                     # Index data
โ”‚   โ”œโ”€โ”€ vector_idx_001/          # Vector index
โ”‚   โ”‚   โ”œโ”€โ”€ index.idx
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ scalar_idx_002/          # Scalar index
โ””โ”€โ”€ data/                         # Column data
    โ”œโ”€โ”€ part-0.lance              # Data fragment
    โ”œโ”€โ”€ part-1.lance
    โ””โ”€โ”€ ...

Data Fragment Structure

pub struct LanceFragment {
    /// Fragment ID
    id: u64,

    /// Row range in this fragment
    row_range: Range<u64>,

    /// Physical files for each column
    columns: HashMap<String, ColumnFiles>,

    /// Fragment-level statistics
    stats: FragmentStatistics,
}

pub struct FragmentStatistics {
    /// Row count
    num_rows: u64,

    /// Per-column statistics
    column_stats: HashMap<String, ColumnStats>,
}

pub struct ColumnStats {
    null_count: u64,
    min_value: Option<ScalarValue>,
    max_value: Option<ScalarValue>,
    distinct_count: Option<u64>,
}

Dataset Organization

Vertex Datasets

One Lance dataset per vertex label:

storage/
โ”œโ”€โ”€ vertices/
โ”‚   โ”œโ”€โ”€ Paper/                    # :Paper vertices
โ”‚   โ”‚   โ”œโ”€โ”€ _versions/
โ”‚   โ”‚   โ”œโ”€โ”€ _indices/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ embedding_hnsw/   # Vector index
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ year_btree/       # Scalar index
โ”‚   โ”‚   โ””โ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ Author/                   # :Author vertices
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ Venue/                    # :Venue vertices
โ”‚       โ””โ”€โ”€ ...

Vertex Schema:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                           VERTEX DATASET SCHEMA                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                             โ”‚
โ”‚   System Columns:                                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚   โ”‚ Column        โ”‚ Type         โ”‚ Description                            โ”‚โ”‚
โ”‚   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”คโ”‚
โ”‚   โ”‚ _vid          โ”‚ UInt64       โ”‚ Internal vertex ID (label bits + ID)  โ”‚โ”‚
โ”‚   โ”‚ _uid          โ”‚ Binary(32)   โ”‚ UniId (32-byte SHA3-256) - optional   โ”‚โ”‚
โ”‚   โ”‚ _deleted      โ”‚ Bool         โ”‚ Tombstone marker (soft delete)        โ”‚โ”‚
โ”‚   โ”‚ _version      โ”‚ UInt64       โ”‚ Last modified version                 โ”‚โ”‚
โ”‚   โ”‚ ext_id        โ”‚ String       โ”‚ External ID (extracted from props)    โ”‚โ”‚
โ”‚   โ”‚ _created_at   โ”‚ Timestamp    โ”‚ Creation timestamp                    โ”‚โ”‚
โ”‚   โ”‚ _updated_at   โ”‚ Timestamp    โ”‚ Last update timestamp                 โ”‚โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚                                                                             โ”‚
โ”‚   User Properties (schema-defined):                                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ title    โ”‚ String   โ”‚ Paper title                                    โ”‚ โ”‚
โ”‚   โ”‚ year     โ”‚ Int32    โ”‚ Publication year                               โ”‚ โ”‚
โ”‚   โ”‚ abstract โ”‚ String   โ”‚ Paper abstract (nullable)                      โ”‚ โ”‚
โ”‚   โ”‚ embeddingโ”‚ Vector   โ”‚ 768-dimensional embedding                      โ”‚ โ”‚
โ”‚   โ”‚ _doc     โ”‚ Json     โ”‚ Document mode flexible fields (deprecated)     โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                             โ”‚
โ”‚   Schemaless Properties:                                                    โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ overflow_json  โ”‚ LargeBinary  โ”‚ JSONB binary for non-schema props   โ”‚ โ”‚
โ”‚   โ”‚                โ”‚ (JSONB)      โ”‚ - Automatically queryable           โ”‚ โ”‚
โ”‚   โ”‚                โ”‚              โ”‚ - Query rewriting to JSON functions โ”‚ โ”‚
โ”‚   โ”‚                โ”‚              โ”‚ - PostgreSQL-compatible format      โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Edge Datasets

One Lance dataset per edge type:

storage/
โ”œโ”€โ”€ edges/
โ”‚   โ”œโ”€โ”€ CITES/                    # :CITES edges
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ”œโ”€โ”€ AUTHORED_BY/              # :AUTHORED_BY edges
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ PUBLISHED_IN/
โ”‚       โ””โ”€โ”€ ...

Edge Schema:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                            EDGE DATASET SCHEMA                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                             โ”‚
โ”‚   System Columns:                                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚   โ”‚ Column        โ”‚ Type         โ”‚ Description                            โ”‚โ”‚
โ”‚   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”คโ”‚
โ”‚   โ”‚ _eid          โ”‚ UInt64       โ”‚ Internal edge ID (type bits + ID)     โ”‚โ”‚
โ”‚   โ”‚ _src_vid      โ”‚ UInt64       โ”‚ Source vertex VID                      โ”‚โ”‚
โ”‚   โ”‚ _dst_vid      โ”‚ UInt64       โ”‚ Destination vertex VID                 โ”‚โ”‚
โ”‚   โ”‚ _deleted      โ”‚ Bool         โ”‚ Tombstone marker                       โ”‚โ”‚
โ”‚   โ”‚ _version      โ”‚ UInt64       โ”‚ Last modified version                  โ”‚โ”‚
โ”‚   โ”‚ _created_at   โ”‚ Timestamp    โ”‚ Creation timestamp                     โ”‚โ”‚
โ”‚   โ”‚ _updated_at   โ”‚ Timestamp    โ”‚ Last update timestamp                  โ”‚โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚                                                                             โ”‚
โ”‚   Edge Properties (schema-defined):                                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ weight   โ”‚ Float64  โ”‚ Edge weight/score                              โ”‚ โ”‚
โ”‚   โ”‚ position โ”‚ Int32    โ”‚ Author position (for AUTHORED_BY)              โ”‚ โ”‚
โ”‚   โ”‚ timestampโ”‚ Timestampโ”‚ When the edge was created                      โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                             โ”‚
โ”‚   Schemaless Properties:                                                    โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ overflow_json  โ”‚ LargeBinary  โ”‚ JSONB binary for non-schema props   โ”‚ โ”‚
โ”‚   โ”‚                โ”‚ (JSONB)      โ”‚ - Same as vertex overflow support   โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Adjacency Datasets

Optimized for graph traversal (CSR-style):

storage/
โ”œโ”€โ”€ adjacency/
โ”‚   โ”œโ”€โ”€ CITES_OUT/                # Outgoing CITES edges
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ”œโ”€โ”€ CITES_IN/                 # Incoming CITES edges (reverse)
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”‚   โ””โ”€โ”€ ...

Adjacency Schema:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         ADJACENCY DATASET SCHEMA                             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                             โ”‚
โ”‚   Chunked CSR Format (one row per chunk of vertices):                       โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚   โ”‚ Column      โ”‚ Type         โ”‚ Description                            โ”‚  โ”‚
โ”‚   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”‚
โ”‚   โ”‚ chunk_id    โ”‚ UInt64       โ”‚ Chunk identifier                       โ”‚  โ”‚
โ”‚   โ”‚ vid_start   โ”‚ UInt64       โ”‚ First VID in chunk                     โ”‚  โ”‚
โ”‚   โ”‚ vid_end     โ”‚ UInt64       โ”‚ Last VID in chunk (exclusive)          โ”‚  โ”‚
โ”‚   โ”‚ offsets     โ”‚ List<UInt64> โ”‚ CSR offsets (chunk_size + 1 elements)  โ”‚  โ”‚
โ”‚   โ”‚ neighbors   โ”‚ List<UInt64> โ”‚ Neighbor VIDs (flattened)              โ”‚  โ”‚
โ”‚   โ”‚ edge_ids    โ”‚ List<UInt64> โ”‚ Edge IDs (parallel to neighbors)       โ”‚  โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚                                                                             โ”‚
โ”‚   Example (chunk_size=1000):                                                โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚   โ”‚ chunk_id: 0                                                         โ”‚   โ”‚
โ”‚   โ”‚ vid_start: 0, vid_end: 1000                                         โ”‚   โ”‚
โ”‚   โ”‚ offsets: [0, 3, 3, 7, 10, ...]  (1001 elements)                     โ”‚   โ”‚
โ”‚   โ”‚ neighbors: [v5, v12, v99, v4, v6, v8, v42, ...]                     โ”‚   โ”‚
โ”‚   โ”‚ edge_ids: [e1, e2, e3, e4, e5, e6, e7, ...]                         โ”‚   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Write-Ahead Log (WAL)

The WAL ensures durability for uncommitted L0 data.

WAL Structure

pub struct WriteAheadLog {
    /// Object store-backed WAL
    store: Arc<dyn ObjectStore>,

    /// WAL prefix/path
    prefix: Path,

    /// In-memory state (buffer + LSN tracking)
    state: Mutex<WalState>,
}

struct WalState {
    buffer: Vec<Mutation>,
    next_lsn: u64,
    flushed_lsn: u64,
}

/// WAL segment with LSN for idempotent recovery
pub struct WalSegment {
    pub lsn: u64,
    pub mutations: Vec<Mutation>,
}

WAL Mutation Types

The WAL records mutations using the following enum:

pub enum Mutation {
    /// Insert a new edge
    InsertEdge {
        src_vid: Vid,
        dst_vid: Vid,
        edge_type: u16,
        eid: Eid,
        version: u64,
        properties: Properties,
    },

    /// Delete an existing edge
    DeleteEdge {
        eid: Eid,
        src_vid: Vid,
        dst_vid: Vid,
        edge_type: u16,
        version: u64,
    },

    /// Insert a new vertex
    InsertVertex {
        vid: Vid,
        properties: Properties,
    },

    /// Delete an existing vertex
    DeleteVertex {
        vid: Vid,
    },
}

Note: The label_id is encoded in the VID itself, so InsertVertex doesn't need a separate label_id field.

Recovery Process

impl Wal {
    pub async fn recover(&self, l0: &mut L0Buffer) -> Result<()> {
        // Find all WAL segments
        let segments = self.list_segments()?;

        for segment in segments {
            let reader = WalReader::open(&segment)?;

            while let Some(entry) = reader.next_entry()? {
                // Verify CRC
                if !entry.verify_crc() {
                    // Truncate at corruption point
                    break;
                }

                // Replay entry
                match entry.entry_type {
                    EntryType::InsertVertex { vid, label_id, props } => {
                        l0.insert_vertex(vid, label_id, props)?;
                    }
                    EntryType::InsertEdge { eid, src, dst, type_id, props } => {
                        l0.insert_edge(eid, src, dst, type_id, props)?;
                    }
                    // ... handle other types
                }
            }
        }

        Ok(())
    }
}

Snapshot Management

Snapshots provide consistent point-in-time views.

Manifest Structure

{
  "version": 42,
  "timestamp": "2024-01-15T10:30:00Z",
  "schema_version": 1,

  "vertex_datasets": {
    "Paper": {
      "lance_version": 15,
      "row_count": 1000000,
      "size_bytes": 524288000
    },
    "Author": {
      "lance_version": 8,
      "row_count": 250000,
      "size_bytes": 62500000
    }
  },

  "edge_datasets": {
    "CITES": {
      "lance_version": 12,
      "row_count": 5000000,
      "size_bytes": 200000000
    }
  },

  "adjacency_datasets": {
    "CITES_OUT": { "lance_version": 12 },
    "CITES_IN": { "lance_version": 12 }
  },

  "l1_runs": [
    { "sequence": 100, "created_at": "2024-01-15T10:25:00Z" },
    { "sequence": 101, "created_at": "2024-01-15T10:28:00Z" }
  ],

  "indexes": {
    "paper_embeddings": {
      "type": "hnsw",
      "version": 5,
      "row_count": 1000000
    },
    "paper_year": {
      "type": "btree",
      "version": 3
    }
  }
}

Snapshot Operations

impl StorageManager {
    /// Create a new snapshot (after flush)
    pub async fn create_snapshot(&self) -> Result<Snapshot> {
        let manifest = Manifest {
            version: self.next_version(),
            timestamp: Utc::now(),
            vertex_datasets: self.collect_vertex_versions(),
            edge_datasets: self.collect_edge_versions(),
            adjacency_datasets: self.collect_adjacency_versions(),
            l1_runs: self.l1_manager.list_runs(),
            indexes: self.collect_index_versions(),
        };

        // Write manifest atomically
        self.write_manifest(&manifest).await?;

        Ok(Snapshot::new(manifest))
    }

    /// Open a specific snapshot for reading
    pub async fn open_snapshot(&self, version: u64) -> Result<SnapshotReader> {
        let manifest = self.read_manifest(version).await?;

        Ok(SnapshotReader {
            manifest,
            vertex_readers: self.open_vertex_readers(&manifest).await?,
            edge_readers: self.open_edge_readers(&manifest).await?,
            adjacency_cache: self.load_adjacency(&manifest).await?,
        })
    }
}

Index Storage

Vector Index Storage

Vector indexes (HNSW, IVF_PQ) are stored within Lance datasets:

pub struct VectorIndexStorage {
    /// Lance dataset with index
    dataset: Dataset,

    /// Index configuration
    config: VectorIndexConfig,
}

impl VectorIndexStorage {
    pub async fn create_index(
        dataset: &mut Dataset,
        column: &str,
        config: VectorIndexConfig,
    ) -> Result<()> {
        match config.index_type {
            IndexType::Hnsw => {
                dataset.create_index()
                    .column(column)
                    .index_type("IVF_HNSW_SQ")
                    .metric_type(config.metric.to_lance())
                    .build()
                    .await?;
            }
            IndexType::IvfPq => {
                dataset.create_index()
                    .column(column)
                    .index_type("IVF_PQ")
                    .metric_type(config.metric.to_lance())
                    .num_partitions(config.num_partitions)
                    .num_sub_vectors(config.num_sub_vectors)
                    .build()
                    .await?;
            }
        }

        Ok(())
    }
}

Scalar Index Storage

Scalar indexes use Lance's built-in index support:

pub struct ScalarIndexStorage {
    /// Index metadata
    metadata: ScalarIndexMetadata,
}

impl ScalarIndexStorage {
    pub async fn create_index(
        dataset: &mut Dataset,
        column: &str,
        index_type: ScalarIndexType,
    ) -> Result<()> {
        dataset.create_index()
            .column(column)
            .index_type(match index_type {
                ScalarIndexType::BTree => "BTREE",
                ScalarIndexType::Hash => "HASH",
                ScalarIndexType::Bitmap => "BITMAP",
            })
            .build()
            .await?;

        Ok(())
    }
}

Schemaless Properties and Query Rewriting

Overview

Uni supports schemaless properties - properties not defined in the schema that can be stored and queried dynamically. This enables flexible data modeling without requiring schema migrations.

Storage Mechanism

Properties are stored differently based on whether they're in the schema:

Property Type Storage Location Format Performance
Schema-defined Typed Arrow columns Native type (Int32, String, etc.) โšก Fast (columnar)
Non-schema overflow_json column JSONB binary ๐ŸŸก Good (parsed)

JSONB Binary Format

The overflow_json column uses JSONB binary format (PostgreSQL-compatible):

// Schema definition
overflow_json: LargeBinary  // Not Utf8!

// Encoding (on write)
let overflow_props = build_overflow_json_column(vertices, schema)?;
// Returns JSONB binary, not JSON string

// Decoding (on read)
let raw_jsonb = jsonb::RawJsonb::new(bytes);
let json_string = raw_jsonb.to_string();
let props: HashMap<String, Value> = serde_json::from_str(&json_string)?;

Benefits: - ๐Ÿš€ Faster than JSON strings (binary representation) - ๐Ÿ”— PostgreSQL-compatible (same JSONB format) - โš™๏ธ Enables Lance's optimized built-in JSON UDFs

Automatic Query Rewriting

When queries access overflow properties, the planner automatically rewrites them to use Lance's JSON functions:

Original Cypher:

MATCH (p:Person) WHERE p.city = 'NYC' RETURN p.name, p.age

If city and age are not in schema, rewritten to:

WHERE json_get_string(p.overflow_json, 'city') = 'NYC'
RETURN p.name, json_get_string(p.overflow_json, 'age')

Supported JSON Functions: - json_get_string(overflow_json, key) - Extract string value - json_get_int(overflow_json, key) - Extract integer value - json_get_float(overflow_json, key) - Extract float value - json_get_bool(overflow_json, key) - Extract boolean value

Rewriting Algorithm

flowchart TB Q["Query: WHERE p.city = 'NYC'"] C{"Is 'city' in<br/>Person schema?"} S["Use typed column:<br/>WHERE p.city = 'NYC'"] R["Rewrite to JSONB:<br/>WHERE json_get_string(<br/>p.overflow_json, 'city') = 'NYC'"] E["Execute"] Q --> C C -->|Yes| S C -->|No| R S --> E R --> E

Mixed Schema + Overflow Queries

Queries seamlessly mix both types:

// Schema: Person has 'name' property
// Overflow: 'city' and 'verified' not in schema

MATCH (p:Person)
WHERE p.name = 'Alice'        -- Typed column (fast)
  AND p.city = 'NYC'          -- overflow_json (rewritten)
  AND p.verified = true       -- overflow_json (rewritten)
RETURN p.name, p.city, p.age  -- Mixed access

Planner automatically: 1. Identifies schema vs overflow properties 2. Uses typed columns for schema properties 3. Rewrites overflow property access to JSON functions 4. Ensures overflow_json column is materialized in scan

Performance Characteristics

Metric Schema Properties Overflow Properties
Filtering โšก Fast (columnar predicate pushdown) ๐ŸŸก Good (JSONB parsing)
Sorting โšก Fast (native type sorting) ๐ŸŸก Slower (extract then sort)
Compression โšก Type-specific (5-20x) ๐Ÿ”ด Limited (binary blob)
Indexing โœ… Supported (BTree, Vector, etc.) โŒ Not indexed
Schema Evolution โš ๏ธ Requires migration โœ… No migration needed

Usage Guidelines

Use Schema Properties for: - โœ… Frequently queried fields (filters, sorts, aggregations) - โœ… Core data model properties - โœ… Properties requiring indexes - โœ… Performance-critical fields

Use Overflow Properties for: - โœ… Optional/rare properties - โœ… User-defined metadata - โœ… Rapidly evolving schemas - โœ… Prototyping and experimentation

Example: Schemaless Label

// Create label without property definitions
db.schema().label("Document").apply().await?;

// Create with arbitrary properties
db.execute("CREATE (:Document {
    title: 'Article',
    author: 'Alice',
    tags: ['tech', 'ai'],
    year: 2024
})").await?;

// All properties stored in overflow_json
db.flush().await?;

// Query works transparently (automatic rewriting)
let results = db.query("
    MATCH (d:Document)
    WHERE d.author = 'Alice' AND d.year > 2020
    RETURN d.title, d.tags
").await?;

Example: Mixed Schema + Overflow

// Define core properties in schema
db.schema()
    .label("Person")
    .property("name", DataType::String)  // Schema property
    .property("age", DataType::Int)      // Schema property
    .apply().await?;

// Create with schema + overflow properties
db.execute("CREATE (:Person {
    name: 'Bob',       -- Schema (typed column)
    age: 25,           -- Schema (typed column)
    city: 'NYC',       -- Overflow (overflow_json)
    verified: true     -- Overflow (overflow_json)
})").await?;

db.flush().await?;

// Query mixing both (transparent to user)
let results = db.query("
    MATCH (p:Person)
    WHERE p.name = 'Bob'      -- Fast: typed column
      AND p.city = 'NYC'      -- Rewritten: json_get_string(...)
    RETURN p.age, p.verified  -- Mixed: typed + overflow
").await?;

Implementation Details

Write Path: 1. Properties split into schema vs non-schema 2. Schema properties โ†’ Typed Arrow columns 3. Non-schema properties โ†’ Serialized to JSONB binary 4. overflow_json column written to Lance dataset

Read Path: 1. Query planner identifies overflow properties 2. Rewrites expressions to json_get_* functions 3. Ensures overflow_json in scan projection 4. PropertyManager decodes JSONB binary 5. Results merged with typed properties

Flush Behavior: - L0 buffer stores all properties equally (no distinction) - During flush, properties partitioned by schema - build_overflow_json_column() serializes non-schema props - Both typed columns and overflow_json written to Lance

Compaction: - Overflow properties preserved through L1 โ†’ L2 compaction - Main vertices table includes all properties in props_json - Per-label tables maintain separation (typed + overflow)


Object Store Integration

Uni uses the object_store crate for storage abstraction, supporting local filesystems and major cloud providers.

Supported Backends

Backend URI Scheme Status
Local filesystem file:// or path Stable
Amazon S3 s3://bucket/path Stable
Google Cloud Storage gs://bucket/path Stable
Azure Blob Storage az://container/path Stable
Memory (in-memory) Stable (testing)

Using Local Storage

// Standard local storage
let db = Uni::open("./my-database").build().await?;

// Explicit file:// URI
let db = Uni::open("file:///var/data/uni").build().await?;

// In-memory for testing
let db = Uni::in_memory().build().await?;

Cloud Storage

Open databases directly from cloud object stores:

// Amazon S3
let db = Uni::open("s3://my-bucket/graph-data").build().await?;

// Google Cloud Storage
let db = Uni::open("gs://my-bucket/graph-data").build().await?;

// Azure Blob Storage
let db = Uni::open("az://my-container/graph-data").build().await?;

Credential Resolution:

Cloud credentials are resolved automatically using standard environment variables and configuration files:

Provider Environment Variables Config Files
AWS S3 AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ENDPOINT_URL ~/.aws/credentials
GCS GOOGLE_APPLICATION_CREDENTIALS Application Default Credentials
Azure AZURE_STORAGE_ACCOUNT, AZURE_STORAGE_ACCESS_KEY, AZURE_STORAGE_SAS_TOKEN Azure CLI credentials

Hybrid Mode (Local + Cloud)

For optimal performance with cloud storage, use hybrid mode to maintain a local write cache:

use uni_common::CloudStorageConfig;

let cloud = CloudStorageConfig::s3_from_env("my-bucket");

let db = Uni::open("./local-cache")
    .hybrid("./local-cache", "s3://my-bucket/graph-data")
    .cloud_config(cloud)
    .build()
    .await?;

Hybrid Mode Operation:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                            HYBRID MODE                                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                             โ”‚
โ”‚   WRITES                           READS                                    โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚   โ”‚ Client  โ”‚                      โ”‚ Query                           โ”‚     โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜                      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚        โ”‚                                          โ”‚                         โ”‚
โ”‚        โ–ผ                                          โ–ผ                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚
โ”‚   โ”‚ Local L0    โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บโ”‚ Merge View                      โ”‚     โ”‚
โ”‚   โ”‚ (WAL+Buffer)โ”‚                  โ”‚ (Local L0 โˆช Cloud Storage)      โ”‚     โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚
โ”‚          โ”‚                                        โ–ฒ                         โ”‚
โ”‚          โ”‚ flush                                  โ”‚                         โ”‚
โ”‚          โ–ผ                                        โ”‚                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                 โ”‚                         โ”‚
โ”‚   โ”‚ Cloud       โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                         โ”‚
โ”‚   โ”‚ (S3/GCS)    โ”‚                                                           โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                                           โ”‚
โ”‚                                                                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Benefits: - Low-latency writes: All mutations go to local WAL + L0 buffer - Durable storage: Data flushed to cloud storage on interval/threshold - Cost efficiency: Minimize cloud API calls through batching - Crash recovery: WAL provides local durability before cloud sync


Performance Characteristics

Write Performance

Operation Latency Throughput Notes
L0 insert (vertex) ~50ยตs ~20K/sec Memory only
L0 insert (edge) ~30ยตs ~33K/sec Memory only
L0 batch insert (1K) ~550ยตs ~1.8M/sec Amortized
L0 โ†’ L1 flush ~6ms/1K - Lance write
L1 โ†’ L2 compact ~1s/100K - Background

Read Performance

Operation Latency Notes
Point lookup (indexed) ~2.9ms BTree index
Range scan (indexed) ~5ms + 0.1ms/row B-tree index
Full scan ~50ms/100K rows Columnar
Vector KNN (k=10) ~1.8ms HNSW index

Storage Efficiency

Data Type Compression Ratio
Integers Dictionary + RLE 5-20x
Strings Dictionary + LZ4 3-10x
Vectors No compression 1x
Booleans Bitmap 8x

Configuration Reference

Storage behavior is configured via UniConfig (see the Configuration reference). Key storage-related fields include:

  • auto_flush_threshold, auto_flush_interval, auto_flush_min_mutations
  • wal_enabled
  • compaction.* (e.g., max_l1_runs, max_l1_size_bytes)
  • object_store.*
  • index_rebuild.*

Example:

use std::time::Duration;
use uni_db::UniConfig;

let mut config = UniConfig::default();
config.auto_flush_threshold = 10_000;
config.auto_flush_interval = Some(Duration::from_secs(5));
config.compaction.max_l1_runs = 4;
config.compaction.max_l1_size_bytes = 256 * 1024 * 1024;

Next Steps