Crate uni_xervo

Expand description

Unified Rust runtime for local and remote embedding, reranking, and generation models.

Uni-Xervo provides a single, provider-agnostic API for loading and running ML models across a wide range of backends — from local inference engines (Candle, FastEmbed, mistral.rs) to remote API services (OpenAI, Gemini, Anthropic, Cohere, Mistral, Voyage AI, Vertex AI, Azure OpenAI).

§Key concepts

ModelRuntime — the central runtime that owns providers and manages a catalog of model aliases.
ModelAliasSpec — a declarative specification that maps a human-readable alias (e.g. "embed/default") to a concrete provider + model pair.
Providers — pluggable backends that implement ModelProvider. Each provider advertises the tasks it supports and knows how to load models.
Traits — EmbeddingModel, RerankerModel, and GeneratorModel are the task-specific interfaces returned by the runtime.

§Quick start

use uni_xervo::api::{ModelAliasSpec, ModelTask};
use uni_xervo::runtime::ModelRuntime;
use uni_xervo::provider::candle::LocalCandleProvider;

let spec = ModelAliasSpec {
    alias: "embed/local".into(),
    task: ModelTask::Embed,
    provider_id: "local/candle".into(),
    model_id: "sentence-transformers/all-MiniLM-L6-v2".into(),
    revision: None,
    warmup: Default::default(),
    required: true,
    timeout: None,
    load_timeout: None,
    retry: None,
    options: serde_json::Value::Null,
};

let runtime = ModelRuntime::builder()
    .register_provider(LocalCandleProvider::new())
    .catalog(vec![spec])
    .build()
    .await?;

let model = runtime.embedding("embed/local").await?;
let embeddings = model.embed(vec!["Hello, world!"]).await?;

Modules§

api: Public API types for configuring models, catalogs, and runtime behavior.
cache: Model and weight cache directory resolution.
error: Error types for the Uni-Xervo runtime.
provider: Provider implementations for local and remote model backends.
reliability: Reliability primitives: circuit breaker, instrumented model wrappers with timeout and retry support, and metrics emission.
runtime: The core runtime that manages providers, catalogs, and loaded model instances.
traits: Core traits that every provider and model implementation must satisfy.

Crate uni_xervo

Crate uni_xervo Copy item path

§Key concepts

§Quick start

Modules§

Crate uni_xervo