Expand description
Unified Rust runtime for local and remote embedding, reranking, and generation models.
Uni-Xervo provides a single, provider-agnostic API for loading and running ML models across a wide range of backends — from local inference engines (Candle, FastEmbed, mistral.rs) to remote API services (OpenAI, Gemini, Anthropic, Cohere, Mistral, Voyage AI, Vertex AI, Azure OpenAI).
§Key concepts
ModelRuntime— the central runtime that owns providers and manages a catalog of model aliases.ModelAliasSpec— a declarative specification that maps a human-readable alias (e.g."embed/default") to a concrete provider + model pair.- Providers — pluggable backends that implement
ModelProvider. Each provider advertises the tasks it supports and knows how to load models. - Traits —
EmbeddingModel,RerankerModel, andGeneratorModelare the task-specific interfaces returned by the runtime.
§Quick start
use uni_xervo::api::{ModelAliasSpec, ModelTask};
use uni_xervo::runtime::ModelRuntime;
use uni_xervo::provider::candle::LocalCandleProvider;
let spec = ModelAliasSpec {
alias: "embed/local".into(),
task: ModelTask::Embed,
provider_id: "local/candle".into(),
model_id: "sentence-transformers/all-MiniLM-L6-v2".into(),
revision: None,
warmup: Default::default(),
required: true,
timeout: None,
load_timeout: None,
retry: None,
options: serde_json::Value::Null,
};
let runtime = ModelRuntime::builder()
.register_provider(LocalCandleProvider::new())
.catalog(vec![spec])
.build()
.await?;
let model = runtime.embedding("embed/local").await?;
let embeddings = model.embed(vec!["Hello, world!"]).await?;Modules§
- api
- Public API types for configuring models, catalogs, and runtime behavior.
- cache
- Model and weight cache directory resolution.
- error
- Error types for the Uni-Xervo runtime.
- provider
- Provider implementations for local and remote model backends.
- reliability
- Reliability primitives: circuit breaker, instrumented model wrappers with timeout and retry support, and metrics emission.
- runtime
- The core runtime that manages providers, catalogs, and loaded model instances.
- traits
- Core traits that every provider and model implementation must satisfy.