pub struct ModelRuntime { /* private fields */ }Expand description
The central runtime that owns registered providers and a catalog of model aliases.
Obtain an instance via ModelRuntime::builder() and the
ModelRuntimeBuilder. Once built, use embedding,
reranker, or generator to obtain
typed, instrumented model handles.
Models are loaded lazily on first access (unless configured for eager or background warmup) and cached in an internal registry so that subsequent requests for the same model are served instantly.
Implementations§
Source§impl ModelRuntime
impl ModelRuntime
Sourcepub fn builder() -> ModelRuntimeBuilder
pub fn builder() -> ModelRuntimeBuilder
Create a new ModelRuntimeBuilder for configuring and constructing a
runtime.
Sourcepub async fn register(&self, spec: ModelAliasSpec) -> Result<()>
pub async fn register(&self, spec: ModelAliasSpec) -> Result<()>
Register a new model alias at runtime.
Sourcepub async fn contains_alias(&self, alias: &str) -> bool
pub async fn contains_alias(&self, alias: &str) -> bool
Check if an alias exists in the catalog.
Sourcepub async fn prefetch_all(&self) -> Result<()>
pub async fn prefetch_all(&self) -> Result<()>
Pre-load and cache every model in the catalog.
Models already loaded are skipped. Fails fast on the first error. Call this during application startup to avoid cold-start latency on first inference.
Sourcepub async fn prefetch(&self, aliases: &[&str]) -> Result<()>
pub async fn prefetch(&self, aliases: &[&str]) -> Result<()>
Pre-load and cache specific aliases.
Returns an error immediately if an alias is not found in the catalog or if any model fails to load. Models already loaded are skipped.
Sourcepub async fn embedding(&self, alias: &str) -> Result<Arc<dyn EmbeddingModel>>
pub async fn embedding(&self, alias: &str) -> Result<Arc<dyn EmbeddingModel>>
Resolve, load (if necessary), and return an instrumented EmbeddingModel
handle for the given alias.
Sourcepub async fn reranker(&self, alias: &str) -> Result<Arc<dyn RerankerModel>>
pub async fn reranker(&self, alias: &str) -> Result<Arc<dyn RerankerModel>>
Resolve, load (if necessary), and return an instrumented RerankerModel
handle for the given alias.
Sourcepub async fn generator(&self, alias: &str) -> Result<Arc<dyn GeneratorModel>>
pub async fn generator(&self, alias: &str) -> Result<Arc<dyn GeneratorModel>>
Resolve, load (if necessary), and return an instrumented GeneratorModel
handle for the given alias.
Auto Trait Implementations§
impl !Freeze for ModelRuntime
impl !RefUnwindSafe for ModelRuntime
impl Send for ModelRuntime
impl Sync for ModelRuntime
impl Unpin for ModelRuntime
impl !UnwindSafe for ModelRuntime
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more