Architecture¶
flowchart LR
A[App code] --> B[ModelRuntime]
B --> C[Alias lookup]
C --> D[ModelAliasSpec]
D --> E[ModelRuntimeKey]
E --> F{Registry hit?}
F -->|Yes| G[Reuse loaded handle]
F -->|No| H[Per-key load mutex]
H --> I[Provider.load(spec)]
I --> J[Model warmup]
J --> K[Cache handle in registry]
K --> L[Typed resolver]
G --> L
L --> M[Instrumented wrapper timeout/retry/metrics]
M --> N[embed/rerank/generate call]
Build-time flow¶
- Register providers in builder.
- Ingest catalog and validate each
ModelAliasSpec. - Enforce provider existence and options validation.
- Apply provider warmup policy.
- Apply per-alias model warmup policy.
Runtime flow¶
- Resolve alias via catalog.
- Compute runtime key (includes normalized options hash).
- Return cached instance if already loaded.
- Otherwise load under key-level mutex and cache.
- Downcast to task trait and wrap in instrumented model.
Design notes for contributors¶
options_validationshould be updated whenever new provider options are added.- Provider option schemas under
schemas/provider-options/should mirror runtime validation. - Remote providers should use shared helpers in
provider/remote_common.rsfor consistent auth resolution, HTTP status mapping, and circuit breaker behavior. - New providers must declare accurate
capabilities()so runtime task resolution remains correct.