Model Catalog¶
A catalog is a JSON array of ModelAliasSpec objects.
Each entry maps a stable alias (for example embed/default) to a provider model and reliability/warmup configuration.
Field reference¶
| Field | Type | Required | Default | Meaning |
|---|---|---|---|---|
alias |
string |
Yes | - | Must be non-empty and contain / in task/name style. |
task |
embed \| rerank \| generate |
Yes | - | Declares intended capability for this alias. |
provider_id |
string |
Yes | - | Provider ID (for example local/candle, remote/openai). |
model_id |
string |
Yes | - | Provider-specific model identifier. |
revision |
string \| null |
No | null |
Optional model revision/version selector. |
warmup |
eager \| lazy \| background |
No | lazy |
Alias-specific load strategy. |
required |
bool |
No | false |
If true, eager warmup failures fail runtime startup. |
timeout |
u64 seconds |
No | unset | Per-inference timeout for wrapper calls. |
load_timeout |
u64 seconds |
No | 600 |
Max provider load + model warmup duration. |
retry |
object | No | unset | Retry config with attempts and backoff. |
options |
object \| null |
No | null |
Strict provider-specific options. |
Retry object¶
Backoff doubles each retry attempt.
Example catalog¶
[
{
"alias": "embed/local",
"task": "embed",
"provider_id": "local/candle",
"model_id": "sentence-transformers/all-MiniLM-L6-v2",
"warmup": "eager",
"required": true,
"options": {
"cache_dir": ".uni_cache/candle"
}
},
{
"alias": "rerank/semantic",
"task": "rerank",
"provider_id": "remote/cohere",
"model_id": "rerank-v3.5",
"timeout": 20,
"retry": {
"max_attempts": 3,
"initial_backoff_ms": 200
},
"options": {
"api_key_env": "CO_API_KEY"
}
},
{
"alias": "generate/chat",
"task": "generate",
"provider_id": "remote/azure-openai",
"model_id": "gpt-4o-mini",
"load_timeout": 120,
"options": {
"resource_name": "my-azure-openai",
"api_version": "2024-10-21"
}
}
]
Validation behavior¶
At builder/register time Uni-Xervo rejects:
- invalid alias format,
- duplicate aliases,
- unknown providers,
- provider option type/key violations,
- zero-valued
timeoutorload_timeout.
See Config Validation for schema-based CI checks.