Routing
Routing in Keel is explicit, auditable, and non-autonomous. Each routing-aware request produces a recorded selection — a requested target, a selected target, an explicit reason for any change, and a fallback history when fallback ran. There is no fleet-wide automatic provider selection, and routing never bypasses governance.
Different surfaces expose routing differently. POST /v1/execute resolves a routing-policy alias to a concrete target when provider is omitted. POST /v1/executions and select proxy routes accept a public routing envelope on the request that drives initial selection, same-provider reselection, and explicit fallback. The capability matrix below shows which behavior each surface supports.
Per-surface routing capability matrix
| Surface | Initial target | Same-provider reselection | Routing-fitness reranking | Cross-provider fallback | Explicit fallback chain | Priority hint |
|---|---|---|---|---|---|---|
POST /v1/executions | routing.provider / routing.model | Active | Active | Opt-in via routing.allow_cross_provider_fallback | Active | Active (cheap / quality) |
POST /v1/execute | Resolved via aliases / health / routing budgets when provider is omitted | n/a | n/a | n/a | Not exposed | n/a |
POST /v1/proxy/openai | Route-bound provider | Recorded only | Recorded only | Opt-in (non-stream text subset) | Active | Recorded only |
POST /v1/proxy/anthropic | Route-bound provider | Recorded only | Recorded only | Opt-in (non-stream text subset) | Active | Recorded only |
POST /v1/proxy/google | Route-bound provider | n/a | n/a | Rejected | Rejected | n/a |
POST /v1/proxy/xai | Route-bound provider | n/a | n/a | Rejected | Rejected | n/a |
POST /v1/proxy/meta | Route-bound provider | n/a | n/a | Rejected | Rejected | n/a |
POST /v1/permits | n/a | n/a | n/a | n/a | n/a | Metadata only |
POST /v1/jobs | n/a | n/a | n/a | n/a | n/a | Metadata only |
Reading the matrix:
- Active — the field changes selection or dispatch behavior on this surface.
- Recorded only — the field is accepted on the request, persisted on the routing record, and visible in audit evidence, but does not currently influence the dispatched target on that surface.
- Rejected — the surface refuses requests that carry the field.
- Metadata only — the surface stores routing intent on the permit or job record but does not run the request-scoped routing contract.
- n/a — the field has no meaning on this surface.
Target resolution on /v1/execute
This behavior runs only on POST /v1/execute, and only when the request omits provider. Its job is to resolve an alias, a model name, or a unique capability-registry entry into a concrete provider/model target. The selected target then proceeds through provider-bound execution.
Inputs Keel can use
- A routing-policy alias from the project, organization, or global scope.
- A model alias defined in the capability registry.
- A unique provider match for a model name.
- Project-scoped provider-health ranking.
- Active routing budgets.
The selected target is exposed in the resolved block on the /v1/execute response. This selection path does not surface through the public routing envelope used by POST /v1/executions.
Routing policies
Routing policies are control-plane alias rules. They are administered through the dashboard and exposed via dashboard-session authenticated routes:
GET /v1/control-plane/routing-policies
POST /v1/control-plane/routing-policiesA routing-policy row carries an alias, an optional operation, an optional priority hint, a provider, a model identifier, a rank, an active flag, and a scope (project, organization, or global). Each row tells Keel “when callers ask for this alias, consider this provider+model as a candidate.”
{
"project_id": "11111111-1111-1111-1111-111111111111",
"alias": "smart-default",
"operation": "generate.text",
"priority_hint": "cheap",
"provider": "openai",
"model_id": "gpt-4.1-mini",
"rank": 0,
"is_active": true
}Project and organization scopes are mutable through the API. Global routing policies are not exposed for API mutation.
Resolution order
When POST /v1/execute resolves an alias, Keel walks scopes in this order:
- Project scope.
- Organization scope (when the project belongs to an organization).
- Global scope.
Keel stops at the first scope that yields any active candidates. Within the winning scope, candidates are ordered by:
- Most specific rule shape —
operation + priority_hint, thenoperationonly, thenpriority_hintonly, then a generic alias. - The configured
rankvalue. - Authoring order, oldest first.
The first candidate after that ordering is the initial primary target.
Reordering after alias resolution
Routing-policy resolution is the first step, not the last. Once an alias produces candidates, Keel can further reorder the candidate set using:
- Project-scoped provider-health ranking.
- Evidence-based global health override (only when an external provider failure is confirmed across independent signals; see Security › Beta-period security boundaries).
- Active routing budgets.
So the final /v1/execute target can differ from the raw top-ranked alias row when health or budget controls apply. Each input that influenced the selection is recorded so the change is auditable.
The resolved block
POST /v1/execute responses include a resolved block summarizing how the target was resolved:
resolved.alias— the alias that was matched, when an alias drove the selection.resolved.policy— the routing-policy row that won, including its scope.resolved.health— the health signal that influenced ordering, when present.resolved.budget— the active routing budget that influenced ordering, when present.
The resolved block is the customer-facing audit surface for /v1/execute target resolution. Routing policies do not change POST /v1/executions behavior, do not change proxy-route routing, and are not policy rows in the permit policy engine.
Request-scoped routing on /v1/executions and select proxies
This behavior runs on POST /v1/executions and on the OpenAI and Anthropic proxy routes. It accepts a public routing envelope on the request that can change the selected provider, change the selected model within the same provider, and carry an explicit fallback chain into dispatch.
The routing envelope
{
"routing": {
"provider": "openai",
"model": "gpt-4o",
"priority": "cheap",
"task_type": "summarization",
"latency_target_ms": 8000,
"allow_cross_provider_fallback": false,
"fallback_chain": [
{"provider": "anthropic", "model": "claude-sonnet"},
{"provider": "openai", "model": "gpt-4o-mini"}
]
}
}| Field | POST /v1/executions | POST /v1/proxy/openai and anthropic | Meaning |
|---|---|---|---|
provider | Active | n/a (route-bound) | Initial provider target. |
model | Active | n/a (route-bound) | Initial model target. |
priority | Active | Recorded only | One of cheap, balanced, quality. Drives tie-breaks among scored candidates. |
task_type | Active | Recorded only | Routing-fitness key for same-provider model reselection. |
latency_target_ms | Recorded only | Recorded only | Audit metadata only today; does not alter selection. |
allow_cross_provider_fallback | Active | Active | Opt-in gate for cross-provider entries in fallback_chain. |
fallback_chain | Active | Active | Explicit ordered fallback targets. |
The Google, xAI, and Meta proxy routes reject public routing entirely; sending the envelope produces a request error.
Initial target selection
On POST /v1/executions, Keel resolves the initial provider/model from the request inputs in this order:
- Use
routing.providerandrouting.modelwhen supplied. - Otherwise, fall back to the request’s defaults and capability-registry inference.
On the OpenAI and Anthropic proxy routes, the route itself binds the provider — the initial provider is the provider named in the route path. routing.provider and routing.model are not applicable on these routes; the initial target is the route-bound provider plus whatever model the proxy payload names.
Hard eligibility versus soft ranking
Once an initial target is selected, Keel computes the eligible same-provider candidate set, applies hard filters first, and only then applies soft ranking.
Hard eligibility — a candidate must pass all of these to remain in consideration:
- The selected provider boundary (Layer 2 reselection stays inside one provider).
- Any active text-policy allow-list.
- Registered operation support for the candidate.
- Execution-mode support (sync, async, streaming where applicable).
- Pricing availability for the selected target and any fallback target.
Soft ranking — applied to candidates that survive hard eligibility:
- Routing-fitness score for the supplied
task_type. priority=cheapselects the cheapest eligible model when fitness does not decide.priority=qualityselects the highest-cost eligible model in the same situation.
If no scored candidates remain, Keel falls back to the requested model or to existing per-provider price-based selection.
Routing Fitness Registry V1
Routing Fitness Registry V1 is the named registry of per-model task-type scores Keel uses for same-provider model reselection. The “V1” marker signals the current capability boundary: the registry is live and customer-influenceable through task_type, but its scope is intentionally narrow today and may broaden in later versions.
Current capability boundary:
- Scores are manual hints attached to capability descriptors per provider/model.
- Reranking applies to same-provider candidates only — never to cross-provider selection.
- Coverage today is
generate.text. Other operations may carry routing-fitness scores in the future, butgenerate.textis the supported case in V1. - Reranking runs on surfaces that consult the request-scoped
routingenvelope —POST /v1/executionstoday, not the OpenAI or Anthropic proxy routes.
When a task_type is supplied and Routing Fitness V1 has scores for the eligible candidates, the highest-scoring candidate is selected. When fitness does not produce a clear winner, priority=cheap or priority=quality breaks the tie. When neither fitness nor priority decides, the requested model is preserved.
Priority hints
The priority field carries the caller’s intent for tie-breaking among eligible candidates:
priority=cheap— choose the cheapest eligible candidate when fitness does not decide.priority=quality— choose the highest-cost eligible candidate in the same situation.priority=balanced— record the hint and keep the requested model unless routing fitness changes it.priority=balanceddoes not by itself trigger price-based reselection.
priority is recorded on the permit and the routing record regardless of which surface received it. On the OpenAI and Anthropic proxy routes the field is recorded but does not currently change selection.
task_type
task_type is the public key into Routing Fitness V1. Supplying a task_type enables fitness-driven same-provider model reselection on POST /v1/executions. Common values include summarization, classification, and similar narrow operation categories. Unknown task_type values are accepted and recorded; they do not produce errors, but they do not match any registry score.
latency_target_ms
latency_target_ms is recorded on the routing record as audit metadata only. It does not currently influence selection on any surface. Future capability boundaries may add latency-aware selection; the field is reserved so callers can begin recording intent today.
Fallback chains
Fallback in Keel is explicit, ordered, and surface-specific. It is not an implicit “try any provider until one works” mechanism, and it never bypasses governance.
Configuring a fallback chain
{
"routing": {
"allow_cross_provider_fallback": true,
"fallback_chain": [
{"provider": "anthropic", "model": "claude-sonnet"},
{"provider": "openai", "model": "gpt-4o-mini"}
]
}
}Rules that apply to every supported surface:
- The chain is ordered. Keel attempts targets in the order provided.
- Duplicate targets are deduped across the initial target and the chain.
- Cross-provider entries require the caller to set
routing.allow_cross_provider_fallback: true. Without that opt-in, cross-provider entries are rejected at request time.
Eligibility filtering before dispatch
On POST /v1/executions, every fallback target is filtered before it is attempted. A target can be present in the request and still be dropped before dispatch when it does not survive:
- Adapter execution-mode support for the request.
- Request-translation support (the request can be reshaped for the target provider).
- Pricing availability.
When all chain targets fail eligibility, the request denies with the original target’s failure rather than silently invoking an ineligible target.
When fallback advances
Fallback advances after eligible execution failures. Typical retryable cases:
- Upstream provider 5xx responses.
- Upstream timeouts.
- Transport or network failure between Keel and the provider.
Some proxy fallback candidates can also be skipped before dispatch when Keel detects unsupported modality or translation failure while building the next attempt.
When fallback does not advance
Fallback is execution-retry logic, not a mechanism to reroute around governance decisions. Keel does not advance fallback on:
- Prompt-firewall blocks. The request denies with
prompt_firewall_blockedand the chain is not attempted. - Permit or policy denials. A denied permit does not retry through fallback.
- Invalid request errors.
- Authentication failures.
- Missing provider configuration.
- Pricing-not-configured failures.
Per-surface fallback support
| Surface | fallback_chain accepted | Same-provider fallback | Cross-provider fallback |
|---|---|---|---|
POST /v1/executions | Yes | Yes | Opt-in via allow_cross_provider_fallback; eligible targets filtered before dispatch |
POST /v1/proxy/openai | Yes | Yes | Opt-in for the implemented non-stream OpenAI/Anthropic text subset |
POST /v1/proxy/anthropic | Yes | Yes | Opt-in for the implemented non-stream OpenAI/Anthropic text subset |
POST /v1/proxy/google | Rejected | n/a | n/a |
POST /v1/proxy/xai | Rejected | n/a | n/a |
POST /v1/proxy/meta | Rejected | n/a | n/a |
POST /v1/execute | Not exposed | n/a | n/a |
When a requested cross-provider fallback target on a proxy route cannot be translated cleanly, the route fails closed with the unsupported_cross_provider_failover error rather than attempting an unsafe substitution.
Streaming proxy boundary
Streaming proxy paths keep the direct provider path and do not expose the cross-provider fallback contract. A streaming request that asks for cross-provider fallback proceeds against the originally selected provider; cross-provider reroute requires the non-stream OpenAI/Anthropic text path.
Common routing patterns
The previous sections covered each routing control in isolation. This section shows three worked scenarios that combine controls into typical routing strategies.
Pattern 1 — Same-provider fitness with cheap fallback
Use this pattern when you have a preferred model on a single provider but want Keel to choose a cheaper same-provider model when the task allows, and to fall back to a same-provider alternative on transient failures.
Request to POST /v1/executions:
{
"routing": {
"provider": "openai",
"model": "gpt-4o",
"task_type": "summarization",
"priority": "cheap",
"fallback_chain": [
{"provider": "openai", "model": "gpt-4o-mini"}
]
}
}What Keel does:
- Initial target is
openai/gpt-4o. - Routing Fitness V1 looks up scores for
summarizationon the eligible OpenAI text models. Ifgpt-4o-miniscores well andpriority=cheapties resolve in its favor, Keel selectsgpt-4o-miniinstead. - If the dispatched call fails with a retryable upstream error, Keel advances to the next chain target (when it differs from the already-selected target).
- The routing record carries
requested_provider=openai,requested_model=gpt-4o,selected_provider=openai,selected_model=gpt-4o-mini, andreason_code=routing_fitnessorcost_optimizationdepending on what drove the change.
This pattern is available on every plan that supports POST /v1/executions. Same-provider fallback does not require the cross-provider opt-in.
Pattern 2 — Cross-provider failover
Use this pattern when you want a resilient backup on a different provider in case the primary provider is unavailable.
Request to POST /v1/executions (Business plan or above required for cross-provider entries):
{
"routing": {
"provider": "openai",
"model": "gpt-4o",
"allow_cross_provider_fallback": true,
"fallback_chain": [
{"provider": "anthropic", "model": "claude-sonnet"}
]
}
}What Keel does:
- Initial target is
openai/gpt-4o. The cross-provider fallback target is filtered before dispatch — translation support, capability, mode, and pricing are all checked. - If the OpenAI call fails with a retryable error, Keel translates the original request shape to Anthropic’s contract and dispatches to
claude-sonnet. - The routing record carries
requested_provider=openai,requested_model=gpt-4o,selected_provider=anthropic,selected_model=claude-sonnet,fallback_occurred=true, andreason_code=fallback_after_error. - A
routing.fallback_triggeredgovernance event records the fallback advancement.
Without allow_cross_provider_fallback: true, this same request is rejected because the chain contains a cross-provider entry.
Pattern 3 — alias resolution on /v1/execute
Use this pattern when callers want to reference a provider/model through a stable alias rather than naming a specific provider/model on every call. The alias is resolved by /v1/execute’s routing-policy resolution.
Setup — a project-scoped routing policy administered through the dashboard:
{
"alias": "summarizer",
"operation": "generate.text",
"priority_hint": "cheap",
"provider": "openai",
"model_id": "gpt-4o-mini",
"rank": 0,
"is_active": true
}Request to POST /v1/execute (omits provider, sends the alias as model):
{
"model": "summarizer",
"input": {
"messages": [
{"role": "user", "content": "Summarize this document in one sentence."}
],
"max_tokens": 80
}
}What Keel does:
- Layer 1 walks scopes — project, then organization, then global — looking for active routing-policy rows whose alias matches
summarizer. The project-scoped row wins. - Project-scoped health and active routing budgets can reorder the candidate set when more than one candidate is in play.
- The
resolvedblock on the response carriesresolved.alias="summarizer", the winning policy row, and any active health or budget influence. - Execution proceeds against the selected target.
The Layer-1 alias path does not produce the public routing.reason_code set used by Layer 2 — Layer 1’s audit surface is the resolved block on the /v1/execute response.
Selection auditability
Every routing-aware request produces a routing record alongside the permit and request-state records. The routing record is the customer-facing audit surface for selection.
Requested versus selected target
The routing record distinguishes:
requested_provider— the provider the caller named (or that defaults produced).requested_model— the model the caller named.selected_provider— the provider Keel actually dispatched to.selected_model— the model Keel actually dispatched.
The two pairs match when nothing changed the target between request and dispatch. They differ when:
- Same-provider model reselection chose a different model on the same provider.
- Fallback advanced after an execution error and chose a different target.
- Routing-policy resolution on
POST /v1/executeproduced a target that differed from the raw top-ranked alias row because of health or budget reordering.
Routing reason codes
Routing decisions surface a routing.reason_code on every routing record. The current public codes are:
| Code | Meaning |
|---|---|
explicit_request | The request named a target and Keel used it without modification. |
routing_fitness | Same-provider model reselection chose a different model based on Routing Fitness V1 scores for the supplied task_type. |
cost_optimization | A priority=cheap tie-break drove selection. |
quality_preference | A priority=quality tie-break drove selection. |
fallback_after_error | Fallback advanced after an upstream execution failure. |
The routing.reason_code and the permit reason_code are different fields. The permit reason_code describes a governance decision; the routing reason_code describes a selection decision. Both can carry their own values on the same request.
Proxy headers
Proxy responses expose terminal selection in response headers so callers without access to the structured routing record still see how the request was routed:
x-keel-provider— the provider Keel dispatched to.x-keel-model— the model Keel dispatched to.x-keel-requested-provider— the provider the caller named.x-keel-requested-model— the model the caller named.x-keel-routing-reason— the routing reason code.x-keel-routing-fallback—truewhen fallback advanced.x-keel-routing-fallback-attempt-count— the number of fallback attempts before a target succeeded.
Fallback history
When fallback advances, the routing record carries:
routing.fallback_occurred: true.routing.reason_code: "fallback_after_error".- A fallback history sequence inside
reason_metadatadescribing each attempted target and the failure that caused the next attempt. - A governance event entry for
routing.fallback_triggeredso fallback advancement is part of the project’s governance record.
reason_metadata boundary
reason_metadata is audit and diagnostic context, not a stable public schema. Its keys can change between Keel releases as Keel adds finer-grained diagnostic context. Two rules apply:
- Public responses can redact internal scoring or pricing details from
reason_metadata. - Callers should not treat individual
reason_metadatakeys as a long-term integration contract.
For stable audit and integration, parse the typed fields above (requested_*, selected_*, reason_code, fallback_occurred). Use reason_metadata for diagnostic display, not for programmatic selection logic.
Plan tier availability
Routing capability is gated by plan in two places:
| Capability | Gate | Behavior below the gate |
|---|---|---|
Cross-provider fallback (allow_cross_provider_fallback: true with cross-provider entries in fallback_chain) | Available on Business plans and above (entitlement: routing_cross_provider_enabled) | Cross-provider entries are rejected at request time. |
Length of fallback_chain | Quota-bounded by plan (entitlement: routing_fallback_chain_max_length) | Starter is bounded to one fallback target. Growth, Business, and Enterprise are unbounded. |
Same-provider fallback, same-provider model reselection, Routing Fitness Registry V1, and routing.reason_code recording are available on every plan on the surfaces that support them. Plan-tier gating affects cross-provider behavior and chain length, not selection auditability.
For the full per-tier feature matrix, including plan-quota responses for over-limit requests, see Plans & Entitlements.
What this surface does and does not claim
- Layer 1 runs only on
POST /v1/executeand only whenprovideris omitted.POST /v1/executionsdoes not consult routing-policy aliases. - Layer 2 is the public
routingenvelope onPOST /v1/executionsand on the OpenAI and Anthropic proxy routes. The Google, xAI, and Meta proxy routes reject publicrouting. - Routing Fitness Registry V1 is same-provider only and currently covers
generate.text. It is not a cross-provider scoring system. - Fallback never bypasses governance. Permit denials, policy denials, prompt-firewall blocks, authentication failures, and configuration failures stop the request rather than advance through
fallback_chain. - Streaming proxy paths do not expose the cross-provider fallback contract. Streaming requests that ask for cross-provider fallback proceed against the originally selected provider.
reason_metadatais diagnostic state, not a stable public schema. Parse the typed routing fields for integration; treatreason_metadatakeys as informational.- There is no autonomous fleet-wide provider selection on any public surface today.