Skip to Content
Routing

Routing

Routing in Keel is explicit, auditable, and non-autonomous. Each routing-aware request produces a recorded selection — a requested target, a selected target, an explicit reason for any change, and a fallback history when fallback ran. There is no fleet-wide automatic provider selection, and routing never bypasses governance.

Different surfaces expose routing differently. POST /v1/execute resolves a routing-policy alias to a concrete target when provider is omitted. POST /v1/executions and select proxy routes accept a public routing envelope on the request that drives initial selection, same-provider reselection, and explicit fallback. The capability matrix below shows which behavior each surface supports.

Per-surface routing capability matrix

SurfaceInitial targetSame-provider reselectionRouting-fitness rerankingCross-provider fallbackExplicit fallback chainPriority hint
POST /v1/executionsrouting.provider / routing.modelActiveActiveOpt-in via routing.allow_cross_provider_fallbackActiveActive (cheap / quality)
POST /v1/executeResolved via aliases / health / routing budgets when provider is omittedn/an/an/aNot exposedn/a
POST /v1/proxy/openaiRoute-bound providerRecorded onlyRecorded onlyOpt-in (non-stream text subset)ActiveRecorded only
POST /v1/proxy/anthropicRoute-bound providerRecorded onlyRecorded onlyOpt-in (non-stream text subset)ActiveRecorded only
POST /v1/proxy/googleRoute-bound providern/an/aRejectedRejectedn/a
POST /v1/proxy/xaiRoute-bound providern/an/aRejectedRejectedn/a
POST /v1/proxy/metaRoute-bound providern/an/aRejectedRejectedn/a
POST /v1/permitsn/an/an/an/an/aMetadata only
POST /v1/jobsn/an/an/an/an/aMetadata only

Reading the matrix:

  • Active — the field changes selection or dispatch behavior on this surface.
  • Recorded only — the field is accepted on the request, persisted on the routing record, and visible in audit evidence, but does not currently influence the dispatched target on that surface.
  • Rejected — the surface refuses requests that carry the field.
  • Metadata only — the surface stores routing intent on the permit or job record but does not run the request-scoped routing contract.
  • n/a — the field has no meaning on this surface.

Target resolution on /v1/execute

This behavior runs only on POST /v1/execute, and only when the request omits provider. Its job is to resolve an alias, a model name, or a unique capability-registry entry into a concrete provider/model target. The selected target then proceeds through provider-bound execution.

Inputs Keel can use

  • A routing-policy alias from the project, organization, or global scope.
  • A model alias defined in the capability registry.
  • A unique provider match for a model name.
  • Project-scoped provider-health ranking.
  • Active routing budgets.

The selected target is exposed in the resolved block on the /v1/execute response. This selection path does not surface through the public routing envelope used by POST /v1/executions.

Routing policies

Routing policies are control-plane alias rules. They are administered through the dashboard and exposed via dashboard-session authenticated routes:

GET /v1/control-plane/routing-policies POST /v1/control-plane/routing-policies

A routing-policy row carries an alias, an optional operation, an optional priority hint, a provider, a model identifier, a rank, an active flag, and a scope (project, organization, or global). Each row tells Keel “when callers ask for this alias, consider this provider+model as a candidate.”

{ "project_id": "11111111-1111-1111-1111-111111111111", "alias": "smart-default", "operation": "generate.text", "priority_hint": "cheap", "provider": "openai", "model_id": "gpt-4.1-mini", "rank": 0, "is_active": true }

Project and organization scopes are mutable through the API. Global routing policies are not exposed for API mutation.

Resolution order

When POST /v1/execute resolves an alias, Keel walks scopes in this order:

  1. Project scope.
  2. Organization scope (when the project belongs to an organization).
  3. Global scope.

Keel stops at the first scope that yields any active candidates. Within the winning scope, candidates are ordered by:

  1. Most specific rule shape — operation + priority_hint, then operation only, then priority_hint only, then a generic alias.
  2. The configured rank value.
  3. Authoring order, oldest first.

The first candidate after that ordering is the initial primary target.

Reordering after alias resolution

Routing-policy resolution is the first step, not the last. Once an alias produces candidates, Keel can further reorder the candidate set using:

  • Project-scoped provider-health ranking.
  • Evidence-based global health override (only when an external provider failure is confirmed across independent signals; see Security › Beta-period security boundaries).
  • Active routing budgets.

So the final /v1/execute target can differ from the raw top-ranked alias row when health or budget controls apply. Each input that influenced the selection is recorded so the change is auditable.

The resolved block

POST /v1/execute responses include a resolved block summarizing how the target was resolved:

  • resolved.alias — the alias that was matched, when an alias drove the selection.
  • resolved.policy — the routing-policy row that won, including its scope.
  • resolved.health — the health signal that influenced ordering, when present.
  • resolved.budget — the active routing budget that influenced ordering, when present.

The resolved block is the customer-facing audit surface for /v1/execute target resolution. Routing policies do not change POST /v1/executions behavior, do not change proxy-route routing, and are not policy rows in the permit policy engine.

Request-scoped routing on /v1/executions and select proxies

This behavior runs on POST /v1/executions and on the OpenAI and Anthropic proxy routes. It accepts a public routing envelope on the request that can change the selected provider, change the selected model within the same provider, and carry an explicit fallback chain into dispatch.

The routing envelope

{ "routing": { "provider": "openai", "model": "gpt-4o", "priority": "cheap", "task_type": "summarization", "latency_target_ms": 8000, "allow_cross_provider_fallback": false, "fallback_chain": [ {"provider": "anthropic", "model": "claude-sonnet"}, {"provider": "openai", "model": "gpt-4o-mini"} ] } }
FieldPOST /v1/executionsPOST /v1/proxy/openai and anthropicMeaning
providerActiven/a (route-bound)Initial provider target.
modelActiven/a (route-bound)Initial model target.
priorityActiveRecorded onlyOne of cheap, balanced, quality. Drives tie-breaks among scored candidates.
task_typeActiveRecorded onlyRouting-fitness key for same-provider model reselection.
latency_target_msRecorded onlyRecorded onlyAudit metadata only today; does not alter selection.
allow_cross_provider_fallbackActiveActiveOpt-in gate for cross-provider entries in fallback_chain.
fallback_chainActiveActiveExplicit ordered fallback targets.

The Google, xAI, and Meta proxy routes reject public routing entirely; sending the envelope produces a request error.

Initial target selection

On POST /v1/executions, Keel resolves the initial provider/model from the request inputs in this order:

  1. Use routing.provider and routing.model when supplied.
  2. Otherwise, fall back to the request’s defaults and capability-registry inference.

On the OpenAI and Anthropic proxy routes, the route itself binds the provider — the initial provider is the provider named in the route path. routing.provider and routing.model are not applicable on these routes; the initial target is the route-bound provider plus whatever model the proxy payload names.

Hard eligibility versus soft ranking

Once an initial target is selected, Keel computes the eligible same-provider candidate set, applies hard filters first, and only then applies soft ranking.

Hard eligibility — a candidate must pass all of these to remain in consideration:

  • The selected provider boundary (Layer 2 reselection stays inside one provider).
  • Any active text-policy allow-list.
  • Registered operation support for the candidate.
  • Execution-mode support (sync, async, streaming where applicable).
  • Pricing availability for the selected target and any fallback target.

Soft ranking — applied to candidates that survive hard eligibility:

  • Routing-fitness score for the supplied task_type.
  • priority=cheap selects the cheapest eligible model when fitness does not decide.
  • priority=quality selects the highest-cost eligible model in the same situation.

If no scored candidates remain, Keel falls back to the requested model or to existing per-provider price-based selection.

Routing Fitness Registry V1

Routing Fitness Registry V1 is the named registry of per-model task-type scores Keel uses for same-provider model reselection. The “V1” marker signals the current capability boundary: the registry is live and customer-influenceable through task_type, but its scope is intentionally narrow today and may broaden in later versions.

Current capability boundary:

  • Scores are manual hints attached to capability descriptors per provider/model.
  • Reranking applies to same-provider candidates only — never to cross-provider selection.
  • Coverage today is generate.text. Other operations may carry routing-fitness scores in the future, but generate.text is the supported case in V1.
  • Reranking runs on surfaces that consult the request-scoped routing envelope — POST /v1/executions today, not the OpenAI or Anthropic proxy routes.

When a task_type is supplied and Routing Fitness V1 has scores for the eligible candidates, the highest-scoring candidate is selected. When fitness does not produce a clear winner, priority=cheap or priority=quality breaks the tie. When neither fitness nor priority decides, the requested model is preserved.

Priority hints

The priority field carries the caller’s intent for tie-breaking among eligible candidates:

  • priority=cheap — choose the cheapest eligible candidate when fitness does not decide.
  • priority=quality — choose the highest-cost eligible candidate in the same situation.
  • priority=balanced — record the hint and keep the requested model unless routing fitness changes it. priority=balanced does not by itself trigger price-based reselection.

priority is recorded on the permit and the routing record regardless of which surface received it. On the OpenAI and Anthropic proxy routes the field is recorded but does not currently change selection.

task_type

task_type is the public key into Routing Fitness V1. Supplying a task_type enables fitness-driven same-provider model reselection on POST /v1/executions. Common values include summarization, classification, and similar narrow operation categories. Unknown task_type values are accepted and recorded; they do not produce errors, but they do not match any registry score.

latency_target_ms

latency_target_ms is recorded on the routing record as audit metadata only. It does not currently influence selection on any surface. Future capability boundaries may add latency-aware selection; the field is reserved so callers can begin recording intent today.

Fallback chains

Fallback in Keel is explicit, ordered, and surface-specific. It is not an implicit “try any provider until one works” mechanism, and it never bypasses governance.

Configuring a fallback chain

{ "routing": { "allow_cross_provider_fallback": true, "fallback_chain": [ {"provider": "anthropic", "model": "claude-sonnet"}, {"provider": "openai", "model": "gpt-4o-mini"} ] } }

Rules that apply to every supported surface:

  • The chain is ordered. Keel attempts targets in the order provided.
  • Duplicate targets are deduped across the initial target and the chain.
  • Cross-provider entries require the caller to set routing.allow_cross_provider_fallback: true. Without that opt-in, cross-provider entries are rejected at request time.

Eligibility filtering before dispatch

On POST /v1/executions, every fallback target is filtered before it is attempted. A target can be present in the request and still be dropped before dispatch when it does not survive:

  • Adapter execution-mode support for the request.
  • Request-translation support (the request can be reshaped for the target provider).
  • Pricing availability.

When all chain targets fail eligibility, the request denies with the original target’s failure rather than silently invoking an ineligible target.

When fallback advances

Fallback advances after eligible execution failures. Typical retryable cases:

  • Upstream provider 5xx responses.
  • Upstream timeouts.
  • Transport or network failure between Keel and the provider.

Some proxy fallback candidates can also be skipped before dispatch when Keel detects unsupported modality or translation failure while building the next attempt.

When fallback does not advance

Fallback is execution-retry logic, not a mechanism to reroute around governance decisions. Keel does not advance fallback on:

  • Prompt-firewall blocks. The request denies with prompt_firewall_blocked and the chain is not attempted.
  • Permit or policy denials. A denied permit does not retry through fallback.
  • Invalid request errors.
  • Authentication failures.
  • Missing provider configuration.
  • Pricing-not-configured failures.

Per-surface fallback support

Surfacefallback_chain acceptedSame-provider fallbackCross-provider fallback
POST /v1/executionsYesYesOpt-in via allow_cross_provider_fallback; eligible targets filtered before dispatch
POST /v1/proxy/openaiYesYesOpt-in for the implemented non-stream OpenAI/Anthropic text subset
POST /v1/proxy/anthropicYesYesOpt-in for the implemented non-stream OpenAI/Anthropic text subset
POST /v1/proxy/googleRejectedn/an/a
POST /v1/proxy/xaiRejectedn/an/a
POST /v1/proxy/metaRejectedn/an/a
POST /v1/executeNot exposedn/an/a

When a requested cross-provider fallback target on a proxy route cannot be translated cleanly, the route fails closed with the unsupported_cross_provider_failover error rather than attempting an unsafe substitution.

Streaming proxy boundary

Streaming proxy paths keep the direct provider path and do not expose the cross-provider fallback contract. A streaming request that asks for cross-provider fallback proceeds against the originally selected provider; cross-provider reroute requires the non-stream OpenAI/Anthropic text path.

Common routing patterns

The previous sections covered each routing control in isolation. This section shows three worked scenarios that combine controls into typical routing strategies.

Pattern 1 — Same-provider fitness with cheap fallback

Use this pattern when you have a preferred model on a single provider but want Keel to choose a cheaper same-provider model when the task allows, and to fall back to a same-provider alternative on transient failures.

Request to POST /v1/executions:

{ "routing": { "provider": "openai", "model": "gpt-4o", "task_type": "summarization", "priority": "cheap", "fallback_chain": [ {"provider": "openai", "model": "gpt-4o-mini"} ] } }

What Keel does:

  1. Initial target is openai/gpt-4o.
  2. Routing Fitness V1 looks up scores for summarization on the eligible OpenAI text models. If gpt-4o-mini scores well and priority=cheap ties resolve in its favor, Keel selects gpt-4o-mini instead.
  3. If the dispatched call fails with a retryable upstream error, Keel advances to the next chain target (when it differs from the already-selected target).
  4. The routing record carries requested_provider=openai, requested_model=gpt-4o, selected_provider=openai, selected_model=gpt-4o-mini, and reason_code=routing_fitness or cost_optimization depending on what drove the change.

This pattern is available on every plan that supports POST /v1/executions. Same-provider fallback does not require the cross-provider opt-in.

Pattern 2 — Cross-provider failover

Use this pattern when you want a resilient backup on a different provider in case the primary provider is unavailable.

Request to POST /v1/executions (Business plan or above required for cross-provider entries):

{ "routing": { "provider": "openai", "model": "gpt-4o", "allow_cross_provider_fallback": true, "fallback_chain": [ {"provider": "anthropic", "model": "claude-sonnet"} ] } }

What Keel does:

  1. Initial target is openai/gpt-4o. The cross-provider fallback target is filtered before dispatch — translation support, capability, mode, and pricing are all checked.
  2. If the OpenAI call fails with a retryable error, Keel translates the original request shape to Anthropic’s contract and dispatches to claude-sonnet.
  3. The routing record carries requested_provider=openai, requested_model=gpt-4o, selected_provider=anthropic, selected_model=claude-sonnet, fallback_occurred=true, and reason_code=fallback_after_error.
  4. A routing.fallback_triggered governance event records the fallback advancement.

Without allow_cross_provider_fallback: true, this same request is rejected because the chain contains a cross-provider entry.

Pattern 3 — alias resolution on /v1/execute

Use this pattern when callers want to reference a provider/model through a stable alias rather than naming a specific provider/model on every call. The alias is resolved by /v1/execute’s routing-policy resolution.

Setup — a project-scoped routing policy administered through the dashboard:

{ "alias": "summarizer", "operation": "generate.text", "priority_hint": "cheap", "provider": "openai", "model_id": "gpt-4o-mini", "rank": 0, "is_active": true }

Request to POST /v1/execute (omits provider, sends the alias as model):

{ "model": "summarizer", "input": { "messages": [ {"role": "user", "content": "Summarize this document in one sentence."} ], "max_tokens": 80 } }

What Keel does:

  1. Layer 1 walks scopes — project, then organization, then global — looking for active routing-policy rows whose alias matches summarizer. The project-scoped row wins.
  2. Project-scoped health and active routing budgets can reorder the candidate set when more than one candidate is in play.
  3. The resolved block on the response carries resolved.alias="summarizer", the winning policy row, and any active health or budget influence.
  4. Execution proceeds against the selected target.

The Layer-1 alias path does not produce the public routing.reason_code set used by Layer 2 — Layer 1’s audit surface is the resolved block on the /v1/execute response.

Selection auditability

Every routing-aware request produces a routing record alongside the permit and request-state records. The routing record is the customer-facing audit surface for selection.

Requested versus selected target

The routing record distinguishes:

  • requested_provider — the provider the caller named (or that defaults produced).
  • requested_model — the model the caller named.
  • selected_provider — the provider Keel actually dispatched to.
  • selected_model — the model Keel actually dispatched.

The two pairs match when nothing changed the target between request and dispatch. They differ when:

  • Same-provider model reselection chose a different model on the same provider.
  • Fallback advanced after an execution error and chose a different target.
  • Routing-policy resolution on POST /v1/execute produced a target that differed from the raw top-ranked alias row because of health or budget reordering.

Routing reason codes

Routing decisions surface a routing.reason_code on every routing record. The current public codes are:

CodeMeaning
explicit_requestThe request named a target and Keel used it without modification.
routing_fitnessSame-provider model reselection chose a different model based on Routing Fitness V1 scores for the supplied task_type.
cost_optimizationA priority=cheap tie-break drove selection.
quality_preferenceA priority=quality tie-break drove selection.
fallback_after_errorFallback advanced after an upstream execution failure.

The routing.reason_code and the permit reason_code are different fields. The permit reason_code describes a governance decision; the routing reason_code describes a selection decision. Both can carry their own values on the same request.

Proxy headers

Proxy responses expose terminal selection in response headers so callers without access to the structured routing record still see how the request was routed:

  • x-keel-provider — the provider Keel dispatched to.
  • x-keel-model — the model Keel dispatched to.
  • x-keel-requested-provider — the provider the caller named.
  • x-keel-requested-model — the model the caller named.
  • x-keel-routing-reason — the routing reason code.
  • x-keel-routing-fallbacktrue when fallback advanced.
  • x-keel-routing-fallback-attempt-count — the number of fallback attempts before a target succeeded.

Fallback history

When fallback advances, the routing record carries:

  • routing.fallback_occurred: true.
  • routing.reason_code: "fallback_after_error".
  • A fallback history sequence inside reason_metadata describing each attempted target and the failure that caused the next attempt.
  • A governance event entry for routing.fallback_triggered so fallback advancement is part of the project’s governance record.

reason_metadata boundary

reason_metadata is audit and diagnostic context, not a stable public schema. Its keys can change between Keel releases as Keel adds finer-grained diagnostic context. Two rules apply:

  • Public responses can redact internal scoring or pricing details from reason_metadata.
  • Callers should not treat individual reason_metadata keys as a long-term integration contract.

For stable audit and integration, parse the typed fields above (requested_*, selected_*, reason_code, fallback_occurred). Use reason_metadata for diagnostic display, not for programmatic selection logic.

Plan tier availability

Routing capability is gated by plan in two places:

CapabilityGateBehavior below the gate
Cross-provider fallback (allow_cross_provider_fallback: true with cross-provider entries in fallback_chain)Available on Business plans and above (entitlement: routing_cross_provider_enabled)Cross-provider entries are rejected at request time.
Length of fallback_chainQuota-bounded by plan (entitlement: routing_fallback_chain_max_length)Starter is bounded to one fallback target. Growth, Business, and Enterprise are unbounded.

Same-provider fallback, same-provider model reselection, Routing Fitness Registry V1, and routing.reason_code recording are available on every plan on the surfaces that support them. Plan-tier gating affects cross-provider behavior and chain length, not selection auditability.

For the full per-tier feature matrix, including plan-quota responses for over-limit requests, see Plans & Entitlements.

What this surface does and does not claim

  • Layer 1 runs only on POST /v1/execute and only when provider is omitted. POST /v1/executions does not consult routing-policy aliases.
  • Layer 2 is the public routing envelope on POST /v1/executions and on the OpenAI and Anthropic proxy routes. The Google, xAI, and Meta proxy routes reject public routing.
  • Routing Fitness Registry V1 is same-provider only and currently covers generate.text. It is not a cross-provider scoring system.
  • Fallback never bypasses governance. Permit denials, policy denials, prompt-firewall blocks, authentication failures, and configuration failures stop the request rather than advance through fallback_chain.
  • Streaming proxy paths do not expose the cross-provider fallback contract. Streaming requests that ask for cross-provider fallback proceed against the originally selected provider.
  • reason_metadata is diagnostic state, not a stable public schema. Parse the typed routing fields for integration; treat reason_metadata keys as informational.
  • There is no autonomous fleet-wide provider selection on any public surface today.
Last updated on Edit this page on GitHub