Skip to Content
Execution Modes

Execution Modes

Keel exposes five public integration modes. They are not interchangeable, and the right choice depends on who owns the provider call, how provider-shaped your existing code is, and whether you need the request and response synchronously. This page covers each mode in detail, compares them side-by-side, and shows one task implemented three ways so the trade-offs are concrete.

For the orienting glossary that introduces these modes, see Concepts.

At a glance

ModeRouteProvider call byProvider key held byRequest shapeResponse shapeSync?
Permit-firstPOST /v1/permits (+ usage closeout)CallerCallerCanonical permit bodyPermit decision onlySync
Unified executePOST /v1/executeKeelKeelProvider-shaped inputNormalized envelope + resolvedSync
Provider-neutral executionPOST /v1/executionsKeelKeelCanonical operation + messages/inputsNormalized execution envelopeSync (stream subset)
Provider-specific proxyPOST /v1/proxy/{provider}KeelKeelProvider-native payloadProvider-native responseSync (stream subset)
Async jobsPOST /v1/jobs (+ poll or callback)Keel (background)KeelCanonical permit + provider payloadJob record; result via poll or callbackAsync

Reading the table:

  • Provider call by — who issues the HTTP request to the upstream provider. Keel for managed execution; the caller for permit-first.
  • Provider key held by — where the upstream provider’s API key lives. Keel-managed modes resolve provider credentials server-side; permit-first leaves credentials with the caller.
  • Request shape — the body the caller must construct.
  • Response shape — what the route returns. Normalized envelopes are stable across providers; provider-native preserves provider-specific fields.

Permit-first

Route: POST /v1/permits (decision); POST /v1/permits/{permit_id}/usage (closeout).

Who it is for. Applications that already own a provider integration and want Keel to be the canonical decision and audit boundary without changing transport. Keel makes the policy and budget decision; the caller calls the provider; closing usage is reported back so the audit record carries observed token counts.

What Keel owns

  • Request authentication and project scoping
  • Canonical permit evaluation (policy, budget, governance event)
  • Idempotent permit persistence
  • Optional later usage closeout through the public usage route

What the caller owns

  • The actual provider call and its credentials
  • Retry and transport behavior
  • Collecting final usage and cost
  • Reporting completed usage back to Keel when desired

Trade-offs. Narrowest integration contract; strongest decoupling from provider transport; least automatic lifecycle evidence unless usage is reported back. Permit-first is the right answer when the caller’s existing provider integration is mature and ripping it out costs more than the additional governance evidence Keel-managed modes would provide.

Current limits

  • Public permit closeout requires verification material that proves execution and a positive billed cost.
  • The prompt firewall does not run on permit-only requests.
  • Caller-executed traffic is outside Keel’s outbound policy because Keel does not make the provider call.
  • Reported usage is trusted by Keel until verification material is attached. For the broader observation boundary, see Scope and Limits § Permit-first observation boundary.

Unified execute

Route: POST /v1/execute.

Who it is for. Applications that already have a provider-shaped input payload and want Keel to handle target resolution, permit evaluation, dispatch, and a normalized response. Unified execute is the primary public runtime surface for new integrations.

What Keel owns

  • Target resolution before execution (alias resolution, project-scoped health, active routing budgets)
  • Permit evaluation
  • Provider dispatch through adapters
  • Normalized response envelope plus a resolved metadata block
  • Usage and accounting persistence

What the caller owns

  • Constructing the provider-shaped input payload
  • Choosing whether to send provider/model explicitly or let Keel resolve

Trade-offs. Lower-friction than permit-first when the caller wants Keel to own dispatch; less provider-native than the proxy routes; not interchangeable with POST /v1/executions despite the similar name.

Current limits

  • Target resolution is explicit and rule-based, not a claim of broad autonomous routing.
  • The contract is public and stable, but it is not the same request shape as /v1/executions — see Routing for the per-surface capability matrix.

Provider-neutral execution

Route: POST /v1/executions.

Who it is for. Applications that want a single canonical request and response shape across providers. The execution contract is provider-neutral: the same operation and inputs produce stable, normalized outputs regardless of which provider Keel dispatches to.

What Keel owns

  • Permit formation and evaluation
  • Routing plan binding from the public routing envelope
  • Provider dispatch through adapters
  • Usage reconciliation, ledger writes, and lifecycle persistence

What the caller owns

  • Canonical input construction (operation, messages or inputs, parameters, optional routing)
  • Any client-side retry policy against the Keel API
  • Choosing sync or the current stream subset

Trade-offs. Cleaner than provider-native proxies for multi-provider integrations; narrower than provider-native surfaces — not every provider-specific feature is exposed through this contract. The provider-neutral envelope is the most portable shape Keel offers.

Current limits

  • Streaming is narrower than the non-stream path.
  • Routing and fallback behavior are bounded by the surface’s routing capability matrix.

Provider-specific proxy

Routes: POST /v1/proxy/openai, POST /v1/proxy/anthropic, POST /v1/proxy/google, POST /v1/proxy/xai, POST /v1/proxy/meta.

Who it is for. Applications that want to keep provider-native payloads and provider-native responses while Keel governs the request, holds provider credentials, and persists audit evidence. Proxy mode is the right choice when the caller depends on a provider-specific feature that the provider-neutral contract does not expose.

What Keel owns

  • Stripping known transport and auth override fields from the inbound payload
  • Permit evaluation and supported prompt-firewall checks
  • Provider-key lookup and adapter dispatch
  • Idempotency replay and caching on proxy paths
  • Usage and accounting persistence and response headers

What the caller owns

  • Constructing provider-native payloads
  • Understanding provider-specific differences in supported operations and streaming behavior

Trade-offs. Closest to provider semantics; highest route-specific behavior — proxy routes are not interchangeable with one another. Capability coverage differs materially by provider.

Current limits

  • Public routing is exposed only on the OpenAI and Anthropic proxy routes.
  • Public streaming is currently primarily implemented on the OpenAI proxy.
  • Capability coverage differs by provider — see Proxy Execution.

Async jobs

Routes: POST /v1/jobs, GET /v1/jobs/{job_id}.

Who it is for. Applications that want governed execution decoupled from the request/response lifetime — for example, batch processing, long-running generation, or workloads where the caller does not want to hold an HTTP connection open.

What Keel owns

  • Job persistence and queue state
  • Background execution through the shared governance and execution pipeline
  • Usage and accounting persistence
  • Optional best-effort callback delivery

What the caller owns

  • Submitting the canonical permit plus provider payload
  • Polling job status or receiving callbacks
  • Handling eventual completion instead of inline results

Trade-offs. Durable status plus optional callback support; more moving parts than sync routes; distinct job_id and request_id lifecycles.

Current limits

  • Callback URLs must pass outbound-policy validation.
  • Async uses shared governance and execution machinery, but status semantics are job-oriented rather than identical to sync routes.
  • Async job callbacks use bounded in-process retries; callers that need durable outbound delivery should use webhook subscriptions instead.

Worked example — one task across three modes

The same task (“summarize a customer support ticket in one sentence”) implemented three ways. The differences are not in the AI work — they are in who owns dispatch and what the caller’s code looks like.

As permit-first

# 1. Ask Keel for a decision. curl -sS -X POST https://api.keelapi.com/v1/permits \ -H "Authorization: Bearer keel_sk_<project_key>" \ -H "Content-Type: application/json" \ -d '{ "project_id": "<project_uuid>", "idempotency_key": "ticket-1234-summary", "subject": {"type": "user", "id": "agent_42", "attributes": {}}, "action": {"name": "ai.generate.summary", "attributes": {}}, "resource": { "type": "request", "id": "ticket_1234", "attributes": { "provider": "openai", "model": "gpt-4o-mini", "estimated_input_tokens": 800, "estimated_output_tokens": 60 } } }' # 2. If the decision is "allow", call OpenAI from your application. # 3. Report observed usage back to Keel. curl -sS -X POST https://api.keelapi.com/v1/permits/<permit_id>/usage \ -H "Authorization: Bearer keel_sk_<project_key>" \ -H "Content-Type: application/json" \ -d '{ "input_tokens": 812, "output_tokens": 47, "cost_usd_micros": 215 }'

As unified execute

curl -sS -X POST https://api.keelapi.com/v1/execute \ -H "Authorization: Bearer keel_sk_<project_key>" \ -H "Content-Type: application/json" \ -H "Idempotency-Key: ticket-1234-summary" \ -d '{ "provider": "openai", "model": "gpt-4o-mini", "input": { "messages": [ {"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"} ], "max_tokens": 80, "temperature": 0 } }'

Keel resolves the target, evaluates the permit, calls OpenAI, and returns a normalized envelope with both the model output and a resolved block describing the selection.

As provider-specific proxy

curl -sS -X POST https://api.keelapi.com/v1/proxy/openai \ -H "Authorization: Bearer keel_sk_<project_key>" \ -H "Content-Type: application/json" \ -H "Idempotency-Key: ticket-1234-summary" \ -d '{ "model": "gpt-4o-mini", "messages": [ {"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"} ], "max_tokens": 80 }'

The body is the OpenAI-native chat-completion payload. Keel evaluates the permit, dispatches to OpenAI, and returns OpenAI’s native response shape with Keel headers attached for governance correlation.

What changed

  • Permit-first stays simplest if the caller already has an OpenAI client and wants Keel only at the decision boundary. The caller still writes the OpenAI call.
  • Unified execute lets the caller stop maintaining a direct OpenAI dependency and gives back a normalized response that does not change shape if Keel later routes to a different provider.
  • Provider-specific proxy keeps the caller on OpenAI’s native API surface, which is the right choice when the caller depends on an OpenAI-specific feature that the canonical envelope does not expose.

How to choose

If you…Choose
Already have a working provider integration and want only governancePermit-first
Want Keel to dispatch and you have provider-shaped input readyUnified execute
Want a single canonical request/response shape across providersProvider-neutral execution
Need provider-native fields not exposed by the canonical contractProvider-specific proxy
Want background execution with polling or webhook deliveryAsync jobs

For new integrations starting from scratch, unified execute is the recommended default — it has the lowest friction with full Keel-managed evidence. Permit-first is the right choice when an existing provider integration is too valuable to rewrite. Provider-specific proxy is the escape hatch when a provider-specific feature is essential.

What this surface does and does not claim

  • The five modes are not interchangeable. Each makes a different trade-off; choose by who owns dispatch, what shape your existing code uses, and whether you need sync results.
  • Permit-first does not directly observe the provider call. Reported usage is trusted; the verification track is the bridge to provider-side receipt evidence.
  • Public realtime session APIs do not exist today. Realtime session scaffolding exists internally for timeline-replay continuity; it is not a public integration mode.
  • Mode availability is not plan-gated — every plan that supports a route can use the mode. Some capabilities inside a mode are plan-gated: see Plans & Entitlements for cross-provider routing, prompt-firewall strengthening, integrity verification API, and other gated features.
Last updated on Edit this page on GitHub