Execution Modes
Keel exposes five public integration modes. They are not interchangeable, and the right choice depends on who owns the provider call, how provider-shaped your existing code is, and whether you need the request and response synchronously. This page covers each mode in detail, compares them side-by-side, and shows one task implemented three ways so the trade-offs are concrete.
For the orienting glossary that introduces these modes, see Concepts.
At a glance
| Mode | Route | Provider call by | Provider key held by | Request shape | Response shape | Sync? |
|---|---|---|---|---|---|---|
| Permit-first | POST /v1/permits (+ usage closeout) | Caller | Caller | Canonical permit body | Permit decision only | Sync |
| Unified execute | POST /v1/execute | Keel | Keel | Provider-shaped input | Normalized envelope + resolved | Sync |
| Provider-neutral execution | POST /v1/executions | Keel | Keel | Canonical operation + messages/inputs | Normalized execution envelope | Sync (stream subset) |
| Provider-specific proxy | POST /v1/proxy/{provider} | Keel | Keel | Provider-native payload | Provider-native response | Sync (stream subset) |
| Async jobs | POST /v1/jobs (+ poll or callback) | Keel (background) | Keel | Canonical permit + provider payload | Job record; result via poll or callback | Async |
Reading the table:
- Provider call by — who issues the HTTP request to the upstream provider. Keel for managed execution; the caller for permit-first.
- Provider key held by — where the upstream provider’s API key lives. Keel-managed modes resolve provider credentials server-side; permit-first leaves credentials with the caller.
- Request shape — the body the caller must construct.
- Response shape — what the route returns. Normalized envelopes are stable across providers; provider-native preserves provider-specific fields.
Permit-first
Route: POST /v1/permits (decision); POST /v1/permits/{permit_id}/usage (closeout).
Who it is for. Applications that already own a provider integration and want Keel to be the canonical decision and audit boundary without changing transport. Keel makes the policy and budget decision; the caller calls the provider; closing usage is reported back so the audit record carries observed token counts.
What Keel owns
- Request authentication and project scoping
- Canonical permit evaluation (policy, budget, governance event)
- Idempotent permit persistence
- Optional later usage closeout through the public usage route
What the caller owns
- The actual provider call and its credentials
- Retry and transport behavior
- Collecting final usage and cost
- Reporting completed usage back to Keel when desired
Trade-offs. Narrowest integration contract; strongest decoupling from provider transport; least automatic lifecycle evidence unless usage is reported back. Permit-first is the right answer when the caller’s existing provider integration is mature and ripping it out costs more than the additional governance evidence Keel-managed modes would provide.
Current limits
- Public permit closeout requires verification material that proves execution and a positive billed cost.
- The prompt firewall does not run on permit-only requests.
- Caller-executed traffic is outside Keel’s outbound policy because Keel does not make the provider call.
- Reported usage is trusted by Keel until verification material is attached. For the broader observation boundary, see Scope and Limits § Permit-first observation boundary.
Unified execute
Route: POST /v1/execute.
Who it is for. Applications that already have a provider-shaped input payload and want Keel to handle target resolution, permit evaluation, dispatch, and a normalized response. Unified execute is the primary public runtime surface for new integrations.
What Keel owns
- Target resolution before execution (alias resolution, project-scoped health, active routing budgets)
- Permit evaluation
- Provider dispatch through adapters
- Normalized response envelope plus a
resolvedmetadata block - Usage and accounting persistence
What the caller owns
- Constructing the provider-shaped
inputpayload - Choosing whether to send
provider/modelexplicitly or let Keel resolve
Trade-offs. Lower-friction than permit-first when the caller wants Keel to own dispatch; less provider-native than the proxy routes; not interchangeable with POST /v1/executions despite the similar name.
Current limits
- Target resolution is explicit and rule-based, not a claim of broad autonomous routing.
- The contract is public and stable, but it is not the same request shape as
/v1/executions— see Routing for the per-surface capability matrix.
Provider-neutral execution
Route: POST /v1/executions.
Who it is for. Applications that want a single canonical request and response shape across providers. The execution contract is provider-neutral: the same operation and inputs produce stable, normalized outputs regardless of which provider Keel dispatches to.
What Keel owns
- Permit formation and evaluation
- Routing plan binding from the public
routingenvelope - Provider dispatch through adapters
- Usage reconciliation, ledger writes, and lifecycle persistence
What the caller owns
- Canonical input construction (
operation,messagesorinputs,parameters, optionalrouting) - Any client-side retry policy against the Keel API
- Choosing sync or the current stream subset
Trade-offs. Cleaner than provider-native proxies for multi-provider integrations; narrower than provider-native surfaces — not every provider-specific feature is exposed through this contract. The provider-neutral envelope is the most portable shape Keel offers.
Current limits
- Streaming is narrower than the non-stream path.
- Routing and fallback behavior are bounded by the surface’s routing capability matrix.
Provider-specific proxy
Routes: POST /v1/proxy/openai, POST /v1/proxy/anthropic, POST /v1/proxy/google, POST /v1/proxy/xai, POST /v1/proxy/meta.
Who it is for. Applications that want to keep provider-native payloads and provider-native responses while Keel governs the request, holds provider credentials, and persists audit evidence. Proxy mode is the right choice when the caller depends on a provider-specific feature that the provider-neutral contract does not expose.
What Keel owns
- Stripping known transport and auth override fields from the inbound payload
- Permit evaluation and supported prompt-firewall checks
- Provider-key lookup and adapter dispatch
- Idempotency replay and caching on proxy paths
- Usage and accounting persistence and response headers
What the caller owns
- Constructing provider-native payloads
- Understanding provider-specific differences in supported operations and streaming behavior
Trade-offs. Closest to provider semantics; highest route-specific behavior — proxy routes are not interchangeable with one another. Capability coverage differs materially by provider.
Current limits
- Public routing is exposed only on the OpenAI and Anthropic proxy routes.
- Public streaming is currently primarily implemented on the OpenAI proxy.
- Capability coverage differs by provider — see Proxy Execution.
Async jobs
Routes: POST /v1/jobs, GET /v1/jobs/{job_id}.
Who it is for. Applications that want governed execution decoupled from the request/response lifetime — for example, batch processing, long-running generation, or workloads where the caller does not want to hold an HTTP connection open.
What Keel owns
- Job persistence and queue state
- Background execution through the shared governance and execution pipeline
- Usage and accounting persistence
- Optional best-effort callback delivery
What the caller owns
- Submitting the canonical permit plus provider payload
- Polling job status or receiving callbacks
- Handling eventual completion instead of inline results
Trade-offs. Durable status plus optional callback support; more moving parts than sync routes; distinct job_id and request_id lifecycles.
Current limits
- Callback URLs must pass outbound-policy validation.
- Async uses shared governance and execution machinery, but status semantics are job-oriented rather than identical to sync routes.
- Async job callbacks use bounded in-process retries; callers that need durable outbound delivery should use webhook subscriptions instead.
Worked example — one task across three modes
The same task (“summarize a customer support ticket in one sentence”) implemented three ways. The differences are not in the AI work — they are in who owns dispatch and what the caller’s code looks like.
As permit-first
# 1. Ask Keel for a decision.
curl -sS -X POST https://api.keelapi.com/v1/permits \
-H "Authorization: Bearer keel_sk_<project_key>" \
-H "Content-Type: application/json" \
-d '{
"project_id": "<project_uuid>",
"idempotency_key": "ticket-1234-summary",
"subject": {"type": "user", "id": "agent_42", "attributes": {}},
"action": {"name": "ai.generate.summary", "attributes": {}},
"resource": {
"type": "request",
"id": "ticket_1234",
"attributes": {
"provider": "openai",
"model": "gpt-4o-mini",
"estimated_input_tokens": 800,
"estimated_output_tokens": 60
}
}
}'
# 2. If the decision is "allow", call OpenAI from your application.
# 3. Report observed usage back to Keel.
curl -sS -X POST https://api.keelapi.com/v1/permits/<permit_id>/usage \
-H "Authorization: Bearer keel_sk_<project_key>" \
-H "Content-Type: application/json" \
-d '{
"input_tokens": 812,
"output_tokens": 47,
"cost_usd_micros": 215
}'As unified execute
curl -sS -X POST https://api.keelapi.com/v1/execute \
-H "Authorization: Bearer keel_sk_<project_key>" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: ticket-1234-summary" \
-d '{
"provider": "openai",
"model": "gpt-4o-mini",
"input": {
"messages": [
{"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"}
],
"max_tokens": 80,
"temperature": 0
}
}'Keel resolves the target, evaluates the permit, calls OpenAI, and returns a normalized envelope with both the model output and a resolved block describing the selection.
As provider-specific proxy
curl -sS -X POST https://api.keelapi.com/v1/proxy/openai \
-H "Authorization: Bearer keel_sk_<project_key>" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: ticket-1234-summary" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"}
],
"max_tokens": 80
}'The body is the OpenAI-native chat-completion payload. Keel evaluates the permit, dispatches to OpenAI, and returns OpenAI’s native response shape with Keel headers attached for governance correlation.
What changed
- Permit-first stays simplest if the caller already has an OpenAI client and wants Keel only at the decision boundary. The caller still writes the OpenAI call.
- Unified execute lets the caller stop maintaining a direct OpenAI dependency and gives back a normalized response that does not change shape if Keel later routes to a different provider.
- Provider-specific proxy keeps the caller on OpenAI’s native API surface, which is the right choice when the caller depends on an OpenAI-specific feature that the canonical envelope does not expose.
How to choose
| If you… | Choose |
|---|---|
| Already have a working provider integration and want only governance | Permit-first |
| Want Keel to dispatch and you have provider-shaped input ready | Unified execute |
| Want a single canonical request/response shape across providers | Provider-neutral execution |
| Need provider-native fields not exposed by the canonical contract | Provider-specific proxy |
| Want background execution with polling or webhook delivery | Async jobs |
For new integrations starting from scratch, unified execute is the recommended default — it has the lowest friction with full Keel-managed evidence. Permit-first is the right choice when an existing provider integration is too valuable to rewrite. Provider-specific proxy is the escape hatch when a provider-specific feature is essential.
What this surface does and does not claim
- The five modes are not interchangeable. Each makes a different trade-off; choose by who owns dispatch, what shape your existing code uses, and whether you need sync results.
- Permit-first does not directly observe the provider call. Reported usage is trusted; the verification track is the bridge to provider-side receipt evidence.
- Public realtime session APIs do not exist today. Realtime session scaffolding exists internally for timeline-replay continuity; it is not a public integration mode.
- Mode availability is not plan-gated — every plan that supports a route can use the mode. Some capabilities inside a mode are plan-gated: see Plans & Entitlements for cross-provider routing, prompt-firewall strengthening, integrity verification API, and other gated features.
Related pages
- Concepts — orienting glossary
- Quickstart — send a first governed request
- Permits — the permit record contract
- Executions — provider-neutral execution surface
- Execute — unified execute surface
- Proxy Execution — provider-native proxy surfaces
- Routing — per-surface routing capability matrix
- Plans & Entitlements — capabilities gated within each mode