Execution Modes

Keel exposes five public integration modes. They are not interchangeable, and the right choice depends on who owns the provider call, how provider-shaped your existing code is, and whether you need the request and response synchronously. This page covers each mode in detail, compares them side-by-side, and shows one task implemented three ways so the trade-offs are concrete.

For the orienting glossary that introduces these modes, see Concepts.

At a glance

Mode	Route	Provider call by	Provider key held by	Request shape	Response shape	Sync?
Permit-first	`POST /v1/permits` (+ usage closeout)	Caller	Caller	Canonical permit body	Permit decision only	Sync
Unified execute	`POST /v1/execute`	Keel	Keel	Provider-shaped `input`	Normalized envelope + `resolved`	Sync
Provider-neutral execution	`POST /v1/executions`	Keel	Keel	Canonical operation + messages/inputs	Normalized execution envelope	Sync (stream subset)
Provider-specific proxy	`POST /v1/proxy/{provider}`	Keel	Keel	Provider-native payload	Provider-native response	Sync (stream subset)
Async jobs	`POST /v1/jobs` (+ poll or callback)	Keel (background)	Keel	Canonical permit + provider payload	Job record; result via poll or callback	Async

Reading the table:

Provider call by — who issues the HTTP request to the upstream provider. Keel for managed execution; the caller for permit-first.
Provider key held by — where the upstream provider’s API key lives. Keel-managed modes resolve provider credentials server-side; permit-first leaves credentials with the caller.
Request shape — the body the caller must construct.
Response shape — what the route returns. Normalized envelopes are stable across providers; provider-native preserves provider-specific fields.

Permit-first

Route: POST /v1/permits (decision); POST /v1/permits/{permit_id}/usage (closeout).

Who it is for. Applications that already own a provider integration and want Keel to be the canonical decision and audit boundary without changing transport. Keel makes the policy and budget decision; the caller calls the provider; closing usage is reported back so the audit record carries observed token counts.

What Keel owns

Request authentication and project scoping
Canonical permit evaluation (policy, budget, governance event)
Idempotent permit persistence
Optional later usage closeout through the public usage route

What the caller owns

The actual provider call and its credentials
Retry and transport behavior
Collecting final usage and cost
Reporting completed usage back to Keel when desired

Trade-offs. Narrowest integration contract; strongest decoupling from provider transport; least automatic lifecycle evidence unless usage is reported back. Permit-first is the right answer when the caller’s existing provider integration is mature and ripping it out costs more than the additional governance evidence Keel-managed modes would provide.

Current limits

Public permit closeout requires verification material that proves execution and a positive billed cost.
The prompt firewall does not run on permit-only requests.
Caller-executed traffic is outside Keel’s outbound policy because Keel does not make the provider call.
Reported usage is trusted by Keel until verification material is attached. For the broader observation boundary, see Scope and Limits § Permit-first observation boundary.

Unified execute

Route: POST /v1/execute.

Who it is for. Applications that already have a provider-shaped input payload and want Keel to handle target resolution, permit evaluation, dispatch, and a normalized response. Unified execute is the primary public runtime surface for new integrations.

What Keel owns

Target resolution before execution (alias resolution, project-scoped health, active routing budgets)
Permit evaluation
Provider dispatch through adapters
Normalized response envelope plus a resolved metadata block
Usage and accounting persistence

What the caller owns

Constructing the provider-shaped input payload
Choosing whether to send provider/model explicitly or let Keel resolve

Trade-offs. Lower-friction than permit-first when the caller wants Keel to own dispatch; less provider-native than the proxy routes; not interchangeable with POST /v1/executions despite the similar name.

Current limits

Target resolution is explicit and rule-based, not a claim of broad autonomous routing.
The contract is public and stable, but it is not the same request shape as /v1/executions — see Routing for the per-surface capability matrix.

Provider-neutral execution

Route: POST /v1/executions.

Who it is for. Applications that want a single canonical request and response shape across providers. The execution contract is provider-neutral: the same operation and inputs produce stable, normalized outputs regardless of which provider Keel dispatches to.

What Keel owns

Permit formation and evaluation
Routing plan binding from the public routing envelope
Provider dispatch through adapters
Usage reconciliation, ledger writes, and lifecycle persistence

What the caller owns

Canonical input construction (operation, messages or inputs, parameters, optional routing)
Any client-side retry policy against the Keel API
Choosing sync or the current stream subset

Trade-offs. Cleaner than provider-native proxies for multi-provider integrations; narrower than provider-native surfaces — not every provider-specific feature is exposed through this contract. The provider-neutral envelope is the most portable shape Keel offers.

Current limits

Streaming is narrower than the non-stream path.
Routing and fallback behavior are bounded by the surface’s routing capability matrix.

Provider-specific proxy

Routes: POST /v1/proxy/openai, POST /v1/proxy/anthropic, POST /v1/proxy/google, POST /v1/proxy/xai, POST /v1/proxy/meta.

Who it is for. Applications that want to keep provider-native payloads and provider-native responses while Keel governs the request, holds provider credentials, and persists audit evidence. Proxy mode is the right choice when the caller depends on a provider-specific feature that the provider-neutral contract does not expose.

What Keel owns

Stripping known transport and auth override fields from the inbound payload
Permit evaluation and supported prompt-firewall checks
Provider-key lookup and adapter dispatch
Idempotency replay and caching on proxy paths
Usage and accounting persistence and response headers

What the caller owns

Constructing provider-native payloads
Understanding provider-specific differences in supported operations and streaming behavior

Trade-offs. Closest to provider semantics; highest route-specific behavior — proxy routes are not interchangeable with one another. Capability coverage differs materially by provider.

Current limits

Public routing is exposed only on the OpenAI and Anthropic proxy routes.
Public streaming is currently primarily implemented on the OpenAI proxy.
Capability coverage differs by provider — see Proxy Execution.

Async jobs

Routes: POST /v1/jobs, GET /v1/jobs/{job_id}.

Who it is for. Applications that want governed execution decoupled from the request/response lifetime — for example, batch processing, long-running generation, or workloads where the caller does not want to hold an HTTP connection open.

What Keel owns

Job persistence and queue state
Background execution through the shared governance and execution pipeline
Usage and accounting persistence
Optional best-effort callback delivery

What the caller owns

Submitting the canonical permit plus provider payload
Polling job status or receiving callbacks
Handling eventual completion instead of inline results

Trade-offs. Durable status plus optional callback support; more moving parts than sync routes; distinct job_id and request_id lifecycles.

Current limits

Callback URLs must pass outbound-policy validation.
Async uses shared governance and execution machinery, but status semantics are job-oriented rather than identical to sync routes.
Async job callbacks use bounded in-process retries; callers that need durable outbound delivery should use webhook subscriptions instead.

Worked example — one task across three modes

The same task (“summarize a customer support ticket in one sentence”) implemented three ways. The differences are not in the AI work — they are in who owns dispatch and what the caller’s code looks like.

As permit-first


# 1. Ask Keel for a decision.
curl -sS -X POST https://api.keelapi.com/v1/permits \
  -H "Authorization: Bearer keel_sk_<project_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "project_id": "<project_uuid>",
    "idempotency_key": "ticket-1234-summary",
    "subject": {"type": "user", "id": "agent_42", "attributes": {}},
    "action": {"name": "ai.generate.summary", "attributes": {}},
    "resource": {
      "type": "request",
      "id": "ticket_1234",
      "attributes": {
        "provider": "openai",
        "model": "gpt-4o-mini",
        "estimated_input_tokens": 800,
        "estimated_output_tokens": 60
      }
    }
  }'
 
# 2. If the decision is "allow", call OpenAI from your application.
# 3. Report observed usage back to Keel.
curl -sS -X POST https://api.keelapi.com/v1/permits/<permit_id>/usage \
  -H "Authorization: Bearer keel_sk_<project_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "input_tokens": 812,
    "output_tokens": 47,
    "cost_usd_micros": 215
  }'

As unified execute


curl -sS -X POST https://api.keelapi.com/v1/execute \
  -H "Authorization: Bearer keel_sk_<project_key>" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: ticket-1234-summary" \
  -d '{
    "provider": "openai",
    "model": "gpt-4o-mini",
    "input": {
      "messages": [
        {"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"}
      ],
      "max_tokens": 80,
      "temperature": 0
    }
  }'

Keel resolves the target, evaluates the permit, calls OpenAI, and returns a normalized envelope with both the model output and a resolved block describing the selection.

As provider-specific proxy


curl -sS -X POST https://api.keelapi.com/v1/proxy/openai \
  -H "Authorization: Bearer keel_sk_<project_key>" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: ticket-1234-summary" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Summarize this ticket in one sentence: <ticket-body>"}
    ],
    "max_tokens": 80
  }'

The body is the OpenAI-native chat-completion payload. Keel evaluates the permit, dispatches to OpenAI, and returns OpenAI’s native response shape with Keel headers attached for governance correlation.

What changed

Permit-first stays simplest if the caller already has an OpenAI client and wants Keel only at the decision boundary. The caller still writes the OpenAI call.
Unified execute lets the caller stop maintaining a direct OpenAI dependency and gives back a normalized response that does not change shape if Keel later routes to a different provider.
Provider-specific proxy keeps the caller on OpenAI’s native API surface, which is the right choice when the caller depends on an OpenAI-specific feature that the canonical envelope does not expose.

How to choose

If you…	Choose
Already have a working provider integration and want only governance	Permit-first
Want Keel to dispatch and you have provider-shaped input ready	Unified execute
Want a single canonical request/response shape across providers	Provider-neutral execution
Need provider-native fields not exposed by the canonical contract	Provider-specific proxy
Want background execution with polling or webhook delivery	Async jobs

For new integrations starting from scratch, unified execute is the recommended default — it has the lowest friction with full Keel-managed evidence. Permit-first is the right choice when an existing provider integration is too valuable to rewrite. Provider-specific proxy is the escape hatch when a provider-specific feature is essential.

What this surface does and does not claim

The five modes are not interchangeable. Each makes a different trade-off; choose by who owns dispatch, what shape your existing code uses, and whether you need sync results.
Permit-first does not directly observe the provider call. Reported usage is trusted; the verification track is the bridge to provider-side receipt evidence.
Public realtime session APIs do not exist today. Realtime session scaffolding exists internally for timeline-replay continuity; it is not a public integration mode.
Mode availability is not plan-gated — every plan that supports a route can use the mode. Some capabilities inside a mode are plan-gated: see Plans & Entitlements for cross-provider routing, prompt-firewall strengthening, integrity verification API, and other gated features.

Concepts — orienting glossary
Quickstart — send a first governed request
Permits — the permit record contract
Executions — provider-neutral execution surface
Execute — unified execute surface
Proxy Execution — provider-native proxy surfaces
Routing — per-surface routing capability matrix
Plans & Entitlements — capabilities gated within each mode

Execution Modes

At a glance

Permit-first

Unified execute

Provider-neutral execution

Provider-specific proxy

Async jobs

Worked example — one task across three modes

As permit-first

As unified execute

As provider-specific proxy

What changed

How to choose

What this surface does and does not claim

Related pages