Skip to Content
PoliciesDecision Model

Decision Model

Every request evaluated by Keel produces a permit — a persisted decision record that captures outcome, reason, constraints, and budget state at evaluation time.

POST /v1/permits exposes this decision seam directly. All managed execution surfaces (/v1/execute, /v1/executions, /v1/proxy/*) apply the same permit evaluation before provider dispatch. The permit is the durable governance record regardless of which surface originated the request.

Outcomes

Keel’s policy engine produces four outcomes:

OutcomeMeaningHTTP signal
allowRequest may proceed, possibly with accumulated constraints200
denyRequest is rejected200 on permits; 403 on execution routes
challengeRequest requires human review or attestation200 on permits
throttleRequest is rate-throttled429 with Retry-After

throttle is a first-class outcome, not a variant of deny. It carries retry_after_seconds inside reason_detail.outcome_detail, which Keel uses as the source for the HTTP Retry-After header. Unlike deny, throttled requests may be retried after the specified delay.

What Keel evaluates

For every permit request, Keel considers:

  • Policy rows — active rules authored for the project or its organization
  • Cost controls — budget caps, rate limits, spike guards, and threshold guardrails
  • Platform preconditions — whether the requested operation has pricing support and billing access
  • Platform safety nets — governed-request quota and overage enforcement

Policy rows are the primary authoring surface. Cost controls can be expressed as policy rules or as project-level configuration. Platform preconditions run independently of authored rules.

Policy scope

When a project has active policy rows, those rows are evaluated. When a project has no active rows but belongs to an organization with active organization-scoped rows, the organization rows apply. Project rows and organization rows do not stack — project rows take precedence.

Within the evaluated scope:

  • rules are evaluated in authoring order
  • the first terminal match wins
  • allow rules are non-terminal — a matching allow rule preserves rule-level attribution but lets later rules deny, review, throttle, or emit constraints. See Policy Reference › allow (non-terminal).
  • constrain_* rules are non-terminal and continue accumulating constraints after a match

Constraints

When a policy rule emits a constraint, the constraint is carried in the permit response. The current constraint type is max_output_tokens.

Constraint merge is most-restrictive-wins: if multiple matching rules emit max_output_tokens, the lowest cap survives.

On Keel-managed execution surfaces, supported constraints are enforced before provider dispatch. On permit-first flows, your application is responsible for honoring any constraints carried in the permit decision.

Example constraint output in a permit response:

{ "schema_version": 1, "max_output_tokens": 512 }

Budget snapshot

Keel builds a hierarchical budget snapshot from project configuration and live spend state. When budget caps or guardrails are configured, permits carry a budget snapshot. All monetary values are in usd_micros. Sections appear only when the relevant cap or guardrail is active.

{ "schema_version": 1, "currency_unit": "usd_micros", "request": { "estimated_cost": 120000, "cap": 150000, "remaining": 30000 }, "daily": { "cap": 3000000, "current_spend": 2200000, "projected_spend": 2320000, "remaining": 800000 }, "monthly": { "cap": 10000000, "current_spend": 7800000, "projected_spend": 7920000, "remaining": 2200000, "threshold_ratio": 0.85, "threshold_amount": 8500000 }, "rate_limit": { "window_seconds": 60, "limit": 50, "observed": 50, "retry_after_seconds": 12 } }

Structured reason codes

Every non-allow permit decision carries a machine-readable reason_code. These codes are stable and appear in permit responses, Timeline Replay, the dashboard, and governance audit events.

See Errors › Permit reason codes for the full locked vocabulary.

reason_detail and outcome_detail

In addition to reason_code, every non-allow permit decision carries a structured reason_detail object. The shape is consistent across decisions:

  • category — broad bucket of the decision driver (policy, budget, firewall)
  • kind — finer-grained discriminator within the category (for example, model_not_allowed, daily_cap_exceeded, rate_limit_throttled)
  • outcome — one of deny, challenge, or throttle
  • outcome_detail — outcome-specific evidence

The outcome_detail field is where Keel surfaces the structured evidence a client needs to react programmatically. The contents depend on the outcome and the reason code. For example, a budget.daily_cap_exceeded deny carries spend and cap fields:

{ "category": "budget", "kind": "daily_cap_exceeded", "outcome": "deny", "outcome_detail": { "cap_usd_micros": 3000000, "current_spend_usd_micros": 2200000, "projected_spend_usd_micros": 3050000, "window": "daily" } }

Filter on reason_code for stable categorization. Read outcome_detail for the values that drive client behavior — retry timing, threshold ratios, spend amounts, and so on. Treat unknown fields inside outcome_detail as additive — Keel may add fields to a code’s outcome_detail without changing the code itself.

Throttling

Throttling is a first-class permit outcome, not a variant of deny. A permit with outcome = throttle represents a decision that the request would have been allowed except for a recent rate ceiling — and that the caller should retry after a bounded delay rather than treating the request as permanently denied.

Throttling is produced by the throttle_if_rate_exceeds policy action. See Policy Reference › throttle_if_rate_exceeds for the action contract.

Throttle versus deny

deny_if_rate_exceedsthrottle_if_rate_exceeds
Outcomedenythrottle
HTTP status on execution surfaces403429
Retry-After headerNot setSet to outcome_detail.retry_after_seconds
Reason codebudget.rate_limit_exceededbudget.rate_limit_throttled
Caller intent”This is rejected; do not retry without changing something""Try again after the bounded delay”

Use deny_if_rate_exceeds when exceeding the rate cap is itself a violation. Use throttle_if_rate_exceeds when the rate cap exists to smooth load and the caller is expected to back off and retry.

Throttle outcome_detail shape

A throttled permit’s reason_detail.outcome_detail carries four fields:

{ "category": "budget", "kind": "rate_limit_throttled", "outcome": "throttle", "outcome_detail": { "retry_after_seconds": 12, "window_seconds": 60, "limit": 50, "observed": 50 } }
  • retry_after_seconds — how long the caller should wait before retrying. This value is the source for the HTTP Retry-After header on execution surfaces.
  • window_seconds — the size of the trailing rate window the rule evaluates.
  • limit — the configured ceiling for that window.
  • observed — the count Keel saw in the window when the rule fired.

HTTP behavior

On execution surfaces (/v1/execute, /v1/executions, /v1/proxy/*), throttle decisions return HTTP 429 with a Retry-After header. The response body is the normalized execution envelope with status: "denied" and an error block. The routing.reason_code and the permit reason_code both carry budget.rate_limit_throttled.

On /v1/permits, the throttle decision is a permit record with HTTP 200 — permit decisions do not branch HTTP status. Branch on decision === "throttle" and read reason_detail.outcome_detail.retry_after_seconds.

SDK retry behavior

Keel’s Python and JavaScript SDKs handle throttle responses on execution surfaces:

  • The SDKs detect the 429 response and the Retry-After header.
  • They wait the indicated number of seconds and retry the request once.
  • If the retry also throttles, the SDK surfaces the second throttle to the caller — it does not retry indefinitely.

Permit-only flows do not auto-retry, because the application controls the provider call and the retry decision belongs in application code. Read reason_detail.outcome_detail.retry_after_seconds and decide whether to retry, queue, or surface the throttle to your end user.

When throttling appears in audit evidence

Throttled permits are stored as deny-category records with a throttle outcome tag. They appear in:

  • Permit list and detail responses (GET /v1/permits, GET /v1/permits/{permit_id})
  • Timeline Replay as a permit.denied event with reason_code: budget.rate_limit_throttled
  • The dashboard activity stream
  • Signed compliance exports

Throttled permits do not consume billable cost. They consume a request count for plan-quota purposes, the same as denied requests.

Decision artifacts on the permit

The permit record captures:

  • decision, reason_code, and structured reason_detail
  • constraints when constraint rules matched
  • budget snapshot when budget caps or guardrails are configured
  • routing metadata when the request carried routing context
  • policy ID and version when a policy row matched
  • estimated usage fields and, after closeout, actual usage fields

These fields are the canonical governance record. Execution events and usage logs add evidence around the permit; they do not replace it.

Accurate scope

  • The permit is Keel’s canonical governance record.
  • Policy rows are the primary enforcement mechanism, but the full permit decision also includes cost controls, billing gates, and platform preconditions that are not expressed as authored policy rules.
  • Permit-first mode produces a decision record before execution; it does not provide execution-bound proof. See Permits for the full permit-first trust boundary.
Last updated on Edit this page on GitHub