Decision Model
Every request evaluated by Keel produces a permit — a persisted decision record that captures outcome, reason, constraints, and budget state at evaluation time.
POST /v1/permits exposes this decision seam directly. All managed execution surfaces (/v1/execute, /v1/executions, /v1/proxy/*) apply the same permit evaluation before provider dispatch. The permit is the durable governance record regardless of which surface originated the request.
Outcomes
Keel’s policy engine produces four outcomes:
| Outcome | Meaning | HTTP signal |
|---|---|---|
allow | Request may proceed, possibly with accumulated constraints | 200 |
deny | Request is rejected | 200 on permits; 403 on execution routes |
challenge | Request requires human review or attestation | 200 on permits |
throttle | Request is rate-throttled | 429 with Retry-After |
throttle is a first-class outcome, not a variant of deny. It carries retry_after_seconds inside reason_detail.outcome_detail, which Keel uses as the source for the HTTP Retry-After header. Unlike deny, throttled requests may be retried after the specified delay.
What Keel evaluates
For every permit request, Keel considers:
- Policy rows — active rules authored for the project or its organization
- Cost controls — budget caps, rate limits, spike guards, and threshold guardrails
- Platform preconditions — whether the requested operation has pricing support and billing access
- Platform safety nets — governed-request quota and overage enforcement
Policy rows are the primary authoring surface. Cost controls can be expressed as policy rules or as project-level configuration. Platform preconditions run independently of authored rules.
Policy scope
When a project has active policy rows, those rows are evaluated. When a project has no active rows but belongs to an organization with active organization-scoped rows, the organization rows apply. Project rows and organization rows do not stack — project rows take precedence.
Within the evaluated scope:
- rules are evaluated in authoring order
- the first terminal match wins
allowrules are non-terminal — a matching allow rule preserves rule-level attribution but lets later rules deny, review, throttle, or emit constraints. See Policy Reference ›allow(non-terminal).constrain_*rules are non-terminal and continue accumulating constraints after a match
Constraints
When a policy rule emits a constraint, the constraint is carried in the permit response. The current constraint type is max_output_tokens.
Constraint merge is most-restrictive-wins: if multiple matching rules emit max_output_tokens, the lowest cap survives.
On Keel-managed execution surfaces, supported constraints are enforced before provider dispatch. On permit-first flows, your application is responsible for honoring any constraints carried in the permit decision.
Example constraint output in a permit response:
{
"schema_version": 1,
"max_output_tokens": 512
}Budget snapshot
Keel builds a hierarchical budget snapshot from project configuration and live spend state. When budget caps or guardrails are configured, permits carry a budget snapshot. All monetary values are in usd_micros. Sections appear only when the relevant cap or guardrail is active.
{
"schema_version": 1,
"currency_unit": "usd_micros",
"request": {
"estimated_cost": 120000,
"cap": 150000,
"remaining": 30000
},
"daily": {
"cap": 3000000,
"current_spend": 2200000,
"projected_spend": 2320000,
"remaining": 800000
},
"monthly": {
"cap": 10000000,
"current_spend": 7800000,
"projected_spend": 7920000,
"remaining": 2200000,
"threshold_ratio": 0.85,
"threshold_amount": 8500000
},
"rate_limit": {
"window_seconds": 60,
"limit": 50,
"observed": 50,
"retry_after_seconds": 12
}
}Structured reason codes
Every non-allow permit decision carries a machine-readable reason_code. These codes are stable and appear in permit responses, Timeline Replay, the dashboard, and governance audit events.
See Errors › Permit reason codes for the full locked vocabulary.
reason_detail and outcome_detail
In addition to reason_code, every non-allow permit decision carries a structured reason_detail object. The shape is consistent across decisions:
category— broad bucket of the decision driver (policy,budget,firewall)kind— finer-grained discriminator within the category (for example,model_not_allowed,daily_cap_exceeded,rate_limit_throttled)outcome— one ofdeny,challenge, orthrottleoutcome_detail— outcome-specific evidence
The outcome_detail field is where Keel surfaces the structured evidence a client needs to react programmatically. The contents depend on the outcome and the reason code. For example, a budget.daily_cap_exceeded deny carries spend and cap fields:
{
"category": "budget",
"kind": "daily_cap_exceeded",
"outcome": "deny",
"outcome_detail": {
"cap_usd_micros": 3000000,
"current_spend_usd_micros": 2200000,
"projected_spend_usd_micros": 3050000,
"window": "daily"
}
}Filter on reason_code for stable categorization. Read outcome_detail for the values that drive client behavior — retry timing, threshold ratios, spend amounts, and so on. Treat unknown fields inside outcome_detail as additive — Keel may add fields to a code’s outcome_detail without changing the code itself.
Throttling
Throttling is a first-class permit outcome, not a variant of deny. A permit with outcome = throttle represents a decision that the request would have been allowed except for a recent rate ceiling — and that the caller should retry after a bounded delay rather than treating the request as permanently denied.
Throttling is produced by the throttle_if_rate_exceeds policy action. See Policy Reference › throttle_if_rate_exceeds for the action contract.
Throttle versus deny
deny_if_rate_exceeds | throttle_if_rate_exceeds | |
|---|---|---|
| Outcome | deny | throttle |
| HTTP status on execution surfaces | 403 | 429 |
Retry-After header | Not set | Set to outcome_detail.retry_after_seconds |
| Reason code | budget.rate_limit_exceeded | budget.rate_limit_throttled |
| Caller intent | ”This is rejected; do not retry without changing something" | "Try again after the bounded delay” |
Use deny_if_rate_exceeds when exceeding the rate cap is itself a violation. Use throttle_if_rate_exceeds when the rate cap exists to smooth load and the caller is expected to back off and retry.
Throttle outcome_detail shape
A throttled permit’s reason_detail.outcome_detail carries four fields:
{
"category": "budget",
"kind": "rate_limit_throttled",
"outcome": "throttle",
"outcome_detail": {
"retry_after_seconds": 12,
"window_seconds": 60,
"limit": 50,
"observed": 50
}
}retry_after_seconds— how long the caller should wait before retrying. This value is the source for the HTTPRetry-Afterheader on execution surfaces.window_seconds— the size of the trailing rate window the rule evaluates.limit— the configured ceiling for that window.observed— the count Keel saw in the window when the rule fired.
HTTP behavior
On execution surfaces (/v1/execute, /v1/executions, /v1/proxy/*), throttle decisions return HTTP 429 with a Retry-After header. The response body is the normalized execution envelope with status: "denied" and an error block. The routing.reason_code and the permit reason_code both carry budget.rate_limit_throttled.
On /v1/permits, the throttle decision is a permit record with HTTP 200 — permit decisions do not branch HTTP status. Branch on decision === "throttle" and read reason_detail.outcome_detail.retry_after_seconds.
SDK retry behavior
Keel’s Python and JavaScript SDKs handle throttle responses on execution surfaces:
- The SDKs detect the
429response and theRetry-Afterheader. - They wait the indicated number of seconds and retry the request once.
- If the retry also throttles, the SDK surfaces the second throttle to the caller — it does not retry indefinitely.
Permit-only flows do not auto-retry, because the application controls the provider call and the retry decision belongs in application code. Read reason_detail.outcome_detail.retry_after_seconds and decide whether to retry, queue, or surface the throttle to your end user.
When throttling appears in audit evidence
Throttled permits are stored as deny-category records with a throttle outcome tag. They appear in:
- Permit list and detail responses (
GET /v1/permits,GET /v1/permits/{permit_id}) - Timeline Replay as a
permit.deniedevent withreason_code: budget.rate_limit_throttled - The dashboard activity stream
- Signed compliance exports
Throttled permits do not consume billable cost. They consume a request count for plan-quota purposes, the same as denied requests.
Decision artifacts on the permit
The permit record captures:
decision,reason_code, and structuredreason_detail- constraints when constraint rules matched
- budget snapshot when budget caps or guardrails are configured
- routing metadata when the request carried routing context
- policy ID and version when a policy row matched
- estimated usage fields and, after closeout, actual usage fields
These fields are the canonical governance record. Execution events and usage logs add evidence around the permit; they do not replace it.
Accurate scope
- The permit is Keel’s canonical governance record.
- Policy rows are the primary enforcement mechanism, but the full permit decision also includes cost controls, billing gates, and platform preconditions that are not expressed as authored policy rules.
- Permit-first mode produces a decision record before execution; it does not provide execution-bound proof. See Permits for the full permit-first trust boundary.