Skip to Content
Budget Envelopes

Budget Envelopes

Budget envelopes lock estimated cost before provider dispatch and adjust to actual cost afterward. This prevents budget overruns by enforcing the cap before tokens are consumed — not after the bill arrives.

How envelopes work

Before provider dispatch, Keel estimates the request cost and locks that amount against the envelope’s remaining budget. The request then proceeds to the provider with the estimate held. After the provider responds, the locked amount is replaced with the actual cost; any surplus is released back to the envelope.

If the envelope does not have sufficient remaining budget to cover the estimate, the request is denied before provider dispatch.

Remaining budget calculation

remaining = total_budget - reserved - spent
  • total_budget — The configured cap for this envelope
  • reserved — Amount locked by in-flight requests (not yet reconciled)
  • spent — Amount consumed by completed requests

Amounts

All envelope amounts are denominated in microdollars (USD × 10⁶). One US dollar equals 1,000,000 microdollars. Microdollar precision avoids floating-point rounding in budget accounting.

Reconciliation

After provider dispatch completes, Keel adjusts the envelope state:

  • The locked amount is released
  • The actual cost is recorded as spent
  • A correction amount is calculated: actual - locked
    • Positive correction: underestimated (actual cost exceeded estimate)
    • Negative correction: overestimated (budget recovered)

Envelope reconciliation is one track in Keel’s broader reconciliation surface. For the full pillar, including the verification track and the financial_reconciliation export bundle, see Reconciliation.

Envelope states

Envelopes can be paused without deletion. A paused envelope retains its records but rejects new reservations. Reactivating an envelope restores normal operation.

Failure modes

ErrorMeaning
Envelope not foundThe referenced envelope does not exist or belongs to a different project.
Envelope inactiveThe envelope is paused and cannot accept new reservations.
Envelope exhaustedRemaining budget cannot cover the estimated cost. The request is denied.
Estimate requiredThe request did not include token estimates needed to calculate the reservation. Provide estimated_input_tokens and estimated_output_tokens in the resource attributes.

Providing estimates

Envelope reservations require cost estimates. Include token estimates in the resource attributes of your permit or execution request:

{ "resource": { "attributes": { "provider": "openai", "model": "gpt-4o", "operation": "generate.text", "estimated_input_tokens": 200, "estimated_output_tokens": 250 } } }

If estimates are omitted and the project’s budget policy requires envelopes, the request is denied before evaluation.

Budget enforcement via policies

Budget caps, rate limits, spike detection, and projected-threshold enforcement are expressed as policy rules via POST /v1/policies. Available budget actions:

  • deny_if_cost_exceeds — denies when cost exceeds a cap for the specified window (daily, monthly, or request) with cap_micros in USD micros
  • deny_if_rate_exceeds — hard-denies when permit volume in the trailing window_seconds window reaches max_requests
  • throttle_if_rate_exceeds — same param shape as deny_if_rate_exceeds, but returns HTTP 429 + Retry-After instead of a hard deny
  • deny_if_spike_detected — denies when today’s projected spend exceeds baseline × multiplier, where baseline is average daily spend over baseline_days
  • deny_if_projected_monthly_ratio_exceeds — denies when projected month-to-date spend reaches (monthly_cap_micros × ratio_pct) / 100

Budget enforcement and budget envelope reservation work together: the policy engine decides whether to allow or deny at permit evaluation time, and the envelope system reserves estimated cost from the project’s budget allocation before provider dispatch.

Existing project overrides (PATCH /v1/projects/{id}/policy with daily_cost_usd_micros_cap, monthly_cost_usd_micros_cap, request_cost_usd_micros_cap, etc.) continue to work. Under the hood, they compile to equivalent policy rules at runtime via the synthesizer.

Last updated on Edit this page on GitHub