Skip to Content
Observability

Observability

Keel observability is built from the same persisted records that drive permit decisions and audit evidence. There is no separate metrics store or sampled telemetry pipeline. Every observability surface reads from the records that already exist for governance and accounting reasons.

This page is a top-level map of what’s available and where to go for each question. For per-question detail, follow the cross-links.

What’s observable

QuestionSurfaceDetail page
What happened to this specific governed request?Public timelineTimeline Replay
What did this project spend, and how was it broken down?Cost-metrics rollupCost Metrics
What was the full phase-by-phase shape of governed requests?Lifecycle referenceRequest Lifecycle
What can I prove cryptographically about a permit decision?Audit evidenceVerifying Keel Evidence
What’s flowing through the project right now?Activity stream (dashboard)Activity stream below
What happened on a particular permit or request?Request inspector (dashboard)Request inspector below

Public versus dashboard surfaces

Two layers of observability:

  • PublicGET /v1/requests/{request_id}/timeline is the documented public lifecycle replay. It is reachable with a project API key and returns the full chronology of a single request.
  • Dashboard — Cost metrics rollups, the activity stream, and the request inspector are dashboard surfaces backed by the same persisted records. They require an authenticated dashboard session and serve interactive triage rather than programmatic integration.

Both layers read from the same source records. The dashboard surfaces add aggregation and presentation; they do not introduce a separate source of truth.

Persisted, not synthesized

Every observability response is reconstructed from records that already exist for governance, accounting, or trust reasons:

  • Permits — the canonical governance record for every governed request
  • Request rows — the per-request lifecycle state (terminal status, error code, HTTP status, cache flag)
  • Usage rows — actual token counts and cost when usage has been reported
  • Execution events — chronological events that add detail to the lifecycle (routing selections, fallback, provider send/receive)
  • Async-job rows — submission, queue, processing, completion, and callback delivery for async surfaces
  • Realtime-session rows — when realtime scaffolding is present for the request

The public request timeline reads across all of these. Cost metrics read primarily from permit and usage rows. The request inspector composes per-request detail from permit, usage, and the latest request row.

Streaming accuracy caveat

When a provider does not return final usage for a stream, observability may show estimated or lower-bound output and cost data rather than exact final provider usage. This is a deliberate durability tradeoff — Keel persists a terminal record even when provider final usage is missing rather than leaving an accounting gap. Records that fall back to estimated final usage are marked as such, not presented as exact billing truth.

Activity stream

The dashboard activity stream is a recent-feed surface, not a historical query tool. It returns the most recent governance and usage events across the projects accessible to the dashboard session, ordered by timestamp.

A typical entry includes:

  • timestamp, project, provider, and model
  • the permit decision and message (allow, deny, challenge)
  • terminal request status and error code when applicable
  • token counts and cost when reported
  • the linked permit_id and request_id

The activity stream deduplicates intentionally: when a request has both a permit row and a usage row, it appears once, anchored to the usage row. Permits without a matching usage row appear as standalone permit-decision entries.

The activity stream is dashboard-scoped. For programmatic per-request lookup, use Timeline Replay.

Request inspector

The dashboard request inspector returns one detailed record for a single governed request. Look up by permit_id (canonical, stable across permit-first and execution-backed flows) or by request_id (convenient for execution-backed flows).

Each inspector record carries the per-permit governance decision, the routing selections (requested versus served provider and model, fallback indicator, reason code), the firewall result when persisted, the actual usage and cost when reported, and the terminal execution outcome.

Fields with no persisted value return null rather than fabricated placeholders. Unpersisted data does not appear in the inspector. For example, prompt-firewall pass and disabled results are not persisted today, so the inspector shows a firewall result only when a permit recorded a firewall block.

The inspector is dashboard-scoped; for programmatic lookup, use the public permit endpoints (GET /v1/permits/{permit_id}) and GET /v1/requests/{request_id}/timeline.

Plan tier availability

Public timeline replay is reachable on every plan with a project API key. Dashboard observability surfaces are available on every plan, but the visibility window is bounded by the plan’s log retention:

  • Starter — 7-day retention
  • Growth — 30-day retention
  • Business — 180-day retention
  • Enterprise — 365-day retention

Records older than the plan’s retention are pruned on a scheduled cadence and do not appear in dashboard rollups, the activity stream, or the request inspector.

Some replay-evidence surfaces tied to the dashboard (project-scoped replay evidence and dashboard permit replay) require Growth or higher; see Plans & Entitlements for the full feature-gate table.

What this surface does and does not claim

  • Observability is reconstructed from persisted records, not invented. Latency, usage, and timing values that were never persisted are never returned.
  • Successful prompt-firewall passes do not always appear as explicit lifecycle events. Only firewall.blocked events are persisted as standalone firewall events.
  • Per-request latency summaries are not exposed by the dashboard rollup surface. The lifecycle event timestamps in Timeline Replay are the closest public-evidence equivalent.
  • Dashboard surfaces share the same source records as the public timeline. They do not unlock data that the public surface cannot reach — they only add aggregation, presentation, and triage workflows.
Last updated on Edit this page on GitHub