Realtime Voice
Realtime voice sessions turn a live conversation into a sequence of decisions, events, tool calls, and evidence. Keel adds the Decide and Prove layer: your app starts a session with Keel, your voice gateway or sidecar sends signed event atoms, Keel evaluates turns against a locked policy snapshot, and the session ends with a verifier-readable voice attestation artifact.
Use this guide when your application already runs voice through OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, or another provider that follows a realtime session model, and you want Keel to decide, execute, and prove the session through the same permit-centered audit model used by Execute, Permits, Signed Exports, and Stripe MPP.
Keel ingests signed voice events, transcript spans, tool-call metadata, interruption evidence, and audio content hashes. Do not send raw voice media to Keel. Your voice provider or gateway remains the system that streams and stores audio.
What this integrates
Keel adjudicates realtime voice sessions across five surfaces:
| Surface | Keel role |
|---|---|
| Session lifecycle | Create a realtime_sessions row, bind the session-start permit, lock the policy snapshot, attach an optional budget envelope, and track active, ended, or failed state. |
| Voice event ingestion | Accept batched, signed voice atoms from your gateway or sidecar and add hash-linked session events. |
| Turn evaluation | Evaluate each realtime turn against the session-locked policy snapshot and budget envelope. |
| Tool calls | Issue action-child permits for voice tool calls. Tool calls default to action_verb: "ai.generate" and can dispatch as action_verb: "mpp.payment" for voice-initiated payments. |
| Session attestation | Emit a keel.voice.attestation.phase_a artifact with chain entries, signatures, timestamp receipts, policy snapshot hashes, and permit linkage. |
Compatible voice providers include OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, and provider-neutral gateways that can emit Keel voice event atoms.
Architecture
Customer app
|
| 1. Start or join a call with a voice provider
v
Voice provider or gateway
|
| 2. Stream audio and realtime provider events
v
Customer app / voice sidecar
|
| 3. POST /v1/sessions
| Create Keel realtime session, bind permit,
| lock policy snapshot, and create budget envelope
v
Keel
|
| 4. POST /v1/voice/sessions/{session_id}/events
| Ingest signed voice atoms and hash-only audio evidence
v
Keel
|
| 5. POST /v1/sessions/{session_id}/turns
| Decide each turn against the locked session policy
v
Keel
|
| 6. Optional voice tool call with action_verb=mpp.payment
v
MPP relay
|
| 7. POST /v1/sessions/{session_id}/end
| End session and emit voice attestation
v
Customer app
|
| 8. Retrieve attestation and run keel-verify
v
Auditor / reviewerThe trust domains stay separate:
| Domain | Owner | Keel role |
|---|---|---|
| Voice media transport | Your voice provider or gateway | None. Keel does not hold raw audio. |
| Event sidecar | Your app or gateway sidecar | Sign event atoms and send hashes, transcript spans, and tool-call metadata. |
| Session decision | Keel permit policy | Decide whether the session and its turns remain within policy and budget. |
| Embedded payment rail | Stripe MPP, when used | Dispatch through mpp.payment and bind the payment evidence. |
| Audit trail | Keel plus independent witnesses | Preserve tamper-evident session evidence with externally anchored, independently witnessed timestamp receipts where configured. |
The public start endpoint is POST /v1/sessions. If your internal design names this step /v1/realtime/sessions/start, map that step to the shipped public route.
Prerequisites
- A voice provider or gateway integration that can emit realtime event atoms.
- A Keel project on Production or Enterprise for the voice + MPP path and signed export review. See Plans & Entitlements.
- A client-scoped Keel API key for session, event, and turn endpoints.
- A policy that allows the realtime model and session operation your app uses.
keel-verifier3.0.0 or newer for verifier UX, with voice attestation compatibility from 2.6.0 and schema v3 support from 2.7.0.- If you use voice-initiated payments, a Stripe MPP setup that follows Stripe MPP.
Set your Keel environment:
export KEEL_BASE_URL="https://api.keelapi.com"
export KEEL_API_KEY="keel_sk_your_project_key"
export KEEL_PROJECT_ID="11111111-1111-1111-1111-111111111111"Quick start
- Start a realtime session with
POST /v1/sessions. - Ingest signed voice events with
POST /v1/voice/sessions/{session_id}/events. - Submit turn evaluations with
POST /v1/sessions/{session_id}/turns. - Send voice tool calls with
action_verbwhen the call should dispatch through a specific verb. - End the session with
POST /v1/sessions/{session_id}/end. - Verify the returned attestation artifact with
keel-verify.
Python
pip install keel-sdk "keel-verifier>=3.0.0"import json
import os
from keel_sdk import KeelClient
client = KeelClient(
base_url=os.environ.get("KEEL_BASE_URL", "https://api.keelapi.com"),
api_key=os.environ["KEEL_API_KEY"],
)
project_id = os.environ["KEEL_PROJECT_ID"]
session = client.request(
"POST",
"/v1/sessions",
json={
"permit": {
"project_id": project_id,
"idempotency_key": "idem_voice_session_example_001",
"subject": {"type": "user", "id": "usr_example_voice"},
"action": {"name": "realtime.session"},
"resource": {
"type": "request",
"id": "req_example_voice_001",
"attributes": {
"provider": "openai",
"model": "gpt-4o-realtime-preview",
"operation": "realtime.session",
"execution_mode": "realtime",
"estimated_input_tokens": 0,
"estimated_output_tokens": 0,
},
},
},
"metadata": {"voice_provider": "openai_realtime"},
"risk_budget_total_micros": 250000,
},
)
session_id = session["session_id"]
client.request(
"POST",
f"/v1/voice/sessions/{session_id}/events",
json={
"batch_idempotency_key": "batch_voice_example_001",
"events": [
{
"atom_type": "lifecycle_marker",
"idempotency_key": "voice_evt_example_started",
"captured_at_ns": 1780513200000000000,
"payload": {"marker_type": "started"},
"signatures": [
{
"algorithm": "ed25519",
"key_id": "voice-sidecar-example",
"signature": "sig_example_started",
"signed_at": "2026-06-03T18:00:00Z",
}
],
},
{
"atom_type": "audio_chunk",
"idempotency_key": "voice_evt_example_audio_001",
"captured_at_ns": 1780513201500000000,
"direction": "inbound",
"payload": {
"content_hash": "sha256:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"format": "pcm16",
"duration_ms": 1500,
},
"signatures": [
{
"algorithm": "ed25519",
"key_id": "voice-sidecar-example",
"signature": "sig_example_audio_001",
"signed_at": "2026-06-03T18:00:01Z",
}
],
},
],
},
)
turn = client.request(
"POST",
f"/v1/sessions/{session_id}/turns",
json={
"input_tokens": 280,
"output_tokens": 120,
"risk_budget_used_micros": 4500,
"metadata": {"provider_response_id": "resp_example_voice_001"},
},
)
if turn["decision"] != "allow":
raise RuntimeError(turn.get("user_facing_reason") or turn["decision"])
ended = client.request(
"POST",
f"/v1/sessions/{session_id}/end",
json={
"usage_metrics": [
{"meter": "realtime_minutes", "quantity": 3},
{"meter": "input_tokens", "quantity": 280},
{"meter": "output_tokens", "quantity": 120},
],
"metadata": {"ended_by": "voice_gateway"},
},
)
attestation = ended["attestation_artifact"]
with open("voice_session_export.json", "w", encoding="utf-8") as handle:
json.dump(attestation, handle, indent=2)
print("session:", session_id)
print("chain head:", attestation["chain_head"]["content_hash"])TypeScript
npm install keel-sdkconst baseUrl = process.env.KEEL_BASE_URL ?? "https://api.keelapi.com";
const apiKey = process.env.KEEL_API_KEY!;
const projectId = process.env.KEEL_PROJECT_ID!;
type Json = Record<string, unknown>;
async function keel<T>(method: string, path: string, body: Json): Promise<T> {
const response = await fetch(`${baseUrl}${path}`, {
method,
headers: {
Authorization: `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify(body),
});
const payload = (await response.json()) as T & { error?: { message?: string } };
if (!response.ok) {
throw new Error(payload.error?.message ?? `Keel request failed: ${response.status}`);
}
return payload;
}
const session = await keel<{ session_id: string }>("POST", "/v1/sessions", {
permit: {
project_id: projectId,
idempotency_key: "idem_voice_session_example_002",
subject: { type: "user", id: "usr_example_voice" },
action: { name: "realtime.session" },
resource: {
type: "request",
id: "req_example_voice_002",
attributes: {
provider: "openai",
model: "gpt-4o-realtime-preview",
operation: "realtime.session",
execution_mode: "realtime",
estimated_input_tokens: 0,
estimated_output_tokens: 0,
},
},
},
metadata: { voice_provider: "openai_realtime" },
risk_budget_total_micros: 250000,
});
await keel("POST", `/v1/voice/sessions/${session.session_id}/events`, {
batch_idempotency_key: "batch_voice_example_002",
events: [
{
atom_type: "transcript_span",
idempotency_key: "voice_evt_example_transcript_001",
captured_at_ns: "1780513202000000000",
direction: "inbound",
payload: {
text: "I approve the support lookup for this account.",
language: "en",
},
signatures: [
{
algorithm: "ed25519",
key_id: "voice-sidecar-example",
signature: "sig_example_transcript_001",
signed_at: "2026-06-03T18:00:02Z",
},
],
},
],
});
const turn = await keel<{ decision: string; user_facing_reason?: string }>(
"POST",
`/v1/sessions/${session.session_id}/turns`,
{
input_tokens: 280,
output_tokens: 120,
risk_budget_used_micros: 4500,
metadata: { provider_response_id: "resp_example_voice_002" },
},
);
if (turn.decision !== "allow") {
throw new Error(turn.user_facing_reason ?? turn.decision);
}
const ended = await keel<{ attestation_artifact: Json }>(
"POST",
`/v1/sessions/${session.session_id}/end`,
{
usage_metrics: [
{ meter: "realtime_minutes", quantity: 3 },
{ meter: "input_tokens", quantity: 280 },
{ meter: "output_tokens", quantity: 120 },
],
metadata: { ended_by: "voice_gateway" },
},
);
console.log("session:", session.session_id);
console.log("schema:", ended.attestation_artifact.schema);Some SDK versions may not yet include generated realtime session subclients. The Python SDK exposes client.request() for these routes. The TypeScript example uses the same SDK environment variables and a small REST helper until the typed realtime surface is available.
Session lifecycle reference
Realtime voice sessions follow this sequence:
| Phase | Endpoint | What Keel binds |
|---|---|---|
| Start | POST /v1/sessions | Session ID, session-start permit, project, provider, model, operation: "realtime.session", execution_mode: "realtime", policy snapshot ID and hash, principal facts, optional session budget envelope, and voice.session.started evidence when ingested. |
| Ingest events | POST /v1/voice/sessions/{session_id}/events | Batched voice atoms, event idempotency keys, captured_at_ns, direction, payload hash, sidecar signatures, session event type, governance record hash, and project chain linkage. |
| Evaluate turns | POST /v1/sessions/{session_id}/turns | Turn index, token estimates, risk budget reservation, decision outcome, authorization trace, and optional tool-call permit linkage. |
| Tool calls | tool_call on /turns or POST /v1/sessions/{session_id}/action-permits | voice.tool_call.requested, binding_session_event_hash, action-child permit, declared action_verb, target digest, and child-to-parent permit lineage. |
| Interruption evidence | POST /v1/sessions/{session_id}/interruption-events | Signed interruption state, delivered/withheld/replacement digests, cutoff offsets, provider cancel acknowledgement, and redaction or retention profile IDs. |
| End | POST /v1/sessions/{session_id}/end | Terminal status, terminal usage metrics, final counters, voice.session.ended, and the keel.voice.attestation.phase_a artifact when assembly succeeds. |
Session status is one of active, ended, or failed. Turns evaluate against the policy snapshot locked at session start. They do not create a fresh session permit. Tool calls are the path that creates child permits inside the session.
Voice event atoms
Event ingestion accepts batches up to 100 atoms:
POST /v1/voice/sessions/{session_id}/events
Authorization: Bearer <client_project_api_key>
Content-Type: application/json{
"batch_idempotency_key": "batch_voice_example_003",
"events": [
{
"atom_type": "audio_chunk",
"idempotency_key": "voice_evt_example_audio_002",
"captured_at_ns": 1780513202500000000,
"direction": "outbound",
"payload": {
"content_hash": "sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
"format": "opus",
"duration_ms": 1200
},
"signatures": [
{
"algorithm": "ed25519",
"key_id": "voice-sidecar-example",
"signature": "sig_example_audio_002",
"signed_at": "2026-06-03T18:00:03Z"
}
]
}
]
}Supported atom_type values:
audio_chunk, transcript_span, tool_invocation_attempt,
lifecycle_marker, interruption, provider_opaque_markerAudio atoms should carry content_hash, format, and duration_ms, not media bytes. Transcript atoms may carry transcript text when your retention policy allows it. Sidecar signatures currently accept ed25519, sidecar-session-cert, or ml-dsa-65 algorithm labels.
If you construct event batches in TypeScript, send epoch-nanosecond values with a bigint-aware encoder or as numeric strings that Keel can parse as integers. Plain JavaScript numbers cannot safely represent current epoch nanoseconds.
Tool calls via action_verb
Voice tool calls dispatch through the same action verb registry used by /v1/execute. If action_verb is omitted, Keel treats the tool call as ai.generate. If the voice assistant initiates a payment, set action_verb: "mpp.payment" and pass the Stripe MPP context.
{
"input_tokens": 410,
"output_tokens": 90,
"risk_budget_used_micros": 7000,
"tool_call": {
"tool_name": "stripe_mpp_payment",
"call_id": "call_example_voice_mpp_001",
"action_verb": "mpp.payment",
"arguments": {
"requested_amount": 1499,
"requested_currency": "usd",
"mpp_target_url": "https://merchant.example/mpp/pay",
"spend_request_payload": {
"id": "lsrq_example_voice_approved_001",
"amount": 1499,
"currency": "usd",
"status": "approved",
"credential_type": "shared_payment_token",
"shared_payment_token": {
"id": "spt_example_voice_001",
"valid_until": "2026-06-03T19:00:00Z"
},
"merchant_name": "Example Merchant",
"merchant_url": "https://merchant.example/checkout",
"context": "The user approved a one-time voice purchase for the monthly research report for no more than $14.99 USD."
},
"authority": {
"amount_max": "1499",
"currency_class": "USD_FIAT",
"cadence": "one_shot",
"ttl_seconds": 3600,
"purpose_binding": "purchase.once"
}
}
}
}For mpp.payment, Keel uses tool_call.verb_context when present. If verb_context is omitted, Keel uses tool_call.arguments as the MPP verb context. The action-child permit records action_name: "mpp.payment", resource_provider: "stripe_mpp", resource_model: "stripe.mpp.v1", and the binding_session_event_hash for the voice tool-call event.
See Stripe MPP for the full SpendRequest and spend authority reference.
Voice attestation shape
When an ended session can be assembled, Keel returns an attestation artifact in the end response:
{
"artifact_version": "1.2.0",
"schema": "keel.voice.attestation.phase_a",
"schema_version": 3,
"canonicalization_profile": "keel.canonical_json.attestation_artifact.v1",
"issued_at": "2026-06-03T18:05:00Z",
"session_metadata": {
"session_id": "sess_example_voice_001",
"status": "ended",
"policy_snapshot_hash": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
},
"chain": [],
"chain_head": {
"content_hash": "sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
"event_count": 0
},
"project_chain_head": {
"content_hash": "sha256:cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc"
},
"permits": {},
"policy_snapshot": {},
"signatures": [],
"rfc3161_timestamp_receipt": null,
"timestamp_receipts": [],
"tsa_status": "degraded_no_receipt",
"verifier_compatibility": {
"format_version": "1.0",
"atoms_layer": false
}
}Important fields:
| Field | Meaning |
|---|---|
schema | Always keel.voice.attestation.phase_a for Phase A voice attestations. |
schema_version | 3 for the hash-only chain materialization supported by current Keel. |
session_metadata | Session identity, lifecycle state, provider/model, policy snapshot ID/hash, parent permit, and budget context. |
chain | Ordered session evidence entries. Schema v3 entries use payload_materialization: "hash_only" and canonicalized_payload_hash. |
chain_head | Final session chain head and event count. |
project_chain_head | Project-chain head covered by timestamp receipts. |
permits | Parent session permit and child tool-call permits bound into the session. |
policy_snapshot | Embedded snapshot whose hash must match session_metadata.policy_snapshot_hash. |
signatures | Ed25519 issuer signatures over canonical artifact bytes. |
rfc3161_timestamp_receipt / timestamp_receipts | RFC 3161 timestamp witness receipts over the project chain head, plus attempts and degraded status when no receipt is available. |
verifier_compatibility | Lets keel-verify auto-detect the artifact as a voice session attestation. |
keel-verifier verifies the artifact schema, Ed25519 signature, session-chain hash linkage, embedded policy snapshot hash, and RFC 3161 timestamp receipt binding.
Verifying with keel-verify
Install the verifier and run it against the session artifact:
python -m pip install "keel-verifier>=3.0.0"
keel-verify <voice_session_export.json>The verifier auto-detects artifacts with kind: "voice_session_attestation" after loading a file that contains the voice verifier_compatibility block. A successful human-readable run prints the session ID, chain head, and checks such as artifact_schema, issuer_signature, chain_integrity, policy_snapshot_hash, and rfc3161_timestamp_receipt.
The published verifier repository includes a synthetic v3 sample at:
keel-verifier/sample/voice_session_export_v3.jsonYou can use that sample to validate verifier installation before testing your own session exports.
Audit trail
Capture these fields from the session end response:
session_id
permit_id
status
policy_snapshot_id
policy_snapshot_hash
turn_count
cumulative_usage
risk_budget
attestation_artifact.schema
attestation_artifact.chain_head.content_hash
attestation_artifact.project_chain_head.content_hashFor immediate verification, persist attestation_artifact from POST /v1/sessions/{session_id}/end and run keel-verify on that JSON file.
For compliance review later, request a signed export around the session. Realtime sessions appear in exports using the session ID as request_id:
curl -sS -X POST https://api.keelapi.com/v1/compliance/exports \
-H "Authorization: Bearer $KEEL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"export_type": "full_audit",
"format": "json",
"filters": {
"request_id": "sess_example_voice_001",
"include_replay_evidence": true
}
}'Then verify the export before review. See Signed Exports, Independent Verification Overview, and Running Keel Verify.
FAQ
How is this different from Stripe MPP integration?
Realtime voice is session-attested. Stripe MPP is transaction-attested. A voice session produces a keel.voice.attestation.phase_a artifact that covers session events, turn decisions, tool-call permits, and the policy snapshot. MPP produces transaction evidence around one payment rail outcome. When voice calls MPP through action_verb: "mpp.payment", both surfaces produce verifier-readable evidence.
Does Keel hold our voice audio?
No. Keel ingests events, transcript spans when you choose to send them, hashes, signatures, and tool-call metadata. Your voice provider or gateway holds the audio stream and any media archive. For audio atoms, send content_hash, format, and duration_ms, not audio bytes.
Can I use voice with MPP payments together?
Yes. Voice tool calls can carry action_verb: "mpp.payment". Keel routes the call through the action verb registry, issues an action-child permit under the active voice session, validates the MPP spend authority, and binds the voice tool-call event hash into the permit. See Stripe MPP.
What voice providers work?
Any provider that can emit events your app or sidecar can translate into Keel voice atoms. Keel is design-agnostic across OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, and custom gateways that follow the realtime session model.
What happens if timestamp witnessing is unavailable?
Keel still returns a signed artifact, and the artifact records tsa_status plus timestamp attempts. Reviewers should treat degraded_no_receipt as degraded timestamp evidence, not as a silent success. See Multi-TSA Configuration.