Skip to Content
IntegrationsRealtime Voice

Realtime Voice

Realtime voice sessions turn a live conversation into a sequence of decisions, events, tool calls, and evidence. Keel adds the Decide and Prove layer: your app starts a session with Keel, your voice gateway or sidecar sends signed event atoms, Keel evaluates turns against a locked policy snapshot, and the session ends with a verifier-readable voice attestation artifact.

Use this guide when your application already runs voice through OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, or another provider that follows a realtime session model, and you want Keel to decide, execute, and prove the session through the same permit-centered audit model used by Execute, Permits, Signed Exports, and Stripe MPP.

Keel ingests signed voice events, transcript spans, tool-call metadata, interruption evidence, and audio content hashes. Do not send raw voice media to Keel. Your voice provider or gateway remains the system that streams and stores audio.

What this integrates

Keel adjudicates realtime voice sessions across five surfaces:

SurfaceKeel role
Session lifecycleCreate a realtime_sessions row, bind the session-start permit, lock the policy snapshot, attach an optional budget envelope, and track active, ended, or failed state.
Voice event ingestionAccept batched, signed voice atoms from your gateway or sidecar and add hash-linked session events.
Turn evaluationEvaluate each realtime turn against the session-locked policy snapshot and budget envelope.
Tool callsIssue action-child permits for voice tool calls. Tool calls default to action_verb: "ai.generate" and can dispatch as action_verb: "mpp.payment" for voice-initiated payments.
Session attestationEmit a keel.voice.attestation.phase_a artifact with chain entries, signatures, timestamp receipts, policy snapshot hashes, and permit linkage.

Compatible voice providers include OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, and provider-neutral gateways that can emit Keel voice event atoms.

Architecture

Customer app | | 1. Start or join a call with a voice provider v Voice provider or gateway | | 2. Stream audio and realtime provider events v Customer app / voice sidecar | | 3. POST /v1/sessions | Create Keel realtime session, bind permit, | lock policy snapshot, and create budget envelope v Keel | | 4. POST /v1/voice/sessions/{session_id}/events | Ingest signed voice atoms and hash-only audio evidence v Keel | | 5. POST /v1/sessions/{session_id}/turns | Decide each turn against the locked session policy v Keel | | 6. Optional voice tool call with action_verb=mpp.payment v MPP relay | | 7. POST /v1/sessions/{session_id}/end | End session and emit voice attestation v Customer app | | 8. Retrieve attestation and run keel-verify v Auditor / reviewer

The trust domains stay separate:

DomainOwnerKeel role
Voice media transportYour voice provider or gatewayNone. Keel does not hold raw audio.
Event sidecarYour app or gateway sidecarSign event atoms and send hashes, transcript spans, and tool-call metadata.
Session decisionKeel permit policyDecide whether the session and its turns remain within policy and budget.
Embedded payment railStripe MPP, when usedDispatch through mpp.payment and bind the payment evidence.
Audit trailKeel plus independent witnessesPreserve tamper-evident session evidence with externally anchored, independently witnessed timestamp receipts where configured.

The public start endpoint is POST /v1/sessions. If your internal design names this step /v1/realtime/sessions/start, map that step to the shipped public route.

Prerequisites

  • A voice provider or gateway integration that can emit realtime event atoms.
  • A Keel project on Production or Enterprise for the voice + MPP path and signed export review. See Plans & Entitlements.
  • A client-scoped Keel API key for session, event, and turn endpoints.
  • A policy that allows the realtime model and session operation your app uses.
  • keel-verifier 3.0.0 or newer for verifier UX, with voice attestation compatibility from 2.6.0 and schema v3 support from 2.7.0.
  • If you use voice-initiated payments, a Stripe MPP setup that follows Stripe MPP.

Set your Keel environment:

export KEEL_BASE_URL="https://api.keelapi.com" export KEEL_API_KEY="keel_sk_your_project_key" export KEEL_PROJECT_ID="11111111-1111-1111-1111-111111111111"

Quick start

  1. Start a realtime session with POST /v1/sessions.
  2. Ingest signed voice events with POST /v1/voice/sessions/{session_id}/events.
  3. Submit turn evaluations with POST /v1/sessions/{session_id}/turns.
  4. Send voice tool calls with action_verb when the call should dispatch through a specific verb.
  5. End the session with POST /v1/sessions/{session_id}/end.
  6. Verify the returned attestation artifact with keel-verify.

Python

pip install keel-sdk "keel-verifier>=3.0.0"
import json import os from keel_sdk import KeelClient client = KeelClient( base_url=os.environ.get("KEEL_BASE_URL", "https://api.keelapi.com"), api_key=os.environ["KEEL_API_KEY"], ) project_id = os.environ["KEEL_PROJECT_ID"] session = client.request( "POST", "/v1/sessions", json={ "permit": { "project_id": project_id, "idempotency_key": "idem_voice_session_example_001", "subject": {"type": "user", "id": "usr_example_voice"}, "action": {"name": "realtime.session"}, "resource": { "type": "request", "id": "req_example_voice_001", "attributes": { "provider": "openai", "model": "gpt-4o-realtime-preview", "operation": "realtime.session", "execution_mode": "realtime", "estimated_input_tokens": 0, "estimated_output_tokens": 0, }, }, }, "metadata": {"voice_provider": "openai_realtime"}, "risk_budget_total_micros": 250000, }, ) session_id = session["session_id"] client.request( "POST", f"/v1/voice/sessions/{session_id}/events", json={ "batch_idempotency_key": "batch_voice_example_001", "events": [ { "atom_type": "lifecycle_marker", "idempotency_key": "voice_evt_example_started", "captured_at_ns": 1780513200000000000, "payload": {"marker_type": "started"}, "signatures": [ { "algorithm": "ed25519", "key_id": "voice-sidecar-example", "signature": "sig_example_started", "signed_at": "2026-06-03T18:00:00Z", } ], }, { "atom_type": "audio_chunk", "idempotency_key": "voice_evt_example_audio_001", "captured_at_ns": 1780513201500000000, "direction": "inbound", "payload": { "content_hash": "sha256:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "format": "pcm16", "duration_ms": 1500, }, "signatures": [ { "algorithm": "ed25519", "key_id": "voice-sidecar-example", "signature": "sig_example_audio_001", "signed_at": "2026-06-03T18:00:01Z", } ], }, ], }, ) turn = client.request( "POST", f"/v1/sessions/{session_id}/turns", json={ "input_tokens": 280, "output_tokens": 120, "risk_budget_used_micros": 4500, "metadata": {"provider_response_id": "resp_example_voice_001"}, }, ) if turn["decision"] != "allow": raise RuntimeError(turn.get("user_facing_reason") or turn["decision"]) ended = client.request( "POST", f"/v1/sessions/{session_id}/end", json={ "usage_metrics": [ {"meter": "realtime_minutes", "quantity": 3}, {"meter": "input_tokens", "quantity": 280}, {"meter": "output_tokens", "quantity": 120}, ], "metadata": {"ended_by": "voice_gateway"}, }, ) attestation = ended["attestation_artifact"] with open("voice_session_export.json", "w", encoding="utf-8") as handle: json.dump(attestation, handle, indent=2) print("session:", session_id) print("chain head:", attestation["chain_head"]["content_hash"])

TypeScript

npm install keel-sdk
const baseUrl = process.env.KEEL_BASE_URL ?? "https://api.keelapi.com"; const apiKey = process.env.KEEL_API_KEY!; const projectId = process.env.KEEL_PROJECT_ID!; type Json = Record<string, unknown>; async function keel<T>(method: string, path: string, body: Json): Promise<T> { const response = await fetch(`${baseUrl}${path}`, { method, headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json", }, body: JSON.stringify(body), }); const payload = (await response.json()) as T & { error?: { message?: string } }; if (!response.ok) { throw new Error(payload.error?.message ?? `Keel request failed: ${response.status}`); } return payload; } const session = await keel<{ session_id: string }>("POST", "/v1/sessions", { permit: { project_id: projectId, idempotency_key: "idem_voice_session_example_002", subject: { type: "user", id: "usr_example_voice" }, action: { name: "realtime.session" }, resource: { type: "request", id: "req_example_voice_002", attributes: { provider: "openai", model: "gpt-4o-realtime-preview", operation: "realtime.session", execution_mode: "realtime", estimated_input_tokens: 0, estimated_output_tokens: 0, }, }, }, metadata: { voice_provider: "openai_realtime" }, risk_budget_total_micros: 250000, }); await keel("POST", `/v1/voice/sessions/${session.session_id}/events`, { batch_idempotency_key: "batch_voice_example_002", events: [ { atom_type: "transcript_span", idempotency_key: "voice_evt_example_transcript_001", captured_at_ns: "1780513202000000000", direction: "inbound", payload: { text: "I approve the support lookup for this account.", language: "en", }, signatures: [ { algorithm: "ed25519", key_id: "voice-sidecar-example", signature: "sig_example_transcript_001", signed_at: "2026-06-03T18:00:02Z", }, ], }, ], }); const turn = await keel<{ decision: string; user_facing_reason?: string }>( "POST", `/v1/sessions/${session.session_id}/turns`, { input_tokens: 280, output_tokens: 120, risk_budget_used_micros: 4500, metadata: { provider_response_id: "resp_example_voice_002" }, }, ); if (turn.decision !== "allow") { throw new Error(turn.user_facing_reason ?? turn.decision); } const ended = await keel<{ attestation_artifact: Json }>( "POST", `/v1/sessions/${session.session_id}/end`, { usage_metrics: [ { meter: "realtime_minutes", quantity: 3 }, { meter: "input_tokens", quantity: 280 }, { meter: "output_tokens", quantity: 120 }, ], metadata: { ended_by: "voice_gateway" }, }, ); console.log("session:", session.session_id); console.log("schema:", ended.attestation_artifact.schema);

Some SDK versions may not yet include generated realtime session subclients. The Python SDK exposes client.request() for these routes. The TypeScript example uses the same SDK environment variables and a small REST helper until the typed realtime surface is available.

Session lifecycle reference

Realtime voice sessions follow this sequence:

PhaseEndpointWhat Keel binds
StartPOST /v1/sessionsSession ID, session-start permit, project, provider, model, operation: "realtime.session", execution_mode: "realtime", policy snapshot ID and hash, principal facts, optional session budget envelope, and voice.session.started evidence when ingested.
Ingest eventsPOST /v1/voice/sessions/{session_id}/eventsBatched voice atoms, event idempotency keys, captured_at_ns, direction, payload hash, sidecar signatures, session event type, governance record hash, and project chain linkage.
Evaluate turnsPOST /v1/sessions/{session_id}/turnsTurn index, token estimates, risk budget reservation, decision outcome, authorization trace, and optional tool-call permit linkage.
Tool callstool_call on /turns or POST /v1/sessions/{session_id}/action-permitsvoice.tool_call.requested, binding_session_event_hash, action-child permit, declared action_verb, target digest, and child-to-parent permit lineage.
Interruption evidencePOST /v1/sessions/{session_id}/interruption-eventsSigned interruption state, delivered/withheld/replacement digests, cutoff offsets, provider cancel acknowledgement, and redaction or retention profile IDs.
EndPOST /v1/sessions/{session_id}/endTerminal status, terminal usage metrics, final counters, voice.session.ended, and the keel.voice.attestation.phase_a artifact when assembly succeeds.

Session status is one of active, ended, or failed. Turns evaluate against the policy snapshot locked at session start. They do not create a fresh session permit. Tool calls are the path that creates child permits inside the session.

Voice event atoms

Event ingestion accepts batches up to 100 atoms:

POST /v1/voice/sessions/{session_id}/events Authorization: Bearer <client_project_api_key> Content-Type: application/json
{ "batch_idempotency_key": "batch_voice_example_003", "events": [ { "atom_type": "audio_chunk", "idempotency_key": "voice_evt_example_audio_002", "captured_at_ns": 1780513202500000000, "direction": "outbound", "payload": { "content_hash": "sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "format": "opus", "duration_ms": 1200 }, "signatures": [ { "algorithm": "ed25519", "key_id": "voice-sidecar-example", "signature": "sig_example_audio_002", "signed_at": "2026-06-03T18:00:03Z" } ] } ] }

Supported atom_type values:

audio_chunk, transcript_span, tool_invocation_attempt, lifecycle_marker, interruption, provider_opaque_marker

Audio atoms should carry content_hash, format, and duration_ms, not media bytes. Transcript atoms may carry transcript text when your retention policy allows it. Sidecar signatures currently accept ed25519, sidecar-session-cert, or ml-dsa-65 algorithm labels.

If you construct event batches in TypeScript, send epoch-nanosecond values with a bigint-aware encoder or as numeric strings that Keel can parse as integers. Plain JavaScript numbers cannot safely represent current epoch nanoseconds.

Tool calls via action_verb

Voice tool calls dispatch through the same action verb registry used by /v1/execute. If action_verb is omitted, Keel treats the tool call as ai.generate. If the voice assistant initiates a payment, set action_verb: "mpp.payment" and pass the Stripe MPP context.

{ "input_tokens": 410, "output_tokens": 90, "risk_budget_used_micros": 7000, "tool_call": { "tool_name": "stripe_mpp_payment", "call_id": "call_example_voice_mpp_001", "action_verb": "mpp.payment", "arguments": { "requested_amount": 1499, "requested_currency": "usd", "mpp_target_url": "https://merchant.example/mpp/pay", "spend_request_payload": { "id": "lsrq_example_voice_approved_001", "amount": 1499, "currency": "usd", "status": "approved", "credential_type": "shared_payment_token", "shared_payment_token": { "id": "spt_example_voice_001", "valid_until": "2026-06-03T19:00:00Z" }, "merchant_name": "Example Merchant", "merchant_url": "https://merchant.example/checkout", "context": "The user approved a one-time voice purchase for the monthly research report for no more than $14.99 USD." }, "authority": { "amount_max": "1499", "currency_class": "USD_FIAT", "cadence": "one_shot", "ttl_seconds": 3600, "purpose_binding": "purchase.once" } } } }

For mpp.payment, Keel uses tool_call.verb_context when present. If verb_context is omitted, Keel uses tool_call.arguments as the MPP verb context. The action-child permit records action_name: "mpp.payment", resource_provider: "stripe_mpp", resource_model: "stripe.mpp.v1", and the binding_session_event_hash for the voice tool-call event.

See Stripe MPP for the full SpendRequest and spend authority reference.

Voice attestation shape

When an ended session can be assembled, Keel returns an attestation artifact in the end response:

{ "artifact_version": "1.2.0", "schema": "keel.voice.attestation.phase_a", "schema_version": 3, "canonicalization_profile": "keel.canonical_json.attestation_artifact.v1", "issued_at": "2026-06-03T18:05:00Z", "session_metadata": { "session_id": "sess_example_voice_001", "status": "ended", "policy_snapshot_hash": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" }, "chain": [], "chain_head": { "content_hash": "sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", "event_count": 0 }, "project_chain_head": { "content_hash": "sha256:cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc" }, "permits": {}, "policy_snapshot": {}, "signatures": [], "rfc3161_timestamp_receipt": null, "timestamp_receipts": [], "tsa_status": "degraded_no_receipt", "verifier_compatibility": { "format_version": "1.0", "atoms_layer": false } }

Important fields:

FieldMeaning
schemaAlways keel.voice.attestation.phase_a for Phase A voice attestations.
schema_version3 for the hash-only chain materialization supported by current Keel.
session_metadataSession identity, lifecycle state, provider/model, policy snapshot ID/hash, parent permit, and budget context.
chainOrdered session evidence entries. Schema v3 entries use payload_materialization: "hash_only" and canonicalized_payload_hash.
chain_headFinal session chain head and event count.
project_chain_headProject-chain head covered by timestamp receipts.
permitsParent session permit and child tool-call permits bound into the session.
policy_snapshotEmbedded snapshot whose hash must match session_metadata.policy_snapshot_hash.
signaturesEd25519 issuer signatures over canonical artifact bytes.
rfc3161_timestamp_receipt / timestamp_receiptsRFC 3161 timestamp witness receipts over the project chain head, plus attempts and degraded status when no receipt is available.
verifier_compatibilityLets keel-verify auto-detect the artifact as a voice session attestation.

keel-verifier verifies the artifact schema, Ed25519 signature, session-chain hash linkage, embedded policy snapshot hash, and RFC 3161 timestamp receipt binding.

Verifying with keel-verify

Install the verifier and run it against the session artifact:

python -m pip install "keel-verifier>=3.0.0" keel-verify <voice_session_export.json>

The verifier auto-detects artifacts with kind: "voice_session_attestation" after loading a file that contains the voice verifier_compatibility block. A successful human-readable run prints the session ID, chain head, and checks such as artifact_schema, issuer_signature, chain_integrity, policy_snapshot_hash, and rfc3161_timestamp_receipt.

The published verifier repository includes a synthetic v3 sample at:

keel-verifier/sample/voice_session_export_v3.json

You can use that sample to validate verifier installation before testing your own session exports.

Audit trail

Capture these fields from the session end response:

session_id permit_id status policy_snapshot_id policy_snapshot_hash turn_count cumulative_usage risk_budget attestation_artifact.schema attestation_artifact.chain_head.content_hash attestation_artifact.project_chain_head.content_hash

For immediate verification, persist attestation_artifact from POST /v1/sessions/{session_id}/end and run keel-verify on that JSON file.

For compliance review later, request a signed export around the session. Realtime sessions appear in exports using the session ID as request_id:

curl -sS -X POST https://api.keelapi.com/v1/compliance/exports \ -H "Authorization: Bearer $KEEL_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "export_type": "full_audit", "format": "json", "filters": { "request_id": "sess_example_voice_001", "include_replay_evidence": true } }'

Then verify the export before review. See Signed Exports, Independent Verification Overview, and Running Keel Verify.

FAQ

How is this different from Stripe MPP integration?

Realtime voice is session-attested. Stripe MPP is transaction-attested. A voice session produces a keel.voice.attestation.phase_a artifact that covers session events, turn decisions, tool-call permits, and the policy snapshot. MPP produces transaction evidence around one payment rail outcome. When voice calls MPP through action_verb: "mpp.payment", both surfaces produce verifier-readable evidence.

Does Keel hold our voice audio?

No. Keel ingests events, transcript spans when you choose to send them, hashes, signatures, and tool-call metadata. Your voice provider or gateway holds the audio stream and any media archive. For audio atoms, send content_hash, format, and duration_ms, not audio bytes.

Can I use voice with MPP payments together?

Yes. Voice tool calls can carry action_verb: "mpp.payment". Keel routes the call through the action verb registry, issues an action-child permit under the active voice session, validates the MPP spend authority, and binds the voice tool-call event hash into the permit. See Stripe MPP.

What voice providers work?

Any provider that can emit events your app or sidecar can translate into Keel voice atoms. Keel is design-agnostic across OpenAI Realtime, Twilio Voice, Vapi, Deepgram Agents, and custom gateways that follow the realtime session model.

What happens if timestamp witnessing is unavailable?

Keel still returns a signed artifact, and the artifact records tsa_status plus timestamp attempts. Reviewers should treat degraded_no_receipt as degraded timestamp evidence, not as a silent success. See Multi-TSA Configuration.

Last updated on Edit this page on GitHubÂ