Skip to Content
Egress-Controlled Deployments

Egress-Controlled Deployments

An egress-controlled deployment pairs Keel managed execution with customer-owned network controls so that provider traffic is forced through Keel. It is intended for Enterprise customers who need a defensible story that workloads cannot bypass Keel to reach provider APIs directly.

This is a deployment architecture, not a runtime feature toggle. The application-layer controls Keel provides — managed-execution dispatch verification, code-level provider endpoint allowlisting, outbound binding trace metadata, and the dispatch.egress_bound governance event — are properties of Keel-managed execution. The network-layer controls that block direct provider egress are operated by the customer (Kubernetes NetworkPolicy, VPC routing, security groups, egress proxy, or equivalent). Neither layer alone is sufficient; the pattern is the combination.

For the broader security boundary, see Security.

When to use this pattern

Use an egress-controlled deployment when any of the following apply:

  • You need a defensible story that workloads cannot bypass Keel to reach provider APIs directly.
  • You are subject to auditor scrutiny on AI-spend or data-egress controls (SOC 2 CC6, ISO 27001 A.13, HIPAA, FedRAMP-aligned).
  • You run multi-tenant or multi-team workloads where a single misconfigured service account could leak spend or data via direct provider calls.
  • Your threat model includes insider-adjacent risk: developers with legitimate cluster access who could otherwise call providers directly.

This pattern is not required for teams using Keel purely for advisory governance or for non-sensitive internal experimentation.

Architecture

┌─────────────────────────────────────────────────────────────┐ │ Customer workload namespace │ │ │ │ [app pods] ──egress allowed──▶ [Keel managed execution] │ │ │ │ │ │ └──egress DENIED──▶ provider APIs │ │ │ ▼ │ │ [provider endpoint allowlisting] │ │ │ │ │ ▼ │ │ provider APIs (only Keel may reach) │ └─────────────────────────────────────────────────────────────┘

Two independent layers combine:

  1. Network layer (customer-owned). Kubernetes NetworkPolicy, VPC routing, firewall rules, or egress proxy ACLs deny workload egress to provider hostnames or IPs. Only Keel managed-execution egress is permitted to reach providers.
  2. Application layer (Keel-managed). Keel’s managed dispatcher performs signed request-bound verification, enforces provider endpoint allowlisting at the code level, and emits outbound binding trace metadata plus a dispatch.egress_bound governance event for every dispatch.

The network layer stops the bypass path; the application layer proves what Keel dispatched and why.

Threat model

In scope:

  • Direct-provider bypass. A workload attempts to call api.openai.com (or equivalent) directly, skipping Keel.
  • Credential misuse via bypass. A leaked or shared provider API key is used from inside the customer network.
  • Dispatch tampering. An attacker with code-path access attempts to modify request content after permit issuance.
  • Silent endpoint substitution. An attacker attempts to redirect Keel dispatch to an unapproved provider endpoint.

Out of scope (handled by other controls or explicitly not covered here):

  • Compromise of customer provider API keys outside the network boundary.
  • Compromise of the Keel control plane itself — covered by Keel’s own trust stack: hash-chained governance records, signed exports, externally anchored checkpoints, and TSA receipts. See Verifying Keel Evidence.
  • Customer-hosted runtimes where dispatch happens outside Keel-managed execution (see § Caveats below).
  • Provider-side verification of Keel permits — providers do not verify Keel permits; outbound binding headers are traceability metadata only.

What this protects against

When both layers are correctly configured, the deployment materially reduces:

  • Workloads reaching provider APIs without going through Keel.
  • Undocumented or unapproved provider endpoints being dialed from managed execution.
  • Request content drifting between permit issuance and dispatch — detected by managed-execution signed request-bound verification.
  • Silent routing changes on fallback — recorded via the permit.binding_rebound_on_fallback governance event.

What this does not protect against

Stated plainly: if your network allows direct outbound calls to provider endpoints, Keel cannot prevent those calls. This deployment pattern assumes you restrict that path.

  • Keel does not automatically provide network-layer egress enforcement. Direct provider egress is blocked by customer network policy, not by Keel.
  • Provider endpoint allowlisting is code-level hardening inside the managed dispatcher, not a network egress control. A workload that bypasses Keel entirely is not subject to it.
  • Outbound binding trace headers are traceability metadata. Providers do not consume, verify, or enforce them.
  • Permit-first mode (permit issued, dispatch performed by caller) remains advisory unless paired with managed execution and network egress controls. See § Caveats below.
  • Compromise of the customer provider API key outside the governed path is not mitigated by this pattern.
  • Keel cannot attest to execution integrity in customer-hosted runtimes under this pattern.

Kubernetes NetworkPolicy example

The example assumes Keel managed execution runs in namespace keel-system with pods labeled app=keel-managed-execution, and that workload pods run in workloads and are labeled egress-via-keel=true.

apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-direct-provider-egress namespace: workloads spec: podSelector: matchLabels: egress-via-keel: "true" policyTypes: - Egress egress: # Allow DNS - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53 # Allow egress only to Keel managed execution - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: keel-system podSelector: matchLabels: app: keel-managed-execution ports: - protocol: TCP port: 443 --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-keel-provider-egress namespace: keel-system spec: podSelector: matchLabels: app: keel-managed-execution policyTypes: - Egress egress: # DNS - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53 # Provider HTTPS egress (scope further with an egress proxy/FQDN policy) - to: - ipBlock: cidr: 0.0.0.0/0 except: - 169.254.0.0/16 - 10.0.0.0/8 ports: - protocol: TCP port: 443

Notes:

  • NetworkPolicy alone does not resolve FQDNs. For domain-level enforcement (api.openai.com, api.anthropic.com, etc.), layer an egress proxy (Cilium FQDN policy, Istio ServiceEntry plus Sidecar egress, or a forward proxy such as Envoy or Squid) in front of this policy.
  • The workloads namespace policy is the load-bearing control: it denies direct provider egress from app pods.

VPC and security-group pattern

For deployments outside Kubernetes, or as defense in depth around the cluster:

Place Keel managed execution in a dedicated subnet with its own NAT gateway and route table. Attach a security group (or equivalent cloud firewall) that permits outbound 443 only to the provider FQDN set. Place workloads in separate subnets whose route tables and security groups permit egress only to the Keel managed-execution load balancer — not to the internet. Use a VPC endpoint or dedicated NAT with an egress filter (AWS Network Firewall, GCP Cloud NGFW, Azure Firewall) in front of the Keel subnet to constrain outbound traffic to the approved provider hostnames. Flow logs from both subnets are the audit evidence that bypass attempts did not reach providers.

This produces two independent enforcement points: the workload subnet cannot route to providers at all, and the Keel subnet cannot reach arbitrary destinations.

Provider key custody

  • Store customer provider keys only where Keel managed execution can read them — Keel’s encrypted provider-key storage, or a secret manager scoped to the Keel managed-execution service account.
  • Do not distribute provider keys to workload pods or developer workstations. A key present on a workload pod defeats the egress-controlled pattern even when network policy is correct.
  • Rotate provider keys on any suspicion of workload-side exposure, not just Keel-side exposure.
  • For regulated environments, prefer per-project provider keys so a single compromise is bounded.

Monitoring and audit signals

Wire the following signals into your alerting pipeline or external audit tooling. All are emitted as Keel governance events or surfaced on permits and responses:

  • dispatch.egress_bound — every managed dispatch emits this. Absence for a project that normally dispatches is itself a signal.
  • permit.binding_rebound_on_fallback — the permit binding was rebound during provider fallback. Spikes warrant review.
  • provider_endpoint_not_allowed — the managed dispatcher refused to dial an endpoint outside the code-level allowlist. Should be zero in steady state.
  • permit_request_hash_mismatch — request content did not match the permit’s signed binding. A non-zero rate indicates caller drift or tampering.
  • permit_signed_constraint_exceeded — dispatch would have violated a signed constraint.
  • permit_binding_key_mismatch — binding key presented at dispatch does not match the permit’s bound key.
  • permit_signature_invalid — permit signature failed verification.

Pair these application-layer signals with network-layer signals:

  • VPC or flow-log denies to provider FQDNs from workload subnets.
  • Egress proxy rejections for non-Keel sources attempting provider hostnames.
  • DNS query logs for provider domains from non-Keel pods.

A bypass attempt typically shows up first at the network layer (denied connection) and should correlate with an absence of dispatch.egress_bound for that workload identity.

Operational checklist

Before declaring a deployment egress-controlled:

  • Workload namespaces have a default-deny egress NetworkPolicy.
  • Workload egress to Keel managed execution is the only permitted external path.
  • FQDN-level enforcement is in place for Keel’s own provider egress (egress proxy or cloud NGFW).
  • No provider API keys are deployed to workload pods or developer environments.
  • Provider keys are scoped per project where feasible.
  • dispatch.egress_bound presence is monitored per project.
  • provider_endpoint_not_allowed, permit_request_hash_mismatch, permit_signed_constraint_exceeded, permit_binding_key_mismatch, permit_signature_invalid, and permit.binding_rebound_on_fallback are alerted on.
  • Flow logs from workload subnets are retained and reviewed.
  • A runbook exists for investigating a denied-egress event from workloads.
  • Permit-first mode is disabled for workloads that must be egress-controlled, or is paired with managed execution.

Caveats

Permit-first mode

In permit-first mode, Keel issues a permit and the caller performs the provider dispatch directly. In this mode:

  • Signed request-bound dispatch verification does not run inside Keel — Keel is not on the dispatch path.
  • The code-level provider endpoint allowlist does not apply.
  • Outbound binding trace metadata is not emitted by Keel for that call.
  • The permit remains a governance record, but enforcement is advisory: a caller that ignores the permit’s constraints will not be stopped by Keel.

Permit-first mode becomes enforcement-grade only when paired with (a) managed execution for the actual dispatch, and (b) network egress controls that prevent the caller from reaching providers directly. In an egress-controlled deployment, treat permit-first as a development or observability mode, not as an enforcement mode, unless both pairings are in place.

Customer-hosted runtimes

Customer-hosted runtimes — where dispatch happens in customer-operated infrastructure rather than Keel-managed execution — are not covered by this pattern. Specifically:

  • Keel cannot attest to the integrity of a runtime it does not operate.
  • Managed-execution signed request-bound verification, code-level provider endpoint allowlisting, and dispatch.egress_bound emission are properties of Keel-managed execution. A customer-hosted runtime would need to implement its own equivalent, including a separate attestation and verifier design, before any egress-controlled claim applies to it.
  • Customer-hosted deployments should be described as governed (permits, policy decisions, governance records) but not as egress-controlled under this pattern.

Allowed claim language

The following claims are accurate for a correctly configured egress-controlled deployment:

  • “Provider traffic is forced through Keel-managed execution by customer network policy.”
  • “Keel-managed execution performs signed request-bound dispatch verification before calling providers.”
  • “Managed dispatch is constrained by code-level provider endpoint allowlisting.”
  • “Every managed dispatch emits outbound binding trace metadata and a dispatch.egress_bound governance event.”
  • “Direct-to-provider bypass from workloads is blocked at the network layer and detectable at the governance layer.”

Forbidden claim language

Do not use these claims; they are false or misleading:

  • “Tamper-proof.”
  • “Unbypassable.”
  • “Guaranteed enforcement.”
  • “Providers verify Keel permits.” — they do not.
  • “Keel blocks all direct provider egress by itself.” — it does not; customer network policy does.
  • “Provider endpoint allowlisting prevents bypass.” — it is code-level hardening, not network egress control.
  • “Outbound binding headers are verified by providers.” — they are traceability metadata.

What this surface does and does not claim

  • This is a deployment architecture, not a runtime feature toggle. The application-layer controls Keel provides (managed dispatch verification, code-level allowlisting, governance events) are present in Keel-managed execution; the network-layer controls are operated by the customer.
  • The architecture is available to Enterprise customers; concrete deployment work involves coordination between Keel solutions engineering and customer network operations.
  • Keel does not expose a server-side toggle that flips a deployment to “egress-controlled.” The properties are achieved through configuration and architecture, not through a feature flag.
  • The forbidden-claim language section is normative for customer-facing materials. Marketing language describing a Keel deployment as “egress-controlled” should match the allowed-claim language above.
Last updated on Edit this page on GitHub