Concept · Observability & Security

Zero Trust Networking

01

Why this matters

The traditional security model: castle-and-moat. Strong perimeter (firewall + VPN); inside the perimeter, services trust each other implicitly. The flaw: one compromised laptop or one breached service inside the perimeter sees everything. The 2013 Target breach, the SolarWinds compromise, every major recent incident — same pattern. Once an attacker is inside, the network gives them everything.

Zero trust flips the model: never trust, always verify. Every request — even between two of your own services in the same datacenter — must authenticate and authorize. No "internal vs external"; no "trusted network." Every connection is treated as if it's coming from the open internet.

02

The five principles

  1. Verify explicitly. Authenticate every request based on identity, device, location, and signals. No implicit trust because "you're inside the firewall."
  2. Least privilege access. Each service / user gets the minimum permissions to do its job. Most calls denied by default.
  3. Assume breach. Design as if attackers are already inside. Limit blast radius via segmentation.
  4. Continuous validation. A token issued an hour ago might no longer be valid. Re-check on every request, not just at login.
  5. End-to-end encryption. Every connection is mTLS — even within your own network.
03

The architecture

Practical implementation in a cloud-native stack:

  • Identity for every workload. Each pod / function / VM has its own identity (SPIFFE, IAM role, K8s ServiceAccount). No shared service accounts; no "the prod-app password."
  • mTLS everywhere. A service mesh (Istio, Linkerd) issues short-lived certs per workload + enforces mTLS on every connection.
  • Policy engine. "Service A can call Service B's /orders endpoint, but not /admin." Policy enforced at the proxy, not in app code. Open Policy Agent (OPA) is common.
  • Short-lived credentials. No long-lived API keys. Tokens expire in minutes; auto-rotated.
  • Audit log. Every authz decision logged. Who tried to access what, when, granted/denied.
04

The user-side: BeyondCorp

Google's BeyondCorp is the canonical user-facing zero-trust deployment. The principles applied to employee access:

  • No corporate VPN. Engineers access internal apps over the public internet.
  • Every request authenticated via Google identity + device certificate.
  • Access decision based on user identity + device posture (managed laptop? OS up to date? not jailbroken?).
  • Per-app policy ("Engineering team can access internal Bug Tracker"; not based on "is on corp WiFi").

Result: Google employees worked remotely from anywhere with no VPN, securely, years before COVID forced everyone else into the same model. Cloudflare Access, Tailscale, Twingate are the productized versions for non-Google companies.

05

Deep dive — the workload identity flow

How does Service A prove to Service B who it is, without sharing a long-lived credential?

  1. Service A starts. Platform (K8s, ECS, EC2) attaches a workload identity — a signed token from the platform: "I am pod foo in namespace bar."
  2. A wants to call B. A presents its workload identity token to a local sidecar (Envoy in the mesh).
  3. Sidecar fetches a fresh mTLS cert from the cert authority (CA), proving A's identity to anyone speaking mTLS. Cert TTL: minutes.
  4. A's sidecar opens an mTLS connection to B's sidecar. Both sides present certs. Mutual proof of identity.
  5. B's sidecar checks policy: "Is identity A allowed to call this path on B?" Yes → forward to B's app. No → reject with 403.
  6. B's app processes the request in plain HTTP. Sidecar handled all the security.

App code knows nothing about identity, certs, or mTLS. The platform + mesh enforce zero-trust. Compromise of one service exposes only what that service was already authorized for; lateral movement is blocked.

Interview answer

"Zero-trust replaces network-perimeter security with per-request identity verification. Every workload has its own short-lived identity, every connection uses mTLS via service mesh, every authorization checked against policy at the sidecar. No 'internal vs external' — even pod-to-pod calls are authenticated."

06

Real-world

Google BeyondCorp

The reference

Internal Google security model since ~2014. No VPN; per-request user + device verification. Inspired the entire zero-trust movement.

Cloudflare Access

BeyondCorp for everyone else

SaaS zero-trust for company apps. Replaces legacy VPN. Used by thousands of companies post-COVID remote shift.

Istio + SPIFFE

Workload identity

Open-source standard for workload identity. Istio integrates SPIFFE for cluster-wide mTLS. The K8s-native zero-trust stack.

HashiCorp Boundary + Vault

Zero-trust toolkit

Boundary for user access, Vault for short-lived credentials. Common in Hashicorp shops.

07

Used in problems

Payment gateway must implement zero-trust for PCI compliance. E-commerce internal microservices use mTLS via service mesh. WhatsApp's media servers fetch identities per session — long-lived secrets are forbidden.

Next up