Anycast routing. All CDN PoPs announce the same IP address (typically a single /24 per service) via BGP. Internet routers naturally prefer the topologically closest announcement. A user in London gets routed to the London PoP; a user in Tokyo to the Tokyo PoP — without DNS games. If a PoP dies, its BGP announcement is withdrawn and traffic reroutes to the next-closest.
This is dramatically simpler than DNS-based geographic routing (which is the Akamai-era approach): no TTL games, no resolver location guesses, no DNS caching issues. Cloudflare popularized Anycast-for-CDN; now industry standard.
Cache hierarchy with tiered cache. Problem: 300 PoPs × 100 TB each is 30 PB of storage — but cumulative customer content is far more than that. Each PoP can only hold a subset. Result: cold content causes repeated fetches from origin, hammering customer servers.
Solution: regional tier between edge and origin. A PoP miss queries its regional tier. The regional tier has more storage, sees more traffic, and has better hit ratio. Only regional-tier misses hit origin. Net: origin load drops 10× compared to flat-edge.
Request Flow — Cache Hit / Miss Cascade
Mermaid
sequenceDiagram
participant U as User
participant E as Edge PoP
participant R as Regional
participant O as Origin
U->>E: GET /foo.jpg
E->>E: cache lookup
alt HIT (90% of the time)
E-->>U: serve from NVMe (p50 ~5 ms)
else MISS at edge
E->>R: fetch /foo.jpg
alt HIT at regional
R-->>E: bytes + cache-control
E-->>U: serve + populate edge cache
else MISS at regional
R->>O: fetch /foo.jpg
O-->>R: bytes
R-->>E: bytes
E-->>U: serve + populate both caches
end
end
Cache key. Default is (host, URL-path, query-string). Customer can override — strip tracking params, normalize case, include Vary headers (Accept-Encoding, device type). Mis-configured cache keys are a top source of inexplicable "why isn't my site caching?" support tickets.
Purge / invalidation. Two flavors: URL-based (specific URLs) and tag-based (arbitrary label applied to responses at cache-time, then "purge all cached items with tag X"). Purge flow:
- Customer API call hits control plane.
- Control plane writes purge message to a global pub/sub bus (Kafka / internal multicast).
- Every PoP subscribes; consumes message.
- Each PoP invalidates matching entries in local cache (usually by writing a tombstone so serves return miss).
Total time: seconds. Not instant — but predictable. "Purge everything" zone-wide is much slower (hours) because it invalidates millions of keys.
Stampede prevention. If 1M users request a newly-popular image simultaneously and edge cache misses, all 1M requests would forward to the regional/origin. Protection: request coalescing — concurrent requests for the same key at a single PoP collapse into one upstream fetch. Only the first miss goes upstream; others wait for its response. One-request-in-flight invariant per key per PoP.
Interview answer
"Anycast BGP routing sends users to the nearest PoP. Each PoP runs nginx/envoy with NVMe-backed cache, TLS termination with SNI-indexed certs, WAF, and workers. Cache misses cascade to a regional tier (origin shield) before ever hitting the customer's origin, cutting origin load 10×. Purge is global pub/sub from a control plane — seconds to propagate to all PoPs. Stampede protection via per-key request coalescing so only one upstream fetch runs for concurrent misses. Real-time logs stream to a central analytics pipeline."