Every request to your API starts with "what IP is api.example.com?" That's DNS. Get it wrong and users can't reach you. Configure TTLs wrong and users are stuck on a dead IP for hours after you change it. Pick the wrong DNS provider and you die when they go down (see: Dyn 2016, half the internet offline).
DNS is also the first place in your stack where you can do global traffic management: geo-route users to the nearest region, fail over a dead datacenter, A/B test by %. If you don't use it that way, you're leaving power on the table.
02
How resolution works
Client wants api.example.com. It asks its resolver (usually run by the ISP or Google at 8.8.8.8, or Cloudflare 1.1.1.1). The resolver walks the DNS hierarchy:
Authoritative name server → "what's api.example.com?" → "IP 203.0.113.42 (or: here's a CNAME → resolve that)."
Each answer includes a TTL (time-to-live) — how long the resolver can cache it. Next time a client asks, resolver returns the cached answer without walking the hierarchy.
03
Record types you care about
Record
Purpose
Example
A
Domain → IPv4
api.example.com → 203.0.113.42
AAAA
Domain → IPv6
api.example.com → 2001:db8::1
CNAME
Alias → another domain
www.example.com → example.com
MX
Mail server for a domain
example.com → mx.google.com
TXT
Arbitrary text (SPF, DKIM, domain verification)
v=spf1 include:_spf.google.com ~all
NS
Which name servers are authoritative
example.com → ns1.cloudflare.com
04
DNS as a traffic-management tool
Authoritative DNS servers can return different answers to different clients. This enables:
Geo-routing. US client asks → returns US-East IP. EU client asks → returns EU-West IP. Latency-optimal.
Weighted round-robin. 80% of answers point to region A, 20% to region B. Used for gradual traffic shifts during deploys.
Health-based failover. Authoritative DNS monitors each endpoint. If US-East fails health checks, return US-West until recovery.
Anycast. Many servers announce the same IP from different regions (BGP magic). The network itself routes each client to the nearest. Used by Cloudflare, most CDNs, and 1.1.1.1.
05
TTL — the propagation tradeoff
TTL controls how long resolvers cache the answer. Short TTL = fast failover, high query load on your DNS. Long TTL = cached everywhere, slow to propagate changes.
60–300s
TTL for actively-managed records (blue/green)
1–24 h
TTL for stable records
~48 h
worst-case stale cache (ignoring misbehaving resolvers)
~3 s
typical resolution time (cached)
Before planning a cutover: lower TTL days in advance. After cutover is stable: raise TTL again to reduce query load and cost.
TTL=300s — When Does Each Resolver See the New IP?SVG
~300 s
avg failover with TTL=300
~10%
resolvers ignore TTL
~24 hr
long-tail of misbehaving resolvers
3-7 days
lower TTL ahead of planned cutover
06
Deep dive — anycast, the trick behind 1.1.1.1
Normal IP: one machine owns that IP. Anycast IP: many machines across the world each announce "I am 1.1.1.1" via BGP. When a client sends a packet to 1.1.1.1, the global routing table picks the topologically nearest announcer — almost always the closest one geographically.
Result: a single IP address served from ~300 cities. Any one PoP can fail and traffic automatically reroutes to the next closest. No DNS change needed. Clients don't know or care. This is how CDNs, global DNS resolvers, and modern anycast load balancers work.
For your systems: usually you rent anycast from someone else (Cloudflare, AWS Global Accelerator). Running your own anycast means BGP peering at multiple internet exchanges — a commitment, not a side project.
07
Real-world
Route 53
AWS managed DNS
Integrated with AWS. Geo routing, weighted routing, health checks. ~99.99% SLA. Default for AWS-native stacks.
Cloudflare DNS
Anycast everywhere
Fastest resolver, free. Used for both authoritative (*.cloudflare.com) and recursive (1.1.1.1).
NS1 / Dyn
Traffic-steering specialists
Sophisticated routing (real-user metrics, data-driven). Pricier. Used by Netflix, LinkedIn historically.
Redundant providers
Run two authoritative DNS providers
Post-Dyn 2016, serious companies keep DNS in two providers simultaneously (Cloudflare + Route 53). One goes down, the other answers.
08
Used in problems
URL shortener uses short, memorable domains + low TTL for fast failover. YouTube/Netflix use DNS geo-routing to direct users to the nearest CDN edge. Google Maps uses DNS-based region routing.