Service A calls Service B. Where is B? Hard-coding an IP breaks the moment B autoscales, moves to a new host, or deploys to a new region. Hard-coding a hostname helps, but still routes through DNS caches that are slow to reflect instance changes. At microservice scale — hundreds of services, thousands of instances, constantly churning — you need a service discovery mechanism that stays accurate in seconds, not minutes.
Every serious production platform has one. Kubernetes, Consul, Eureka, Envoy xDS — same problem, different vocabulary.
02
The moving parts
Any service-discovery system has three roles:
Registry — an authoritative database of "service X has instances at IPs [a, b, c], each healthy/unhealthy."
Registration — how instances announce themselves (self-register on startup, or a sidecar / platform registers on their behalf).
Discovery — how a client finds which instance to call (query the registry, or have instances pushed to you).
03
Client-side vs server-side discovery
Client-side
Client queries registry, picks instance
Client library fetches instance list from registry (Consul, Eureka, etcd), caches locally, picks one per request with its own load balancing algorithm. Low latency, no extra hop. Every client has to implement discovery logic (or use a library).
Server-side
LB / gateway queries registry
Client hits a well-known LB (ALB, Envoy, Istio ingress). LB queries registry + picks instance. Client sees one virtual endpoint. Simpler clients but one extra hop + LB becomes scaling point.
Kubernetes is both
K8s Services are server-side (client calls service.namespace.svc → kube-proxy routes via iptables/IPVS). But add a service mesh (Istio, Linkerd) and each pod's sidecar discovers peers directly — client-side.
04
How registration stays accurate
An instance that crashes or loses network must be removed from the registry fast or clients will keep hitting a black hole. Two strategies:
Heartbeat-based TTL (Eureka, Consul). Instance registers with a 30s TTL and renews every 10s. Misses renewal → registry expires the entry. Simple; delayed detection.
Platform-driven (Kubernetes). The platform knows which pods are healthy (via readiness probes) and updates Endpoints for you. No registration at all from the service's POV.
Pair any of these with active health checks from the caller side — "even if registry says healthy, circuit-break if I've seen 10 failures in a row."
05
Deep dive — Envoy's xDS, the modern approach
Envoy (and by extension Istio, AWS App Mesh, Consul Connect) use the xDS protocol family — a set of gRPC streaming APIs where a central control plane pushes service-discovery data to sidecars.
EDS (Endpoint Discovery Service) — pushes "service X has these endpoints with these load-balancing weights."
CDS (Cluster) — service metadata: timeouts, connection pools, TLS.
RDS (Route) — HTTP routing rules.
LDS (Listener) — which ports to accept on.
Control plane (Istiod, Consul server, etc.) watches the environment — K8s API, Consul catalog, Vault — and streams updates to every sidecar. A new pod spins up → 100ms later every other pod knows about it and can route to it.
Net effect: microservice discovery works at the same speed as the underlying orchestrator, with zero service-side library code. The sidecar abstracts everything away.
06
Real-world
Kubernetes
Platform-native discovery
DNS-based (my-service.namespace.svc.cluster.local) + Endpoints objects updated by controller. No explicit registration.
Consul
General-purpose registry
HashiCorp. Works across VMs, containers, bare metal. Services register via agent; health-checked. DNS and HTTP interfaces.
Netflix Eureka
JVM client-side
Pioneered by Netflix. Eureka clients register + heartbeat; peers pull the list. Still common in Spring Cloud stacks.
AWS Cloud Map
Managed
DNS + HTTP namespace for AWS services. Integrated with ECS, Route 53, App Mesh.
07
Used in problems
News feed's microservices rely on service mesh discovery (every service has hundreds of instances). E-commerce checkout calls payment/shipping/inventory services via discovery. Notification system fans out to channel-specific microservices.