03
Patterns in practice
Feature flags for hot-disable. When a downstream starts misbehaving, flip a feature flag off instantly. The caller short-circuits to the fallback without deploying code. Stripe, LinkedIn, and Netflix all wire critical features this way.
Timeout-first, not retry-first. Every downstream call has a tight deadline (200ms, not 30s). If it's not back in time, skip it and use fallback. Don't let slow dependencies cascade into slow users.
Fallback is also tested. Most outages happen when the primary fails AND the fallback was never rehearsed. Chaos engineering (force-fail dependency X; verify fallback engages correctly) is how you keep fallbacks real.
Degrade cheap, not expensive. Serving trending-for-all to a million users is cheap; recomputing personalized recs synchronously is not. If the recs service is down, the fallback must be a pre-warmed global ranking, not a synchronous emergency calculation.