Feature Flags & Rollouts

01

Why this matters

"Deploy a new feature to all 100M users at once" is asking for a 3am incident. Feature flags let you separate code deploy from feature release: ship the code dark, then turn it on for 1% of users, watch metrics, ramp to 10%, 50%, 100%. If anything goes wrong, flip it off — no rollback, no redeploy.

Modern continuous deployment depends on this. Stripe deploys hundreds of times per day; nothing reaches all customers without a flag-controlled rollout.

02

Four flavors of flag

Type	Lifetime	Use for
Release flag	Days–weeks	Hide an unfinished feature in production until it's ready
Experiment flag	Weeks (A/B test duration)	Compare variant A vs B; analyze which performs better
Permission flag	Long-lived	"Premium tier sees this; free tier doesn't"
Operational kill switch	Permanent	"Disable the recommendations service if it misbehaves" — pairs with graceful degradation

03

How a check actually evaluates

Code path:

if (flags.isEnabled("new-feed-ranker", { user_id, country, plan })) {
  return newRanker(user);
} else {
  return legacyRanker(user);
}

Inside isEnabled:

Look up the flag's rules (cached locally; refreshed every few seconds).
Apply targeting: "enabled for premium users in EU" → check the context.
Apply percentage rollout: hash the user_id deterministically into a 0–99 bucket; if bucket < rollout%, return true.
Cache the result.

Critical: the bucketing is deterministic per user. Same user always gets the same answer until rollout% increases. Otherwise users would flip in/out of the experience randomly per request.

04

Operational rules that make flags safe

Default to OFF for new flags. Code that hasn't been tested in prod stays dark.
Test both branches in CI. Run your suite with the flag on and off.
Time-bound release flags. Track them in a registry; flags older than 60 days get reviewed and deleted. Otherwise you accumulate dead branches.
Monitor by flag. Each flag should expose metrics: enabled-vs-disabled performance, error rate per branch. Alerts when one branch regresses.
Make rollback instant. Flag changes propagate in < 60s. If you have to redeploy to disable a feature, your flag system is broken.

05

Deep dive — progressive delivery patterns

Beyond simple percentage rollouts, modern teams use a ladder:

Internal dogfood — flag enabled only for company employees (filter by email domain). Catch egregious bugs.
Beta cohort — opt-in users / specific accounts. Real-world feedback at low risk.
Canary deploy — code rolled out to 1 server out of N. Hardware-level isolation. If that one server crashes, only its share of traffic is affected.
1% percentage rollout — flag bumps from 0% to 1%. Watch error rate, latency, business metrics. Hold for an hour.
10% → 50% → 100% — over hours or days. Each step is a checkpoint.
Cleanup — once at 100% for a stable period, remove the flag and the legacy branch. Avoid permanent flag debt.

Sophisticated platforms (LaunchDarkly, Split, Optimizely) automate the ladder: define a rollout policy, the platform advances rollout% based on guardrail metrics. Halt automatically if anomalies detected.

Interview one-liner

"Every major change is behind a feature flag. We deploy code dark, ramp via percentage rollout with metric guardrails, and have an instant kill switch. Failed releases roll back in < 1 minute by flipping the flag — no redeploy required."

Progressive Delivery Ladder SVG

< 60 s

flag-flip propagation (LaunchDarkly)

+0.1%

error-rate threshold for auto-revert

+10%

P99 latency threshold for auto-revert

60 days

flag age before mandatory cleanup review

06

Real-world

LaunchDarkly

Managed feature-flag platform

SDKs for every language. Sub-second flag propagation. The default for teams that don't want to build it themselves.

Statsig / Optimizely

Flags + experimentation

Automatic A/B test analysis per flag. Used by companies with heavy experiment cadence.

Stripe / Netflix

Internal platforms

Built their own at scale. Stripe's "Sorbet" and Netflix's "Mantis" tightly integrate with their deploy pipelines.

Cloudflare Pages / Vercel

Edge-config flags

Flags read at the CDN edge → instant propagation worldwide. Useful for static-site flag flips.

07

Used in problems

News feed gates new ranker rollouts via flags + A/B test. E-commerce uses flags for checkout flow experiments + kill switches per payment provider. Notification system gates new channel integrations.

📺

References & Videos

Feature Flags & Progressive Rollout

ByteByteGo · 8 min

Feature Toggles Explained

TechWorld with Nana · 15 min

Feature Toggles (Feature Flags)

Martin Fowler

Feature Flags

GeeksforGeeks