Sending a push notification to 10 million users sounds simple — until you confront that you're not actually delivering the notification. Apple is. Google is. The browser vendor is. Your server hands the message to APNs (Apple), FCM (Google), or Web Push (browser); they deliver to the device. Each has different rate limits, different payload sizes, different delivery semantics, different error modes. Get this wrong and you'll send 1M push notifications and have 100k actually delivered.
02
The three platforms
Platform
Owner
Token type
Max payload
Rate limits
APNs (iOS)
Apple
Device token (per app install)
4 KB
Per-connection; ~1000s/sec on HTTP/2 streams
FCM (Android, but also iOS pass-through)
Google
FCM token
4 KB
Per-project; high but throttles on errors
Web Push
Browser vendors (VAPID standard)
Browser-managed subscription
4 KB
Per-endpoint
03
The end-to-end flow
App registers for push. User taps "allow notifications." OS / browser generates a token.
App sends token to your server. You store it indexed by user_id.
Trigger: something happens that should notify the user (new message, order shipped, etc.).
Your server constructs the payload, looks up the user's tokens, sends to APNs / FCM / Web Push.
The platform delivers to the device. May queue if device offline (APNs ~1 day, FCM ~28 days TTL).
Device receives + displays. User taps → opens your app to the relevant screen.
Tokens expire / become invalid. Platform tells you in the response. You must clean up dead tokens or risk being throttled.
04
Deep dive — at scale, you need a fan-out tier
"User created post → notify their 10M followers" doesn't fit one HTTP call. Production architecture:
Topic-based fan-out. APNs/FCM both support "topics" (FCM) or "broadcast channels" (APNs Critical Alerts) — the platform handles fan-out for you. Cheaper but less flexible.
Server-side fan-out via queues. For per-recipient personalization (different content per user), shard the recipient list into batches, push batches to a queue, workers pop + send.
Token validity batching. Group sends by token validity windows. Dead tokens fail fast; cull aggressively. Sending to dead tokens uses your quota.
HTTP/2 connection pooling. APNs uses HTTP/2 — one persistent connection per worker can stream thousands of notifications/sec. Reusing connections is critical.
Per-platform workers. Separate worker pools for APNs, FCM, Web Push — different rate limits, different error handling, different retry policies.
flowchart LR
T[Trigger event] --> R[Recipient resolver]
R -->|millions of user_ids| K[Kafka topic]
K --> A[APNs workers HTTP/2 pool]
K --> F[FCM workers HTTP/2 pool]
K --> W[Web Push workers]
A --> AC[(APNs)]
F --> FC[(FCM)]
W --> WC[(Browser endpoint)]
AC --> iOS[iPhone]
FC --> Android[Android]
WC --> Browser[Browser]
Interview answer
"Notifications fan out via Kafka. Per-platform worker pools (APNs, FCM, Web Push) drain the queue and push over persistent HTTP/2 connections. Token-invalid responses go back to a cleanup pipeline. Use collapse keys for idempotency. At 10M-recipient scale, the bottleneck is HTTP/2 connection throughput — saturate connections, not individual streams."
05
Common pitfalls
Treating push as guaranteed delivery. It's not. Phone offline, OS killed your app, user disabled notifications — the notification just doesn't arrive. Don't use push as the primary delivery channel for anything mandatory.
Token cleanup neglect. Send to dead tokens for months → APNs throttles you. Process error responses + remove invalid tokens promptly.
Cross-platform code paths. APNs and FCM payload shapes are different. Build a per-platform serializer; don't try to make one schema work everywhere.
Notification spam. User opens app once → 50 queued notifications. Coalesce server-side; respect "Do Not Disturb"; let users manage frequency.
Localization. User's preferred language might not match server default. Store language preference + render localized messages.
06
Real-world
Apple Push Notification service (APNs)
iOS reference
HTTP/2 protocol. Token + payload + topic + priority. Strict 4 KB payload; expansion via "notification service extension" downloads more data on device.
Firebase Cloud Messaging (FCM)
Cross-platform
Originally Android-only; now serves iOS too (via APNs under the hood). Topic-based fan-out built in. SDK simplifies token mgmt.
Web Push Protocol
Browser standard
VAPID-signed messages. Each browser vendor provides a push endpoint per subscription. Works in service workers; even when browser is closed.
OneSignal / Pusher / Airship
Push-as-a-service
Wraps APNs + FCM + Web Push behind one API. Handles fan-out, segmentation, A/B testing. Used by teams that don't want to build the infra.
07
Used in problems
Notification system is fundamentally about doing this well at scale. WhatsApp uses native push for offline message delivery. News feed uses push for engagement re-activation.