06
Deep Dive — Fan-Out, Push vs Pull & Delivery Guarantees
Fan-Out: Staged Bulk Sends
The naive approach — enqueuing 50M messages at once — floods queues, blows memory during segment resolution, and causes a downstream provider stampede. The production approach uses three stages:
Stage 1 — Job Creation (instant): Write a single bulk_job row to the database, return 202.
Stage 2 — Segment Resolution (streaming): Fan-out service streams users in 5K batches via cursor, pre-filters by preferences (eliminates ~20% before queuing), throttles enqueue at caller-specified rate, and checkpoints progress for crash recovery.
Stage 3 — Delivery: Individual notifications flow through the normal pipeline at a steady, manageable rate.
Push vs Pull for In-App
Pull: User opens feed → query Cassandra (partition key = user_id, cluster key = created_at DESC). Single-partition read, ~2ms. Redis caches page 1 (last 50 notifications) for ~90% cache hit rate — 60× cheaper than storing the full feed in Redis RAM.
Push: Real-time badge updates via WebSocket Gateway. Connection registry in Redis maps user_id → gateway instance. On new notification, publish to the specific gateway holding the user's connection. At 10M concurrent connections, ~25 GB RAM per gateway instance across 20 pods.
Delivery Guarantees: At-Least-Once + Idempotency
Exactly-once across a network boundary with external providers is impossible (Two Generals' Problem). We always retry, with idempotency at 4 layers:
sequenceDiagram
participant PS as Payments Service
participant API as Ingestion API
participant RD as Redis (Dedup)
participant KF as Kafka Critical
participant PW as Processing Worker
participant PU as Push Worker
participant FCM as FCM (Google)
participant DT as Delivery Tracker
participant SMS as SMS Worker
PS->>API: POST /v1/notifications (OTP)
API->>RD: Check idempotency key
RD-->>API: Not seen ✓
API->>KF: Enqueue to notifications.critical
API-->>PS: 202 Accepted
KF->>PW: Consume message
PW->>RD: Check processed set (SETNX)
RD-->>PW: New ✓
PW->>PW: Resolve template + check prefs
PW->>KF: Enqueue to delivery.push
KF->>PU: Consume push message
PU->>RD: Check delivery log
RD-->>PU: Not sent ✓
PU->>FCM: Send push notification
FCM-->>PU: 200 OK
PU->>DT: Status → "sent"
Note over DT: Start 30s fallback timer
FCM-->>DT: Delivery receipt
DT->>DT: Status → "delivered"
DT->>DT: Cancel SMS fallback ✓
DT->>PS: Webhook: OTP delivered via push
Note over DT,SMS: If push fails or times out after 30s:
DT->>SMS: Trigger SMS fallback
SMS->>SMS: Deliver via Twilio
Layer 1 — Ingestion
Redis SETNX on idempotency key (24h TTL). Duplicate requests return same notification_id without re-queuing.
Layer 2 — Processing
Atomic SETNX on processed set. Kafka consumer commits offset after processing, not before.
Layer 3 — Delivery
Check delivery log before calling provider. FCM collapse_key reduces visible duplicates on device.
Layer 4 — Client
Client maintains local set of recent notification_ids (~200). Last line of defense against server-side dedup failures.
Retry Strategy
Exponential backoff with retry Kafka topics: delivery.push.retry.1 (2s), .retry.2 (8s), .retry.3 (32s), .dlq (dead letter). Critical: 6 retries (~10 min window). Low: 2 retries (~10s), then silent drop.