Schedule a reminder for "Thursday 9am in America/New_York" and have it arrive on the user's phone at exactly that local time — even when DST changes, the user moves timezones, or the delivery channel is flaky. The hard parts:
time-bucketed scanning that doesn't miss or duplicate reminders across millions of users,
timezone materialization that handles DST transitions correctly,
and at-least-once delivery with client-side dedup so a reminder arrives once, not zero or three times.
Google Calendar reminders, Slack scheduled messages, and medical appointment alerts all solve this.
⚡ Core: Time Buckets + Timezone + At-Least-Once~100M reminders/dayDST-safeMulti-channel (push/SMS/email)Recurring support
02
Requirements
Functional
Create a one-time reminder for a specific datetime + timezone
Create a recurring reminder (daily, weekly, custom RRULE)
Deliver via push notification, SMS, or email — user configures channel
Snooze (reschedule N minutes/hours from now)
Cancel or edit a pending reminder
Support all IANA timezones including DST transitions
Non-Functional
Delivery within ±30 seconds of the scheduled time
At-least-once delivery — never silently drop a reminder
Client-side dedup so user sees it once even if delivered twice
Scale to ~100M reminders/day = ~1200/sec avg, ~5K/sec peak
99.99% availability on the scheduling path
Recurring reminders must handle DST, leap years, "last day of month"
03
Scale Estimation
Reminders / day
~100M
one-time + recurring instances combined
Fire rate (peak)
~5K/sec
morning hours across top-3 timezones concentrate load
Scan window
1 minute
scanner wakes every 60s; processes all due in [now, now+60s]
Create reminder. Body: {user_id, title, body, fire_at: "2026-04-17T09:00", timezone: "America/New_York", channel: "push", recurrence?: "RRULE:FREQ=DAILY"}. Server stores fire_at_utc computed from local time + IANA zone. Returns {reminder_id}.
GET/api/reminders?user_id=X&status=pending
List user's pending reminders. Paginated by fire_at_utc. Includes next fire time for recurring.
POST/api/reminders/{id}/snooze
Snooze: reschedule to now + N minutes. Creates a new one-time instance; original recurrence continues separately.
DELETE/api/reminders/{id}
Cancel pending reminder. For recurring: cancels the series; individual instance cancel via PATCH.
05
Architecture
Two flows: scheduling (user creates → store → index by fire_at_utc) and firing (scanner reads due reminders → dispatch to delivery channels). A recurrence expander materializes the next N instances of recurring reminders.
Reminder PipelineSVG
Request Flow — Step Through
User · create reminder→Reminder svc · TZ → UTC convert→Reminder DB · fire_minute bucket→Scanner · read due bucket→Dedup (Redis) · SETNX check→Dispatcher · route to channel→Delivery · push / SMS / email
Click Next Step to walk through the request flow.
06
Deep Dive — Time Buckets + Timezone + At-Least-Once
Time-bucketed scanning. Reminders stored in Cassandra with partition key = fire_minute_utc (truncated to the minute). E.g., all reminders due at 2026-04-17T13:00 UTC are in partition 202604171300. Scanner wakes every 60 s, reads the current-minute partition, dispatches each reminder.
Why this works: partition key gives O(1) lookup of "everything due now." No full-table scan. Scanner is a single leader process (elected via distributed lock) per shard of the time space. Multiple scanners shard by minute-range to parallelize.
Timezone materialization. User says "9am in America/New_York." Server must compute the UTC equivalent at creation time. But: DST changes mean "9am" maps to different UTC offsets on different dates. For one-time reminders, compute once at creation. For recurring: compute the NEXT instance's UTC at recurrence-expand time, re-compute after each fire.
Critical rule: store the IANA zone name (America/New_York), NOT the UTC offset (-05:00). UTC offsets change with DST. If you stored offset, a daily 9am reminder set in January (-05:00) would fire at 10am local time in March when clocks spring forward (-04:00).
Reminder Fire SequenceMermaid
sequenceDiagram
participant SC as Scanner
participant DB as Reminder DB
participant DD as Dedup (Redis)
participant D as Dispatcher
participant CH as Push / SMS / Email
participant RE as Recurrence expander
SC->>DB: read partition 202604171300
DB-->>SC: [reminder_1, reminder_2, ...]
loop each reminder
SC->>DD: SETNX reminder_id (dedup)
alt new (not seen)
DD-->>SC: 1 (acquired)
SC->>D: dispatch reminder
D->>CH: deliver via configured channel
CH-->>D: ack / fail
alt recurring
D->>RE: compute next instance
RE->>DB: insert next fire_minute_utc partition
end
D->>DB: mark delivered
else already processed
DD-->>SC: 0 (skip)
end
end
At-least-once delivery. Scanner reads due reminders → dispatches → marks delivered. If scanner crashes after dispatch but before marking: next scanner run sees the same reminder, re-dispatches. Hence at-least-once. The dedup store (Redis SETNX with TTL) prevents most duplicates; client-side dedup (by reminder_id) catches the rest.
Recurring expansion. After a recurring reminder fires, the recurrence expander computes the next instance from the RRULE + IANA timezone, converts to UTC, and inserts into the appropriate time-bucket partition. Horizon: expand only 1–2 instances ahead (not "every Monday forever").
Interview answer
"Store reminders in Cassandra partitioned by fire_minute_utc. Scanner (leader-elected per shard) wakes every 60 s, reads the current-minute partition, dedup-checks via Redis SETNX, and dispatches to the configured channel (push/SMS/email). At-least-once by design — scanner re-processes unacked reminders. Client dedup by reminder_id. Timezones stored as IANA zone names, not offsets; UTC conversion happens at creation + recurrence expansion. Recurring reminders expand one instance ahead after each fire."
⚠
Anti-patterns
🚫
Store timezone as UTC offset (-05:00) instead of IANA name (America/New_York)
DST changes the offset twice a year. A daily 9am reminder created in winter fires at 10am local time in summer.
✓ Better: Store IANA zone + local time. Convert to UTC at computation time using a tz library.
🚫
Full-table scan every minute looking for due reminders
At 100M reminders, scanning the whole table every 60s is ~1.7M rows/sec just for the scan. Unscalable.
✓ Better: Partition by fire_minute_utc. Scanner reads only the one partition that's due now.
🚫
Expand all future instances of "every Monday forever" at creation time
Infinite storage. And if the user edits the recurrence, you have to delete + recreate all future instances.
✓ Better: Expand only the next 1–2 instances. After each fires, compute + insert the next one.
07
Tradeoffs & Design Choices
Polling (scanner every N sec) vs timer-wheel / DelayQueue. Polling is simpler and works well for minute-level precision. Timer wheels (Kafka-style) are better for sub-second precision but harder to distribute. For reminders (30 s tolerance): minute-bucket polling wins on simplicity.
At-least-once vs exactly-once delivery. Exactly-once across distributed systems is infeasible without 2PC (slow) or app-level dedup. At-least-once + dedup is the pragmatic answer. Client-side dedup is cheap (check reminder_id before displaying).
Cassandra vs Postgres for time buckets. Cassandra: partition by fire_minute_utc gives free sharding + fast partition reads. Postgres: range query on fire_at index, simpler ops. At 100M/day, Cassandra is the better fit. At 1M/day, Postgres is fine.
Single scanner leader vs sharded scanners. Single: simpler, no coordination needed, but SPOF. Sharded: each scanner owns a range of minutes (scanner_0 handles even minutes, scanner_1 handles odd, etc.). Leader election per shard via Redis lock.
Push vs pull for delivery confirmation. Push to APNs/FCM is fire-and-forget (no delivery guarantee from the platform). SMS has delivery receipts. Email has no reliable read-receipt. Accept: delivery = "we sent it"; display confirmation is the client's problem.
08
Failure Modes
⏰
Scanner leader dies mid-scan
Scanner crashes after dispatching 50 of 200 reminders in a minute bucket. 150 not fired.
→ Mitigation: leader lock has TTL (e.g., 90 s). On expiry, another scanner takes over and re-reads the partition. Already-dispatched reminders caught by dedup (Redis SETNX). 150 un-dispatched fire on retry.
🌍
IANA timezone database update
A country changes its DST rules (happens ~yearly somewhere). Existing reminders computed with old rules fire at wrong local time.
→ Mitigation: store local time + IANA zone; recompute UTC on each recurrence expansion. Batch re-index job when tz database updates: scan all reminders in affected zone, recompute fire_at_utc.
📱
Push notification not delivered (phone offline)
APNs/FCM queue the push; phone was off for 3 hours; by then the reminder is irrelevant.
→ Mitigation: set TTL on push (e.g., 1 hour). Fallback channel: if push not delivered within 5 min, send SMS. Channel escalation ladder: push → SMS → email.
🔁
Duplicate delivery on scanner restart
Scanner re-reads the same minute bucket after crash recovery. User gets the reminder twice.
→ Mitigation: Redis SETNX per reminder_id with 10-min TTL. Client-side dedup: reminder_id already displayed → skip. Belt + suspenders.
🗓️
"Last day of month" recurring reminder
User sets "remind me on the last day of every month." Feb has 28/29, Apr has 30, May has 31. RRULE BYMONTHDAY=-1 handles this — but naive implementations break.
→ Mitigation: use a proper RRULE library (e.g., python-dateutil, rrule.js) that handles BYMONTHDAY=-1 correctly. Never hand-roll date math.
09
Interview Tips
Lead with time-bucketed partitioning. "Cassandra partitioned by fire_minute_utc. Scanner reads one partition per tick." This is the insight.
IANA zone, not UTC offset. Say it explicitly: "we store America/New_York, not -05:00." This is the DST answer and shows you've hit this bug.
At-least-once + dedup is the pattern. Don't propose exactly-once. Acknowledge the duplicate risk; show the mitigation (Redis SETNX + client dedup).
Recurring = expand one ahead. Never pre-generate all future instances. Expand after each fire.
Channel escalation. Push fails silently; SMS is more reliable; email is least timely. Escalation ladder shows product thinking.