Google Calendar

A calendar and scheduling platform handling recurring events, calendar sharing, and free/busy queries for 1.5 billion users across every timezone. The hard parts: an RRULE recurrence engine that expands "every Monday forever" into concrete instances on read without infinite storage, a free/busy query system that resolves availability across N calendars in milliseconds, and a sync layer that keeps events consistent across web, mobile, and CalDAV clients even after offline edits. iCal defined the spec; Google Calendar made it work at planet scale.

Core: RRULE Expansion + Free/Busy1.5B users~500M events/day~10K free/busy QPSCross-timezone complexity
02

Requirements

Functional
  • Create, update, delete events with RRULE recurrence rules (daily, weekly, monthly, custom)
  • Query events within a time window (day/week/month view) with expanded recurring instances
  • Share calendars with ACL permissions (view only, edit, manage)
  • Query free/busy across multiple calendars for scheduling
  • Set reminders and receive notifications (push, email, SMS)
  • Handle timezones correctly (IANA zones, DST transitions)
  • Sync across devices (web, mobile, CalDAV clients)
Non-Functional
  • Event read < 100 ms p99 for week-view queries
  • Free/busy resolution < 200 ms across 10 calendars
  • Offline edits sync correctly with conflict resolution
  • 99.99% availability (calendar is critical infrastructure)
  • Scale to 1.5B users, ~500M events created/day
  • Strong consistency on event writes; eventual on notifications
03

Scale Estimation

1.5B users, ~800M DAU. Average user creates ~0.6 events/day (including recurring expansions) -> ~500M events created/day ~6K writes/sec. Reads dominate: ~20 calendar views/user/day x 800M = 16B reads/day -> ~185K reads/sec.

Free/busy queries are bursty during business hours: ~10K QPS peak. Each free/busy query fans out across 3-10 calendars, so effective DB load is 30K-100K calendar lookups/sec for free/busy alone.

1.5B
total users
~6K
event writes/sec
~185K
calendar view reads/sec
~10K
free/busy QPS (peak)
~500M
events created/day
~2 TB
event metadata storage (1 KB avg x 2T total events)
~50K
reminder notifications/sec
24
timezone-aware DST transitions/year to handle

The dominant complexity is not raw throughput but correctness: RRULE expansion must handle DST, leap years, "last Friday of month," and single-instance exceptions. Free/busy is an interval-merge problem across expanded instances. Storage is modest compared to media platforms — events are small (~1 KB each).

04

API Design

POST/v1/calendars/{calendarId}/events

{ title, start, end, timezone, rrule?, attendees?, reminders[] }. Creates an event. If rrule is provided (e.g. FREQ=WEEKLY;BYDAY=MO), stores the rule — does not materialize instances.

GET/v1/calendars/{calendarId}/events?timeMin=&timeMax=

Returns all events (single + expanded recurring instances) within the requested window. Recurrence expander generates virtual instances from stored RRULEs on the fly.

GET/v1/freebusy

{ timeMin, timeMax, items: [{ id: calendarId }...] }. Batch free/busy query across N calendars. Returns busy intervals per calendar. Backed by Redis bitmap cache.

PATCH/v1/events/{eventId}

{ originalStartTime, ...overrides }. Edits a single instance of a recurring event. Creates an exception override row keyed by (recurringEventId, originalStartTime).

POST/v1/calendars/{calendarId}/share

{ userId, role: "reader"|"writer"|"owner" }. Grants ACL permissions on a calendar. Propagates to free/busy visibility.

05

High-Level Architecture

The architecture splits into four paths from the API gateway:

  • Event path: CRUD operations on events/RRULEs in a sharded Event DB (partitioned by calendar_id).
  • Read path: Recurrence Expander generates virtual instances from stored RRULEs within the queried window, merges with exception overrides.
  • Free/busy path: Queries the Redis free/busy cache (pre-computed bitmaps per calendar per day) or falls back to expansion.
  • Sync path: CalDAV/push sync service maintains per-client sync tokens and pushes change notifications via WebSocket (web) or FCM/APNs (mobile).
Architecture — Event CRUD + Read + SyncSVG
Client API Gateway Event Service Event DB (sharded) Recurrence Expander Redis free/busy cache Notification Service Sync Service (CalDAV) WS / FCM / APNs Reminder Queue ACL Store Kafka events
Request Flow — Step Through
Client · Web / Mobile / CalDAVAPI Gateway · Auth + rate-limitEvent Service · CRUD + validationEvent DB · Sharded by calendar_idRecurrence Expander · RRULE -> instancesRedis Cache · Free/busy bitmapsSync Service · CalDAV + push
Click Next Step to walk through the request flow.
06

Deep Dive — RRULE Expansion & Free/Busy

The Core Question

How do you store "every Monday at 10 AM forever" without writing infinite rows, yet answer "show me next week" in under 100 ms — including timezone-correct DST transitions and single-instance exceptions?

Step 1 — Store the rule, not the instances. A recurring event is one row: { id, calendar_id, title, dtstart, rrule: "FREQ=WEEKLY;BYDAY=MO", timezone: "America/New_York" }. No materialized instances. Storage: O(1) per recurring series regardless of how far into the future it repeats.

Step 2 — Expand on read within the window. When a user opens their week view (e.g. Apr 13-19), the Recurrence Expander loads all events whose dtstart ≤ windowEnd and whose rrule could produce instances in the window. It walks the RRULE forward from dtstart using the IANA timezone, generating instances. For "every Monday," it yields Apr 13 at 10:00 EDT. Key: expansion is bounded by the query window — never "expand all."

Step 3 — Merge exceptions. Single-instance overrides (e.g., "move this Monday to Tuesday") are stored as exception rows keyed by (recurring_event_id, original_start_time). The expander generates the virtual instance, checks for an override, and either replaces or deletes it. An EXDATE marks a deleted instance.

Step 4 — Timezone correctness. All RRULEs are expanded in the event's IANA timezone, not UTC. "Every day at 9 AM America/New_York" must produce 9 AM EST in winter and 9 AM EDT in summer. Storing a UTC offset (e.g., -05:00) would break on DST transitions. The expander uses the tz database to resolve each instance.

Step 5 — Free/busy as interval query. Free/busy expands all events across the requested calendars within the window, then merges overlapping busy intervals. For performance, a Redis bitmap cache stores per-calendar busy slots at 15-minute granularity (96 bits/day). Cache is invalidated on event write. At 10K QPS, most free/busy queries hit the bitmap cache — no expansion needed.

Step 6 — Change notifications. On any event mutation, the Event Service publishes to Kafka. The Sync Service maintains per-client syncToken (a monotonic version). Web clients receive changes via WebSocket; mobile via FCM/APNs push. CalDAV clients poll with their sync token. Conflict resolution uses last-writer-wins with a sequence number tiebreaker.

Sequence — Recurring Event Create & Week View QueryMermaid.js
sequenceDiagram participant C as Client participant API as API Gateway participant ES as Event Service participant DB as Event DB participant RE as Recurrence Expander participant R as Redis (free/busy) C->>API: POST /calendars/{id}/events {rrule: "FREQ=WEEKLY;BYDAY=MO"} API->>ES: create recurring event ES->>DB: INSERT event row with RRULE ES->>R: invalidate free/busy cache for calendar ES-->>C: 201 Created {eventId} C->>API: GET /calendars/{id}/events?timeMin=Apr13&timeMax=Apr19 API->>ES: query week view ES->>DB: SELECT events WHERE calendar_id=X AND dtstart <= Apr19 DB-->>ES: event rows (single + recurring) ES->>RE: expand RRULEs within [Apr13, Apr19] RE->>RE: walk RRULE in IANA tz, generate instances RE->>DB: fetch exception overrides for window RE-->>ES: expanded instances + merged exceptions ES-->>C: [{Mon Apr 14 10:00 EDT}, ...]
07

Key Design Decisions & Tradeoffs

Expand on read

Generate recurring instances at query time from stored RRULE

No storage waste — "every Monday forever" is one row. Handles infinite recurrences naturally. Compute cost on every read, but bounded by the query window (typically 7-31 days). RRULE expansion is CPU-light (~microseconds per instance).

Materialize instances

Pre-generate N months of recurring instances as concrete rows

Fast reads (plain range query, no expansion). But requires a rolling materialization job. "Every Monday forever" needs a cutoff. Editing the series means updating thousands of rows. Storage grows with recurrence frequency x lookahead window.

Shard by calendar_id

All events for a calendar co-located on one shard

Week-view query is single-shard. Free/busy for one calendar is single-shard. Sharing a calendar means one shard serves all viewers. Hot calendars (company-wide) can become hot shards, mitigated by read replicas.

Shard by user_id

All calendars for a user on one shard

Optimizes "show all my calendars" query. But shared calendars span shards — free/busy across shared calendars requires cross-shard scatter-gather. Calendar_id sharding wins for the dominant access pattern.

--

Anti-patterns

X
Materialize all future instances of "every Monday forever"

Infinite recurrence = infinite rows. Even with a 10-year cutoff, one rule generates 520 rows. Multiply by millions of recurring events and storage explodes. Editing the series requires bulk updates.

Better: Store the RRULE once; expand on read within the requested time window. One row, bounded compute.
X
Store timezone as UTC offset (-05:00) instead of IANA zone (America/New_York)

UTC offsets are ambiguous across DST. "9 AM at -05:00" stays at -05:00 in summer when New York shifts to -04:00. The event drifts by an hour twice a year.

Better: Always store the IANA timezone identifier. Expand using the tz database so 9 AM remains 9 AM local through DST transitions.
X
Free/busy query scans all events linearly

At 10K QPS with 3-10 calendars each, a linear scan over all events per calendar is O(N) per calendar. Calendars with years of history become slow.

Better: Use a Redis bitmap cache at 15-minute granularity (96 bits/day). Invalidate on write. Free/busy becomes a bitwise OR across calendars — O(1) per day.
08

What Can Go Wrong

R
RRULE edge cases: leap year, DST, end-of-month
"Every month on the 31st" — what happens in February? "Every day at 2:30 AM" — what happens on DST spring-forward when 2:30 AM doesn't exist? These produce silent data bugs.
-> Mitigation: use a battle-tested RRULE library (libical); define explicit policies — skip nonexistent times, clamp to last day of month. Extensive property-based tests.
N
Notification storm on shared-calendar bulk edit
Admin edits a recurring event on a 10,000-person company calendar. Naive notification = 10K pushes x number of instances modified. Notification service drowns.
-> Mitigation: batch and deduplicate — one notification per user per series edit, not per instance. Rate-limit shared-calendar notifications with a coalescing window.
T
Timezone mismatch between creator and viewer
Creator in Tokyo sets a meeting at "3 PM" but viewer in London sees it at 3 PM London time instead of 7 AM. Caused by losing the creator's timezone during storage or display.
-> Mitigation: always store and transmit the event's IANA timezone. Render in the viewer's local tz but show the original tz on hover. Never silently convert.
S
Sync conflict — same event edited on two offline devices
User edits an event title on their phone (offline) and the location on their laptop (offline). Both come online — which version wins? Naive last-write-wins loses one edit.
-> Mitigation: field-level merge with vector clocks. If different fields changed, merge both. If same field, use sequence number + timestamp tiebreaker. Surface conflict to user if ambiguous.
C
Free/busy cache stale after rapid edits
User creates 20 events in quick succession. Free/busy bitmap cache lags behind — a colleague sees stale availability and double-books.
-> Mitigation: write-through cache invalidation (update bitmap synchronously on event write). For burst writes, use a short async coalescing window (~500 ms) then rebuild the affected day's bitmap.
H
Hot shard on company-wide calendar
A 50,000-employee company calendar is one shard. Monday morning: everyone opens their calendar. One DB shard takes 50K concurrent reads.
-> Mitigation: read replicas for hot calendars. Detect hot shards via QPS monitoring; auto-promote replicas. Cache expanded week views in Redis with short TTL (30s).
09

Interview Tips

  1. Lead with RRULE expansion. This is THE differentiator. Explain "store the rule, expand on read" immediately — it shows you understand the core complexity rather than treating it as a simple CRUD app.
  2. Timezones are not optional. Mention IANA timezone IDs (not UTC offsets) early. DST handling is the canonical "sounds easy, is hard" problem. Naming it explicitly earns credibility.
  3. Free/busy is an interval problem. Frame it as merging intervals across expanded instances. Then optimize with bitmap caching. This shows algorithmic thinking applied to systems.
  4. Single-instance exceptions are the tricky part. "This Monday is moved to Tuesday" as an override row merged at read time. Shows you've thought beyond the happy path.
  5. Sync conflict resolution. Mention field-level merge + vector clocks. This demonstrates distributed systems depth — calendar is inherently a multi-device, occasionally-offline system.
11

Evolution

1

MVP — Single Postgres + cron reminders

One DB table for events, no recurrence support. Cron job scans for upcoming events every minute, sends email reminders. Works to ~100K users. Google Calendar launched similarly in 2006.

2

RRULE support + per-user sharding

Add RRULE column to events table. Build an in-process recurrence expander. Shard event DB by calendar_id. Exception overrides table added. Scales to ~50M users.

3

CalDAV sync + push notifications

CalDAV endpoint for third-party clients (Apple Calendar, Thunderbird). Sync tokens for incremental sync. FCM/APNs push replaces polling. WebSocket for web. Scales to ~500M users.

4

Free/busy service + room booking

Dedicated free/busy microservice with Redis bitmap cache. Room resource calendars with conflict detection. Scheduling assistant ("find a time" across 8 attendees). Enterprise features. Scales to ~1B users.

5

ML scheduling suggestions + Gemini integration

ML model suggests optimal meeting times based on habits, focus-time preferences, and travel time. Gemini integration for natural-language event creation ("schedule lunch with Alex next week"). Smart RSVPs and automatic rescheduling proposals.

Next up