Exercise · Social & Feed

Twitter / News Feed

Whiteboard exercise. Try the problem cold, then reveal the rubric to self-score.

Out of 10 points45 min whiteboardReference solution →

Prompt

Design Twitter's home timeline. 500M DAU, average follows 200 accounts, ~6K tweets/sec global, feed load < 200ms p99.

Time budget: 45 min whiteboard. Draw architecture, estimate numbers, discuss tradeoffs.

Hint 1

Fan-out on write vs on read — this is the canonical tradeoff. What if a user follows a celebrity with 100M followers?

Hint 2

Hybrid: push to most users, pull for celebrities. The cutover threshold is a key design decision.

Hint 3

Pre-compute materialized feeds. Don't query at read time from 200 users × days-of-tweets.

+1 BoE: ~290K feed loads/sec; 6K tweets/sec × 200 fan-out = 1.2M fan-out events/sec worst case
+2 Hybrid fan-out explicit: push-on-write for normal users (≤ 100K followers), pull-on-read for celebrities
+2 Feed storage: Redis sorted-set per user (timeline), TTL-bounded; backfill from tweet DB when empty
+1 Tweet storage: Cassandra, partition by user_id; read path: GET by tweet_id
+1 Ranker vs chronological: mention algorithmic re-ranking with recent engagement features
+1 Cache warming for users about to log in; invalidation on new tweet from followed account
+1 Image/video handled via URL reference; CDN serves; tweet stores cheap pointer only
+1 Addresses hot-key problem on celebrity's fan-out queue; backpressure / shedding plan

Self-score: tally the points you would have mentioned unprompted. 7+ is interview-ready on this problem.

Naive: query all followed users' tweets at read time and sort (doesn't scale past ~1K users)
Ignores celebrity problem; fan-out-on-write to 100M followers
Stores full tweet bytes in feed (should be IDs only with DB fetch)
No eviction / TTL on pre-computed feeds (memory explodes)
Single Postgres table for all tweets (billions of rows)