Exercise · Storage & Data

Amazon S3

Whiteboard exercise. Try the problem cold, then reveal the rubric to self-score.

Out of 10 points45 min whiteboardReference solution →
01

Prompt

A bucket-and-key object store with read-your-writes consistency, ~11 nines of durability, and exabyte-scale capacity. The hard parts: a key → bytes service that doesn't fall over at millions of requests per second; erasure-coded durability cheaper than 3× replication but rebuilds gracefully; and a partition-key-hot-spot story — what happens when everyone writes to keys starting with 2026-04-12/. S3 stores hundreds of trillions of objects and serves tens of millions of requests per second.

Time budget: 45 min whiteboard. Draw architecture, estimate numbers, discuss tradeoffs.

02

Hints (progressive — click to reveal)

Hint 1

Durability math is the core. 11 nines is a number. Work backward from it: what does that imply about replication factor, geographic spread, verification?

Hint 2

Name erasure coding by type. "Reed-Solomon (10, 4)" or "(12, 4)" is much more credible than "we use erasure coding." Know the storage-vs-durability tradeoff by heart.

Hint 3

Split the two planes. Index service (metadata) vs data fleet (bytes). Don't muddle them. The index is transactional; bytes are immutable blobs.

03

Rubric — 10 points

  • +2 Durability math is the core. 11 nines is a number. Work backward from it: what does that imply about replication factor, geographic spread, verification?
  • +2 Name erasure coding by type. "Reed-Solomon (10, 4)" or "(12, 4)" is much more credible than "we use erasure coding." Know the storage-vs-durability tradeoff by heart.
  • +2 Split the two planes. Index service (metadata) vs data fleet (bytes). Don't muddle them. The index is transactional; bytes are immutable blobs.
  • +2 Hot partitions are the classic follow-up. Interviewer will ask "what if everyone's key starts with today's date?" Have the auto-split + hash-prefix answer ready.
  • +2 Multipart upload is more than chunking. It's resumable, parallelizable, and lets you abort partial work. Describe each of those benefits distinctly.

Self-score: tally the points you would have mentioned unprompted. 7+ is interview-ready on this problem.

04

Red flags (things that tank the interview)

  • No back-of-envelope estimation — jumps straight into components without quantifying scale for Amazon S3
  • Single point of failure — no replication, failover, or redundancy discussed
  • Ignores data model and storage choices — hand-waves the database layer