Exercise · Communication

Gmail

Whiteboard exercise. Try the problem cold, then reveal the rubric to self-score.

Out of 10 points45 min whiteboardReference solution →
01

Prompt

An email service for ~1.8B users handling ~1B incoming emails per day plus outgoing. The hard parts: an SMTP ingress that accepts mail from anyone on the public internet while rejecting terabytes of daily spam; a per-user mailbox store sharded across thousands of machines with search, labels, and conversation threading; and delivery reputation management so your outbound mail doesn't get marked spam by other providers. Gmail pioneered "search, don't sort" — replaced folders with labels + full-text search, which changed the email paradigm.

Time budget: 45 min whiteboard. Draw architecture, estimate numbers, discuss tradeoffs.

02

Hints (progressive — click to reveal)

Hint 1

SMTP is not HTTP. Many candidates model ingress as a REST endpoint. SMTP's temp-fail retry model is core; losing that means losing mail.

Hint 2

Spam is 40% of the story. Not an afterthought. DKIM/SPF/DMARC + ML + user feedback loop is the engineering mass of the inbound path.

Hint 3

Per-user sharding + labels + search. The "Gmail innovation" triangle. Don't present a folder-based model — mention labels + search explicitly.

03

Rubric — 10 points

  • +2 SMTP is not HTTP. Many candidates model ingress as a REST endpoint. SMTP's temp-fail retry model is core; losing that means losing mail.
  • +2 Spam is 40% of the story. Not an afterthought. DKIM/SPF/DMARC + ML + user feedback loop is the engineering mass of the inbound path.
  • +2 Per-user sharding + labels + search. The "Gmail innovation" triangle. Don't present a folder-based model — mention labels + search explicitly.
  • +2 Threading is subtle. In-Reply-To + References + subject-similarity fallback. Mention all three.
  • +2 Outbound reputation matters as much as inbound filtering. Gmail's deliverability is a product feature. Don't forget to architect the send side.

Self-score: tally the points you would have mentioned unprompted. 7+ is interview-ready on this problem.

04

Red flags (things that tank the interview)

  • No back-of-envelope estimation — jumps straight into components without quantifying scale for Gmail
  • Single point of failure — no replication, failover, or redundancy discussed
  • Ignores data model and storage choices — hand-waves the database layer