Gmail

An email service for ~1.8B users handling ~1B incoming emails per day plus outgoing. The hard parts: an SMTP ingress that accepts mail from anyone on the public internet while rejecting terabytes of daily spam; a per-user mailbox store sharded across thousands of machines with search, labels, and conversation threading; and delivery reputation management so your outbound mail doesn't get marked spam by other providers. Gmail pioneered "search, don't sort" — replaced folders with labels + full-text search, which changed the email paradigm.

⚡ Core: SMTP Ingress + Spam + Search1.8B users~1B emails/day~40% spam rejectionPer-user full-text index

Requirements

Functional

Receive email from any SMTP sender on the internet
Send email to any recipient domain (outbound SMTP)
Store per-user mailbox: read/unread, starred, labeled, archived
Conversation threading (stitch replies via In-Reply-To / References headers)
Full-text search across message bodies + attachments
Labels (Gmail's innovation — multi-label instead of folders), filters, priorities
Attachments up to ~25 MB inline; Drive links for larger
Spam + phishing filtering; DKIM/SPF/DMARC verification

Non-Functional

Mail delivery from sender's send to recipient's inbox: < 60 seconds typical
99.99% availability for receive; 99.9% for send (retries are built into SMTP)
No lost mail — durability is sacred; users lose trust fast
Inbox load < 200 ms p99; search < 500 ms p99
Strong consistency on user actions (mark-read, archive, delete)
Scale to ~1.8B users, ~15 GB avg quota, total ~~25 EB storage pre-dedup

Scale Estimation

Users

~1.8B

disclosed by Google; ~300M Workspace + ~1.5B consumer

Emails / sec (in)

~30K avg

~1B/day pre-spam ÷ 86,400; peak ~100K during business hours

Spam filtered

~40%

Google blocks tens of billions of spam/phishing msgs/day; most at edge

Avg message size

~75 KB

HTML body + headers; attachments separate in Drive

Total storage

~25 EB

1.8B × ~15 GB avg use; deduped across threads; tiered hot/cold

Search index size

~1% of data

inverted index per user; tokens from subject + body + attachments

API Design

Three surfaces: the SMTP edge (internet-facing protocol, not HTTP), the web/mobile API (HTTP+JSON for client apps), and IMAP/POP (legacy clients). Real engineering attention goes into SMTP edge and the web API.

SMTPsmtp.gmail.com:25 — receive from internet

Standard SMTP conversation: HELO → MAIL FROM → RCPT TO → DATA. Sender verification (SPF lookup on sender domain), DKIM signature check, DMARC policy enforcement all happen here. Most spam rejected at SMTP-time before even accepting bytes.

GET/api/v1/messages?q=from:boss&labelIds=INBOX&maxResults=50

List messages matching a query. Gmail's query syntax is the primary API. Returns message stubs with metadata + snippets. Backed by per-user Elasticsearch-equivalent.

GET/api/v1/messages/{id}?format=full

Fetch full message including body, headers, and attachment refs. Sent separately from listing so list operations are fast + body is lazy-loaded.

POST/api/v1/messages/send

Submit outgoing message (web UI). Server does DKIM-sign, store in "Sent", enqueue for outbound SMTP delivery. Returns message_id immediately; delivery happens async.

POST/api/v1/messages/{id}/modify

Add/remove labels (marking read = remove UNREAD label). Strong-consistency in user's per-user datastore; fast path for UI responsiveness.

GET/api/v1/threads/{id}

Fetch full conversation (all replies in a thread). Thread ID is computed server-side from Message-ID chain.

POST/api/v1/messages/{id}/report-spam

User reports spam. Not just moves the message — feeds back into the spam classifier (user signal). This is how the filter improves over time.

Architecture

The system has a distinctive shape: an SMTP edge fleet facing the hostile internet, a spam + policy layer that decides accept/reject/quarantine, a per-user mailbox store (sharded by user_id) holding messages + labels + thread data, and a search index per user. An outbound SMTP fleet handles send-mail with delivery reputation management.

Gmail End-to-End SVG

Request Flow — Step Through

Sender MTA · internet SMTP→SMTP edge · IP rep + SPF→DKIM verify · crypto auth→Spam / virus · ML classifier→Mailbox svc · write by user→Search index · invert body→Notify · push + IMAP

Click Next Step to walk through the request flow.

Deep Dive — Inbound Flow + Spam / Phishing Defense

When a mail lands at smtp.gmail.com:25, these steps run before we even store the message:

Connection-level check. Sender IP reputation lookup. Known bad-actor networks get 4xx temp-fail (forces retry from clean IPs) or hard reject.
SMTP handshake. Standard SMTP. During MAIL FROM, sender's domain SPF record is checked — "is this IP authorized to send for this domain?" Failures flagged or rejected based on DMARC policy.
DATA phase. Accept message bytes. On body completion, verify DKIM signature (cryptographic proof the domain owner signed this message). Unsigned or invalid signatures heavily penalized by spam classifier.
Spam classifier. ML model reads headers + body features (links, suspicious TLDs, phrase patterns, sender reputation, recipient engagement history with this sender, etc.). Output: {spam_prob, phishing_prob, label}.
Attachment + URL scanning. Attachments run through VirusTotal-equivalent in-house scanner. URLs checked against Google Safe Browsing. Phishing URLs trigger rewrite to warning interstitial.
Routing decision. Accept (→ inbox), accept-to-spam (→ spam folder), reject (4xx temp or 5xx hard). Non-accepts return over SMTP so sending server knows.
Storage write. Accepted messages written to per-user mailbox (sharded by recipient user_id). Thread computed from headers (In-Reply-To + References + subject-similarity heuristics). Search index updated.
Notify. Push notification to device if user has app installed; IMAP IDLE for long-polling mail clients.

Inbound Sequence — SMTP → Inbox Mermaid

sequenceDiagram participant S as Sender MTA participant E as SMTP edge participant F as Spam / virus participant M as Mailbox svc participant IX as Search index participant N as Notify S->>E: CONNECT + HELO E->>E: IP rep lookup S->>E: MAIL FROM + RCPT TO E->>E: SPF + DMARC check S->>E: DATA (headers+body) E->>E: DKIM verify E->>F: score message alt accept to inbox F-->>E: ham E->>M: store for user M->>IX: index body M->>N: fanout event E-->>S: 250 OK else accept to spam F-->>E: spam E->>M: store to SPAM label E-->>S: 250 OK (quiet) else reject F-->>E: malware / policy E-->>S: 550 Rejected end

Per-user mailbox shard. Bigtable row key is something like user_id#inv_ts#msg_id — so all messages for a user are co-located and scanning the inbox by time is a cheap range read. Labels stored as a separate indexed table (user_id#label#inv_ts#msg_id → msg_id) so "all messages with label=Work" is also a range read.

Search. Each user has their own inverted index. A query "from:boss receipt" intersects the "from:boss" posting list with the "receipt" posting list, restricted to that user's corpus. Indexes live in a separate service; built asynchronously after message ingest (≤ few seconds lag). Google's custom Zanzibar-style ACL applies — never search across users.

Outbound reputation. To send email that actually lands in inboxes (not spam folders at the recipient), Gmail must maintain pristine IP reputation. Outbound fleet sends from dedicated IPs with SPF + DKIM signing. Rate limits per sending-user. Bulk senders (marketing accounts) are flagged and throttled. Gmail's deliverability advantage is maintained by aggressively shutting down senders who get reported as spam.

Interview answer

"SMTP edge does connection-level IP reputation, SPF/DKIM/DMARC auth, then accepts DATA. Before writing, message scores through ML spam + phishing classifier with URL + attachment scanning. Accept decision routes to inbox, spam folder, or reject. Per-user mailbox on Bigtable keyed by (user_id, ts). Threading via In-Reply-To headers. Per-user search index built asynchronously. Outbound fleet DKIM-signs and maintains IP reputation via throttling + abuse response. User spam reports feed back into classifier training."

Tradeoffs & Design Choices

Labels vs folders. Gmail's big bet. Folders are hierarchical single-placement; labels are multi-valued. Enables "archive but I'll find it via search" — a UX that requires reliable fast search. Folders are storage-cheap; labels require reverse index.
Per-user sharding vs domain sharding. Per-user is simpler for consistency + per-user quotas + privacy boundaries. Domain sharding is natural for corporate Workspace. Gmail does per-user + domain-aware routing.
Threading algorithm quality. In-Reply-To/References is RFC-compliant. But email clients strip these fields; same-subject heuristics needed as fallback. Each added heuristic has false-positive risk. Interview-worthy because it's never fully solved.
Attachments inline vs separate. Separate: dedup across messages (same PDF attached to 100 forwards = 1 copy). Inline costs more storage. Gmail stores separate + dedup + garbage-collects when all refs gone.
Accept-and-filter vs reject-at-SMTP. Accept + filter to spam folder = better user experience (can recover). Reject = defends sender reputation. Gmail does both: soft-spam to folder for maybe-spam, hard-reject for obvious malware / IP-reputation-bad.

Failure Modes

📬

Legitimate mail classified as spam

Important message (wedding invite, job offer) lands in spam. User misses it.

→ Mitigation: user spam reports (both directions) feed classifier; personalized signal per-user; "Never mark as spam" explicit override; frequent senders auto-whitelisted by engagement history.

🎣

Phishing slips through to inbox

Targeted phishing link evades classifier; user clicks; account compromised.

→ Mitigation: URL rewrite so all links route through Safe Browsing click-check at click-time (not just send-time); suspicious-sender warning banners for mails from outside the org; MFA on the account reduces impact of credential theft.

🌊

DDoS / mail bomb

Attacker sends 1M emails to one user (or spam from many IPs).

→ Mitigation: per-sender IP rate limiting at SMTP edge; per-recipient quota alerts; connection-level tarpitting on abuse; temp-fail 4xx forces retry backoff to thin the traffic.

💥

Mailbox shard failure

One Bigtable tablet goes offline. Millions of users affected.

→ Mitigation: Bigtable replication across AZs; inbound mail queued (not lost) during outage and flushed on recovery; reads serve from replica; end-user impact is latency spike, not data loss.

🌍

Outbound IP gets blacklisted

A compromised user account sends spam from Gmail IPs. Other providers blacklist us.

→ Mitigation: per-user outbound rate limit; anomaly detection (user suddenly sends 10× normal); auto-suspension on abuse signal; dedicated outbound IP pools so bad actors don't poison shared IPs.

🔐

Search leaks across users

Bug in query builder forgets to scope by user; results include other users' mail. Catastrophic privacy event.

→ Mitigation: defense-in-depth — ACL check at API layer + query always filtered by user_id + row-level filtering in index service + audit logs on cross-user result anomalies. Never rely on one check.

Interview Tips

SMTP is not HTTP. Many candidates model ingress as a REST endpoint. SMTP's temp-fail retry model is core; losing that means losing mail.
Spam is 40% of the story. Not an afterthought. DKIM/SPF/DMARC + ML + user feedback loop is the engineering mass of the inbound path.
Per-user sharding + labels + search. The "Gmail innovation" triangle. Don't present a folder-based model — mention labels + search explicitly.
Threading is subtle. In-Reply-To + References + subject-similarity fallback. Mention all three.
Outbound reputation matters as much as inbound filtering. Gmail's deliverability is a product feature. Don't forget to architect the send side.

Evolution

MVP — per-user folders, IMAP, simple MTA

Folder-based mailbox on local disk per user. Sendmail / Postfix for SMTP. No unified search. Works at Yahoo Mail early-2000s scale.

Search-first + labels (Gmail 2004)

Per-user inverted index. Multi-label messages instead of single-folder. "Just archive, you'll search for it" paradigm shift.

ML spam + DKIM/DMARC

Bayesian → deep-learning spam classifier. DKIM cryptographic sender auth. DMARC policy enforcement. Spam rate drops from 50% of inbox to < 1%.

Conversation view + smart categories

Thread detection + Primary/Social/Promotions tabs. Smart Reply + Smart Compose (ML-generated suggestions) added later.

LLM-integrated (Help me write, summarize)

Gemini-style models draft replies, summarize threads, extract action items. Requires per-user, per-thread context — surfacing relevant messages back into the model prompt is the new "search".

📺

References & Videos

How Email Works (SMTP, IMAP, POP3)

ByteByteGo · 9 min

Design Email System

Arpit Bhayani · 28 min

Design Gmail

AlgoMaster

Design an Email System

GeeksforGeeks

Next up

PROBLEM

Notification System

Push + mobile notify layer overlaps

Read →

PROBLEM

Search Engine

Per-user inverted index; same retrieval primitives

Read →

Gmail

Requirements

Scale Estimation

API Design

Architecture

Deep Dive — Inbound Flow + Spam / Phishing Defense

Tradeoffs & Design Choices

Failure Modes

Interview Tips

Similar Problems

Notification System

Search Engine

WhatsApp

Slack / Discord

Distributed Logging

Evolution

MVP — per-user folders, IMAP, simple MTA

Search-first + labels (Gmail 2004)

ML spam + DKIM/DMARC

Conversation view + smart categories

LLM-integrated (Help me write, summarize)

References & Videos

Notification System

Search Engine