URL Shortener

02

Requirements

Functional

Users can shorten any URL to a 7-character code
Anyone with the short link gets redirected to the original URL
Links can optionally have a custom expiry date
Creators can view click analytics — count, geography, device
Users can optionally provide a custom alias instead of auto-generated code

Non-Functional

99.99% availability — broken links are catastrophic
Redirects under 100ms p99 globally
Short codes must be globally unique — zero collision tolerance
Read-heavy: 1000:1 read-to-write ratio
Eventual consistency acceptable — 50ms replication lag is fine

Key Clarifying Questions

Before designing, always ask: Do links expire or live forever? Do you need click analytics? Are custom aliases required? Expected scale — 1M DAU or 100M DAU? These four questions change the architecture significantly.

03

Scale Estimation

Metric	Calculation	Result
Daily Active Users	bit.ly scale assumption	100M
Daily writes (new links)	100M × 1% create links	1M / day
Write RPS (average)	1,000,000 ÷ 86,400	~12 / sec
Write RPS (peak)	12 × 5× peak factor	~60 / sec
Daily reads (clicks)	100M × 10 clicks/user	1B / day
Read RPS (average)	1,000,000,000 ÷ 86,400	~11,600 / sec
Read RPS (peak)	viral moments	~100,000 / sec
Read : Write ratio	11,600 ÷ 12	~1,000 : 1
Storage (5 years)	1M/day × 365 × 5 × 500B	~1 TB
Short code space (7 chars)	62^7	3.5 trillion
Runway at current write rate	3.5T ÷ 1M/day	~9,500 years

The Number That Defines Everything

The 1,000:1 read-to-write ratio is the single most important number in this system. It tells us writes are trivially easy (PostgreSQL handles 60/sec in its sleep) and reads are the entire engineering challenge. Every architectural decision — Redis, CDN, LFU eviction, hotness tiers — exists because of this ratio.

Storage Insight

Storage is not a bottleneck. 1TB over 5 years fits on a single modern SSD. This tells you the problem is throughput, not capacity. You are not building a data warehouse — you are building a high-speed lookup service.

04

API Design

POST /v1/urls Create a new short URL

// Request { "long_url": "https://amazon.com/dp/B09G9FPHY6/ref=pd_ci_mcx...", "custom_alias": "my-launch", // optional "expires_at": "2025-12-31T00:00:00Z" // optional } // Response 201 { "short_code": "x9kZ3mP", "short_url": "https://bit.ly/x9kZ3mP", "created_at": "2025-01-15T10:00:00Z" }

GET /:code Redirect to original URL

// Response 302 Found (NOT 301 — analytics requires every click tracked) Location: https://amazon.com/dp/B09G9FPHY6/ref=pd_ci_mcx... // Response 404 if code not found or expired { "error": "link_not_found" }

GET /v1/urls/:code/analytics Get click analytics for a link

// Response 200 { "short_code": "x9kZ3mP", "total_clicks": 47293, "clicks_last_24h": 3847, "top_countries": ["US", "GB", "IN"], "top_devices": ["mobile", "desktop"] }

DELETE /v1/urls/:code Delete a short URL

Returns 204 No Content. Soft-delete — record purged after 30 days. Must also DEL from Redis and purge CDN entry synchronously.

Critical: 302 not 301

Always use 302 Temporary Redirect, never 301. A 301 tells browsers to cache the redirect permanently — after the first click, your servers never see subsequent clicks. You lose all analytics. Since analytics is the commercial value of a URL shortener, using 301 destroys your product while appearing to optimise it.

05

High-Level Architecture

Architecture — Full System SVG Diagram

CDN Edge

Tier 2 and Tier 3 viral links live here. A user in Tokyo clicking a viral link never touches your origin servers — the CDN edge node responds in under 5ms. The hotness monitor pushes links to CDN when they cross 1,000 clicks/minute.

API Servers

Stateless compute — any instance handles any request. This enables horizontal scaling. Session state lives in Redis, not in memory. Each server claims keys from the pool, never generates them on the fly.

Redis Cache

LFU eviction protects historically popular links. TTL jitter (±1hr) prevents expiry cliffs. Sliding window counters track clicks/minute per link. A cache hit costs <1ms vs ~5ms for a DB read.

Key Pool DB

Pre-generated 7-char Base62 codes stored in an keys_available table. FOR UPDATE SKIP LOCKED prevents race conditions across servers. The background generator maintains 10M+ codes available at all times.

06

Deep Dive — Key Generation & Hotness Tiers

Two Core Concepts

This problem has two technically interesting cores: how you generate unique short codes safely across distributed servers, and how you detect and serve viral links without touching your database. Everything else is standard web infrastructure.

Sequence — Cache Hit vs Cache Miss Mermaid.js

sequenceDiagram participant C as Client participant CDN as CDN Edge participant API as API Server participant R as Redis participant DB as PostgreSQL C->>CDN: GET /x9kZ3mP alt Tier 2/3 — CDN hit CDN-->>C: 302 redirect (< 5ms globally) else CDN miss CDN->>API: Forward request API->>R: GET x9kZ3mP alt Redis hit R-->>API: long_url (< 1ms) API-->>C: 302 redirect (~10ms total) else Redis miss API->>DB: SELECT long_url WHERE short_code = 'x9kZ3mP' DB-->>API: long_url (~5ms) API->>R: SET x9kZ3mP TTL=86400 (async) API-->>C: 302 redirect (~20ms total) end end

Key Generation — The Pre-Generated Pool

Three approaches exist for generating short codes. Random on-demand requires a DB read on every write to check for collisions. MD5 hashing is deterministic (deduplication for free) but needs salt-and-rehash on collision. The pre-generated key pool wins because all uniqueness checking happens offline in the background — the write path just claims a key atomically.

The critical SQL clause is FOR UPDATE SKIP LOCKED. When 10 API servers simultaneously claim keys, each one locks a row and any server that finds its target already locked simply skips to the next available key. No race condition. No collision. No coordination overhead. The entire operation is a single atomic transaction.

-- Atomic key claim — safe across all API servers simultaneously BEGIN; SELECT short_code FROM keys_available LIMIT 1 FOR UPDATE SKIP LOCKED; -- skip keys locked by other servers DELETE FROM keys_available WHERE short_code = 'x9kZ3mP'; INSERT INTO keys_used (short_code, long_url, user_id) VALUES ('x9kZ3mP', 'https://amazon.com/...', 12345); COMMIT;

Hotness Monitor — Three-Tier Traffic Detection

Every click increments a Redis counter with minute-level granularity: INCR clicks:x9kZ3mP:2024-01-15-14:37. The hotness monitor sums the last 5 minute-buckets every 60 seconds to get a stable clicks/minute figure for each active link. Links are then sorted into three tiers based on traffic thresholds.

Tier 1 (100–999/min) — Redis cache with TTL refreshed on access. Tier 2 (1,000–9,999/min) — Redis plus CDN edge globally, TTL renewed every 60s by the monitor. Tier 3 (10,000+/min) — pre-computed static redirect served at CDN with no application logic involved at all. Demotion is passive — when traffic drops, the monitor stops renewing the CDN TTL, and it expires naturally within 5 minutes.

Why 7 Characters? The Math.

7 characters from Base62 (a–z, A–Z, 0–9) gives 62^7 = 3.5 trillion unique combinations. At 1 million new links per day, that's 9,500 years of runway. 6 characters would give 56 billion combinations — enough mathematically but without safety margin. 8 characters is unnecessary. 7 is the sweet spot derived from requirements, not guesswork.

The avalanche effect in Base62-encoded hashes ensures inputs are uniformly distributed across the entire 3.5 trillion slot space. Even after 2 billion stored URLs, you're using only 0.057% of available space — collision probability per new insert is approximately 0.029%.

Request Flow — Step Through

Client · Browser→CDN Edge · Cloudflare→API Server · Stateless→Redis · Cache <1ms→PostgreSQL · Source of truth

Click Next Step to walk through the request flow.

07

Key Design Decisions & Tradeoffs

Option A — Chosen

Pre-Generated Key Pool

All uniqueness work happens offline. Write path just claims a key atomically with FOR UPDATE SKIP LOCKED. Zero collision possible. No DB reads on write path. Background generator easily keeps pace with 60 writes/sec.

✓ Production answer — zero write-path complexity

Options B & C

Random / MD5 On-Demand

Random requires a DB existence check on every write (extra round trip). MD5 hashing gives free deduplication but requires salt-and-rehash on collision. Both work at small scale but add reactive complexity vs the pool's proactive approach.

~ Acceptable at low scale

Option A — Chosen

302 Temporary Redirect

Every click hits your servers. Full analytics on every request — count, device, country, referrer. The short URL can be updated or expired at any time. Analytics is the product — this is the only viable choice.

✓ Required for analytics-driven business model

Option B

301 Permanent Redirect

Browser caches redirect forever after first click. Near-zero server load for returning users. But you lose all subsequent click data. You cannot update the destination. You cannot expire the link. Destroys your product while appearing to optimise it.

✗ Kills analytics — only for internal tools

Option A — Chosen

LFU Cache Eviction

Evicts by total access frequency. A link with 10M historical clicks that went quiet 2 hours ago stays protected. URL access follows a power law — historically popular links predict future clicks better than recent ones.

✓ Right for power-law access patterns

Option B

LRU Cache Eviction

Evicts by recency. Redis default. Works well for most systems but can evict a viral link during a brief quiet period, causing a thundering herd on the next traffic spike. Suboptimal for URL shorteners specifically.

~ Fine for generic caching, suboptimal here

Option A — Chosen

PostgreSQL

ACID guarantees, FOR UPDATE SKIP LOCKED for key pool, complex queries, familiar tooling. Handles 60 writes/sec using 1.2% of capacity. Read replicas scale reads to 100x. Simple schema, proven at this scale.

✓ Start here always

Option B

Cassandra / DynamoDB

Massive write throughput, linear horizontal scaling, no single point of failure. But gives up ACID, joins, and strong consistency. Adds significant operational complexity for a problem that doesn't need it. PostgreSQL handles our write load in its sleep.

~ At 1000× scale only

08

What Can Go Wrong

🔥

Thundering Herd

A viral cached link expires. Thousands of requests simultaneously miss the cache and all query the database at once. DB CPU spikes, latency climbs, cascading failure possible.

→ Fix: Mutex lock on cache miss — only one request queries DB, rest wait for cache population

⚡

New Viral Link Stampede

Celebrity posts a brand new link. It has never been cached. 100,000 simultaneous first-clicks all miss cache at the same moment before the hotness monitor can promote it.

→ Fix: Hotness monitor promotes to CDN within 60s. Redis absorbs the burst in the interim — it handles 100K ops/sec comfortably.

💾

Primary Database Failure

Primary DB goes down. All writes fail — new links cannot be created. Reads survive on replicas but may serve slightly stale data during the failover window.

→ Fix: AWS RDS Multi-AZ automated failover. Standby promoted in ~30s. Reads unaffected — replicas still serve. CDN protects hot links entirely.

🌊

Redis Failure

Cache goes dark. 100% of reads fall to the database immediately. At 11,600 reads/second, a single PostgreSQL instance is overwhelmed within seconds — not minutes.

→ Fix: Redis Sentinel for automatic failover (<60s). CDN provides partial safety net — Tier 2/3 links continue serving from edge nodes unaffected.

🗝️

Key Pool Runs Dry

Background key generator crashes silently. Pool drains at 60 writes/sec. When pool hits zero, write requests fail — users cannot shorten new URLs.

→ Fix: Alert at <1M keys remaining (46 hours runway). Fallback to random generation with collision check. Generator restart resolves within minutes.

🔄

Stale Cache After URL Update

User updates their short link destination. DB is updated immediately but Redis and CDN still serve the old URL. Users get redirected to the wrong place — a correctness bug, not just a performance issue.

→ Fix: On every URL update, explicitly DEL from Redis and call CDN purge API synchronously before returning success to the user.

🌍

Full Region Outage

Entire datacenter or cloud region goes dark. All links hosted in that region become inaccessible. Every QR code, marketing email, and printed poster pointing to those links breaks simultaneously.

→ Fix: Multi-region active-active deployment. DNS failover routes to healthy region in <60s. CDN edge nodes are globally distributed — hot links unaffected by origin region failures.

09

Interview Tips

01

Clarify before drawing anything. Ask four questions: Do links expire? Do you need analytics? Custom aliases? Expected scale? These change the architecture significantly. Interviewers reward candidates who treat requirements as inputs, not assumptions.

02

Lead with the read/write ratio. Say it explicitly: "This system is 1,000:1 read-heavy — that single number defines the architecture." Everything that follows — Redis, CDN, LFU — should trace back to this ratio. Interviewers want to see you derive architecture from data.

03

Avoid the NoSQL trap. Most candidates jump to Cassandra or DynamoDB to "handle scale." This signals poor judgment. PostgreSQL handles 60 writes/sec using 1.2% of capacity. Say: "I'd start with PostgreSQL and migrate to NoSQL only if I hit its write ceiling" — which won't happen at this scale.

04

Name all three key generation options, then justify the pool. Random → MD5 hash → pre-generated pool. The magic phrase: "FOR UPDATE SKIP LOCKED prevents race conditions atomically across distributed servers." Knowing this SQL clause signals genuine production database experience.

05

The 301 vs 302 question always comes up. Answer: "301 destroys analytics — the entire commercial value of a URL shortener is click tracking, so 302 is the only viable choice." Candidates who say 301 don't understand the business model.

06

The hotness monitor is your differentiator. Most candidates stop at "put Redis in front of the DB." Describing a sliding window counter that detects tier transitions and pushes viral links to CDN edge separates good answers from great ones. It shows you think in systems, not just components.

07

Name your failure modes before being asked. Say proactively: "The thundering herd is the main risk — a mutex lock on cache miss means only one request queries the DB while others wait." Candidates who raise problems before being asked look senior. Those who wait to be asked look reactive.

08

End with the evolution story. "I'd start with a monolith — one server, one PostgreSQL, no Redis. Ship fast." Then walk through: add Redis → add read replicas → add CDN tiering → shard DB → multi-region. Each step triggered by a specific metric. This shows engineering maturity — solutions proportional to problems.

10

How the Design Evolves

Principle

Each phase is triggered by a specific metric crossing a threshold — not by time, not by team size, not by intuition. Premature optimisation is the enemy. Every phase adds complexity that must be justified by a real problem.

Phase 1 — 0 to 100K users

Single Server Monolith

One server, one PostgreSQL database, no cache, no queue. Random code generation with collision check is fine at this scale. Ship fast. This handles ~500 reads/sec easily. Most startups live here for years. The correct answer for MVP is boring infrastructure.

Phase 2 — 100K to 10M users

Add Redis + Key Pool

DB read latency starts climbing. Add Redis cache with LFU eviction in front of PostgreSQL. Migrate to the pre-generated key pool to eliminate write-path collision checks. Add a read replica. Most products live here their entire lifecycle. This configuration handles ~50K reads/sec.

Phase 3 — 10M to 100M users

Add CDN + Hotness Monitor

Cache hit rate is high but viral spikes still stress the origin. Add CDN edge caching for hot links, driven by the hotness monitor. Add more read replicas. API servers scale horizontally behind the load balancer. Redis cluster for cache HA. This is the architecture described in this document — the "full" design.

Phase 4 — 100M+ users

Database Sharding + Multi-Region

PostgreSQL write throughput approaches its ceiling (even at 60 writes/sec, data volume and index size create operational complexity at this scale). Shard by short_code hash range across multiple primaries. Deploy full active-active stack in multiple regions. Global load balancing with GeoDNS. Most engineers never operate here — but knowing it demonstrates architectural depth.

Phase 5 — Planet Scale

Custom Infrastructure

At true planet scale (think bit.ly or t.co), consider migrating the hot path entirely to a NoSQL store like Cassandra for the short_code → long_url lookup table. Custom hardware at CDN PoPs. Dedicated analytics pipeline (Kafka → Flink → data warehouse). The read path becomes a pure key-value lookup with no application logic — raw bytes from edge nodes in under 2ms globally.

📺

References & Videos

Designing a URL Shortener

Gaurav Sen · 12 min

System Design: TinyURL

Tech Dummies · 20 min

Design a URL Shortener

ByteByteGo · 8 min

Design a URL Shortener

AlgoMaster

System Design: URL Shortening Service

GeeksforGeeks

Requirements

Scale Estimation

API Design

High-Level Architecture

Deep Dive — Key Generation & Hotness Tiers

Key Generation — The Pre-Generated Pool

Hotness Monitor — Three-Tier Traffic Detection

Why 7 Characters? The Math.

Key Design Decisions & Tradeoffs

What Can Go Wrong

Interview Tips

Similar Problems

How the Design Evolves

References & Videos

Interview Framework