06
Deep Dive — Preventing Double-Selling Under Extreme Concurrency
This is THE interesting problem in this design. When 10,000 users click on the same seat within a second, exactly one must win and 9,999 must be told "seat taken" — instantly, with no race conditions.
The Naive Approach (and Why It Fails)
-- Thread 1 and Thread 2 both run this simultaneously
SELECT status FROM seats WHERE seat_id = 'A1' AND event_id = 'E1';
-- Both see: AVAILABLE ← race condition window
UPDATE seats SET status = 'HELD', user_id = 'U1' WHERE seat_id = 'A1';
-- Thread 1 wins
UPDATE seats SET status = 'HELD', user_id = 'U2' WHERE seat_id = 'A1';
-- Thread 2 ALSO succeeds → DOUBLE SOLD
Two reads happen before either write. Both see "available" and both proceed. This is a classic read-then-write race condition.
The Two-Layer Hold Pattern
The solution uses Redis as a fast gate and PostgreSQL as the authority. Redis rejects 99% of contention without ever touching the database.
sequenceDiagram
participant U as User
participant IS as Inventory Service
participant R as Redis
participant PG as PostgreSQL
U->>IS: POST /hold (seat A1)
IS->>R: SET seat:E1:A1 user_123 NX EX 480
alt Key already exists
R-->>IS: nil (FAIL)
IS-->>U: 409 Seat Taken
else Key set successfully
R-->>IS: OK
IS->>PG: UPDATE seats SET status='HELD' WHERE status='AVAILABLE' AND version=N
alt rows_affected = 1
PG-->>IS: 1 row updated
IS-->>U: 200 Hold Confirmed (8 min)
else rows_affected = 0
PG-->>IS: 0 rows
IS->>R: DEL seat:E1:A1
IS-->>U: 409 Seat Taken
end
end
Why SET NX Instead of Redlock?
SET NX — Simple Claim
Single atomic command. 1 network round-trip. The NX flag means "only set if not exists." The EX 480 gives an 8-minute TTL. We're claiming, not locking — the hold itself is the state.
Redlock — Overkill
Requires 5 independent Redis masters, 5 round-trips, clock sync assumptions. Designed for mutual exclusion (lock → work → unlock), but we don't have a critical section. Known issues with GC pauses and clock drift (see Kleppmann's critique).
What if Redis Succeeds but DB Fails?
The Redis SET NX succeeds, but the PostgreSQL UPDATE fails (timeout, crash, disk full). Now Redis thinks the seat is held, but the DB thinks it's available — split-brain state.
async def hold_seat(event_id, seat_id, user_id):
redis_key = f"seat:{event_id}:{seat_id}"
# Phase 1: Fast gate
acquired = await redis.set(redis_key, user_id, nx=True, ex=480)
if not acquired:
return HoldResult.SEAT_TAKEN
# Phase 2: Authoritative write
try:
rows = await db.execute("""
UPDATE seats SET status = 'HELD', user_id = %s,
held_until = NOW() + INTERVAL '8 min',
version = version + 1
WHERE seat_id = %s AND event_id = %s
AND status = 'AVAILABLE'
""", [user_id, seat_id, event_id])
if rows == 0:
await redis.delete(redis_key) # Clean up
return HoldResult.SEAT_TAKEN
return HoldResult.SUCCESS
except Exception:
await redis.delete(redis_key) # Roll back Redis
return HoldResult.RETRY
Defense in Depth — 4 Safety Layers
The Hierarchy of Truth
Layer 1 — PostgreSQL is the source of truth (survives restarts, is ACID).
Layer 2 — Redis is the performance optimization (fast filter, may be stale).
Layer 3 — TTL is the self-healing mechanism (bounds duration of any inconsistency to 8 min).
Layer 4 — Reconciliation job is the safety net (scans every 30s, catches anything TTL hasn't fixed).
"Best Available" — FOR UPDATE SKIP LOCKED
Many users don't pick specific seats — they request "2 best available in Section B." PostgreSQL's FOR UPDATE SKIP LOCKED is perfect:
SELECT seat_id FROM seats
WHERE event_id = ? AND status = 'AVAILABLE' AND price_tier = ?
ORDER BY row_number ASC, seat_position ASC -- front-center is "best"
LIMIT ?
FOR UPDATE SKIP LOCKED -- Skip rows locked by other transactions
If 10 people request "best available" simultaneously, they each get different seats without blocking each other. No deadlocks, no waiting.
General Admission — Atomic Counter
For events without reserved seating, per-seat locking is unnecessary. Instead, use a single Redis DECR:
remaining = DECR event:E1:remaining
if remaining >= 0:
# Purchase succeeds — write to DB async
else:
INCR event:E1:remaining # Roll back
# Sold out