Airbnb

A two-sided marketplace with ~7M listings and ~100M nights booked per year. The hard parts: date-range availability search — "find homes in Lisbon for 4 guests, June 3–10, under $200/night" is a surprisingly gnarly query; no-double-booking with strong consistency on the calendar; and dynamic pricing + complex fees (cleaning fee, service fee, taxes that vary by jurisdiction) without ever showing the guest one number and charging another. Airbnb solved the "Amazon but for someone's spare bedroom" problem at planet scale.

⚡ Core: Availability + Pricing + Booking7M listings100M nights/yearCalendar consistencyTwo-sided marketplace
02

Requirements

Functional
  • Search listings by location, date range, guests, amenities, price; sort + filter + map view
  • View listing detail: photos, amenities, reviews, host, exact-but-approximated location
  • Book a stay: select dates → see total price → confirm → charge
  • Host tools: create/edit listing, manage calendar, set pricing, accept/decline requests
  • Messaging between guest and host pre- and post-booking
  • Reviews (both directions) after checkout with a blind review window
  • Cancellations with policy-driven refunds
Non-Functional
  • Search < 300 ms p99 for the default map-view
  • No double-bookings, ever — strong consistency on calendar writes
  • Price displayed = price charged (no silent recalculations)
  • Scale to 7M active listings, 100M nights/year = ~275k nights/day booked
  • Eventual consistency fine for search / browse; instant consistency on booking
  • 99.99% availability on booking-critical path
03

Scale Estimation

Listings
~7M active
~15M registered including inactive; 7M live at any time
Nights booked / day
~275K
100M/year ÷ 365; peak summer is 2–3× winter
Search QPS
~50K
~1000× browse-to-book ratio; most users are window-shopping
Calendar writes / sec
~10
275K bookings/day ÷ 86400 ≈ 3/sec avg; 10/sec peak; extremely sharp-consistency-critical
Photos / listing
~15
7M × 15 = ~100M photos; ~500 TB in CDN; derivatives auto-generated
Booking saga steps
~6
availability lock → price quote → payment auth → reservation → host notify → lock release
04

API Design

GET/search?location=Lisbon&checkin=2026-06-03&checkout=2026-06-10&guests=4&price_max=200

Primary search. Returns {listings: [{id, title, price_per_night, total_price, photo_urls, location_approx, rating}], total_count, map_bbox}. Backed by Elasticsearch index pre-pruned for the date window.

GET/listings/{id}?checkin=X&checkout=Y&guests=N

Listing detail + authoritative price quote for the requested date window. Returns line-item breakdown: nightly × N, cleaning fee, service fee, taxes.

POST/reservations/quote

Lock in a price quote. Server returns {quote_id, expires_at (10 min), total}. Guest has 10 minutes to complete booking at this price. Protects against race on fee changes mid-checkout.

POST/reservations

Confirm booking. Body: {quote_id, payment_token, guest_message}. Idempotency-key required. Executes the booking saga; returns {reservation_id, status}.

GET/listings/{id}/calendar?start=2026-06&end=2026-09

Per-listing availability calendar. Returns array of (date, status, nightly_price). Used by guest detail view + host dashboard. Served from a read replica; stale by < 1 s.

POST/listings

Host creates or edits a listing. Inline validation; async indexing into Elasticsearch (< 60 s lag).

POST/listings/{id}/calendar

Host blocks dates / sets per-night prices. Writes to authoritative calendar store; invalidates search index entries for those dates.

POST/reservations/{id}/cancel

Cancel with policy check. Returns refund breakdown. Calendar re-opens; emits event to re-index search.

05

Architecture

Three conceptual tiers: the search tier (read-heavy, eventually consistent), the booking tier (write-heavy, strongly consistent), and pricing / ML (mostly offline feature stores + real-time serving). Search and booking both touch the calendar, but for different purposes — search reads a pre-joined, denormalized view; booking takes a short-lived lock on the authoritative row.

Search + Booking Architecture SVG
Guest app/web browse + book Host app listings + calendar API Gateway auth + routing Search (eventually consistent) Search svc query builder Pricing svc quote + ML Elasticsearch geo + date index Feature store host/guest features Booking (strong consistency) Reservation svc booking saga Calendar svc authoritative availability Payment svc auth + capture Postgres (res.) sharded by listing_id Postgres (cal.) row-per-date-per-listing Postgres (listing) master catalog Kafka — booking events / calendar CDC → search reindex S3/CDN photos Redis price cache
Request Flow — Step Through
Guest · search + browseSearch svc · ES bitmap queryQuote · 10 min price lockReservation · saga startCalendar · FOR UPDATE lockPayment · auth w/ idempotencyCommit + Event · flip bits + Kafka
Click Next Step to walk through the request flow.
06

Deep Dive — Availability Search + Booking Saga

Two things are unusual about Airbnb search. First, the filter is a date range, not a single date — you need listings where every night in [checkin, checkout) is available. Second, pricing is per-night and dynamic — the "total price" shown in search results has to be computed per (listing, date-range). Both are hard to do at 50K QPS.

Indexing strategy. The search index stores, for each listing, a compact availability bitmap (one bit per night for the next ~365 days — 46 bytes) plus geo (H3 cells), amenities (tokenized), and a pre-computed nightly price per day. A search query:

  1. Geo-filter to listings in the bounding box.
  2. AND the bitmap against the requested date range — listings with any zero bit in range are excluded. (Elasticsearch handles this natively via bitmap intersection.)
  3. Price filter: compute total (sum of nightly prices for range + fees) and filter to ≤ price_max.
  4. Rank by personalized relevance (learning-to-rank model; features include match score, reviews, host response rate, booking probability).

Critically, the search index is eventually consistent. Host edits and bookings emit events on Kafka; an indexer consumer updates Elasticsearch with a few-second lag. That lag is acceptable because the booking-time re-check is strongly consistent (see saga below).

Booking saga. A booking is not a single DB write; it's a sequence of operations across multiple services, each of which can fail. If any step fails, prior steps must be compensated.

  1. Quote lock (optional). Guest hit quote endpoint earlier; we have a quote_id reserving a price for 10 minutes.
  2. Availability lock. Take an advisory lock on the (listing_id, date_range) rows in Postgres calendar. Use SELECT … FOR UPDATE to prevent concurrent bookings.
  3. Re-verify availability + price. The eventual-consistency search might have been wrong; check authoritatively. If unavailable or price changed materially, abort.
  4. Authorize payment. Call Stripe with idempotency key = reservation attempt id.
  5. Commit reservation. Write reservation row + flip calendar bits to booked. Atomic via DB transaction.
  6. Release advisory lock. Emit ReservationCreated event to Kafka.
  7. Downstream, async: host notification, calendar sync to iCal/Google Cal, search re-index, ML features updated.

If payment fails between steps 4 and 5: release lock, void auth, surface error. If step 5 commit fails but payment succeeded: capture is safe because idempotency key lets us retry; worst case, manual reconciliation flags a held auth that never committed.

Booking Saga Mermaid
sequenceDiagram participant G as Guest participant R as Reservation svc participant C as Calendar svc participant P as Payment svc participant K as Kafka G->>R: POST /reservations (quote_id) R->>C: lock (listing, dates) FOR UPDATE C-->>R: acquired + authoritative availability alt available + price matches quote R->>P: authorize (idempotency_key) P-->>R: auth_id R->>C: COMMIT: mark dates booked R->>K: ReservationCreated event R-->>G: 201 { reservation_id } else unavailable / price changed R->>C: release lock R-->>G: 409 Conflict else payment failed R->>C: release lock R-->>G: 402 Payment Failed end

Why not optimistic concurrency? Could work — read-version, write-check-version. Pessimistic FOR UPDATE is chosen because contention on the same (listing, dates) is rare at the individual row level (each listing has few concurrent attempts) and optimistic retries on conflict would surface "someone beat you" errors to users more often. Pessimistic locks held for <2 s are cheaper than the UX cost of aborted optimistic writes.

Interview answer

"Search uses an Elasticsearch index with per-listing date-availability bitmaps + geo + price; queries intersect the bitmap with the requested range. Index is eventually consistent; updated via Kafka CDC from the authoritative calendar store. Bookings go through a saga: pessimistic lock on (listing, date-range) → re-verify authoritatively → payment auth with idempotency → commit reservation + flip calendar bits atomically → emit event. Price quotes are locked for 10 min at checkout to prevent silent repricing."

07

Tradeoffs & Design Choices

  • Eventually-consistent search, strongly-consistent booking. This is the key architectural choice. Search is read-heavy, latency-critical, and can tolerate a few seconds of staleness because the booking path re-checks authoritatively. Trying to make search strongly consistent would cost orders of magnitude more.
  • Pessimistic vs optimistic calendar locks. Pessimistic for the booking path because contention is rare + concentrated; optimistic with retry for calendar-view reads. Don't generalize one to the other.
  • Pre-computed nightly prices vs compute-on-query. Pre-compute for search (needs to scale to 50k QPS; ML-driven pricing can't run per query). Recompute authoritatively at quote time (by then it's a single listing, single computation).
  • Photo storage: direct-to-S3. Hosts upload directly to S3 via pre-signed URLs. App servers never handle the bytes. Derivatives (different sizes) generated async.
  • Reviews are blind until both sides submit (or 14 days pass). A product-design choice that becomes a data-model choice: review goes into a "pending" state until the reveal trigger. Simpler than it sounds but often missed in interviews.
08

Failure Modes

🔁
Double-booking via race condition
Two guests try to book the same listing for overlapping dates at the exact same millisecond.
→ Mitigation: pessimistic lock on (listing_id, date range) via SELECT … FOR UPDATE. First wins; second retries or fails. Unique constraint on (listing_id, date) with status='booked' as belt-and-suspenders.
💸
Payment authorized, reservation commit fails
DB connection drops between auth and commit. Guest is charged but has no reservation.
→ Mitigation: idempotent commit retry with same reservation_id. If ultimately fails, a reconciliation job scans orphan auths and voids them. Payment service holds the auth for up to 7 days allowing recovery.
🗓️
iCal sync drift
Host also lists on Booking.com; two platforms book the same dates, we don't sync fast enough.
→ Mitigation: poll host's external iCal URLs every 10 minutes; mark dates "soft blocked" immediately on detection; surface conflicts in host dashboard with guided resolution UI. Acknowledge we can't fully prevent — it's a platform interop problem.
📸
Host uploads 100 MB of RAW photos
Mobile upload times out; synchronous transcoding kills the API tier.
→ Mitigation: direct-to-S3 via pre-signed URL; client-side downsample to 5MB max JPEG; async transcoder generates CDN variants; listing marked "photos processing" during the 1–2 min gap.
🌍
Search region hot-spot on big event
Olympics in Paris → 50× search volume for Paris dates. Elasticsearch cluster CPU pegs.
→ Mitigation: shard Elasticsearch by geo with replication; cache hot query results with a short TTL (60 s — stale is OK for browse); rate-limit + queue search beyond threshold; pre-warm caches for known events.
🛡️
Bot scraping for market-intelligence
Competitors scrape all listings + pricing at high rate. Search QPS explodes with no conversion.
→ Mitigation: bot detection on User-Agent + IP patterns; rate limit per IP/account; request signing for official mobile apps; CAPTCHA on anonymous search above thresholds.
09

Interview Tips

  1. Lead with the consistency split. "Search is eventually consistent; booking is strongly consistent; the booking saga re-verifies authoritatively." This sentence shows you've thought about the hard part.
  2. Date-range availability bitmaps. Mention them by name. Most candidates don't know this representation; it's specific and correct.
  3. Price quote locking. 10-minute quote window. Prevents "I see $200, I go enter my card, I see $230." A small detail that shows care for UX.
  4. Saga, not 2PC. Airbnb is a canonical saga example. Don't propose distributed transactions across payment + calendar.
  5. Reviews and trust. Often skipped. The blind-review system is a product-design decision with a data-model implication — worth mentioning.
11

Evolution

1

MVP — single Rails monolith + Postgres

One DB. Search is WHERE clauses. Hosts + guests + listings + reservations all in one schema. Works to ~10K listings. Airbnb ran this way for years.

2

Elasticsearch for search + sharded Postgres

Search moves out of the OLTP DB. Postgres sharded by listing_id. Reservations separated from listings. Carries to ~1M listings.

3

Service-oriented architecture

Monolith split into Listing / Reservation / Payment / Search services. Saga orchestrator for bookings. Kafka for event propagation.

4

ML-driven pricing + ranking

Smart Pricing (dynamic nightly prices per listing), Learning-to-Rank for search. Feature store + offline training pipelines. ~7M listings.

5

Global multi-region + personalization

Search serves from nearest region; booking writes replicate cross-region. Personalized search results based on prior browse + book history.

Next up