(a) Cart data model. Per-user document stored in DynamoDB (partition key: user_id) and cached in Redis as a JSON hash.
// DynamoDB / Redis document
{
"user_id": "u_abc123", // or "guest_xyz" for anonymous
"items": [
{ "sku": "B08N5WRWNW", "qty": 2, "price_at_add": 29.99,
"variant_id": "color_black", "added_at": "2026-04-13T10:05:00Z" }
],
"updated_at": "2026-04-13T10:05:00Z",
"ttl": 1681430400 // DynamoDB TTL: auto-delete after 30 days idle
}
Write-through: Cart Service writes Redis first (fast ack to client), then async writes DynamoDB. On Redis miss, load from DynamoDB and backfill Redis. DynamoDB TTL auto-cleans stale carts after 30 days.
(b) Guest-to-auth merge. When a guest user logs in, the Merge Service unions the guest cart with the saved authenticated cart:
- Read both carts (guest + auth) from Redis/DynamoDB.
- For items in only one cart: include as-is.
- For items in both carts (same SKU): keep the higher quantity. The user clearly wants at least that many.
- Refresh all prices from Price Service — guest cart may be hours old.
- Write merged cart under the authenticated user_id. Delete guest cart.
- Adjust inventory soft-holds: release guest holds, acquire auth holds for the merged quantities.
(c) Inventory soft-hold with TTL. When an item is added to cart, hold that unit for 15 minutes so it doesn't sell out from under the user.
-- Redis: soft-hold on cart-add
-- Key: inventory:{sku}:available
DECR inventory:{sku}:available -- reserve 1 unit
SET hold:{user_id}:{sku} 1 EX 900 -- 15-min TTL key
-- On TTL expiry (keyspace notification):
INCR inventory:{sku}:available -- restore stock
-- On checkout (hard commit):
DEL hold:{user_id}:{sku} -- prevent TTL restore
-- Inventory already decremented; now it's permanent
If the user doesn't checkout within 15 minutes, the TTL expires, Redis INCR restores the stock, and the item becomes available to others. If they do checkout, the hold key is deleted (preventing the TTL callback) and the decrement becomes permanent.
(d) Abandon cart recovery. Cart Service publishes a Kafka event on every cart update. A Flink consumer tracks "time since last update" per cart. When a cart has been idle for > 1 hour and contains items, it fires an event to the Email Service: "You left items in your cart." Recovery emails recapture ~5-10% of abandoned carts — worth billions at Amazon scale.
Recovery timing matters. Sending the email too early (15 min) annoys users still browsing. Too late (24 hours) and they've forgotten or bought elsewhere. The sweet spot is 1 hour for high-intent carts (high-value items), 4 hours for low-value carts. ML models optimize send time per user based on historical open rates and conversion patterns.
Write-through vs write-behind detail. The Cart Service uses a write-behind pattern with Kafka as a durable buffer:
- Client calls
POST /cart/items. Cart Service writes to Redis immediately — client gets response in < 5 ms.
- Cart Service publishes a
cart-updated event to Kafka (async, non-blocking).
- A DynamoDB writer consumer reads from Kafka and writes to DynamoDB.
- If Redis crashes before Kafka publish, the write is lost — but the client already got a success response. Risk window: ~1-2 ms. Acceptable for a shopping cart (not for payments).
- If DynamoDB writer fails, Kafka retains the event. Consumer retries with exponential backoff. Eventually consistent.
Cart Lifecycle — Add to Merge to CheckoutMermaid
sequenceDiagram
participant U as User (guest)
participant CS as Cart Service
participant R as Redis
participant D as DynamoDB
participant IS as Inventory svc
participant MS as Merge Service
participant CO as Checkout
U->>CS: POST /cart/items {sku, qty:2}
CS->>R: HSET cart:guest_xyz (item)
CS->>D: PutItem (async write-through)
CS->>IS: DECR inventory:{sku} by 2
IS-->>CS: hold confirmed (15-min TTL)
CS-->>U: cart updated, price_at_add=$29.99
Note over U: User logs in
U->>MS: POST /cart/merge {guest_cart_id}
MS->>R: GET cart:guest_xyz + cart:u_abc123
MS->>MS: union items, keep higher qty
MS->>R: HSET cart:u_abc123 (merged)
MS->>D: PutItem (merged, async)
MS->>IS: release guest holds, acquire auth holds
MS-->>U: merged cart returned
U->>CO: POST /checkout
CO->>IS: hard-commit inventory (DEL hold keys)
CO-->>U: order confirmed
Soft-hold edge cases. The 15-min TTL soft-hold has several edge cases that need careful handling:
- User updates quantity from 2 to 5: Cart Service computes delta (+3), calls DECR by 3. If insufficient stock for the delta, reject the update and return "only N available."
- User removes item: Cart Service calls INCR by the held quantity. Stock is immediately available to others.
- TTL race on checkout: User's hold expires at T=15:00, checkout request arrives at T=14:59. The hold key might expire between "check hold exists" and "delete hold key." Solution: use a Lua script that atomically checks and deletes the hold, then hard-commits the inventory decrement.
- Multiple tabs / devices: User adds item on phone (hold created), adds same item on laptop (second hold attempted). Deduplicate by user_id + sku — only one hold per user per SKU.
Price service integration. Every GET /cart call refreshes prices from the Price Service cache (a Redis cluster with catalog prices updated every 5 minutes from the product catalog DB). The response includes both price_at_add (historical) and current_price. If they differ by more than 5%, the UI shows a "price changed" badge. Before checkout, a final price validation ensures the user pays the current price — never the stale one.
Interview answer
"Cart is a per-user JSON document in DynamoDB (durable) with a Redis cache (fast). Writes go through Redis first, then async to DynamoDB via Kafka write-behind. Guest carts merge on login via union strategy — same SKU keeps higher quantity, prices refreshed. Inventory soft-hold uses Redis DECR with a 15-minute TTL key: if checkout happens, delete the TTL key and hard-commit; if abandoned, TTL expires and INCR restores stock. Kafka events on cart updates feed a Flink consumer that triggers abandon-recovery emails after 1 hour idle."