06
Deep Dive — The Media Upload Pipeline
The Core Question
How do you accept 95M uploads/day, transcode each into 4-6 renditions, and have them globally available within seconds — without your servers ever touching the bytes?
Step 1 — Direct-to-S3 upload. Client requests a pre-signed multi-part upload URL. Uploads chunks of ~5 MB directly to S3, resumable (network drops? resume from last chunk). Server sees only metadata: upload_id, expected size, declared MIME. Saves ~50 Tbps of inbound bandwidth at the API tier.
Step 2 — Validate + extract. S3 event triggers a Lambda. Reads the original, validates format/size, extracts EXIF, runs initial moderation (NSFW classifier + virus scan). Marks the upload ready or rejects.
Step 3 — Transcode ladder. A GPU worker pool consumes ready uploads. Photos get resized to 5 sizes (thumbnail 150², card 600², feed 1080², profile 1440², full 2048²) + 2 formats (WebP for modern, JPEG fallback). Reels get the full HLS ladder: 240p / 480p / 720p / 1080p (+ AV1 for newer clients) split into 4-second segments. One reel = ~40 derivative files; one photo = ~10.
Step 4 — Push to CDN. Derivatives written to S3 with predictable paths. CDN pulls on first request (lazy) or pre-warmed via API call (for high-confidence viral content from verified accounts). HLS manifest stitched with adaptive bitrate URLs.
Step 5 — Ack to client. Original Kafka UploadReady → PostCreated handlers update the post status. Client polls the post; once status is processed, post becomes visible in feeds.
Latency budget: first thumbnail in 1-2s, full HLS ladder in 10-30s for reels. Users see their photo immediately (we serve the original or a fast-path resize); the optimal rendition replaces it within seconds.
Cost realities — a 30-second reel transcoded into 4 renditions × 2 codecs takes ~30 GPU-seconds. At 95M uploads/day with say 30% reels, that's ~280 GPU-hours/day for transcoding alone. At AWS spot pricing ($0.50/GPU-hour for older A10G), ~$140/day. At Instagram's actual scale with H100s and AV1 re-encoding, it's millions in annual GPU spend.
Sequence — Upload & TranscodeMermaid.js
sequenceDiagram
participant C as Client
participant API as API
participant S3 as S3
participant L as Validator (Lambda)
participant K as Kafka
participant T as Transcode workers
participant CDN as CDN
C->>API: POST /uploads/initiate
API->>S3: pre-sign multipart URL
API-->>C: { upload_url, upload_id }
C->>S3: PUT chunks (5 MB each, resumable)
S3->>L: ObjectCreated event
L->>L: validate, EXIF, NSFW, virus
L->>K: UploadReady{ id, type }
K->>T: consume
T->>S3: read original
T->>T: encode 5 sizes + 2 formats (photo)
or HLS ladder (reel)
T->>S3: write derivatives
T->>K: TranscodeComplete{ id, manifest }
K->>API: status update
CDN->>S3: lazy pull on first request
CDN-->>C: serve derivative