04
Deep dive — cross-database migration with CDC
You're moving from MySQL to Postgres. Different schemas, different connection strings, different SQL dialects. Naive: schedule a maintenance window, dump-restore, point app at new DB. Hours of downtime; data loss if anything fails.
The right way using CDC:
- Set up Postgres with target schema. Empty.
- Backfill: streaming export from MySQL → import to Postgres. Take hours/days; no downtime since MySQL still serves prod.
- Start CDC: Debezium tails MySQL binlog → applies changes to Postgres. New writes flow through. Postgres now mirrors MySQL with seconds of lag.
- Verify: run consistency checks. Row counts match? Sample rows match? Latency under control?
- Dual-write: deploy app version that writes to both MySQL and Postgres explicitly. Catches bugs the CDC layer might miss.
- Switch reads: deploy app reading from Postgres. MySQL still being written but unused.
- Stop writing to MySQL: deploy app writing only to Postgres.
- Decommission MySQL: archive, then delete.
Each step takes hours/days. The whole migration takes weeks. But traffic is served the entire time. This is how Stripe migrated billions of rows from MongoDB to Postgres; how Shopify continuously evolves its sharded MySQL fleet.
Interview answer
"Zero-downtime migrations follow expand-then-contract: add new schema, dual-write, backfill, switch reads, stop dual-write, drop old. Each step is independently deployable and reversible. Cross-database moves use CDC to keep target in sync during the dual-write window — Debezium tails the source's WAL/binlog and replays to the destination."