"Which database?" is the single most common design decision in every system design interview. Pick wrong and your design either can't scale, can't query, or can't stay consistent. PostgreSQL, MySQL, Cassandra, DynamoDB, and MongoDB each dominate a different workload shape.
This comparison arms you with a crisp mental model so you never wave your hands when the interviewer asks "Why not just use Postgres?"
02
Head-to-head comparison
Database
Consistency
Scaling model
Query flexibility
Ops complexity
Sweet spot
PostgreSQL
Strong (ACID, serializable)
Vertical + read replicas; Citus for horizontal
Full SQL, JSON, CTEs, window functions
Moderate (mature tooling)
General OLTP, complex queries, ACID-critical
MySQL
Strong (InnoDB ACID)
Vertical + read replicas; Vitess for sharding
Standard SQL, decent JSON support
Low-moderate
Web apps, read-heavy OLTP, proven at scale (Meta, GitHub)
Key insight: Relational databases (Postgres, MySQL) give you query flexibility and strong consistency at the cost of harder horizontal scaling. NoSQL databases (Cassandra, DynamoDB, MongoDB) give you horizontal scaling at the cost of query constraints or consistency trade-offs.
03
Decision flowchart
Walk through these questions in order during your interview:
Do you need ACID transactions + complex queries (joins, aggregations)? → PostgreSQL. Default choice for most OLTP. Battle-tested, extensible, rich ecosystem.
Write-heavy time-series / append-only workload at massive scale? → Cassandra. Linear write scaling. Model data by query (denormalize aggressively). No joins.
Serverless, simple key-value / key-sort access patterns? → DynamoDB. Zero ops. Single-digit ms latency. Design your access patterns around partition key + sort key.
Rapidly evolving schema, document-shaped data? → MongoDB. Flexible schema for prototyping. Good aggregation pipeline. Careful with shard key selection for production.
Read-heavy web app, proven MySQL ecosystem (Vitess, ProxySQL)? → MySQL. Simpler than Postgres, massive community, proven at Meta/GitHub scale.
The interview answer: "Postgres for most OLTP. Cassandra for write-heavy time-series. DynamoDB for serverless KV. MongoDB for flexible schema prototyping."
04
When to pick what — deeper context
PostgreSQL — the default choice
If you don't have a specific reason NOT to use Postgres, use Postgres. It handles JSON (JSONB), full-text search, geospatial (PostGIS), and time-series (TimescaleDB) via extensions. Horizontal scaling via Citus or read replicas covers most workloads up to ~10TB. Beyond that, consider sharding or moving hot paths to a purpose-built store.
Cassandra — built for writes
Cassandra's LSM-tree storage engine is optimized for writes. It achieves linear write scaling by partitioning data across nodes with consistent hashing. The trade-off: you must model data by query pattern (no ad-hoc joins). Reads are fast only when you query by partition key. Time-series data (metrics, IoT, activity logs) is the canonical use case.
DynamoDB — zero ops at any scale
DynamoDB's single-table design pattern packs multiple entity types into one table using composite keys. This feels unnatural to SQL developers but enables single-digit-ms reads at any scale. Watch out for hot partitions (adaptive capacity helps but doesn't eliminate the problem) and GSI cost (each GSI is essentially a full table copy).
MongoDB — schema flexibility has a cost
MongoDB's flexible schema is a double-edged sword. Great for prototyping and evolving data models, but production systems need schema validation to prevent data rot. The aggregation pipeline is surprisingly powerful. Shard key selection is make-or-break: a bad shard key creates hot spots that are painful to fix post-deployment.
The polyglot persistence pattern
Real systems often use multiple databases: Postgres for transactional core + Redis for caching + Elasticsearch for search + Cassandra for analytics. Mention this pattern in interviews to show real-world awareness, but keep your primary store choice simple and justified.