Why this matters
One server, even a massive one, tops out around ~100k writes/sec. Replication buys you read scale but not write scale — the leader is still a bottleneck. Once writes exceed one box, you shard: split the data across N boxes, each owning a slice. Done right, throughput scales linearly with N. Done wrong, you get hot shards, impossible cross-shard queries, and a year-long migration project.
Sharding is the most-misused word in system design. Know the four strategies, their failure modes, and when each is correct.