Concept · Foundations

Shared-Nothing Architecture

01

Why this matters

The single most important architectural principle for scalable systems: no resource is shared across nodes. No shared memory, no shared disk, no shared lock manager, no shared cache. Each node owns its slice of data and serves its slice of requests. Adding capacity = adding more nodes. Removing capacity = killing nodes. That's it. No coordination overhead grows with cluster size.

Stonebraker named the pattern in 1986; every webscale system you can name is shared-nothing under the hood — DynamoDB, Cassandra, BigTable, Snowflake, Kafka. The opposite (shared-disk databases like Oracle RAC) hits coordination ceilings dramatically faster.

02

The three architectures

PatternWhat's sharedScaling ceilingExample
Shared-everythingMemory + disk + CPU1 machineSingle-server Postgres
Shared-diskDisk; CPU + memory per node~20 nodes (lock manager bottleneck)Oracle RAC, IBM DB2 pureScale
Shared-nothingNothing; communication via network only~1000s of nodesCassandra, DynamoDB, BigQuery, Kafka
03

Why it scales

Shared anything = coordination. Coordination scales sub-linearly with node count (often O(N²) for messages, O(N) for lock contention). Past some N, adding a node makes everything slower because coordination overhead exceeds added work.

Shared-nothing eliminates coordination from the data path. A node receives a request, looks up only its local data (or computes only its slice), returns. It doesn't ask other nodes anything. Throughput is N × per-node throughput, perfectly linear.

The pattern extends beyond data: stateless services are shared-nothing at the compute layer (no shared session state). Sharded databases are shared-nothing at the storage layer (each shard owns its data). The principle propagates: anywhere you can avoid sharing, do.

04

What you give up

Shared-nothing isn't free. The trade-offs:

  • No global transactions for free. A transaction touching data on N shards becomes a distributed transaction (saga, 2PC, or "model your data so transactions stay local").
  • Cross-shard joins are app-level. The DB can't join across nodes; you fetch then join in code, or denormalize.
  • Routing layer needed. Some component must know which node owns which data — a router, smart client, or consistent hashing.
  • Rebalancing. Adding/removing a node moves some data. Done well (consistent hashing, virtual nodes), it's automatic and incremental. Done poorly, it's a maintenance window.
05

Deep dive — designing your data model for shared-nothing

The single biggest determinant of shared-nothing success: pick a partition key that aligns with your access pattern. If most queries are scoped to "this user's data," shard by user_id. If most queries are "this tenant's data," shard by tenant_id. Each query then hits exactly one node — perfect locality.

Bad partition keys force scatter-gather: query goes to all N nodes, results merge. Latency = slowest node's response. Throughput = N× lower than single-shard queries. This is what kills naive sharded systems.

Real example — Twitter:

  • Shard tweets by user_id. Reading "all of Alice's tweets" hits one shard. Win.
  • "Tweets mentioning @bob." Bob's mentions are spread across millions of authors' shards. Scatter-gather across the entire fleet. Lose.
  • Solution: build a second data structure — a "mentions index" — sharded by mentioned-user. Two writes per tweet, but reads stay shard-local. Denormalization to preserve the shared-nothing read property.
The interview rule

"For shared-nothing scale, every hot query must hit exactly one shard. If a query inherently spans shards, build a denormalized view sharded by that query's access key. Two writes; one read. Always."

06

Real-world

DynamoDB / Cassandra

Shared-nothing by design

Each partition lives on its node(s). No cross-partition coordination at the data layer.

BigQuery / Snowflake

Compute-storage separation

Storage on shared S3-like layer; compute is shared-nothing nodes pulling slices in parallel. New twist on the pattern.

Kafka brokers

Partition-per-broker

Each broker owns its partitions. Producer and consumer routing is shared-nothing — no global coordinator on the data path.

CockroachDB / TiDB

Shared-nothing SQL

Range-sharded; each range owned by one Raft group. Cross-range transactions exist but are explicit; most queries are local.

07

Used in problems

Every system in this portfolio that scales beyond one box uses shared-nothing somewhere. URL shortener shards by short-code hash. News feed shards by user_id. Uber shards by geohash. E-commerce federates by domain.

Next up