Concept · Databases

Database Federation

01

Why this matters

One huge database holds everything: users, orders, products, reviews, notifications, sessions. As the company grows, this DB becomes the bottleneck — every team's slow query affects everyone, schema changes touch everyone, scaling means scaling for the noisiest workload.

Database federation (also called functional or vertical sharding) splits the data by function: users go in one DB, orders in another, products in a third. Each tuned for its workload. Each scaled independently. Distinct from horizontal sharding, which splits a single table across nodes by row.

02

Functional vs horizontal split

Horizontal sharding

One table, split by row across N nodes

Users 1-1M on shard A, 1M-2M on shard B. Same schema everywhere. Solves "this table is too big."

Functional federation

Different tables on different DBs

Users live on user-DB. Orders on order-DB. Products on product-DB. Each DB has its own schema. Solves "this monolith is too coupled."

Often used together. Federate first (split monolith into per-domain DBs); shard horizontally if any single domain DB outgrows one node.

03

What the split unlocks

  • Independent scaling. Read-heavy product catalog needs replicas. Write-heavy order log needs sharding. Solved per-domain.
  • Independent technology. Users in Postgres for ACID. Sessions in Redis for speed. Search in Elasticsearch. Time-series metrics in InfluxDB. Polyglot persistence falls naturally out of federation.
  • Independent ops. Reschedule order-DB maintenance Tuesday; user-DB Wednesday. Smaller blast radius per upgrade.
  • Independent ownership. Each team owns its DB. Schema changes don't require cross-team coordination.
  • Smaller indexes. Each DB is smaller → indexes fit in RAM, queries faster.
04

What you give up

The price is steep: you can no longer JOIN across DBs. Need "user + their orders"? You fetch from user-DB, then fetch from order-DB, join in app code (or denormalize).

Other costs:

  • No multi-DB transactions. Atomic "decrement inventory + create order" across two DBs needs a saga or 2PC.
  • Cross-DB foreign keys don't exist. You can't enforce order.user_id REFERENCES user.id across DBs.
  • More moving pieces — N DBs to monitor, back up, restore.
Anti-pattern

Federating too early. A startup with 5 engineers and one Postgres doesn't need federation — it needs better indexes and connection pooling. Federation is for when team boundaries and scale jointly justify the operational tax.

05

Deep dive — the migration path

Going from one DB to federated takes months. The pattern most companies use:

  1. Identify a domain to extract. Pick something with low coupling (orders, recommendations, search). Avoid the most-joined tables (users) until last.
  2. Wrap access in a service. No more direct SQL from random callers — only the new "Orders Service" reads/writes orders. Forces the API boundary clean before the data moves.
  3. Dual-write transition. New writes go to both old DB and new DB. Reads still hit old DB.
  4. Backfill. Copy historical data from old to new. Verify consistency.
  5. Cut over reads. Switch the service to read from new DB. Old DB becomes write-only.
  6. Stop dual-writes. All writes go only to new DB. Old DB schema is dead.

Each step is reversible. Each step ships independently. CDC tools (Debezium) automate steps 3-4 by streaming the WAL. The whole migration takes weeks per domain in the best case.

06

Real-world

Amazon (early 2000s)

Service-per-domain

Famously decreed every team must expose its data only via API. Forced functional federation as a side effect. Foundation of AWS.

eBay

Dozens of federated DBs

Auctions, listings, users, payments — each its own DB. Sharded horizontally within each domain.

Shopify

"Pods" architecture

Each pod is a federated stack — DB + workers + cache — serving a subset of merchants. Federation + sharding combined.

Microservices in general

Each owns its DB

The microservice principle "each service has its own DB" is functional federation by another name.

07

Used in problems

E-commerce splits user-DB / order-DB / product-DB / inventory-DB. Uber federates by domain (rides, payments, driver-location, eats). News feed federates posts vs engagement vs feed-cache. Stock trading separates positions, orders, market-data, settlements.

Next up