Concept · Foundations

Multi-Tenancy

01

Why this matters

You're building B2B SaaS. 10,000 customer companies, each with their own users + data + settings. Should each customer get their own database, their own EC2 instances, their own Kubernetes namespace? Or do they all share one stack with logical isolation?

This is the multi-tenancy question. The answer drives infrastructure cost (10,000× difference), security model, performance isolation, and onboarding speed (minutes vs days). Every SaaS company answers this; getting it wrong is hard to undo.

02

The three tenancy models

ModelIsolationCost per tenantOnboardingBest for
Pool (shared everything)Logical only (tenant_id filter)PenniesInstantSmall tenants, freemium SaaS
Bridge (shared compute, dedicated DB)Compute pooled, data isolated$10s/moMinutesMid-market with data residency needs
Silo (dedicated everything)Full stack per tenant$100s-1000s/moHours-daysEnterprise with strict isolation requirements
03

Pool model — the default

One database, one app fleet. Every table has a tenant_id column. Every query filters by it. Every cache key includes it. The app fleet is shared — tenant 5's traffic and tenant 8000's traffic land on the same servers.

Wins: dirt cheap — even free tenants have near-zero marginal cost. Onboarding is just INSERT INTO tenants. Single deploy serves everyone.

Pains:

  • Noisy neighbor. One huge tenant (or one slow query) affects everyone. Need per-tenant rate limits and query timeouts.
  • Data leak risk. Forget a WHERE tenant_id = in one query → tenant 5 sees tenant 8000's data. Catastrophic. Mitigate with row-level security in Postgres, or repository-pattern wrappers that always inject the filter.
  • Cross-tenant queries hard. Backups, exports, deletes for one tenant scan the whole DB.
  • Compliance friction. "We need our data in EU" → can't deliver without rebuilding for silo.
04

Bridge model — the practical compromise

Shared application tier; one database (or schema) per tenant. Best of both?

  • Each tenant's data physically separate → easier compliance, instant per-tenant backup/restore, no risk of cross-tenant leak from a missing WHERE.
  • App tier still shared — one deploy, one fleet, one ops burden.
  • Database connections multiply — 10k tenants × 5 connections = 50k. Need aggressive pooling (pgbouncer with per-tenant DBs, or shared schemas).
  • Schema migrations harder — migrate 10k DBs sequentially? Migration framework needs to handle this.

Common variant: shared DB, one schema per tenant (Postgres). Schemas are cheap; migrations apply to all schemas. Used by Heroku, GitHub Actions runners, many B2B SaaS.

05

Deep dive — the noisy neighbor problem

The pool model's #1 failure: one tenant's behavior wrecks others' experience. Mitigations form a hierarchy:

  1. Rate limit per tenant. Hard cap on RPS; 429 if exceeded. Token bucket per tenant_id. Prevents accidental DOS by one tenant's bug.
  2. Query timeouts per tenant. Tenant's SQL exceeds 5 seconds → kill it. Without this, a bad query holds DB resources indefinitely.
  3. Resource quotas. Tenant's storage, API call quota, worker minutes capped. Bills auto-throttle past quota.
  4. Bulkhead per tenant. Big tenants get dedicated thread pools or worker queues. Smaller tenants share one pool.
  5. Promote big tenants to silo. When a tenant becomes ≥1% of total traffic, move them to dedicated infrastructure (sometimes with a price increase). Best of both worlds.

Shopify's "pods" architecture is the textbook example: every shop is in a pod (a complete app + DB stack); pods serve N shops; large shops get their own pod. Combines pool scalability with silo-level isolation per high-value tenant.

Interview answer

"Default to pool tenancy with strict per-tenant rate limits, query timeouts, and row-level security. Promote large tenants to dedicated stacks (silo) once they exceed 1% of total traffic. Avoids over-provisioning while keeping noisy-neighbor risk bounded."

06

Real-world

Slack

Pool with workspace sharding

Workspaces sharded across many DB clusters. Single huge clients (large enterprises) get dedicated clusters.

Shopify

Pods

Each pod is a full stack serving N shops. Large shops get their own pod. Pool + silo hybrid.

Stripe

Pool with row-level filters

One big Postgres + Redis. Account-id filtering everywhere. Per-account rate limits + quotas. Strong row-level security audit.

AWS GovCloud

Silo for compliance

Entirely separate AWS region for US gov customers. Most extreme silo — separate availability, separate accounts.

07

Used in problems

E-commerce platforms (like Shopify) are inherently multi-tenant. Notification system tenants its quotas + delivery channels. Google Docs treats each user/org as a tenant for sharing + access control.

Next up