Concept · Distributed Systems

Tunable Consistency per Query

01

Why this matters

"What consistency does our database give us?" is the wrong question. Different queries need different guarantees. A user's password change must be strongly consistent (next login should see the new hash). A like-count display can be eventually consistent (15 seconds of staleness invisible).

Forcing strong consistency on everything is expensive. Forcing weak consistency on everything is unsafe. Tunable per-query consistency — choosing the level at the call site, per request — is how Dynamo-style and modern distributed databases let you have both. The interview-grade answer to "is this CP or AP?" is "it depends per query."

02

The consistency knobs

Two settings per operation, both relative to N replicas:

  • W = how many replicas must ack before write returns success.
  • R = how many replicas must respond before read returns.

Magic inequality: W + R > N guarantees any read overlaps with the latest write — strong consistency. W + R ≤ N = eventual consistency. See quorum.

Common settings on N=3:

WRPropertyCostUse for
31Read-your-writes for freeSlow writesRead-heavy critical data
2 (QUORUM)2 (QUORUM)Strong consistency, balancedBoth moderateDefault for important data
11EventualFastest, bothCounters, cache, low-stakes data
3 (ALL)3 (ALL)Strong + redundant verificationSlow + brittle (one down = blocked)Audit data — rare
03

Per-query application

The real superpower: pick the level per call site, not per database.

// Cassandra example
session.execute(insert(orders), ConsistencyLevel.QUORUM);  // strong
session.execute(insert(view_log), ConsistencyLevel.ONE);   // eventual

session.execute(select(user_profile), ConsistencyLevel.QUORUM);  // strong read
session.execute(select(like_count), ConsistencyLevel.ONE);       // eventual

Same DB, same connection, same cluster. The application code chooses per query. Each query pays only what it needs.

04

Beyond W/R — full consistency menu

Modern systems offer a richer menu than just W/R numbers:

  • Linearizable — strongest. Reads see effects in real-time order. Cassandra's SERIAL, Spanner default.
  • Sequential / strong-after-read — operations from any one client appear in order. Most apps' actual need.
  • Bounded staleness — "may be up to 5 seconds stale." Predictable upper bound. Useful for "this dashboard updates every 5s" UX.
  • Session consistency / read-your-writes — within one user's session, reads always see their writes. Implemented via session-pinned routing or version vectors.
  • Eventual — converges in finite time, no specific bound. Cheapest.

DynamoDB exposes "strong" vs "eventual" reads as a flag per request. CockroachDB / Spanner default to linearizable but allow per-transaction BOUNDED STALENESS for read-only reporting queries — 100× faster for analytics that don't need real-time.

05

Deep dive — the routing pattern

Per-query consistency is enabled by a smart driver/coordinator:

  1. App makes call with consistency level annotation.
  2. Coordinator picks N replicas owning the key (via consistent hashing).
  3. Sends request to enough of them to satisfy W or R.
  4. Waits for that many acks; returns.
  5. Async fires off requests to remaining replicas (for read repair / write fan-out beyond W).

The cost shows up in latency. Strong reads (R=2) wait for the 2nd-fastest replica to respond. P99 latency follows the slow tail of the slowest replica in the quorum. This is why hedging and aggressive timeout tuning matter so much.

Interview answer

"We tune consistency per query. Critical writes (orders, payments) use QUORUM. High-throughput counters use ONE. User profile reads use QUORUM by default but drop to ONE in batch jobs that don't need real-time. Each query pays for the consistency it actually needs — no global tax."

06

Real-world

Cassandra

Per-query level

11 levels: ONE, TWO, THREE, QUORUM, ALL, LOCAL_QUORUM, EACH_QUORUM, etc. Apps annotate each query.

DynamoDB

Strong vs Eventual flag

Boolean per GetItem. Strong is ~2× cost in read-capacity units. Apps choose per request.

Spanner / CockroachDB

Bounded staleness queries

Default linearizable. Optional AS OF SYSTEM TIME -10s for analytics — 100× cheaper, fine for non-critical reads.

MongoDB

Read concern + Write concern

Per-operation. readConcern: "majority", writeConcern: { w: "majority" }. Fine-grained control like Cassandra.

07

Used in problems

News feed mixes QUORUM writes (post creation) with ONE reads (feed display). E-commerce checkout uses QUORUM for orders + ONE for product-view-count updates. Google Docs uses linearizable for OT operations + bounded staleness for "saved" indicators.

Next up