Concept · Observability & Security

Field-Level Encryption

01

Why this matters

"Our database is encrypted." Sounds great. Look closer: encrypted at rest with a single key controlled by the cloud provider. A DBA, a compromised app, an SRE running SELECT * sees plaintext. Encryption-at-rest only protects against someone stealing the disk — almost nobody's actual threat model.

Field-level encryption encrypts specific sensitive fields (SSN, credit card, medical record) with keys the database doesn't have. Even a full DB dump or rogue admin sees ciphertext. Plaintext only exists in app memory while serving an authorized request.

02

The encryption layers

LayerProtects againstPlaintext visible to
Disk encryption (LUKS, BitLocker)Stolen physical diskAnyone with OS access
DB-level encryption-at-rest (TDE)Stolen DB file, backupAnyone with DB connection
Column-level encryptionBackup theft, rogue DBAAnyone with the column key (often app server)
Field-level encryption with KMSRogue DBA, compromised app, partial key compromiseOnly authorized request handlers with KMS access
End-to-end encryption (E2EE)Everyone except endpoint ownersOnly the data owners; server sees nothing

Each layer adds protection at a cost. Most apps need encryption-at-rest + field-level for sensitive fields. E2EE only when the threat model truly excludes the operator (Signal, WhatsApp messages, password managers).

03

The KMS-based pattern

The standard production approach:

  1. KMS (AWS KMS, Google Cloud KMS, HashiCorp Vault Transit) holds the master key. App can never extract it.
  2. To encrypt: app generates a random per-record data-encryption key (DEK), encrypts the field with AES-GCM, asks KMS to encrypt the DEK with the master key (KEK), stores ciphertext + encrypted DEK in the DB. KEK never leaves KMS.
  3. To decrypt: app reads ciphertext + encrypted DEK from DB, asks KMS to decrypt the DEK, then locally decrypts the field. KEK still never leaves KMS.
  4. Audit: every KMS decrypt is logged. You can see exactly which app + which user requested decryption of which record at what time.
  5. Rotate keys: rotate the KEK in KMS without re-encrypting all records (KMS handles the chain). Rotate DEKs by re-encrypting affected records.

Why two-key (envelope encryption): KMS would be a bottleneck if every encrypt/decrypt called it. With per-record DEKs, KMS is hit once per record (cheap, rare), not once per byte.

04

Deep dive — searchable encrypted fields

Encrypted fields lose searchability. WHERE encrypted_email = ... can't match because every encryption produces different ciphertext (semantic security). How do you query?

Three approaches:

1. Deterministic encryption. Same plaintext always produces same ciphertext. WHERE encrypted_email = encrypted('alice@x.com') works. Cost: leaks frequency information — attacker can see "this ciphertext appears 1000 times" and infer it's a common value. Acceptable for high-cardinality fields (emails); dangerous for low-cardinality (zip code).

2. Blind index. Store an HMAC of the plaintext as a separate searchable column. HMAC("alice@x.com") is deterministic; the encrypted value isn't. Search the HMAC column, then decrypt the matching encrypted value. CipherSweet library implements this for PHP/Java.

3. Field-format encryption with searchable schemes (CryptDB academic work, MongoDB CSFLE). Letting the encryption scheme support equality, range, or order queries. Always with cryptographic tradeoffs — every searchable scheme leaks more than the unsearchable equivalent.

Practical default: deterministic encryption for primary lookups (user IDs, SSNs), random encryption for the rest, blind indexes when you genuinely need to search.

Interview answer

"Field-level encryption via envelope encryption — KMS holds the master key, per-record data keys encrypt sensitive fields. Search via blind indexes (HMAC) or deterministic encryption depending on cardinality. Rogue admin sees ciphertext; auditor sees every decrypt event in KMS logs."

05

When to skip field-level encryption

  • Your threat model doesn't include rogue admins / compromised app. Disk encryption + good IAM is enough.
  • The data isn't actually sensitive. Tagging non-sensitive fields creates ops burden with no real benefit.
  • You can use tokenization instead. For payment cards, SSNs, etc., outsourcing to a vault provider is operationally simpler.
  • Latency is critical. Each KMS round trip adds ~5-10ms. For a 1ms p99 endpoint serving 100k QPS, this is unacceptable. Cache decrypted DEKs (with care).
06

Real-world

AWS KMS + DynamoDB Encryption Client

SDK pattern

Library wraps DynamoDB calls; transparently encrypts/decrypts marked fields using KMS-rooted keys.

MongoDB Client-Side Field Level Encryption (CSFLE)

Driver-level

Encryption happens in the MongoDB driver before data hits server. Server stores ciphertext; never sees plaintext.

HashiCorp Vault Transit

Encryption-as-a-service

App calls Vault: "encrypt this string with key X." Vault returns ciphertext. Same for decrypt. Centralized + auditable.

Signal protocol

End-to-end variant

Field-level encryption taken to E2E — server only sees ciphertext, can't decrypt at all. Per-message key ratcheting.

07

Used in problems

Payment gateway field-encrypts card metadata (last 4 digits, expiry). WhatsApp uses E2EE for messages — neither WhatsApp nor anyone else can read the content. E-commerce field-encrypts PII (address, phone) in the order DB.

Next up