03
Exponential backoff + full jitter
The battle-tested recipe:
- Exponential backoff. Wait 100ms, then 200, 400, 800... doubling each retry. Gives the service time to recover.
- Full jitter (AWS paper, 2015). Replace the exact wait with
random(0, current_backoff). Spreads clients' retries over the interval — no synchronized spike.
- Max retries. 3–5 is typical. Beyond that, accept failure.
- Max backoff cap. 30 seconds or so. Don't wait an hour for retry 7.
Pseudocode:
for attempt in 1..max_retries:
try: return call()
except Transient:
base = min(max_backoff, initial * 2^attempt)
wait = random(0, base)
sleep(wait)
throw LastError
Exponential backoff with jitter
import random, time
def retry(fn, max_attempts=5, base=0.1, cap=10.0):
"""Full-jitter exponential backoff (AWS recommended)."""
for attempt in range(max_attempts):
try:
return fn()
except Exception:
if attempt == max_attempts - 1: raise
# base × 2^attempt, capped; uniform jitter 0..backoff
backoff = min(cap, base * (2 ** attempt))
time.sleep(random.uniform(0, backoff))
# Attempt delays: 0-0.1s, 0-0.2s, 0-0.4s, 0-0.8s, 0-1.6s
# Full-jitter outperforms "backoff + small jitter" for contended resources