03
Tuning the breaker
Four parameters:
- Failure threshold — "50% of the last 20 calls failed" OR "10 consecutive failures." Percentage-based handles low-traffic scenarios better than count-based.
- Minimum call count — don't trip on 1 failure out of 1 call. Require ≥ 20 calls before judging.
- Cooldown — how long to stay open. Typical: 30s for fast-recovering services, minutes for external APIs.
- What counts as failure — timeouts? 5xx? 4xx? Usually: timeouts + 5xx. 4xx is the client's fault, not the service's.
Circuit breaker state machine
import time
from enum import Enum
class State(Enum): CLOSED, OPEN, HALF_OPEN = 1, 2, 3
class CircuitBreaker:
def __init__(self, fail_threshold=5, reset_timeout=30):
self.fail_threshold = fail_threshold
self.reset_timeout = reset_timeout
self.failures = 0
self.state = State.CLOSED
self.opened_at = 0
def call(self, fn, *args, **kw):
if self.state == State.OPEN:
if time.time() - self.opened_at >= self.reset_timeout:
self.state = State.HALF_OPEN # probe
else:
raise Exception("circuit open")
try:
r = fn(*args, **kw)
self._on_success(); return r
except Exception as e:
self._on_failure(); raise
def _on_success(self):
self.failures = 0; self.state = State.CLOSED
def _on_failure(self):
self.failures += 1
if self.failures >= self.fail_threshold:
self.state = State.OPEN
self.opened_at = time.time()