Message Brokers Compared — Comparison — System Design Portfolio

01

Why this matters

Every distributed system eventually needs asynchronous communication. The wrong broker choice leads to lost messages, impossible replay, or a $50k/month AWS bill. Kafka, RabbitMQ, SQS, and Redis pub/sub each solve a different slice of the problem — and interviewers expect you to know which slice.

This comparison gives you a single decision framework so you can justify your choice in 30 seconds during a system design round.

02

Head-to-head comparison

Broker	Ordering	Durability	Throughput	Consumer model	Replay	Use case
Apache Kafka	Per-partition FIFO	Replicated commit log on disk	Millions msg/s per cluster	Pull-based consumer groups	Yes — offset rewind	Event sourcing, CDC, analytics pipelines
RabbitMQ	Per-queue FIFO (no global)	Durable queues + publisher confirms	Tens of thousands msg/s	Push to consumers, ack-based	No — consumed = gone	Task queues, complex routing, RPC
AWS SQS	Best-effort (FIFO queues available)	Managed, replicated across AZs	Nearly unlimited (managed)	Pull-based, visibility timeout	No — once consumed + deleted	Serverless decoupling, Lambda triggers
Redis Pub/Sub	Per-channel publish order	None — in-memory, fire-and-forget	Very high for small payloads	Push to all subscribers	No — missed = lost	Real-time notifications, cache invalidation

Key insight: Kafka is a log; RabbitMQ is a queue; SQS is a managed queue; Redis pub/sub is a broadcast channel. The data model determines everything else.

03

Decision flowchart

Use this mental model during interviews:

Need replay + ordering? → Kafka. Event sourcing, audit logs, CDC streams, analytics — anything where consumers must reprocess history.
Need complex routing (topic/fanout/headers)? → RabbitMQ. Its exchange model (direct, topic, fanout, headers) gives routing flexibility no other broker matches.
Serverless / zero-ops? → AWS SQS. No brokers to manage. Scales to zero. Pay-per-message. Pairs with Lambda for event-driven architectures.
Fire-and-forget real-time broadcast? → Redis Pub/Sub. Cache invalidation, presence updates, live scores — when losing the occasional message is acceptable.

The interview answer: "Use Kafka when you need replay + ordering. RabbitMQ for complex routing. SQS for serverless. Redis pub/sub for fire-and-forget real-time."

04

When to pick what — deeper context

Kafka deep considerations

Kafka shines when you have multiple consumers that each need the full stream. Consumer groups give you both pub/sub semantics (different groups) and queue semantics (same group). The retention-based model means a new analytics pipeline can start from day-one data. Downside: operational complexity (ZooKeeper/KRaft, partition rebalancing, schema registry).

RabbitMQ deep considerations

RabbitMQ's AMQP model gives you dead-letter exchanges, TTL, priority queues, and delayed messages out of the box. It's the best choice for task distribution where you want exactly-once processing with manual acks. Not ideal for high-throughput streaming — at ~50K msg/s you'll hit limits.

SQS deep considerations

SQS Standard queues offer at-least-once delivery with best-effort ordering. FIFO queues guarantee exactly-once and ordering but cap at 300 msg/s (3,000 with batching). The visibility timeout model is simple but can cause duplicates if processing exceeds the timeout. Use with DLQ for poison-pill handling.

Redis Pub/Sub deep considerations

Zero persistence means if a subscriber is down when a message publishes, it's gone forever. Redis Streams (different from pub/sub) add persistence and consumer groups — essentially Redis's answer to Kafka. In interviews, clarify whether you mean pub/sub or Streams.

Hybrid patterns

Real systems often combine brokers: Kafka for the event backbone + SQS for individual service task queues + Redis pub/sub for WebSocket fan-out. Name the pattern and the interviewer knows you've built real systems.