Message brokers are the plumbing that lets services communicate without knowing about each other. This article is a gentle introduction — what brokers are, why they matter, and a practical framework for choosing the right one.
The problem
When service A needs to tell service B about something, the naive solution is a direct HTTP call. Problems appear fast:
- If B is down, A’s operation fails
- If B is slow, A’s latency climbs
- If B is added later, A must be modified to know about it
- If you add services C and D that also care, A calls all three
A message broker sits between them. A publishes the event. B, C, D subscribe. A doesn’t know who’s listening. The broker handles delivery, buffering, and retries.
The two primary models
Queue (point-to-point). One sender, one receiver per message. The broker distributes work across consumers. Example: job queue for image processing.
Pub/sub (topic-based). One sender, many receivers. Every subscriber gets a copy. Example: “order placed” event consumed by email, analytics, shipping.
Modern brokers blend both. Kafka’s consumer groups let you choose: one group = all consumers share (queue-like); many groups = each gets a copy (pub/sub-like).
The big three
RabbitMQ — traditional message queue. Strong routing rules (exchanges, bindings). Excellent for work queues and complex routing. Persistence is possible but not its strength. Push-based delivery.
Kafka — distributed log. Messages stored durably on disk in partitioned topics. Consumers pull at their own pace and track offsets. Excellent for high-throughput event streams and replay. Not great for small, low-latency messages.
NATS — minimalist messaging. Very fast, very small. Great for internal service communication. JetStream variant adds persistence.
Three other honorable mentions: Redis Streams (light, handy if already using Redis), AWS SQS/SNS (managed, pay-per-message), Google Pub/Sub (managed, global).
Choosing one
Answer in order:
- Is it a work queue (process-it-once)? → RabbitMQ or SQS
- Event stream with multiple consumers and replay? → Kafka
- Low-latency internal signaling? → NATS
- Already on managed cloud? → the cloud’s native option, usually
The wrong choice: “Kafka for everything because it’s popular”. Kafka’s strengths (partitioning, durability, high throughput) are overkill — and its operational complexity is a tax — for most RabbitMQ use cases.
Key concepts across brokers
Delivery guarantees — at-most-once, at-least-once, exactly-once. Exactly-once is almost always a lie at the infrastructure level; design consumers to be idempotent and treat the guarantee as at-least-once.
Ordering — Kafka orders within a partition; RabbitMQ within a queue; most others do not guarantee order at all. If ordering matters, check explicitly.
Backpressure — what happens when producers outrun consumers. Some brokers buffer unboundedly (and eventually OOM); others block producers or drop messages. Know your broker’s behavior.
Durability — is the message safe if the broker restarts? Usually yes, but only if configured correctly (replication, disk sync).
A minimum viable setup
For a Spring Boot service using RabbitMQ:
@Component
public class OrderEventHandler {
@RabbitListener(queues = "orders.placed")
public void onOrderPlaced(OrderPlaced event) {
// handle
}
}And publishing:
rabbitTemplate.convertAndSend("orders.exchange", "orders.placed",
new OrderPlaced(order.getId(), order.getAmount()));Kafka equivalent:
@KafkaListener(topics = "orders.placed", groupId = "email-service")
public void onOrderPlaced(OrderPlaced event) { ... }kafkaTemplate.send("orders.placed", order.getId().toString(), event);Either way, the consumer side is a method with an annotation. The complexity is in configuration: serialization, error handling, retry policy, dead-letter queues.
What goes wrong
- No idempotency — message redelivery creates duplicates
- No dead-letter queue — bad messages crash consumers in a loop
- Unbounded consumer lag — monitoring never alerts until production breaks
- Schema drift — producer adds a field, consumer can’t parse, everything fails
Each of these has well-known fixes; you just have to know to apply them from day one.
Closing note
Message brokers solve real problems but introduce new ones — delivery semantics, ordering, failure handling, monitoring. The rule: don’t reach for a broker unless you have one of the problems it solves. Two services that always need to respond synchronously should still call each other directly. Three services where decoupling and reliability outweigh the simplicity of HTTP — that’s where a broker starts to pay for itself.