Message Queues
Message queues let services communicate asynchronously. Instead of one service calling another directly and waiting for a response, it places a message on a queue. The receiving service processes it when it's ready. This decoupling makes systems more resilient — if the receiver is temporarily down, messages wait rather than fail.
Messaging Patterns
Point-to-point delivers each message to exactly one consumer. This is the classic work queue: multiple workers pull from the same queue, and each message is processed once. It's the right pattern for distributing tasks across a pool of workers.
Publish/subscribe delivers each message to all subscribed consumers. A single event — "order placed" — can trigger inventory updates, email notifications, and analytics processing simultaneously, without the publisher knowing about any of them.
Request/reply uses queues to implement asynchronous RPC. The sender includes a reply-to address with the message; the receiver processes it and sends the response to that address. Useful when you need the decoupling benefits of a queue but still need a response.
Delivery Guarantees
This is where messaging gets nuanced.
At-most-once means a message may be lost but will never be delivered twice. Fast, but unsuitable when every message matters.
At-least-once means every message will be delivered, but duplicates are possible. This is the most common guarantee and the right default for most systems — but it requires your consumers to be idempotent (processing the same message twice produces the same result as processing it once).
Exactly-once is what everyone wants and almost no system truly provides. Most implementations achieve it through at-least-once delivery combined with idempotent processing or deduplication on the consumer side.
Reliability
Message persistence writes messages to disk so they survive broker restarts. There's a throughput cost, but the alternative is data loss during outages.
Acknowledgments let the broker know a message has been successfully processed. If the consumer crashes before acknowledging, the broker redelivers the message to another consumer.
Dead letter queues capture messages that repeatedly fail processing. Rather than blocking the queue or losing the message, it's moved aside for inspection and manual intervention. This prevents one bad message from stalling an entire pipeline.
Retry with backoff handles transient failures by waiting progressively longer between attempts. Exponential backoff with jitter prevents thundering herds when a downstream service recovers.
RabbitMQ
RabbitMQ is a widely deployed message broker that implements the AMQP protocol. Its routing model is built around three concepts:
Exchanges receive messages and route them to queues based on rules. A direct exchange routes by exact key match. A topic exchange supports pattern matching. A fanout exchange broadcasts to all bound queues.
Queues store messages until consumers retrieve them. They can be durable (survive restarts), exclusive (single consumer), or have TTLs (messages expire after a duration).
Bindings connect exchanges to queues with routing rules. This separation of routing from storage gives RabbitMQ considerable flexibility in how messages flow through the system.
Observability
Queues are a blind spot if you don't instrument them. The key metrics to track:
- Queue depth — how many messages are waiting. A consistently growing queue means consumers can't keep up.
- Consumer lag — the delay between when a message is published and when it's processed. This is the latency your users actually experience.
- Processing rate — messages consumed per second. Compare this against the publish rate to understand whether you're keeping up, falling behind, or have headroom.
- Dead letter queue depth — messages that failed processing. This should be zero in steady state; any non-zero value warrants investigation.
See metrics for how to instrument these signals and logging for the event patterns that make queue processing debuggable.
When to Use Something Else
Message queues aren't the only option for async communication. Event streaming platforms like Apache Kafka retain an ordered log of events that multiple consumers can replay independently — better suited for event sourcing, audit trails, and high-throughput data pipelines. Managed services like Amazon SQS remove operational overhead at the cost of fewer features and vendor coupling.