AlgoDaily - What Is a Message Queue?

Home > Systems Design and Architecture 🔥 > Fundamentals of Systems Design > What Is a Message Queue?

One Pager Cheat Sheet

A message queue is an asynchronous channel where producers send messages to a broker for later processing by consumers, decoupling services, absorbing traffic spikes, and improving resilience so systems can hand off work without waiting and users see “it just works.”
Queues provide decoupling, buffering, load leveling, retry, and fan-out—patterns that underpin microservices and serverless systems—and implement backpressure by growing when consumers lag, signaling the need to scale workers or throttle producers.
Queues let producers proceed without waiting for consumers because the intermediary message queue decouples producers and consumers and buffers work so a producer can return once an enqueue is acknowledged (enabling asynchronous processing, failure isolation, and higher throughput), though this depends on queue capacity and policies (leading to backpressure) and requires handling FIFO/ordering, at-least-once delivery, durability, idempotency, and eventual consistency.
Messaging systems have components — Producer (creates messages), Broker (persists, routes, delivers), and Consumer (receives; can join a consumer group) — rely on acknowledgment (Ack) to signal processing and avoid redelivery, and support delivery modes like at-least-once (may duplicate), at-most-once (may drop), and exactly-once (hard, platform-constrained).
Delivery guarantees span At-least-once (duplicates possible—build consumer idempotency; e.g., SQS Standard may deliver duplicates and reorder), FIFO/Exactly-once-ish (per-queue deduplication such as SQS FIFO's deduplication windows), and Exactly-once processing (achieved with transactions + idempotent producers in Kafka Streams/apps, with important trade-offs).
Exactly-once guarantees are neither free nor universal: achieving exactly-once requires broker-specific primitives (e.g., idempotent producers, transactions, sequence numbers) plus extra coordination and persistent state, which introduces performance and operational costs, often only applies within a bounded scope (not end-to-end without two-phase commit or application-level idempotency), and still places a correctness burden on client design to avoid duplicates.
A minimal producer/consumer demo using Python's standard library (no external deps) that demonstrates buffering, retries, and ack-like behavior.
Because a broker can't know whether a consumer finished processing until it receives an ack, it treats unacknowledged messages (or those past a visibility timeout) as potentially unprocessed and will redeliver them to prevent message loss, which yields at-least-once delivery—so consumers must handle duplicate delivery via idempotent processing or deduplication.
FIFO enforces ordering and partitions let you scale by splitting a stream while preserving order within each partition; popularized by Kafka, consumers in the same group share partitions so consumer groups provide horizontal scaling with automatic rebalancing, letting you add nodes to increase throughput at the cost of per-partition order only.
Protocols like AMQP (with exchanges, queues, and bindings — e.g. AMQP 0-9-1 in RabbitMQ), MQTT (a lightweight pub/sub for IoT) and cloud HTTP APIs determine message-routing semantics (e.g. fanout, topic, direct), QoS levels, and interoperability across languages.
Because AMQP models messaging as a broker-mediated messaging model with first-class exchanges, queues, and bindings—so the broker applies routing logic (e.g., direct, fanout, topic, headers) via exchange/binding rules, providing flexible, server-side routing that decouples producers from consumers.
RabbitMQ is a mature broker widely deployed on-premises and in the cloud that supports AMQP and MQTT, where you declare exchanges, bind queues, publish and ack, offers both classic queues and stream-like features, and follows the core mental model publishers → exchange (routing) → queue → consumers with dead-letter exchanges used for retries.
Producers always publish to an exchange via basic.publish, and the default exchange ("") only makes it look like publishing to a queue while exchanges fundamentally decouple producers and queues to enable routing, pub/sub, dead-lettering and other features, with UI/client conveniences hiding but not removing the exchange layer.
Apache Kafka is a distributed commit log built around partitions and retention that delivers high throughput, durable history and replay for both streaming pipelines and classic messaging, while Kafka Streams provides stateful processing with exactly-once semantics when configured correctly—operators must still manage replication, partitions, and compaction policies.
Kafka is fundamentally an append-only, durable log where each topic is split into partitions (ordered sequences of messages with monotonically increasing offsets stored in segment files), achieving durability via local persistence and replication (e.g., acks=all, min.insync.replicas, ISR), enabling replayable history, high throughput, and fault tolerance, with data kept according to cleanup.policy (retention.ms/retention.bytes or compact).
Redis Streams provide append-only logs via commands like XADD/XREAD and support consumer groups with XREADGROUP for ordered entries with IDs, pending lists and claiming of stuck deliveries—offering more control than simple pub/sub and are great for lightweight pipelines if you run Redis, but watch memory usage and persistence settings (AOF/RDB).
AWS SQS offers Standard queues with high throughput, at‑least‑once delivery, and best‑effort ordering (duplicates possible), and FIFO queues for ordered, exactly‑once processing via deduplication windows (no duplicate inserts within the dedupe interval); all use HTTP APIs and are ideal to decouple serverless functions, run background jobs, and handle retries with visibility timeouts.
Amazon SQS Standard queues offer high throughput and at-least-once delivery but only best-effort ordering (due to distribution, duplicates, concurrent ReceiveMessage consumers, retries/visibility timeouts and batching), while FIFO queues give guaranteed ordering within a message group ID plus deduplication/exactly-once processing (with lower throughput); if you must use Standard, include sequence numbers and make consumers idempotent.
Google Cloud Pub/Sub uses topics and subscriptions where subscribers must ack to prevent redelivery; it avoids simultaneous delivery of the same message to multiple subscribers of a single subscription, deletes acknowledged messages asynchronously, and supports global fan-out, event ingestion, and managed push/pull subscriptions.
Apache Pulsar separates serving (brokers) from storage (Apache BookKeeper), enabling segment-tiered storage and independent scaling, and making it a strong fit for multi-tenancy, geo-replication, and very long retention with offloading, unlike Kafka's tighter storage–compute coupling.
Pulsar achieves a separation of serving and storage by having brokers perform client-facing serving while BookKeeper bookies persist messages in replicated ledgers for durable, replicated storage and offloading to object stores, enabling independent scaling, lightweight brokers, long retention, and simpler operations compared with systems that couple compute and storage.
NATS uses subjects for ultra-fast messaging, while JetStream adds persistence, replay, and configurable QoS, giving you both low-latency fire-and-forget and durable streams — ideal for control-plane and edge messaging with JetStream providing the durability.
Point-to-point uses one queue with many workers (competing consumers) so each message goes to one worker, whereas Pub/Sub publishes to a topic with fan-out so many subscriptions each receive a copy and typically uses ack + redelivery.
Do Receive message, Process message, Commit / Save side effects, then Ack (ack) — because brokers often use at-least-once delivery, so you must ensure side effects are durably persisted before sending the ack to avoid lost side effects, and handle possible redelivery with idempotency, the outbox pattern/transactions, visibility timeout or lease-renewal, and a dead-letter queue for poison messages.
In AMQP-ish Routing, since we can’t import RabbitMQ libs here, we’ll model routing decisions with standard collections.
A topic exchange implements pattern-based (wildcard) routing by comparing a message's routing key to queue binding key patterns and delivering to any queue whose binding pattern matches the routing key; binding keys use token-based wildcards where * matches exactly one word and # matches zero or more words (with # as a catch‑all and support for placements like a.#.b), this is not regular-expression matching, and it therefore differs from direct, fanout, and headers exchanges.
Because at-least-once can duplicate deliveries, make consumers idempotent (e.g., use keys, dedupe tables, or a transactional outbox); even with Kafka's exactly-once features, end-to-end guarantees still depend on the whole pipeline.
Idempotent handlers ensure the same result on repeated processing by making operations deterministic or guarded by deduplication/transactional mechanisms—using idempotency keys with a dedupe table or unique constraint, upsert semantics, a transactional outbox, or tracking last_processed_id—so the system’s observable state and external side effects remain unchanged despite at-least-once delivery and to preserve end-to-end guarantees even when relying on Kafka exactly-once features.
When work fails, retry with backoff, and after N tries send to a dead-letter queue (DLQ) for inspection; many systems also use a visibility timeout so messages not acked within the window are automatically redelivered (classic in SQS or via ack deadlines).
Hands-on demo using Go channels to implement in-process queues and fan-out workers, all with no external libraries.
Go channels are an in-process, typed queue (chan T) provided by the runtime that give compile-time type safety, built-in synchronization via blocking buffered channel/unbuffered channel semantics (providing automatic backpressure and per-sender FIFO ordering), are easily composable with goroutines and select into patterns like fan-out/fan-in, pipelines, and worker pools, support close/range and context.Context for lifecycle and coordination, and offer low-latency, low-overhead in-process communication — but they are in-process only, lack persistence or distributed delivery and can cause deadlocks or goroutine leaks if misused, so external queues are needed for cross-process or durable workloads.
Pick Redis Streams for simple streams + groups (if already on Redis); Kafka for massive throughput, replay, partitions, stream processing; SQS for managed queueing with HTTP + serverless; Pub/Sub for fan-out at global scale with managed subs; RabbitMQ for AMQP routing, varied protocols, easy ops; Pulsar for multi-tenant systems with storage decoupled from brokers; and NATS + JetStream for ultra-low latency control plane with optional durability.
Maintain secure endpoints (use TLS), enforce credentials/roles and least privilege, monitor lag, consumer errors, and DLQ rates, scale via partitions, consumer groups, and sharding, and ensure backups and retention for compliance and replay.

One Pager Cheat Sheet

Programming Categories

Popular Lessons