ServerlessArchitecture

Handling Asynchronous Failures with SQS Dead Letter Queues

NT

Naveen Teja

2/27/2026

Handling Asynchronous Failures with SQS Dead Letter Queues

In message-driven architectures, consumers (like Lambda functions or ECS worker tasks) pull messages from an Amazon SQS queue. However, if a message is malformed or a downstream database is temporarily unavailable, the consumer will fail to process it, causing the message to return to the queue and be endlessly retried.

This 'poison pill' scenario can completely halt your processing pipeline. To resolve this, AWS provides Dead Letter Queues (DLQ). A DLQ is simply a secondary SQS queue that acts as a quarantine zone. If a message fails processing a specified number of times, SQS automatically moves it to the DLQ.

Engineers can then set up alarms on the DLQ to trigger alerts, allowing them to inspect the failed messages, fix the underlying bug, and use the SQS Redrive Policy to send the messages back to the main queue for reprocessing. Here is the Terraform to link a main queue to a DLQ.

sqs-dlq.tf
resource "aws_sqs_queue" "main_queue" {
  name = "order-processing-queue"

  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.dlq.arn
    maxReceiveCount     = 5
  })
}

resource "aws_sqs_queue" "dlq" {
  name = "order-processing-dlq"
}