DataAWS

Real-Time Log Analytics with Amazon Kinesis Data Firehose

NT

Naveen Teja

2/27/2026

Real-Time Log Analytics with Amazon Kinesis Data Firehose

Modern observability requires centralizing logs from hundreds of microservices into a searchable, analytics-ready format. Relying purely on CloudWatch Logs makes deep querying expensive and slow. The optimal architecture streams these logs to an indexing engine like Amazon OpenSearch or a data lake in S3.

Amazon Kinesis Data Firehose provides a fully managed service for delivering real-time streaming data to destinations. It automatically scales to match the throughput of your data and handles buffering, compression, and format conversion (like transforming JSON to Apache Parquet) on the fly before delivery.

Setting up a Firehose delivery stream involves defining a source (like a CloudWatch Logs subscription filter) and a destination. The following Terraform configuration provisions a Firehose stream that buffers incoming logs for 60 seconds or 5MB before pushing them securely into an S3 bucket for Athena querying.

kinesis-firehose.tf
resource "aws_kinesis_firehose_delivery_stream" "log_stream" {
  name        = "application-logs-to-s3"
  destination = "extended_s3"

  extended_s3_configuration {
    role_arn   = aws_iam_role.firehose_role.arn
    bucket_arn = aws_s3_bucket.log_archive.arn

    buffer_size     = 5
    buffer_interval = 60

    compression_format = "GZIP"
  }
}