Home icon

Improving throughput of serverless streaming workloads for Kafka

Compute Blog



This article provides guidance on optimizing AWS Lambda for processing high-volume Apache Kafka streams with focus on throughput and scaling.

  • Use Provisioned Mode for bursty workloads to ensure predictable, fast scaling instead of on-demand mode
  • Apply ESM filtering to drop irrelevant records before Lambda invocation, reducing cost and concurrency
  • Configure batch window and batch size to process more records per invocation and improve efficiency
  • Optimize handler code by reducing per-record work and increasing memory allocation for better CPU
  • Monitor OffsetLag, Duration, Concurrency, and Errors metrics to detect issues and guide tuning
  • Single provisioned poller can process up to 5 MB/s of Kafka data
  • Follow iterative optimization loop: baseline, filter, batch, speed up, test spikes, alert, re-evaluate

The article emphasizes that effective Kafka-Lambda optimization requires understanding the poll-filter-batch-invoke workflow and using observability metrics to drive configuration decisions.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 21
2024
Build an end-to-end serverless streaming pipeline with Apache Kafka on Amazon MSK using Python
Jan 16
2024
Real-time serverless data ingestion from your Kafka clusters into Amazon Timestream using Kafka Connect
Aug 2
2024
Improve Apache Kafka scalability and resiliency using Amazon MSK tiered storage
Jun 3
2024
Optimize write throughput for Amazon Kinesis Data Streams

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.