Improving throughput of serverless streaming workloads for Kafka
Compute Blog
This article provides guidance on optimizing AWS Lambda for processing high-volume Apache Kafka streams with focus on throughput and scaling.
- Use Provisioned Mode for bursty workloads to ensure predictable, fast scaling instead of on-demand mode
- Apply ESM filtering to drop irrelevant records before Lambda invocation, reducing cost and concurrency
- Configure batch window and batch size to process more records per invocation and improve efficiency
- Optimize handler code by reducing per-record work and increasing memory allocation for better CPU
- Monitor OffsetLag, Duration, Concurrency, and Errors metrics to detect issues and guide tuning
- Single provisioned poller can process up to 5 MB/s of Kafka data
- Follow iterative optimization loop: baseline, filter, batch, speed up, test spikes, alert, re-evaluate
The article emphasizes that effective Kafka-Lambda optimization requires understanding the poll-filter-batch-invoke workflow and using observability metrics to drive configuration decisions.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Mar 21
2024
2024
Build an end-to-end serverless streaming pipeline with Apache Kafka on Amazon MSK using Python
Jan 16
2024
2024
Real-time serverless data ingestion from your Kafka clusters into Amazon Timestream using Kafka Connect
Aug 2
2024
2024
Improve Apache Kafka scalability and resiliency using Amazon MSK tiered storage
Jun 3
2024
2024
Optimize write throughput for Amazon Kinesis Data Streams
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.