Run Apache Spark Structured Streaming jobs at scale on Amazon EMR Serverless
Big Data Blog
The article discusses running Apache Spark Structured Streaming jobs on Amazon EMR Serverless, highlighting key enhancements and capabilities for processing streaming data in near real-time.
- New 'Streaming' job mode introduced in EMR 7.1 for simplified streaming job submission
- Enhanced Kinesis connector with fan-out support for improved throughput
- Fine-grained scaling capabilities using dynamic allocation
- Built-in Availability Zone resiliency and automatic job retry mechanisms
- Advanced log management with rotation and compression
- Integration with Amazon Managed Service for Prometheus for monitoring
- Support for Kinesis Data Streams, Amazon MSK, and self-managed Kafka clusters
The service offers a robust, scalable solution for real-time data processing, providing performance optimization, cost-effectiveness, and simplified infrastructure management.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2026
2024
2024
2026
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.