Home icon

Run Apache Spark Structured Streaming jobs at scale on Amazon EMR Serverless

Big Data Blog



The article discusses running Apache Spark Structured Streaming jobs on Amazon EMR Serverless, highlighting key enhancements and capabilities for processing streaming data in near real-time.

  • New 'Streaming' job mode introduced in EMR 7.1 for simplified streaming job submission
  • Enhanced Kinesis connector with fan-out support for improved throughput
  • Fine-grained scaling capabilities using dynamic allocation
  • Built-in Availability Zone resiliency and automatic job retry mechanisms
  • Advanced log management with rotation and compression
  • Integration with Amazon Managed Service for Prometheus for monitoring
  • Support for Kinesis Data Streams, Amazon MSK, and self-managed Kafka clusters

The service offers a robust, scalable solution for real-time data processing, providing performance optimization, cost-effectiveness, and simplified infrastructure management.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jun 9
2026
Build stateful streaming applications with Apache Spark 4.0 on Amazon EMR Serverless
Sep 4
2024
Use Apache Spark on Amazon EMR Serverless directly from Amazon Sagemaker Studio
Jun 4
2024
Introducing Amazon EMR Serverless Streaming jobs for continuous processing on streaming data
Jun 9
2026
Run Interactive Workloads on Amazon EMR Serverless with Spark Connect

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.