Big Data Blog
This article announces Spark Connect support on Amazon EMR Serverless, enabling interactive PySpark development from local environments while executing at scale on EMR Serverless.
- Develop and debug Spark applications locally in IDEs, Jupyter notebooks, or SageMaker without deploy-and-check cycles
- Client-server architecture separates application code from Spark engine via secure gRPC/TLS connection
- Each session has unique ARN enabling per-session IAM permissions, cost allocation, and CloudTrail auditing
- Pay only for compute resources during active sessions; automatic scaling via dynamic resource allocation
- Supports interactive ETL, dbt-spark, Iceberg analytics, S3 Tables, and exploratory data analysis
- Available in EMR release 7.13.0 (Spark 3.5.6) and later across all AWS regions
- No additional charge; pricing matches standard EMR Serverless batch job model
Spark Connect on EMR Serverless simplifies PySpark development by bridging local development and production-scale execution with automatic infrastructure management and fine-grained access controls.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.