Apache Spark 4.0.1 preview now available on Amazon EMR Serverless

Big Data Blog

This article announces Apache Spark 4.0.1 preview availability on Amazon EMR Serverless, introducing major enhancements for analytics, data engineering, and governance.

ANSI SQL mode now default, enforcing standard SQL behavior for data integrity
VARIANT data type efficiently handles JSON/XML without repeated parsing overhead
SQL scripting enables loops, conditionals, and session variables directly in SQL
Pipe syntax (|>) chains SQL operations for improved readability and maintainability
Python data source API allows building custom connectors without Scala expertise
Queryable streaming state enables debugging and monitoring of stateful applications
Apache Iceberg v3 support provides transaction guarantees and audit trails
AWS S3 Tables integration with automatic optimization and maintenance
Lake Formation full table access supported for Iceberg, Delta, and Hive tables
Runtime requirements: Scala 2.13.16, Java 17+, Python 3.9+, Pandas 2.0.0+
Preview limitations: No fine-grained access control, Spark Connect, Hudi, or interactive applications

Spark 4.0.1 on EMR Serverless simplifies data engineering workflows with SQL enhancements, Python improvements, and streaming capabilities while maintaining governance controls for semi-structured data.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Nov 21
2025

Amazon EMR Serverless now supports Apache Spark 4.0.1 (preview)

Jun 9
2026

Announcing general availability of Apache Spark 4.0 on Amazon EMR

May 27
2026

Amazon EMR now supports Apache Spark 4.0.2 in general availability

Dec 2
2025

Amazon EMR Serverless eliminates local storage provisioning for Apache Spark workloads

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Apache Spark 4.0.1 preview now available on Amazon EMR Serverless

Related articles