Home icon

Optimize Amazon EMR runtime for Apache Spark with EMR S3A

Big Data Blog



Amazon EMR 7.10 introduces an enhanced S3A file system connector that offers improved performance for Apache Spark workloads across various EMR deployment options.

  • EMR S3A is now the default S3 file system connector for all EMR deployment types
  • Read performance is comparable to EMRFS, with 1.08x speedup over open-source S3A
  • Write performance shows significant improvements:
    • 7% faster for static partition overwrites
    • 215% faster for dynamic partition overwrites
  • Benchmarks used 3TB TPC-DS dataset with 104 SparkSQL queries
  • Provides better cost-efficiency with slightly lower runtime costs

The new connector offers enhanced performance, standardization, and cross-platform portability for big data analytics workloads.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 15
2025
Accelerate Apache Hive read and write on Amazon EMR using enhanced S3A
Jun 21
2024
Run Apache Spark 3.5.1 workloads 4.5 times faster with Amazon EMR runtime for Apache Spark
Dec 16
2025
Introducing Apache Spark upgrade agent for Amazon EMR
May 27
2026
Amazon EMR now supports Apache Spark 4.0.2 in general availability

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.