Home icon

Run Apache Spark 3.5.1 workloads 4.5 times faster with Amazon EMR runtime for Apache Spark

Big Data Blog



This article discusses the performance improvements in the latest Amazon EMR runtime for Apache Spark, which is optimized to run Spark workloads faster than open-source Apache Spark.

Specifically, the article covers:

  • Benchmark results showing Amazon EMR 7.1 runs Apache Spark 3.5.1 workloads 4.5 times faster and with 2.8 times better price-performance
  • Recent improvements in the Amazon EMR runtime, including optimizations to physical operators, query planning, Amazon S3 requests, and using Java 17
  • Methodology and configurations used for benchmarking Apache Spark 3.5.1 and Amazon EMR
  • Instructions for running the TPC-DS benchmark on Apache Spark and Amazon EMR clusters
  • Conclusion recommending using the latest Amazon EMR release to benefit from performance optimizations


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Aug 8
2024
Amazon EMR 7.2 now supports Apache Spark 3.5.1
Aug 26
2024
Amazon EMR 7.1 runtime for Apache Spark and Iceberg can run Spark workloads 2.7 times faster than Apache Spark 3.5.1 and Iceberg 1.5.2
Dec 27
2024
Amazon EMR 7.5 runtime for Apache Spark and Iceberg can run Spark workloads 3.6 times faster than Spark 3.5.3 and Iceberg 1.6.1
May 27
2026
Amazon EMR now supports Apache Spark 4.0.2 in general availability

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.