Home icon

Amazon EMR 7.1 runtime for Apache Spark and Iceberg can run Spark workloads 2.7 times faster than Apache Spark 3.5.1 and Iceberg 1.5.2

Big Data Blog



The article discusses performance improvements in the Amazon EMR 7.1 runtime for Apache Spark and Apache Iceberg, allowing Spark workloads to run 2.7 times faster compared to Apache Spark 3.5.1 and Iceberg 1.5.2 on the 3TB TPC-DS benchmark dataset.

Specifically, the article covers:

  • Eight new optimizations added in Amazon EMR 7.1 for Spark's DataSource V2 and Iceberg-specific enhancements
  • Benchmark results showing Amazon EMR 7.1 running TPC-DS 3TB workloads 2.7 times faster (0.56 hours vs 1.55 hours) and with 2.2 times better cost efficiency compared to open source Spark 3.5.1 and Iceberg 1.5.2
  • Instructions to run the TPC-DS benchmark on Amazon EMR 7.1 and open source Spark 3.5.1 for comparison
  • Conclusion highlighting the performance advantage of using the latest Amazon EMR releases for Spark and Iceberg workloads


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 27
2024
Amazon EMR 7.5 runtime for Apache Spark and Iceberg can run Spark workloads 3.6 times faster than Spark 3.5.3 and Iceberg 1.6.1
Nov 27
2025
Run Apache Spark and Iceberg 4.5x faster than open source Spark with Amazon EMR
Nov 27
2025
Run Apache Spark and Apache Iceberg write jobs 2x faster with Amazon EMR
Aug 8
2024
Amazon EMR 7.2 now supports Apache Spark 3.5.1

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.