Reducing costs for shuffle-heavy Apache Spark workloads with serverless storage for Amazon EMR Serverless

Big Data Blog

This article explains how serverless storage for Amazon EMR Serverless reduces costs for Apache Spark shuffle-heavy workloads by up to 26%, with savings reaching 85% for certain query patterns.

Serverless storage eliminates local disk provisioning for Spark workloads
Benchmarking on TPC-DS dataset showed 26.65% total cost savings versus standard disks
Benefits 80% of queries with average 47% savings when shuffle data is externalized
Requires Dynamic Resource Allocation enabled to release idle executors early
Runtime increases 37.94% due to external shuffle read/write latency
Inverted triangle queries (high cardinality input, low cardinality output) benefit most
Hourglass pattern queries with varying executor demand also see significant savings
Rectangle pattern queries with sustained high parallelism see minimal cost benefits

Serverless storage enables cost optimization for Spark workloads with dynamic resource patterns by decoupling shuffle data from compute, allowing immediate release of idle resources.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Dec 2
2025

Amazon EMR Serverless eliminates local storage provisioning for Apache Spark workloads

Jul 29
2026

Accelerate Spark on EMR Serverless with larger workers and shuffle-optimized disks

Jan 6
2026

Amazon EMR Serverless eliminates local storage provisioning, reducing data processing costs by up to 20%

Nov 21
2025

Amazon EMR Serverless now supports Apache Spark 4.0.1 (preview)

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Reducing costs for shuffle-heavy Apache Spark workloads with serverless storage for Amazon EMR Serverless

Related articles