How Amazon Ads uses Iceberg optimizations to accelerate their Spark workload on Amazon S3
Storage Blog
Amazon Ads successfully optimized their Spark workload on Amazon S3 by leveraging Apache Iceberg's new base-2 object store file layout, resulting in significant performance and cost improvements.
- Reduced total EMR processing time by 22% (from 11.5 to 9 hours)
- Decreased EMR compute costs by 20%
- Reduced S3 storage costs by 32%
- Eliminated manual job retries
- Reduced S3 5xx errors by 77%
The optimization was achieved by using Iceberg's new 20-character base-2 hash for object key names, which allows for more even distribution of traffic across S3 prefixes and improves request scaling performance.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2024
2026
2025
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.