Compaction support for Avro and ORC file formats in Apache Iceberg tables in Amazon S3
Big Data Blog
This article discusses the expansion of compaction support for Avro and ORC file formats in Apache Iceberg tables stored in Amazon S3, providing several key insights:
- Amazon S3 Tables now supports automatic compaction for Avro and ORC file formats, in addition to Parquet
- The feature aims to help manage large-scale analytic tables by addressing challenges like small file proliferation
- Performance tests showed significant query time improvements:
- Avro tables saw up to 27.29% query time reduction
- ORC tables saw up to 39.98% query time reduction
- The solution was tested using a simulated IoT data pipeline with over 20 billion events
- The compaction feature is available in S3 Tables and for Iceberg tables in general purpose S3 buckets
The article provides a comprehensive guide to setting up and testing the new compaction capabilities across different file formats.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Jul 15
2025
2025
Amazon S3 now supports compaction of Apache Avro and ORC formats for Apache Iceberg tables
Jun 24
2025
2025
Amazon S3 now supports sort and z-order compaction for Apache Iceberg tables
Jun 24
2025
2025
New: Improve Apache Iceberg query performance in Amazon S3 with sort and z-order compaction
Dec 19
2024
2024
Accelerate queries on Apache Iceberg tables through AWS Glue auto compaction
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.