The Amazon SageMaker lakehouse architecture now automates optimization configuration of Apache Iceberg tables on Amazon S3
Big Data Blog
AWS has announced a new feature in the Amazon SageMaker lakehouse architecture that automates optimization of Apache Iceberg tables stored on Amazon S3 through catalog-level configuration.
- Enables automatic optimization for new Iceberg tables with a single Data Catalog configuration
- Continuously optimizes tables by compacting small files, removing snapshots, and unreferenced files
- Allows data lake administrators to configure table optimizations at both catalog and table levels
- Supports different optimization strategies like bin-pack and sort for specific table characteristics
- Reduces operational overhead and improves data lake management efficiency
This feature simplifies Iceberg table maintenance, helping organizations maintain high-performing and cost-effective data lakes with minimal manual intervention.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Aug 8
2025
2025
Amazon SageMaker lakehouse architecture now automates optimization configuration of Apache Iceberg tables
Jul 28
2025
2025
Accelerate your data quality journey for lakehouse architecture with Amazon SageMaker, Apache Iceberg on AWS, Amazon S3 tables, and AWS Glue Data Quality
Jul 16
2025
2025
Amazon SageMaker simplifies data management with automated lakehouse onboarding and metadata ingestion
Mar 13
2025
2025
Amazon S3 Tables integration with SageMaker Lakehouse is now generally available
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.