Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Big Data Blog

This article covers how to use AWS Glue ETL to perform data merge, partition evolution, and schema evolution operations on Apache Iceberg tables in a transactional data lake.

Specifically, the article covers:

Overview of the solution architecture
Setting up the infrastructure with AWS CloudFormation
Creating an Iceberg table using AWS Lambda and granting access using AWS Lake Formation
Integrating Iceberg with the AWS Glue Data Catalog and Amazon S3
Merging data from a Dropzone location into the Iceberg table using an AWS Glue ETL job
Querying the Iceberg table using Amazon Athena
Performing partition evolution on the Iceberg table
Performing schema evolution on the Iceberg table
Updating data in the Iceberg table using a positional update

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Dec 18
2025

Create and update Apache Iceberg tables with partitions in the AWS Glue Data Catalog using the AWS SDK and AWS CloudFormation

Nov 21
2024

AWS Glue Data Catalog supports automatic optimization of Apache Iceberg tables through your Amazon VPC

Sep 12
2024

The AWS Glue Data Catalog now supports storage optimization of Apache Iceberg tables

Dec 19
2024

AWS Glue Data Catalog offers advanced automatic optimization for Apache Iceberg tables

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Related articles