Apache Iceberg, an open table format for large, high-throughput, updatable datasets, enables seamless schema and partition evolution in transactional data lakes on AWS through AWS Glue ETL, Lake Formation, and Amazon S3.

<div>
<p>This article covers how to use AWS Glue ETL to perform data merge, partition evolution, and schema evolution operations on Apache Iceberg tables in a transactional data lake.</p>
<p>Specifically, the article covers:</p>
<ul>
<li>Overview of the solution architecture</li>
<li>Setting up the infrastructure with AWS CloudFormation</li>
<li>Creating an Iceberg table using AWS Lambda and granting access using AWS Lake Formation</li>
<li>Integrating Iceberg with the AWS Glue Data Catalog and Amazon S3</li>
<li>Merging data from a Dropzone location into the Iceberg table using an AWS Glue ETL job</li>
<li>Querying the Iceberg table using Amazon Athena</li>
<li>Performing partition evolution on the Iceberg table</li>
<li>Performing schema evolution on the Iceberg table</li>
<li>Updating data in the Iceberg table using a positional update</li>
</ul>
</div>


Related articles