Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Big Data Blog

This article explains how to build a data lake on AWS using AWS Glue, AWS DMS, and Apache Iceberg. It covers the process of loading data from a legacy SQL Server database into a transactional data lake on AWS.

Specifically, the article covers:

Setting up AWS Glue connections to the source database
Creating AWS Glue jobs for full load and change data capture (CDC) of data from SQL Server to the data lake
Handling schema evolution in the data lake using Iceberg's merge capabilities
Configuring AWS Step Functions to orchestrate the data ingestion workflow
Best practices for optimizing cost, performance, and resilience of the data ingestion pipeline

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Oct 30
2024

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

Oct 30
2024

Modernize your legacy databases with AWS data lakes, Part 1: Migrate SQL Server using AWS DMS

Apr 3
2024

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Mar 13
2025

Build a managed Apache Iceberg data lake using Starburst and Amazon S3 Tables

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Related articles