Home icon

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Big Data Blog



This article explains how to build a data lake on AWS using AWS Glue, AWS DMS, and Apache Iceberg. It covers the process of loading data from a legacy SQL Server database into a transactional data lake on AWS.

Specifically, the article covers:

  • Setting up AWS Glue connections to the source database
  • Creating AWS Glue jobs for full load and change data capture (CDC) of data from SQL Server to the data lake
  • Handling schema evolution in the data lake using Iceberg's merge capabilities
  • Configuring AWS Step Functions to orchestrate the data ingestion workflow
  • Best practices for optimizing cost, performance, and resilience of the data ingestion pipeline


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Oct 30
2024
Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer
Oct 30
2024
Modernize your legacy databases with AWS data lakes, Part 1: Migrate SQL Server using AWS DMS
Apr 3
2024
Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake
Mar 13
2025
Build a managed Apache Iceberg data lake using Starburst and Amazon S3 Tables

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.