Home icon

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

Blog



This article provides a comprehensive walkthrough of building a real-time serverless data analytics solution using AWS services like AWS Glue, AWS DMS, and Amazon DynamoDB. The solution demonstrates how to:

  • Join streaming data from Kinesis with a dynamically changing reference table in DynamoDB
  • Replicate data from an Aurora MySQL database to DynamoDB using AWS DMS
  • Create an AWS Glue streaming ETL job to process and enrich data in near-real-time
  • Ingest data into a transactional data lake using Apache Hudi
  • Enable querying of the enriched data using Amazon Athena

Key technical components include using CDC (Change Data Capture) for keeping reference data up-to-date, native Hudi integration in AWS Glue, and a streaming job that can handle dynamically changing reference data without interruption.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Aug 16
2023
Implement a serverless CDC process with Apache Iceberg using Amazon DynamoDB and Amazon Athena
Jun 10
2026
Real-time CDC from Aurora PostgreSQL to Amazon S3 Tables using Debezium and Firehose
Oct 23
2025
Unlock real-time data insights with schema evolution using Amazon MSK Serverless, Iceberg, and AWS Glue streaming
Jul 8
2025
Introducing Amazon Keyspaces CDC streams

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.