Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB
Blog
This article provides a comprehensive walkthrough of building a real-time serverless data analytics solution using AWS services like AWS Glue, AWS DMS, and Amazon DynamoDB. The solution demonstrates how to:
- Join streaming data from Kinesis with a dynamically changing reference table in DynamoDB
- Replicate data from an Aurora MySQL database to DynamoDB using AWS DMS
- Create an AWS Glue streaming ETL job to process and enrich data in near-real-time
- Ingest data into a transactional data lake using Apache Hudi
- Enable querying of the enriched data using Amazon Athena
Key technical components include using CDC (Change Data Capture) for keeping reference data up-to-date, native Hudi integration in AWS Glue, and a streaming job that can handle dynamically changing reference data without interruption.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2023
2026
2025
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.