Home icon

Efficiently compare items across two Amazon DynamoDB tables

Database Blog



This article presents an efficient algorithm for comparing two Amazon DynamoDB tables to identify differences in items, implemented in the open-source Bulk Executor tool.

  • Algorithm leverages DynamoDB's consistent hashing and ordered scan results for efficient comparison
  • Compares scan sequences using parallel segmented scans for independent, parallel processing
  • Handles schema validation, identical items, changed attributes, missing items, and partition key differences
  • Demonstrated on 500M-item tables (~180GB each) compared in 6.5 minutes for under $10
  • Uses AWS Glue to run hundreds of segmented scans in parallel
  • Outputs differences as added (+), removed (-), or changed (*) items with optional full details
  • Results can be stored in Amazon S3 for large datasets
  • Useful for migrations, point-in-time recovery verification, and data propagation validation

The Bulk Executor tool provides a fast, scalable, and cost-effective solution for comparing large DynamoDB tables using linear-time algorithm with minimal memory overhead.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jan 10
2024
Effective data sorting with Amazon DynamoDB
Sep 11
2024
Obtaining item counts in Amazon DynamoDB
Jul 10
2025
Evolve your Amazon DynamoDB table’s data model
Dec 3
2024
Amazon DynamoDB global tables previews multi-Region strong consistency

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.