Home icon

Managing duplicate objects in Amazon S3

Storage Blog



This article provides a solution to manage duplicate objects in Amazon S3 and reduce storage costs. It covers how to identify and delete duplicate objects using Amazon Athena, AWS Lambda, and S3 Batch Operations.

Specifically, the article covers:

  • Identifying duplicate objects by comparing their ETags (content hashes) using an Athena query on the S3 Inventory report
  • Creating a Lambda function to delete a single S3 object
  • Configuring an S3 Batch Operations job to invoke the Lambda function and delete the identified duplicate objects
  • Prerequisites, walkthrough steps, and things to know about the solution
  • Cleaning up the resources created for the solution


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 27
2024
Maintaining object immutability by automatically extending Amazon S3 Object Lock retention periods
Jul 17
2025
Copy objects between any Amazon S3 storage classes using S3 Batch Operations
Jan 16
2025
Preventing unintended encryption of Amazon S3 objects
Jan 23
2026
Applying Amazon S3 Object Lock at scale for petabytes of existing data

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.