Processing Amazon S3 objects at scale with AWS Step Functions Distributed Map S3 prefix

Compute Blog

The article discusses processing large-scale data files using AWS Step Functions Distributed Map, a new feature for iterating over Amazon S3 objects with simplified management.

Enables processing large datasets by running concurrent workflow iterations in parallel
Introduces prefix-based iteration with `LOAD_AND_FLATTEN` transformation
Demonstrates a use case of application log processing and summarization
Provides a sample workflow that:
- Iterates over log files from an S3 prefix
- Puts hourly error count metrics into CloudWatch
- Stores metrics in DynamoDB
- Invokes a Lambda function for metrics aggregation
Supports multiple input types including CSV, JSON, JSONL, and Parquet

The feature simplifies data processing workflows by eliminating the need for nested workflows and custom code, making it easier to build dynamic, resilient data processing pipelines.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Nov 4
2025

Orchestrating big data processing with AWS Step Functions Distributed Map

Oct 1
2025

How to export to Amazon S3 Tables by using AWS Step Functions Distributed Map

Nov 4
2025

Optimizing nested JSON array processing using AWS Step Functions Distributed Map

Sep 18
2025

AWS Step Functions expands data source options and improves observability for Distributed Map

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Processing Amazon S3 objects at scale with AWS Step Functions Distributed Map S3 prefix

Related articles