Build end-to-end Apache Spark pipelines with Amazon MWAA, Batch Processing Gateway, and Amazon EMR on EKS clusters
Big Data Blog
This article discusses building end-to-end Apache Spark pipelines using Amazon Managed Workflows for Apache Airflow (MWAA), Batch Processing Gateway (BPG), and Amazon EMR on EKS clusters.
- Enables routing Spark workloads across multiple EMR on EKS clusters
- Introduces a custom Airflow operator (BPGOperator) for seamless job submission
- Provides solution for healthcare analytics company needing separate data processing environments
- Offers benefits like separation of responsibilities and centralized code management
- Demonstrates incremental migration strategy for existing Airflow DAGs
The solution allows organizations to build flexible, scalable data processing pipelines using AWS services, with clear separation between infrastructure and data engineering teams.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2025
2024
2026
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.