Home icon

Gang scheduling pods on Amazon EKS using AWS Batch multi-node processing jobs

HPC Blog



This article discusses how to use AWS Batch multi-node parallel (MNP) jobs to enable gang scheduling of pods across nodes in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster for distributed workloads such as machine learning, weather forecasting, and computational fluid dynamics.

Specifically, the article covers:

  • An introduction to Dask, a flexible library for parallel computing in Python, and AWS Batch, a fully-managed batch processing service
  • An overview of the example Dask "Hello World!" script used to demonstrate distributed processing across worker nodes
  • How to create an application container with the Dask script and initialize it correctly using AWS Batch environment variables
  • Configuring an AWS Batch MNP job definition to launch the main node (with Dask scheduler and analysis script) and worker nodes (with Dask workers)
  • Submitting the job, monitoring its progress, and viewing the output logs from the worker nodes
  • Conclusion highlighting the ability to gang schedule pods across EKS nodes using AWS Batch MNP jobs


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jul 11
2024
AWS Batch now supports gang-scheduling on Amazon EKS using multi-node parallel jobs
Sep 13
2024
Use Batch Processing Gateway to automate job management in multi-cluster Amazon EMR on EKS environments
Oct 8
2025
Improve Kubernetes pod scheduling accuracy using Amazon EBS
Jul 16
2025
Amazon EKS enables ultra scale AI/ML workloads with support for 100K nodes per cluster

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.