AWS Batch now supports gang-scheduling on Amazon EKS using multi-node parallel jobs
News
This article announces the general availability of Multi-Node Parallel (MNP) jobs in AWS Batch on Amazon Elastic Kubernetes Service (Amazon EKS). MNP jobs allow running tightly-coupled High Performance Computing (HPC) applications like training multi-layer AI/ML models across multiple Amazon EC2 instances.
Specifically, the article covers:
- AWS Batch helps launch, configure, and manage nodes in Amazon EKS clusters for MNP jobs
- MNP jobs can use inter-instance communication frameworks like NCCL, Gloo, MPI, UCC, as well as machine learning and parallel computing libraries like PyTorch and Dask
- MNP jobs can be configured using the RegisterJobsDefinition API or AWS Batch Management Console
- MNP jobs on AWS Batch are available in all AWS regions where AWS Batch is supported
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.