Home icon

Amazon SageMaker HyperPod now supports automatic Slurm topology management

News



This article announces automatic Slurm topology management for Amazon SageMaker HyperPod, which optimizes network configuration for distributed training clusters.

  • Automatically selects optimal network topology based on GPU instance types
  • Dynamically maintains topology as cluster scales or nodes are replaced
  • Supports tree topology for hierarchical interconnects like ml.p5 instances
  • Supports block topology for uniform high-bandwidth connectivity like ml.p6e
  • Handles mixed instance type clusters with compatible topology selection
  • Enabled by default with no manual configuration required
  • Available in all AWS Regions supporting SageMaker HyperPod

This feature improves distributed training performance by automatically optimizing GPU-to-GPU communication and NCCL operations without manual topology management.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 3
2026
Amazon SageMaker HyperPod now supports API-driven Slurm configuration
Mar 25
2026
Amazon SageMaker HyperPod now supports continuous provisioning for Slurm-orchestrated clusters
May 7
2026
Amazon SageMaker HyperPod now supports AMI-based node lifecycle configuration for Slurm clusters
Mar 26
2025
Announcing multi-head node support in Slurm for Amazon SageMaker HyperPod clusters

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.