Home icon

Amazon SageMaker HyperPod now supports continuous provisioning for enhanced cluster operations

News



Amazon SageMaker HyperPod has introduced continuous provisioning, a new capability designed to enhance AI/ML cluster operations for enterprise customers.

  • Automatically provisions remaining cluster capacity in the background
  • Allows training jobs to start immediately on available instances
  • Automatically retries node provisioning failures without manual intervention
  • Enables concurrent operations like scaling nodes and applying patches
  • Provides real-time visibility through new Events APIs

The feature is currently available for SageMaker HyperPod clusters using EKS orchestrator, and can be enabled by setting the NodeProvisioningMode parameter to "Continuous" when creating new clusters.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Aug 11
2025
Amazon SageMaker HyperPod now provides a new cluster setup experience
Mar 25
2026
Amazon SageMaker HyperPod now supports continuous provisioning for Slurm-orchestrated clusters
Sep 2
2025
Announcing the new cluster creation experience for Amazon SageMaker HyperPod
Jun 20
2024
Amazon SageMaker HyperPod now supports configurable cluster storage

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.