Home icon

Amazon EKS enables ultra scale AI/ML workloads with support for 100K nodes per cluster

Containers Blog



Amazon EKS has introduced support for ultra-large-scale AI/ML workloads with the ability to run a single Kubernetes cluster with up to 100,000 worker nodes.

  • Enables scaling up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs
  • Supports training of massive AI models with unprecedented computational power
  • Provides benefits like accelerating AI innovation, reducing costs, and offering framework flexibility
  • Implemented architectural changes to support high-performance workloads
  • Used by companies like Anthropic and Amazon's AGI team for advanced AI research

This breakthrough allows organizations to pursue ambitious AI goals, from training trillion-parameter models to advancing artificial general intelligence, while maintaining Kubernetes compatibility.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jul 15
2025
Amazon EKS now supports up to 100,000 worker nodes per cluster
May 29
2025
Introducing AI on EKS: powering scalable AI workloads with Amazon EKS
Jul 16
2025
Under the hood: Amazon EKS ultra scale clusters
Oct 10
2024
Powering the Next Generation of AI Workloads on Amazon EKS with Anyscale

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.