Home icon

CloudWatch Container Insights now delivers observability for NVIDIA GPUs on EKS

News



The article introduces a new feature in Amazon CloudWatch Container Insights with Enhanced Observability for EKS that provides auto-discovery and monitoring of critical health and performance metrics from NVIDIA GPUs. This allows users to easily understand the health and performance of their GPU-accelerated workloads, such as AI/ML training jobs, and troubleshoot issues more quickly.

Specifically, the article covers:

  • Overview of the new GPU metric monitoring capabilities in CloudWatch Container Insights
  • Benefits of the enhanced observability, including automatic dashboards, faster problem isolation, and resource optimization
  • How to get started by enabling the feature in the EKS console or CloudWatch Agent configuration
  • Availability of the NVIDIA GPU metrics in all AWS regions, including pricing information
  • Link to the Container Insights user guide for further details


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.