Home icon
Announcing the Preview of Amazon SageMaker Profiler: Track and visualize detailed hardware performance data for your model training workloads

Blog



This article announces the preview of Amazon SageMaker Profiler, a tool for tracking and visualizing hardware performance during deep learning model training on AWS.

  • Tracks CPU and GPU activities including utilization, kernel runs, memory operations, and data transfers
  • Provides Python modules for annotating PyTorch and TensorFlow training scripts
  • Offers UI dashboard with visualizations of GPU/CPU active time, utilization trends, and kernel performance
  • Includes timeline interface showing detailed kernel launches and runs at operation level
  • Supports PyTorch 2.0.0, 1.13.1 and TensorFlow 2.12.0, 2.11.1 on specific GPU instance types
  • Available in US East, US West, and Europe regions
  • Generates up to 10x less profiling data than open-source alternatives

SageMaker Profiler helps ML practitioners optimize resource utilization and reduce training costs by identifying performance bottlenecks in large-scale distributed training jobs.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.