Home icon

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

Machine Learning Blog



AWS and NVIDIA have announced new capabilities for AI inference in Amazon SageMaker, focusing on three key advancements:

  • NVIDIA NIM microservices now available in AWS Marketplace for SageMaker Inference
  • NVIDIA Nemotron-4 model added to SageMaker JumpStart
  • New inference-optimized P5e and G6e instances powered by NVIDIA H200 and L40S GPUs

The NIM microservices include several pre-trained models like Nemotron-4, Llama 3.1, and Mixtral 8x7B, which can be easily deployed through AWS Marketplace. The new GPU instances offer significant performance improvements, with H200 GPUs providing 1.7 times larger memory and L40S GPUs delivering up to 2.5 times better performance compared to previous generations.

These advancements aim to simplify AI model deployment, improve inference performance, and provide more accessible and scalable generative AI capabilities for AWS customers.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 6
2024
Amazon SageMaker introduces new capabilities to accelerate scaling of Generative AI Inference
Aug 29
2024
Accelerate Generative AI Inference with NVIDIA NIM Microservices on Amazon SageMaker
Jul 25
2024
Amazon SageMaker inference launches faster auto scaling for generative AI models
Jul 9
2024
Amazon SageMaker introduces a new generative AI inference optimization capability

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.