Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

Machine Learning Blog

AWS and NVIDIA have announced new capabilities for AI inference in Amazon SageMaker, focusing on three key advancements:

NVIDIA NIM microservices now available in AWS Marketplace for SageMaker Inference
NVIDIA Nemotron-4 model added to SageMaker JumpStart
New inference-optimized P5e and G6e instances powered by NVIDIA H200 and L40S GPUs

The NIM microservices include several pre-trained models like Nemotron-4, Llama 3.1, and Mixtral 8x7B, which can be easily deployed through AWS Marketplace. The new GPU instances offer significant performance improvements, with H200 GPUs providing 1.7 times larger memory and L40S GPUs delivering up to 2.5 times better performance compared to previous generations.

These advancements aim to simplify AI model deployment, improve inference performance, and provide more accessible and scalable generative AI capabilities for AWS customers.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Dec 6
2024

Amazon SageMaker introduces new capabilities to accelerate scaling of Generative AI Inference

Aug 29
2024

Accelerate Generative AI Inference with NVIDIA NIM Microservices on Amazon SageMaker

Jul 25
2024

Amazon SageMaker inference launches faster auto scaling for generative AI models

Jul 9
2024

Amazon SageMaker introduces a new generative AI inference optimization capability

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

Related articles