New Amazon SageMaker integration with NVIDIA NIM inference microservices

News

This article discusses the new integration of Amazon SageMaker with NVIDIA NIM inference microservices, which allows for improved price-performance when running large language models (LLMs) on NVIDIA GPU-accelerated infrastructure.

Specifically, the article covers:

SageMaker integration with NVIDIA NIM, which provides high-performance AI containers for LLM inference
NIM's support for pre-optimized LLMs like Llama, Mistral, NVIDIA Nemotron, StarCoder, etc.
Ability to create GPU-optimized versions of other LLMs using NIM tools
Deployment of NIM containers on SageMaker by creating inference endpoints
Availability of NIM containers in all AWS regions where SageMaker is available

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

New Amazon SageMaker integration with NVIDIA NIM inference microservices

Related articles