New Amazon SageMaker integration with NVIDIA NIM inference microservices
News
This article discusses the new integration of Amazon SageMaker with NVIDIA NIM inference microservices, which allows for improved price-performance when running large language models (LLMs) on NVIDIA GPU-accelerated infrastructure.
Specifically, the article covers:
- SageMaker integration with NVIDIA NIM, which provides high-performance AI containers for LLM inference
- NIM's support for pre-optimized LLMs like Llama, Mistral, NVIDIA Nemotron, StarCoder, etc.
- Ability to create GPU-optimized versions of other LLMs using NIM tools
- Deployment of NIM containers on SageMaker by creating inference endpoints
- Availability of NIM containers in all AWS regions where SageMaker is available
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.