Deploy Meta Llama 3.1 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

Machine Learning Blog

AWS has announced the availability of Meta Llama 3.1 models on Amazon SageMaker JumpStart, deployable on AWS Trainium and Inferentia instances with significant cost benefits.

Meta Llama 3.1 offers multilingual LLMs in 8B, 70B, and 405B sizes
Models support 128,000 context length and are optimized for inference
Available in both base and instruction-tuned variants
Deployment possible through SageMaker Studio UI or Python SDK
Can reduce inference costs by up to 50% compared to GPU deployment

The solution provides flexible deployment options for large language models with enhanced performance and cost-effectiveness using AWS Neuron technology.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Jan 17
2024

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

May 2
2024

AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in Amazon SageMaker JumpStart

Sep 25
2024

Llama 3.2 models from Meta are now available in Amazon SageMaker JumpStart

Apr 18
2024

Meta Llama 3 models are now available in Amazon SageMaker JumpStart

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Deploy Meta Llama 3.1 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

Related articles