Amazon SageMaker Inference now supports G6e instances
Machine Learning Blog
AWS has announced support for G6e instances on Amazon SageMaker Inference, powered by NVIDIA L40S Tensor Core GPUs, offering enhanced capabilities for generative AI workloads.
- Supports 1, 4, and 8 GPU configurations with 48 GB HBM per GPU
- Can deploy large language models up to 14B parameters on a single GPU node
- Provides up to 400 Gbps networking throughput
- Offers better performance and cost-effectiveness compared to G5 instances
- Ideal for use cases like chatbots, text generation, and image generation
Key performance improvements include up to 37% better latency, 60% higher throughput, and the ability to handle larger models with longer context lengths more efficiently.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Dec 11
2024
2024
Amazon SageMaker AI announces availability of P5e and G6e instances for Inference
Oct 15
2024
2024
Amazon SageMaker Studio notebooks now support G6e instance types
Apr 20
2026
2026
Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances
Apr 21
2022
2022
Amazon SageMaker Serverless Inference is now generally available
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.