Amazon SageMaker Inference now supports G6e instances

Machine Learning Blog

AWS has announced support for G6e instances on Amazon SageMaker Inference, powered by NVIDIA L40S Tensor Core GPUs, offering enhanced capabilities for generative AI workloads.

Supports 1, 4, and 8 GPU configurations with 48 GB HBM per GPU
Can deploy large language models up to 14B parameters on a single GPU node
Provides up to 400 Gbps networking throughput
Offers better performance and cost-effectiveness compared to G5 instances
Ideal for use cases like chatbots, text generation, and image generation

Key performance improvements include up to 37% better latency, 60% higher throughput, and the ability to handle larger models with longer context lengths more efficiently.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Dec 11
2024

Amazon SageMaker AI announces availability of P5e and G6e instances for Inference

Jul 23
2026

Amazon SageMaker AI inference now supports G7 instances

Oct 15
2024

Amazon SageMaker Studio notebooks now support G6e instance types

Jun 23
2026

SageMaker Notebook Instances now support G6e instance types

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Amazon SageMaker Inference now supports G6e instances

Related articles