Home icon

Introducing latency-optimized inference for foundation models in Amazon Bedrock

News



Amazon Bedrock introduces latency-optimized inference for foundation models, enhancing AI application performance.

  • Supports Anthropic's Claude 3.5 Haiku and Meta's Llama 3.1 405B and 70B models
  • Provides faster response times without compromising accuracy
  • Leverages AWS Trainium2 and advanced software optimizations
  • Requires no additional setup or model fine-tuning
  • Particularly beneficial for latency-sensitive applications like chatbots

The feature is currently available in the US East (Ohio) Region, offering improved inference speed for generative AI applications.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 5
2025
Announcing latency-optimized inference for Amazon Nova Pro foundation model in Amazon Bedrock
Dec 23
2024
Amazon Bedrock Agents, Flows, and Knowledge Bases now supports Latency Optimized Models
Nov 6
2024
Integrate foundation models into your code with Amazon Bedrock
Jan 28
2025
Optimizing AI responsiveness: A practical guide to Amazon Bedrock latency-optimized inference

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.