Amazon Bedrock introduces prompt caching, a new feature that can reduce costs by up to 90% and latency by up to 85% by caching repetitive prompts across multiple API calls.


<div>
<p>
Amazon Bedrock has announced a preview of prompt caching, a new feature designed to optimize generative AI model performance and reduce costs.
</p>
<ul>
<li>Reduces costs by up to 90% and latency by up to 85% for supported models</li>
<li>Caches frequently used prompts across multiple API calls</li>
<li>Avoids reprocessing repetitive context like system prompts and common examples</li>
<li>Currently available for Claude 3.5 Haiku, Claude 3.5 Sonnet v2, and Nova models</li>
<li>Initially limited to select customers in US West (Oregon) and US East (N. Virginia) regions</li>
</ul>
<p>
The feature is part of Amazon Bedrock's broader goal of providing secure, privacy-focused generative AI capabilities with improved performance and cost-efficiency.
</p>
</div>


Amazon Bedrock announces preview of prompt caching

Related articles

Related articles

Apr 7
2025
Amazon Bedrock announces general availability of prompt caching

Apr 7
2025
Effectively use prompt caching on Amazon Bedrock

Jan 26
2026
Amazon Bedrock now supports 1-hour duration for prompt caching

Jul 10
2024
Amazon Bedrock Prompt Management and Prompt Flows now available in preview