Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available
Machine Learning Blog
Amazon has announced the general availability of Amazon Bedrock Evaluations, which allows organizations to systematically evaluate machine learning models and Retrieval Augmented Generation (RAG) systems across different environments. Key highlights include:
- Bring Your Own Inference (BYOI) capabilities for evaluating models and RAG systems from any provider
- New citation metrics for RAG systems: citation precision and citation coverage
- Ability to evaluate model responses using LLM-as-a-judge (LLMaaJ) workflow
- Support for evaluating retrieval and generation quality across different platforms
- Flexible evaluation using Amazon Bedrock console, Python SDK, and APIs
The release enables organizations to assess generative AI application performance systematically, regardless of where the models or systems are deployed, helping improve quality and reliability.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2024
2025
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.