Home icon

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available

Machine Learning Blog



Amazon has announced the general availability of Amazon Bedrock Evaluations, which allows organizations to systematically evaluate machine learning models and Retrieval Augmented Generation (RAG) systems across different environments. Key highlights include:

  • Bring Your Own Inference (BYOI) capabilities for evaluating models and RAG systems from any provider
  • New citation metrics for RAG systems: citation precision and citation coverage
  • Ability to evaluate model responses using LLM-as-a-judge (LLMaaJ) workflow
  • Support for evaluating retrieval and generation quality across different platforms
  • Flexible evaluation using Amazon Bedrock console, Python SDK, and APIs

The release enables organizations to assess generative AI application performance systematically, regardless of where the models or systems are deployed, helping improve quality and reliability.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 20
2025
Amazon Bedrock now supports RAG Evaluation (generally available)
Apr 23
2024
Amazon Bedrock model evaluation is now generally available
Mar 14
2025
Evaluating RAG applications with Amazon Bedrock knowledge base evaluation
Oct 9
2024
Amazon Bedrock Model Evaluation now supports evaluating custom models

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.