Home icon

New RAG evaluation and LLM-as-a-judge capabilities in Amazon Bedrock

AWS News Blog



AWS has announced two new evaluation capabilities in Amazon Bedrock for improving generative AI applications:

  • RAG (Retrieval Augmented Generation) evaluation in Knowledge Bases
  • Model Evaluation with LLM-as-a-judge capabilities

Key features include:

  • Automatic knowledge base assessment using large language models
  • Metrics evaluation for quality and responsible AI criteria
  • Comparing different configuration settings
  • Normalized scoring from 0 to 1
  • Natural language explanations for evaluation results

These capabilities are currently in preview across multiple AWS regions and aim to help developers streamline testing and improve AI-powered applications more efficiently.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 2
2024
Amazon Bedrock Model Evaluation now includes LLM-as-a-judge (Preview)
Dec 2
2024
Amazon Bedrock Knowledge Bases now supports RAG evaluation (Preview)
Mar 20
2025
Amazon Bedrock now supports RAG Evaluation (generally available)
Mar 20
2025
Amazon Bedrock Model Evaluation LLM-as-a-judge is now generally available

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.