New RAG evaluation and LLM-as-a-judge capabilities in Amazon Bedrock
AWS News Blog
AWS has announced two new evaluation capabilities in Amazon Bedrock for improving generative AI applications:
- RAG (Retrieval Augmented Generation) evaluation in Knowledge Bases
- Model Evaluation with LLM-as-a-judge capabilities
Key features include:
- Automatic knowledge base assessment using large language models
- Metrics evaluation for quality and responsible AI criteria
- Comparing different configuration settings
- Normalized scoring from 0 to 1
- Natural language explanations for evaluation results
These capabilities are currently in preview across multiple AWS regions and aim to help developers streamline testing and improve AI-powered applications more efficiently.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Dec 2
2024
2024
Amazon Bedrock Model Evaluation now includes LLM-as-a-judge (Preview)
Dec 2
2024
2024
Amazon Bedrock Knowledge Bases now supports RAG evaluation (Preview)
Mar 20
2025
2025
Amazon Bedrock now supports RAG Evaluation (generally available)
Mar 20
2025
2025
Amazon Bedrock Model Evaluation LLM-as-a-judge is now generally available
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.