Amazon Bedrock Model Evaluation now includes LLM-as-a-judge (Preview)

News

Amazon Bedrock has introduced a new Model Evaluation feature called LLM-as-a-judge (Preview) that allows users to evaluate and compare foundation models more effectively.

Users can choose an LLM as a judge to evaluate models
Select from multiple judge LLMs available on Amazon Bedrock
Assess quality metrics like correctness, completeness, and professional tone
Evaluate responsible AI metrics such as harmfulness and answer refusal
Bring custom prompt datasets for personalized evaluations

This new approach provides human-like evaluation quality at a lower cost and significantly faster than traditional human-based evaluations, offering a more sophisticated alternative to previous automatic evaluation methods.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Mar 20
2025

Amazon Bedrock Model Evaluation LLM-as-a-judge is now generally available

Feb 12
2025

LLM-as-a-judge on Amazon Bedrock Model Evaluation

Dec 2
2024

New RAG evaluation and LLM-as-a-judge capabilities in Amazon Bedrock

Oct 9
2024

Amazon Bedrock Model Evaluation now supports evaluating custom models

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Amazon Bedrock Model Evaluation now includes LLM-as-a-judge (Preview)

Related articles