Amazon Bedrock Model Evaluation now includes LLM-as-a-judge (Preview)
News
Amazon Bedrock has introduced a new Model Evaluation feature called LLM-as-a-judge (Preview) that allows users to evaluate and compare foundation models more effectively.
- Users can choose an LLM as a judge to evaluate models
- Select from multiple judge LLMs available on Amazon Bedrock
- Assess quality metrics like correctness, completeness, and professional tone
- Evaluate responsible AI metrics such as harmfulness and answer refusal
- Bring custom prompt datasets for personalized evaluations
This new approach provides human-like evaluation quality at a lower cost and significantly faster than traditional human-based evaluations, offering a more sophisticated alternative to previous automatic evaluation methods.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.