Amazon Bedrock Model Evaluation LLM-as-a-judge is now generally available
News
Amazon Bedrock Model Evaluation's LLM-as-a-judge capability is now generally available, offering comprehensive model evaluation tools.
- Allows evaluating and comparing models using other LLMs as judges
- Supports quality metrics like correctness, completeness, and professional tone
- Includes responsible AI metrics such as harmfulness and answer refusal
- Can evaluate all Amazon Bedrock models, including serverless and marketplace models
- New feature allows "bring your own inference responses" from any model or system
The service provides human-like evaluation quality at a lower cost and significantly reduces evaluation time, making model selection more efficient and comprehensive.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.