AI judging AI: Scaling unstructured text analysis with Amazon Nova

Machine Learning Blog

The article discusses a novel approach to scaling unstructured text analysis using multiple Large Language Models (LLMs) as judges on Amazon Bedrock, addressing the challenge of efficiently analyzing large volumes of customer feedback.

Proposed a workflow using multiple AI models to generate and evaluate thematic summaries of text data
Demonstrated how to use Amazon Nova Pro, Claude 3 Sonnet, and other models to analyze and rate feedback
Implemented statistical metrics like Cohen's kappa and Spearman's rho to compare model performance
Showed that LLMs can achieve up to 91% inter-model agreement compared to 79% human-to-model agreement
Highlighted the importance of human oversight despite high AI performance

The solution offers a scalable method for organizations to analyze large volumes of unstructured text data quickly and reliably using generative AI technologies.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Jul 17
2025

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

Sep 2
2025

Natural language-based database analytics with Amazon Nova

Oct 16
2025

Optimizing document AI and structured outputs by fine-tuning Amazon Nova Models and on-demand inference

Apr 21
2025

Build an automated generative AI solution evaluation pipeline with Amazon Nova

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

AI judging AI: Scaling unstructured text analysis with Amazon Nova

Related articles