Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

Machine Learning Blog

This article discusses how the Amazon EU Design and Construction team improved the performance of their question answering bot using human and AI feedback on Amazon SageMaker.

Specifically, the article covers:

Collecting user feedback from Amazon engineers during a pilot project to identify limitations in the bot's responses
Performing supervised fine-tuning on a Mistral-7B model using the feedback
Using another LLM (Anthropic Claude 2) to generate AI feedback scores for bot responses to augment the limited human feedback
Using reinforcement learning with the human and AI feedback data to further fine-tune the Mistral-7B model
The improved performance of the bot after reinforcement learning, with an 8% increase in AI feedback scores
Conclusion on the benefits of using AI feedback to scale reinforcement learning while reducing subject matter expert workload

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Jun 24
2025

Power Your LLM Training and Evaluation with the New SageMaker AI Generative AI Tools

Jul 24
2024

LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow

Apr 22
2025

Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15

Dec 24
2025

Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizer

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

Related articles