Home icon

Improve LLM performance with human and AI feedback on Amazon SageMaker for Amazon Engineering

Machine Learning Blog



This article discusses how the Amazon EU Design and Construction team improved the performance of their question answering bot using human and AI feedback on Amazon SageMaker.

Specifically, the article covers:

  • Collecting user feedback from Amazon engineers during a pilot project to identify limitations in the bot's responses
  • Performing supervised fine-tuning on a Mistral-7B model using the feedback
  • Using another LLM (Anthropic Claude 2) to generate AI feedback scores for bot responses to augment the limited human feedback
  • Using reinforcement learning with the human and AI feedback data to further fine-tune the Mistral-7B model
  • The improved performance of the bot after reinforcement learning, with an 8% increase in AI feedback scores
  • Conclusion on the benefits of using AI feedback to scale reinforcement learning while reducing subject matter expert workload


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Jun 24
2025
Power Your LLM Training and Evaluation with the New SageMaker AI Generative AI Tools
Jul 24
2024
LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow
Apr 22
2025
Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15
Dec 24
2025
Optimizing LLM inference on Amazon SageMaker AI with BentoML’s LLM- Optimizer

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.