Home icon

How Beekeeper optimized user personalization with Amazon Bedrock

Machine Learning Blog



This article describes how Beekeeper built an automated system using Amazon Bedrock to continuously optimize LLM model and prompt selection for user personalization.

  • Beekeeper created a dynamic evaluation system that tests model/prompt combinations and ranks them on a live leaderboard
  • System evaluates quality using compression ratio, action item presence, hallucination detection, and vector comparison metrics
  • Baseline leaderboard established using synthetic test data with ground truth annotations
  • Manual validation performed via Amazon Mechanical Turk on statistically significant sample sizes
  • User feedback incorporated through prompt mutation process without affecting other users
  • Production deployment uses top three model/prompt pairs at 50%, 30%, and 20% traffic ratios
  • Preliminary results show 13-24% better ratings when aggregated per tenant
  • Solution uses Amazon EventBridge, EKS, Lambda, RDS, and Bedrock for orchestration and evaluation

Beekeeper's approach automates LLM selection and prompt optimization, enabling smaller teams to continuously improve results while balancing quality, cost, and latency without manual intervention.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Nov 12
2024
Deliver personalized marketing with Amazon Bedrock Agents
Apr 10
2025
Generate user-personalized communication with Amazon Personalize and Amazon Bedrock
Aug 20
2025
Create personalized products and marketing campaigns using Amazon Nova in Amazon Bedrock
May 21
2026
Amazon Bedrock expands support for request-level usage attribution

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.