Amazon SageMaker AI launches multi-turn reinforcement learning for AI agent model customization
News
This article announces Amazon SageMaker AI's new multi-turn reinforcement learning capability for fine-tuning AI agent models on complex, multi-step tasks.
- Multi-turn RL trains models against custom agent environments with full sequence rewards
- Enables smaller, lower-cost models to match larger general-purpose models on target workloads
- Fully serverless with no infrastructure provisioning required; pay only for tokens processed
- Integrates with Amazon Bedrock AgentCore Runtime, EKS, EC2, Fargate, or custom frameworks
- SageMaker manages training loop, rollout orchestration, trajectory collection, and checkpoint management
- Built-in MLflow tracking for inspecting agent trajectories, rewards, and traces
- Supported models: Qwen 3.6 27B, Nova Lite 2.0, GPT-OSS-20B, Gemma 31B in us-west-2
Multi-turn RL simplifies complex agent model training by eliminating custom infrastructure setup and providing managed end-to-end training capabilities.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
May 4
2026
2026
Amazon SageMaker AI launches AI agent experience for model customization
May 4
2026
2026
Agent-guided workflows to accelerate model customization in Amazon SageMaker AI
Jan 14
2026
2026
Transform AI development with new Amazon SageMaker AI model customization and large-scale training capabilities
Mar 25
2026
2026
Amazon SageMaker AI now supports serverless reinforcement fine-tuning for 12 additional models
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.