Amazon SageMaker AI launches multi-turn reinforcement learning for AI agent model customization

News

This article announces Amazon SageMaker AI's new multi-turn reinforcement learning capability for fine-tuning AI agent models on complex, multi-step tasks.

Multi-turn RL trains models against custom agent environments with full sequence rewards
Enables smaller, lower-cost models to match larger general-purpose models on target workloads
Fully serverless with no infrastructure provisioning required; pay only for tokens processed
Integrates with Amazon Bedrock AgentCore Runtime, EKS, EC2, Fargate, or custom frameworks
SageMaker manages training loop, rollout orchestration, trajectory collection, and checkpoint management
Built-in MLflow tracking for inspecting agent trajectories, rewards, and traces
Supported models: Qwen 3.6 27B, Nova Lite 2.0, GPT-OSS-20B, Gemma 31B in us-west-2

Multi-turn RL simplifies complex agent model training by eliminating custom infrastructure setup and providing managed end-to-end training capabilities.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

May 4
2026

Amazon SageMaker AI launches AI agent experience for model customization

Jul 2
2026

Best practices for multi-turn reinforcement learning in Amazon SageMaker AI

May 4
2026

Agent-guided workflows to accelerate model customization in Amazon SageMaker AI

Jan 14
2026

Transform AI development with new Amazon SageMaker AI model customization and large-scale training capabilities

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Amazon SageMaker AI launches multi-turn reinforcement learning for AI agent model customization

Related articles