Reinforcement fine-tuning for Amazon Nova: Teaching AI through feedback

Machine Learning Blog

This article explains reinforcement fine-tuning (RFT) for Amazon Nova models, a technique that teaches AI through evaluation rather than imitation, requiring only prompts and quality criteria instead of massive labeled datasets.

RFT learns by evaluating outcomes through test cases and reward functions instead of imitating labeled examples
Supports code generation and math reasoning by verifying outputs automatically without step-by-step demonstrations
Available across four tiers: Amazon Bedrock (fully managed), SageMaker Training Jobs (flexible control), SageMaker HyperPod (enterprise-scale), Nova Forge (multi-turn agentic workflows)
Uses two reward approaches: RLVR (rule-based Lambda functions) for objective tasks, RLAIF (AI judges) for subjective evaluation
Ideal for code generation, customer service, content moderation, financial analysis where outcomes are verifiable
Requires model to produce at least one correct solution among 4-8 attempts; use SFT first if consistently failing
Supports LoRA (parameter-efficient, lower cost) and full-rank training with different resource tradeoffs
Works with reasoning models that show intermediate thinking steps for complex analytical tasks
Reduces token usage and operational complexity compared to supervised fine-tuning

RFT enables efficient model customization for tasks with verifiable outcomes, offering a scalable alternative to traditional supervised fine-tuning across multiple implementation tiers.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Mar 25
2026

Reinforcement fine-tuning on Amazon Bedrock with OpenAI-Compatible APIs: a technical walkthrough

Jul 21
2026

Exploring self-distilled reasoning for supervised fine-tuning with Amazon Nova

Oct 29
2025

Web Grounding: Build accurate AI applications with Amazon Nova models

Oct 29
2025

Build more accurate AI applications with Amazon Nova Web Grounding

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Reinforcement fine-tuning for Amazon Nova: Teaching AI through feedback

Related articles