Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities
Machine Learning Blog
This article provides a comprehensive guide to fine-tuning Amazon Nova models using the Nova Forge SDK with data mixing capabilities, enabling domain specialization while preserving general intelligence.
- Data mixing blends domain-specific training data with Amazon-curated datasets to prevent catastrophic forgetting
- Five-stage workflow: environment setup, data preparation, training configuration, model training, and evaluation
- Requires SageMaker HyperPod cluster with GPU instances (ml.p5.48xlarge recommended)
- Data sanitization removes tokens conflicting with Nova's chat template (e.g., "System:", "Assistant:")
- LoRA (Low-Rank Adaptation) is parameter-efficient; full-rank SFT available for deeper adaptation
- Configurable data mixing ratios control batch composition (e.g., 50% customer data, 50% Nova-curated)
- Evaluation measures both domain performance and general capability retention using MMLU and IFEval benchmarks
- Previous results showed 12-point F1 improvement on domain task while maintaining near-baseline MMLU scores
- Best practice: iterate on mixing ratios, not just hyperparameters; always evaluate on both axes
The guide demonstrates that data mixing enables practical production fine-tuning by balancing domain expertise with general intelligence through systematic iteration and dual-axis evaluation.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2026
2026
2026
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.