Home icon

Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities

Machine Learning Blog



This article provides a comprehensive guide to fine-tuning Amazon Nova models using the Nova Forge SDK with data mixing capabilities, enabling domain specialization while preserving general intelligence.

  • Data mixing blends domain-specific training data with Amazon-curated datasets to prevent catastrophic forgetting
  • Five-stage workflow: environment setup, data preparation, training configuration, model training, and evaluation
  • Requires SageMaker HyperPod cluster with GPU instances (ml.p5.48xlarge recommended)
  • Data sanitization removes tokens conflicting with Nova's chat template (e.g., "System:", "Assistant:")
  • LoRA (Low-Rank Adaptation) is parameter-efficient; full-rank SFT available for deeper adaptation
  • Configurable data mixing ratios control batch composition (e.g., 50% customer data, 50% Nova-curated)
  • Evaluation measures both domain performance and general capability retention using MMLU and IFEval benchmarks
  • Previous results showed 12-point F1 improvement on domain task while maintaining near-baseline MMLU scores
  • Best practice: iterate on mixing ratios, not just hyperparameters; always evaluate on both axes

The guide demonstrates that data mixing enables practical production fine-tuning by balancing domain expertise with general intelligence through systematic iteration and dual-axis evaluation.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Mar 18
2026
Introducing Nova Forge SDK, a seamless way to customize Nova models for enterprise AI
Mar 2
2026
Building specialized AI without sacrificing intelligence: Nova Forge data mixing in action
Mar 18
2026
Kick off Nova customization experiments using Nova Forge SDK
Dec 2
2025
Introducing Amazon Nova Forge: Build your own frontier models using Nova

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.