Home icon

Revolutionizing large language model training with Arcee and AWS Trainium

Machine Learning Blog



This article presents a case study on how Arcee, an AI company, leveraged AWS Trainium to enhance the efficiency and cost-effectiveness of continual pre-training (CPT) for large language models (LLMs) in specialized domains.

Specifically, the article covers:

  • An introduction to Arcee's CPT and model merging techniques for domain adaptation of LLMs
  • The benefits of using AWS Trainium for accelerating LLM training and reducing costs
  • Detailed steps for setting up a Trainium cluster using AWS ParallelCluster and training an LLM using Neuron Distributed
  • Results demonstrating improved perplexity scores for a Llama 2 7B model fine-tuned on PubMed data using CPT and Trainium
  • Conclusion emphasizing the potential of integrating Arcee's methodologies with Trainium to revolutionize LLM development across various industries


Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Apr 29
2024
Develop and train large models cost-efficiently with Metaflow and AWS Trainium
Jun 17
2024
Accelerate deep learning training and simplify orchestration with AWS Trainium and AWS Batch
Sep 7
2025
How NetoAI trained a Telecom-specific large language model using Amazon SageMaker and AWS Trainium
Oct 27
2025
Building large language models for the public sector on AWS

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.