Home icon

AstraZeneca fine-tunes genomics foundation models with Amazon SageMaker

Industries Blog



AstraZeneca's Centre for Genomics Research (CGR) leveraged Amazon SageMaker to fine-tune a genomic foundation model called HyenaDNA for predicting pathogenic genetic variants, particularly in non-coding regions of the human genome.

  • Used HyenaDNA model to analyze up to one million DNA tokens at single-nucleotide level
  • Improved pathogenicity prediction by examining 32,000 nucleotide contexts around variants
  • Outperformed existing baseline (CADD) in four out of five test datasets by 20.9% on average
  • Utilized Amazon SageMaker for model training, including JupyterLab, distributed training, and hyperparameter optimization
  • Stored training data on Amazon Elastic File System (EFS) for collaborative and efficient workflows

The solution demonstrates how advanced AI models can help understand genetic variations and potentially discover novel drug targets by predicting which genetic variants are likely to cause disease.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 17
2025
How AstraZeneca improved their genomics processing to be 60% faster, 70% more cost-effective
Dec 4
2024
Accelerate foundation model training and fine-tuning with new Amazon SageMaker HyperPod recipes
Sep 11
2024
Genomics England uses Amazon SageMaker to predict cancer subtypes and patient survival from multi-modal data
May 31
2024
Pre-training genomic language models using AWS HealthOmics and Amazon SageMaker

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.