AstraZeneca fine-tunes genomics foundation models with Amazon SageMaker
Industries Blog
AstraZeneca's Centre for Genomics Research (CGR) leveraged Amazon SageMaker to fine-tune a genomic foundation model called HyenaDNA for predicting pathogenic genetic variants, particularly in non-coding regions of the human genome.
- Used HyenaDNA model to analyze up to one million DNA tokens at single-nucleotide level
- Improved pathogenicity prediction by examining 32,000 nucleotide contexts around variants
- Outperformed existing baseline (CADD) in four out of five test datasets by 20.9% on average
- Utilized Amazon SageMaker for model training, including JupyterLab, distributed training, and hyperparameter optimization
- Stored training data on Amazon Elastic File System (EFS) for collaborative and efficient workflows
The solution demonstrates how advanced AI models can help understand genetic variations and potentially discover novel drug targets by predicting which genetic variants are likely to cause disease.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2024
2024
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.