Genomics England uses Amazon SageMaker to predict cancer subtypes and patient survival from multi-modal data
Machine Learning Blog
This article discusses a collaboration between Genomics England and AWS to develop multi-modal machine learning models for predicting cancer subtypes and patient survival using genomic and imaging data.
Specifically, the article covers:
- The datasets used, including genomic and whole slide imaging data from The Cancer Genome Atlas (TCGA) for breast cancer and gastrointestinal cancers
- Three multi-modal machine learning frameworks implemented:
- PORPOISE - an attention-based network combining pathology images and genomic features
- Hierarchical Extremum Encoding (HEEC) - a novel architecture from AWS using tree ensembles and hierarchical representations
- Hierarchical Image Pyramid Transformer (HIPT) - a self-supervised approach for the imaging modality
- The architecture deployed on AWS using SageMaker for data processing, model training, and model deployment
- Best practices such as separating development and production environments, and using CI/CD pipelines for automation
The collaboration aimed to enhance Genomics England's understanding of cancer biomarkers and biology using multi-modal data.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2024
2025
2024
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.