Train self-supervised vision transformers on overhead imagery with Amazon SageMaker
Blog
This article demonstrates training self-supervised DINO vision transformers on satellite imagery using Amazon SageMaker and the BigEarthNet-S2 dataset.
- Self-supervised learning reduces annotation costs for satellite/aerial image analysis tasks
- DINO algorithm trains student/teacher networks on unlabeled overhead imagery without labels
- Custom PyTorch dataset class loads BigEarthNet-S2 multispectral images from S3
- SageMaker distributed data parallel library enables multi-GPU training on large instances
- Pre-trained DINO model transfers to downstream land cover classification task
- BigEarthNet-S2 pre-training achieved 6.7% higher average precision than ImageNet pre-training
- Training took approximately 11 hours on ml.p3.16xlarge instance with 83% GPU utilization
The solution provides a template for training self-supervised vision transformers on large-scale unlabeled satellite imagery, improving model performance on downstream tasks compared to generic ImageNet pre-training.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.