Warner Bros. Discovery achieves 60% cost savings and faster ML inference with AWS Graviton
Machine Learning Blog
This article describes how Warner Bros. Discovery achieved significant cost and performance improvements by migrating ML inference workloads to AWS Graviton-based instances.
- Achieved 60% average cost savings, up to 88% for catalog ranking models
- Improved latency by 7% to 60% across different ML models
- XGBoost models showed 60% P99 latency reduction
- Used AWS Graviton processors optimized for ML with Neon and SVE instructions
- Leveraged SageMaker Inference Recommender for automated benchmarking
- Employed shadow testing to validate performance before production deployment
- Completed proof-of-concept in one week, full migration in one month
- Serves 125M+ users across 100+ countries with sub-100ms latency requirements
- Planning to migrate remaining models to achieve 100% Graviton adoption
WBD's migration demonstrates how AWS Graviton processors deliver substantial cost savings and performance improvements for large-scale ML recommendation systems while maintaining reliability and user experience.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2025
2025
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.