Home icon

Migrating enterprise ML workloads from Databricks to AWS for large scale ML

Industries Blog



This article details Kargo's migration of enterprise ML workloads from Databricks to AWS, achieving significant improvements in cost, scalability, and operational efficiency.

  • Replaced Delta Lake ETL with AWS Glue and Apache Iceberg for ACID transactions and schema evolution
  • Consolidated scattered modeling logic into containerized Python packages deployed via Amazon ECR
  • Implemented SageMaker Pipelines for end-to-end orchestration with deterministic artifact versioning
  • Achieved 40% cost reduction through serverless AWS Glue and Athena replacing persistent clusters
  • Improved pipeline execution speed 3-5x through parallel SageMaker pipeline execution
  • Decoupled real-time inference serving from training using sidecar containers for zero-downtime updates
  • Standardized observability via Amazon CloudWatch for unified monitoring across all components
  • Maintained byte-for-byte output parity with original Databricks pipelines for production safety

The migration demonstrates how thoughtful re-architecture—rather than lift-and-shift—enables scalable ML platforms supporting both offline optimization and real-time inference at advertising scale.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Dec 2
2024
New streamlined deployment experience for Databricks on AWS
Jan 11
2024
Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML
May 13
2025
Databricks modernizes healthcare data on AWS
Nov 13
2024
Zero to generative AI with Databricks and AWS

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.