How Amazon trains sequential ensemble models at scale with Amazon SageMaker Pipelines

Machine Learning Blog

This article details how Amazon uses Amazon SageMaker Pipelines to train sequential ensemble models for use case identification in Salesforce opportunities, using a multi-layered BERTopic approach.

The solution uses three sequential BERTopic models to generate hierarchical topic clustering
Each BERTopic model consists of embedding, dimension reduction, clustering, and keyword identification steps
Key challenges include data preprocessing, scalable compute, and coordinating multi-layer model training
The implementation leverages SageMaker pipeline steps including Processing, Training, Callback, and Model registration
Uses custom Docker images and integrates with AWS services like SQS and Lambda for workflow orchestration

The approach enables automatic identification of use cases from text data, improving sales analytics and recommendation models through advanced machine learning techniques.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Nov 27
2024

Efficiently train models with large sequence lengths using Amazon SageMaker model parallel

Oct 28
2024

Customized model monitoring for near real-time batch inference with Amazon SageMaker

Aug 7
2024

Automate the machine learning model approval process with Amazon SageMaker Model Registry and Amazon SageMaker Pipelines

Apr 16
2024

Distributed training and efficient scaling with the Amazon SageMaker Model Parallel and Data Parallel Libraries

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

How Amazon trains sequential ensemble models at scale with Amazon SageMaker Pipelines

Related articles