Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart

Machine Learning Blog

This article provides a step-by-step guide for creating a Retrieval-Augmented Generation (RAG) question-answering application using the Llama3-8B model from SageMaker JumpStart and the BGE Large EN v1.5 embedding model. It covers the following key topics:

Specifically, the article covers:

An overview of Llama3-8B, BGE Large EN v1.5, and RAG
Prerequisites and setup for the solution
Deploying the Llama3-8B and BGE Large EN v1.5 models on SageMaker JumpStart
Setting up the models with LangChain
Preparing data (Amazon's Annual Reports) and generating embeddings
Implementing different approaches for document retrieval and question answering using LangChain:
- Regular Retrieval Chain
- Parent Document Retriever Chain
Cleaning up resources to avoid incurring unnecessary costs

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Sep 6
2024

Fine-tune Llama 3 for text generation on Amazon SageMaker JumpStart

Sep 25
2024

Llama 3.2 generative AI models now available in Amazon SageMaker JumpStart

Aug 21
2024

Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart

Jun 6
2024

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart

Related articles