Build a RAG-based QnA application using Llama3 models from SageMaker JumpStart
Machine Learning Blog
This article provides a step-by-step guide for creating a Retrieval-Augmented Generation (RAG) question-answering application using the Llama3-8B model from SageMaker JumpStart and the BGE Large EN v1.5 embedding model. It covers the following key topics:
Specifically, the article covers:
- An overview of Llama3-8B, BGE Large EN v1.5, and RAG
- Prerequisites and setup for the solution
- Deploying the Llama3-8B and BGE Large EN v1.5 models on SageMaker JumpStart
- Setting up the models with LangChain
- Preparing data (Amazon's Annual Reports) and generating embeddings
- Implementing different approaches for document retrieval and question answering using LangChain:
- Regular Retrieval Chain
- Parent Document Retriever Chain
- Cleaning up resources to avoid incurring unnecessary costs
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Sep 6
2024
2024
Fine-tune Llama 3 for text generation on Amazon SageMaker JumpStart
Sep 25
2024
2024
Llama 3.2 generative AI models now available in Amazon SageMaker JumpStart
Aug 21
2024
2024
Fine-tune Meta Llama 3.1 models for generative AI inference using Amazon SageMaker JumpStart
Jun 6
2024
2024
Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.