Amazon Kendra as a Retriever to Build Retrieval Augmented Generation (RAG) Systems
Amazon Kendra Retrieval API: Overview
Retrieval augmented generation (RAG) is a technique that uses generative artificial intelligence (AI) to build question-answering applications. RAG systems have two components: a retriever and a large language model (LLM). Given a query, the retriever identifies the most relevant chunks of text from a corpus of documents and feeds it to the LLM to provide the most useful answer. Then, the LLM analyzes the relevant text chunks or passages and generates a comprehensive response for the query.
Amazon Kendra is a fully managed service that provides out-of-the-box semantic search capabilities for state-of-the-art ranking of documents and passages. You can use Amazon Kendra as a retriever for RAG systems. It can source the most relevant content and documents from your enterprise data to maximize the quality of your RAG payload. Hence, yielding better LLM responses than conventional or keyword-based search solutions.
This blog post will show you how to use Amazon Kendra as a retriever for RAG systems, with its application and benefits.
Amazon Kendra Retrieval API: Steps
To use Amazon Kendra as a retriever for RAG systems, you need to do the following steps:
- Create an index in Amazon Kendra and add your data sources. You can use pre-built connectors to popular data sources such as Amazon Simple Storage Service (Amazon S3), SharePoint, Confluence, and websites. You can also support common document formats such as HTML, Word, PowerPoint, PDF, Excel, and pure text files.
- Use the Retrieve API to retrieve the top 100 most relevant passages from documents in your index for a given query. The Retrieve API looks at chunks of text or excerpts referred to as passages and returns them using semantic search. Semantic search considers the search query’s context plus all the available information from the indexed documents. You can also override boosting at the index level, filter based on document fields or attributes. Filter based on the user or their group access to documents and include certain areas in the response that might provide useful additional information.
- Send the retrieved passages along with the query as a prompt to the LLM of your choice. The LLM will use the passages as context to generate a natural language answer for the query.
Benefits
Using Amazon Kendra as a retriever for RAG systems has several benefits:
- You can leverage the high-accuracy search in Amazon Kendra to retrieve the most relevant passages from your enterprise data, improving the accuracy and quality of your LLM responses.
- You can use Amazon Kendra’s deep learning search models that are pre-trained on 14 domains and don’t require any machine learning expertise. So, there’s no need to deal with word embeddings, document chunking, and other lower-level complexities typically required for RAG implementations.
- You can easily integrate Amazon Kendra with various LLMs, such as those available soon via Amazon Bedrock and Amazon Titan. Transforming how developers and enterprises can solve traditionally complex challenges related to natural language processing and understanding.
Conclusion
In this blog post, we showed you how to use Amazon Kendra as a retriever to build retrieval augmented generation (RAG) systems. We explained what RAG is, how it works, how to use Amazon Kendra as a retriever, and its benefits. We hope you find this blog post useful and informative.