What is Retrieval-Augmented Generation?
Large Language Models are not up-to-date, and they also lack domain-specific knowledge, as they are trained for generalized tasks and cannot be used to ask questions about your own data.
That’s where Retrieval-Augmented Generation (RAG) comes in: an architecture that provides the most relevant and contextually important data to the LLMs when answering questions.
The three key components for building a RAG system are:
Embedding Models, which embed the data into vectors.
Vector Database to store and retrieve those embeddings, and
A Large Language Model, which takes the context from the vector database to answer.
Clarifai provides all three in a single platform, seamlessly allowing you to build RAG applications.
How to build a Retrieval-Augmented Generation system
As part of our “AI in 5” series, where we teach you how you can create amazing things in just 5 minutes, in this blog, we will see how you can build a RAG system in just 4 lines of code using Clarifai’s Python SDK.
Step 1: Install Clarifai and set your Personal Access Token as an environment variable
First, install the Clarifai Python SDK with a pip command.
Now, you need to set your Clarifai Personal Access Token (PAT) as an environment variable to access the LLMs and vector store. To create a new Personal Access Token, Sign up for Clarifai or if you already have an account, log in to the portal and go to the security option in the settings. Create a new personal access token by providing a token description and selecting the scopes. Copy the Token and set it as an environmental variable.
Once you have installed the Clarifai Python SDK and set your Personal Access Token as an environment variable, you can see that all you need are just these 4 lines of code to build a RAG system. Let’s look at them!
Step 2: Set up the RAG system by passing your Clarifai user ID
First, import the RAG class from Clarifai Python SDK. Now, set up your RAG system by passing your Clarifai user ID.
You can use the setup method and pass the user ID. Since you are already signed up to the platform, you can find your user ID under the account option in the settings here.
Now, once you pass the user ID the setup method will create:
A Clarifai app with “Text” as the base workflow. If you are not aware of apps, they are the basic building blocks for creating projects on the Clarifai platform. Your data, annotations, models, predictions, and searches are contained within applications. Apps act as your vector database. Once you upload the data to the Clarifai application, it will embed the data and index the embeddings based on your base workflow. You can then use those embeddings to query for similarity.Â
Next, it will create a RAG prompter workflow. Workflows in Clarifai allow you to combine multiple models and operators allowing you to build powerful multi-modal systems for various use cases. Within the above created app, it will create this workflow. Let’s look at the RAG prompter workflow and what it does.
We have the input, RAG prompter model type, and text-to-text model types. Let’s understand the flow. Whenever a user sends an input prompt, the RAG prompter will use that prompt to find the relevant context from the Clarifai vector store.
Now, we will pass the context along with the prompt to the text-to-text model type to answer it. By default, this workflow uses the Mistral-7B-Instruct model. Finally, the LLM uses the context and the user query to answer. So that’s the RAG prompter workflow.Â
You don’t need to worry about all these things as the setup method will handle these tasks for you. All you need to do is specify your app ID.
There are other parameters available in the setup method:
app_url: If you already have a Clarifai app that contains your data, you can pass the URL of that app instead of creating an app from scratch using the user ID.
llm_url: As we have seen, by default the prompt workflow takes the Mistral 7b instruct model, but there are many open-source and third-party LLMs in the Clarifai community. You can pass your preferred LLM URL.
base_workflow: As mentioned, the data will be embedded in your Clarifai app based on the base workflow. By default, it will be the text workflow, but there are other workflows available as well. You can specify your preferred workflow.
Step 3: Upload your Documents
Next, upload your documents to embed and store them in the Clarifai vector database. You can pass a file path to your document, a folder path to the documents, or a public URL to the document.
In this example, I am passing the path to a PDF file, which is a recent survey paper on multimodal LLMs. Once you upload the document, it will be loaded and parsed into chunks based on the chunk_size and chunk_overlap parameters. By default, the chunk_size is set to 1024, and the chunk_overlap is set to 200. However, you can adjust these parameters.
Once the document is parsed into chunks, it will ingest the chunks into the Clarifai app.
Step 4: Chat with your Documents
Finally, chat with your data using the chat method. Here, I am asking it to summarize the PDF file and research on multimodal large language models.
Conclusion
That’s how easy it is to build a RAG system with the Python SDK in 4 lines of code. Just to summarize, to set up the RAG system, all you need to do is pass your user ID, or if you have your own Clarifai app, pass that app URL. You can also pass your preferred LLM and workflow.
Next, upload the documents, and there is an option to specify the chunk_size and chunk_overlap parameters to help parse and chunk the documents.
Finally, chat with your documents. You can find the link to the Colab Notebook here to implement this.
If you’d prefer to watch this tutorial you can find the YouTube video here.
Â