The RAG task
In this short blog let me highlight how one can use watsonx Granite Model Series and Langchain to answer State of the Union speech questions. In other words, I want the Granite LLM to generate the answers based on the provided document content (State of the Union speech).
This type of task is called Retrieval-Augmented Generation (RAG). The RAG is a versatile pattern that can unlock a number of use cases requiring factual recall of information, such as querying a knowledge base in natural language.
In its simplest form, RAG requires 3 steps:
- Index knowledge base passages (once)
- Retrieve relevant passage(s) from a knowledge base (for every user query)
- Generate a response by feeding the user query and the passage retrieved from the database into a large language model (for every user query).
The implementation
Watsonx.ai foundation models are supported from Langchain eco-system now. You can communicate with IBM Granite models using langchain code. You can also easily combine those models with RetrievalQA
chain type, Chroma vectorstore
, and HuggingFaceEmbeddings
(designed by langchain to simplify/automate the RAG task), .
If you are interested in step by step description of the solution please check this medium story out!. Code snippets and sample notebook attached. Enjoy!.
#watsonx.ai#ai-spotlight