watsonx.ai

ย View Only

Forget RAG and welcome Agentic RAG!

By Armand Ruiz posted Wed January 15, 2025 04:57 PM

  

๐—ก๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฅ๐—”๐—š 
In Native RAG, the most common implementation nowadays, the user query is processed through a pipeline that includes retrieval, reranking, synthesis, and generation of a response. 
 
This process leverages retrieval and generation-based methods to provide accurate and contextually relevant answers. 
 
๐—”๐—ด๐—ฒ๐—ป๐˜๐—ถ๐—ฐ ๐—ฅ๐—”๐—š 
Agentic RAG is an advanced, agent-based approach to question answering over multiple documents in a coordinated manner. It involves comparing different documents, summarizing specific documents, or comparing various summaries. 
 
Agentic RAG is a flexible framework that supports complex tasks requiring planning, multi-step reasoning, tool use, and learning over time. 
 
๐—ž๐—ฒ๐˜† ๐—–๐—ผ๐—บ๐—ฝ๐—ผ๐—ป๐—ฒ๐—ป๐˜๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—”๐—ฟ๐—ฐ๐—ต๐—ถ๐˜๐—ฒ๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ 
- Document Agents: Each document is assigned a dedicated agent capable of answering questions and summarizing within its own document. 
 
- Meta-Agent: A top-level agent manages all the document agents, orchestrating their interactions and integrating their outputs to generate a coherent and comprehensive response. 
 
๐—™๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—•๐—ฒ๐—ป๐—ฒ๐—ณ๐—ถ๐˜๐˜€ 
- Autonomy: Agents act independently to retrieve, process, and generate information. 
 
- Adaptability: The system can adjust strategies based on new data and changing contexts. 
 
- Proactivity: Agents can anticipate needs and take preemptive actions to achieve goals. 
Applications 
 
Agentic RAG is particularly useful in scenarios requiring thorough and nuanced information processing and decision-making. 
 
A few days ago, I discussed how the future of AI lies in AI Agents. RAG is currently the most popular use case, and with an agentic architecture, you will supercharge RAG!


#GenerativeAI
2 comments
37 views

Permalink

Comments

Thu January 16, 2025 09:17 AM

Great insights in your post! One concern I have with Agentic RAG is the challenge of managing operational costs and maintaining predictability, especially since it heavily relies on user behavior and the complexity of interactions. The high token usage for retrieval and response generation often results in unpredictable and potentially unsustainable costs as demand grows. Even Sam Altman has pointed out that their $200 subscription doesnโ€™t fully cover operational expenses, which highlights how tough it can be to balance advanced capabilities with financial sustainability.

What are your thoughts on tackling these cost challenges? Could shifting to smaller, localized models help improve cost predictability? Or perhaps optimizing token efficiency with specialized hardware for inference, like IBM NorthPole or Grog LPU, could be the key? Or maybe itโ€™s a combination of both? Would love to hear your thoughts!

Wed January 15, 2025 10:09 PM

AI agents are the the next big thing!