IBM Fusion

IBM Fusion

Ask questions, exchange ideas, and learn about IBM Fusion

 View Only

IBM Fusion Content Aware Storage: Making Data Better, Faster and Stronger

By Shu Mookerjee posted 14 days ago

  

 

RAG Against the Machine

Data is typically consumed more quickly than it’s produced. After all, the more information an organization generates, the more analytics it can gather for operations, process improvement or business efficiency.

This same information can be used to train Artificial Intelligence and Machine Learning applications (AI/ML) to better leverage automation and enable a business to be more effective and responsive. This is done through a process called “Retrieval-Augmented Generation” (RAG) which involves three basic steps:
 
  1. Searching the knowledge base for the relevant content
  2. Augmenting this content with additional context to facilitate a more accurate search
  3. Sending the augmented content to the analytics engine to generate a response
The challenge, however, is that most of this data is unstructured, with its contents locked away in non-database formats like emails, presentations or multimedia files. This type of data does not easily lend itself to extraction or inspection, making analytics extremely inefficient, challenging and expensive.


What’s In YOUR Content?

This is where Content Aware Storage (CAS) comes in! Introduced in the latest release of Fusion Software (2.9.1), CAS is an innovative step towards solving these challenges and getting the most out of organizational data. 

It’s built on NVIDIA’s inferencing microservices (NIM) which uses the NeMo Retriever to pull data from texts, images or charts by translating the relevant information into numerical values called vectors.  These values are stored in a vector database and are fed through pre-built and automated pipelines that enhance and accelerate the RAG process, ensuring the information is current and up to date. 

Deployed as a service from the Fusion dashboard, CAS provides turn-key processing of unstructured data and makes the extracted information easily available to RAG applications. 
 
CAS leverages the Global Data Platform data service (IBM Storage Scale) to cache existing Enterprise data residing on S3 sources without having to manage multiple copies of the data. CAS also supports automated processing of data residing in the native IBM Storage Scale filesystem and can track changes in the data to automatically process the modification for use in RAG applications.
 
From an infrastructure standpoint, the CAS service runs on x86 servers and uses the L40S or H100 Graphical Processor Units (GPUs) to accelerate the data extraction.


Watch Your (Natural) Language!

To make accessing the content even easier, the CAS service provides a search API that supports semantic search, BM25 keyword search, and hybrid search (semantic + BM25 keyword).
This means that that data can be queried in real time with natural language prompts. 
 
Once the prompt is received, the CAS process creates a vector representation of the search and performs a query against the vector database using the method specified in the application (i.e. semantic, keyword, or hybrid). The top results are sent to an aggregator component which uses additional metadata to optimize the results. After a few iterations (to insure data integrity), the results are returned to the application.
 
In a customer provided RAG chat bot, the search results provided by the CAS search API can be combined with the original prompt or question and sent to an LLM to generate the answer provided to the end user.
 
All this to say that Content Aware Storage is the next step in leveraging an AI/ML landscape to get the most from an organization’s data and generate deeper insights to provide better response, optimization and efficiency.

 
For more information:

 
Please see Vincent Hsu’s blog here and join us for the Enhancing AI Results with Content-Aware Storage webinar on April 10th and Accelerate with ATG Webinar on CAS on April 29th.



0 comments
52 views

Permalink