watsonx.ai

Β View Only

How to Chat with Multiple Documents Using Watsonx and Streamlit

By Abdul Rahman posted 5 days ago

  

Chat with Multiple Documents Using Watsonx and LangChain😻


A Streamlit-powered app for querying multiple document types using Watsonx and LangChain.
Ever needed to search through multiple documents at once? With IBM Watsonx and LangChain, this project offers a streamlined, AI-powered solution. Users can easily upload a variety of file types (PDFs, DOCX, CSV, etc.) and get accurate answers based on the content. Whether you're handling research papers, business reports, or technical documentation, this app has you covered, making document retrieval faster and more efficient!


"Note: While this app runs efficiently on machines with low specifications, for faster indexing and response times, I recommend using a more powerful machine."

Live Demo 

πŸŽ‰ Try It Now:🌐 Experience the Live Demo and see how easily you can query multiple documents!

Link - (https://huggingface.co/spaces/RAHMAN00700/Chat-with-Multiple-Documents-Using-Streamlit-and-Watsonx)

This is how the app looks like

Features

  • File Support:    Upload and query PDFs, Word documents, PowerPoint slides, CSVs, JSON, YAML, HTML, and plain text files.

  • Watsonx LLM Integration:  Leverage the power of IBM Watsonx models for precise, context-aware querying.

  • Embeddings with HuggingFace: Fast and accurate document indexing using HuggingFace embeddings.

  • RAG (Retrieval Augmented Generation):  Combines robust document retrieval with LLMs for precise answer generation.

  • User-Friendly Interface: An intuitive and responsive app interface built with Streamlit.

Installation
Follow these easy steps to set up and run the project locally on your machine. Get started with multi-document retrieval in minutes!:
Step 1 :  Prerequisites
  • Python 3.8+ installed on your system.

  • Install `pip` (Python package manager).

  • An IBM Watsonx API key and Project ID.

  • Install Git if not already installed.
Step 2 : Clone the Repository
 
Go to the GitHub page. (https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx)
git clone https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx.git
cd Multi-Doc-Retrieval-Watsonx
 
 

Step 3 :  Install Dependencies

1. Create a virtual environment (optional but recommended):
    python -m venv env
    source env/bin/activate  # On Windows: .\env\Scripts\activate
2. Install required Python packages:
   pip install -r requirements.txt
  
3.Set Environment Variables
Create a `.env` file in the project directory with the following keys:
  WATSONX_API_KEY=<your_watsonx_api_key>
  WATSONX_PROJECT_ID=<your_watsonx_project_id>
 Step 4 : Run the App
Once you've installed the dependencies and set up your environment variables, launch the app by running.
   streamlit run app.py
 
Then, open the displayed URL (typically http://localhost:8501) in your browser.

Step 5 :  How to Use the app
"Pro Tip: Try uploading different small document types simultaneously to test the app's ability to cross-reference information and provide context-aware answers!"
  • Upload Documents: Drag and drop supported files (e.g., PDFs, DOCX, JSON) in the app sidebar.

  • Select Model and Parameters: Choose a Watsonx model and configure settings like output tokens and decoding methods.

  • Ask Questions: Enter queries in the chat input to retrieve answers based on the uploaded document.

---

Project Structure
Multi-Doc-Retrieval-Watsonx/
β”œβ”€β”€ app.py               # Main application file
β”œβ”€β”€ requirements.txt     # Python dependencies
β”œβ”€β”€ README.md            # Project documentation
└── .env                 # Environment variables (not included in repo, create manually)
 
"Note: The .env file contains sensitive API keys and should not be shared publicly."
Dependencies
  • Streamlit:  Builds the app's interactive user interface.

  • LangChain: Handles document retrieval and enhances query responses.

  • HuggingFace Transformers:  Provides embeddings for efficient document search.

  • Watsonx Models: Powers the AI's text generation and question answering.

  • Python Libraries: Supports file handling (pandas, python-docx, python-pptx, etc.) for processing different document types.
Contributing 
"I welcome all contributions! Feel free to suggest improvements, fix bugs, or add new features. Let's make this project even better together. If you found this helpful, give it a star ⭐ on GitHub!"
  • Fork the repository.
  • Create a feature branch:
git checkout -b feature-name
  • Commit your changes:
git commit -m 'Add a new feature
  • Push to the branch:   
git push origin feature-name
  • Open a Pull Request.

Conclusion 

"Ready to supercharge your document search capabilities? Try the demo now, clone the repository, and start exploring the possibilities of AI-powered multi-document retrieval with Watsonx and LangChain!"

More Blogs and Interesting Projects

 


#watsonx.ai
#GenerativeAI

0 comments
111 views

Permalink