Chat with Multiple Documents Using Watsonx and LangChainπ»
A Streamlit-powered app for querying multiple document types using Watsonx and LangChain.
Ever needed to search through multiple documents at once? With IBM Watsonx and LangChain, this project offers a streamlined, AI-powered solution. Users can easily upload a variety of file types (PDFs, DOCX, CSV, etc.) and get accurate answers based on the content. Whether you're handling research papers, business reports, or technical documentation, this app has you covered, making document retrieval faster and more efficient!
"Note: While this app runs efficiently on machines with low specifications, for faster indexing and response times, I recommend using a more powerful machine."
Live Demo
π Try It Now:π Experience the Live Demo and see how easily you can query multiple documents!
Link - (https://huggingface.co/spaces/RAHMAN00700/Chat-with-Multiple-Documents-Using-Streamlit-and-Watsonx)
Features
- File Support: Upload and query PDFs, Word documents, PowerPoint slides, CSVs, JSON, YAML, HTML, and plain text files.
- Watsonx LLM Integration: Leverage the power of IBM Watsonx models for precise, context-aware querying.
- Embeddings with HuggingFace: Fast and accurate document indexing using HuggingFace embeddings.
- RAG (Retrieval Augmented Generation): Combines robust document retrieval with LLMs for precise answer generation.
- User-Friendly Interface: An intuitive and responsive app interface built with Streamlit.
Installation
Follow these easy steps to set up and run the project locally on your machine. Get started with multi-document retrieval in minutes!:
Step 1 : Prerequisites
- Python 3.8+ installed on your system.
- Install `pip` (Python package manager).
- An IBM Watsonx API key and Project ID.
- Install Git if not already installed.
Step 2 : Clone the Repository
Go to the GitHub page. (https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx)
git clone https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx.git
cd Multi-Doc-Retrieval-Watsonx
Step 3 : Install Dependencies
1. Create a virtual environment (optional but recommended):
python -m venv env
source env/bin/activate # On Windows: .\env\Scripts\activate
2. Install required Python packages:
pip install -r requirements.txt
3.Set Environment Variables
Create a `.env` file in the project directory with the following keys:
WATSONX_API_KEY=<your_watsonx_api_key>
WATSONX_PROJECT_ID=<your_watsonx_project_id>
Step 4 : Run the App
Once you've installed the dependencies and set up your environment variables, launch the app by running.
Step 5 : How to Use the app
"Pro Tip: Try uploading different small document types simultaneously to test the app's ability to cross-reference information and provide context-aware answers!"
- Upload Documents: Drag and drop supported files (e.g., PDFs, DOCX, JSON) in the app sidebar.
- Select Model and Parameters: Choose a Watsonx model and configure settings like output tokens and decoding methods.
- Ask Questions: Enter queries in the chat input to retrieve answers based on the uploaded document.
---
Project Structure
Multi-Doc-Retrieval-Watsonx/
βββ app.py # Main application file
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
βββ .env # Environment variables (not included in repo, create manually)
"Note: The .env
file contains sensitive API keys and should not be shared publicly."
Dependencies
- Streamlit: Builds the app's interactive user interface.
- LangChain: Handles document retrieval and enhances query responses.
- HuggingFace Transformers: Provides embeddings for efficient document search.
- Watsonx Models: Powers the AI's text generation and question answering.
- Python Libraries: Supports file handling (
pandas
, python-docx
, python-pptx
, etc.) for processing different document types.
Contributing
"I welcome all contributions! Feel free to suggest improvements, fix bugs, or add new features. Let's make this project even better together. If you found this helpful, give it a star β on GitHub!"
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
git commit -m 'Add a new feature
git push origin feature-name
Conclusion
"Ready to supercharge your document search capabilities? Try the demo now, clone the repository, and start exploring the possibilities of AI-powered multi-document retrieval with Watsonx and LangChain!"
More Blogs and Interesting Projects
#watsonx.ai
#GenerativeAI