watsonx.ai

A one-stop, integrated, end- to-end AI development studio

View Only

Back to Blog List

How to Chat with Multiple Documents Using Watsonx and Streamlit

By Abdul Rahman posted Sat November 16, 2024 03:45 AM

Chat with Multiple Documents Using Watsonx and LangChain😻

A Streamlit-powered app for querying multiple document types using Watsonx and LangChain.

Ever needed to search through multiple documents at once? With IBM Watsonx and LangChain, this project offers a streamlined, AI-powered solution. Users can easily upload a variety of file types (PDFs, DOCX, CSV, etc.) and get accurate answers based on the content. Whether you're handling research papers, business reports, or technical documentation, this app has you covered, making document retrieval faster and more efficient!

"Note: While this app runs efficiently on machines with low specifications, for faster indexing and response times, I recommend using a more powerful machine."

Live Demo

🎉 Try It Now:🌐 Experience the Live Demo and see how easily you can query multiple documents!

Link - (https://huggingface.co/spaces/RAHMAN00700/Chat-with-Multiple-Documents-Using-Streamlit-and-Watsonx)

Features

File Support: Upload and query PDFs, Word documents, PowerPoint slides, CSVs, JSON, YAML, HTML, and plain text files.

Watsonx LLM Integration: Leverage the power of IBM Watsonx models for precise, context-aware querying.

Embeddings with HuggingFace: Fast and accurate document indexing using HuggingFace embeddings.

RAG (Retrieval Augmented Generation): Combines robust document retrieval with LLMs for precise answer generation.

User-Friendly Interface: An intuitive and responsive app interface built with Streamlit.

Installation

Follow these easy steps to set up and run the project locally on your machine. Get started with multi-document retrieval in minutes!:

Step 1 : Prerequisites

Python 3.8+ installed on your system.

Install `pip` (Python package manager).

An IBM Watsonx API key and Project ID.

Install Git if not already installed.

Step 2 : Clone the Repository

Go to the GitHub page. (https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx)

git clone https://github.com/Abd-al-RahmanH/Multi-Doc-Retrieval-Watsonx.git
cd Multi-Doc-Retrieval-Watsonx

Step 3 : Install Dependencies

1. Create a virtual environment (optional but recommended):

    python -m venv env
    source env/bin/activate  # On Windows: .\env\Scripts\activate

2. Install required Python packages:

   pip install -r requirements.txt

3.Set Environment Variables

Create a `.env` file in the project directory with the following keys:

  WATSONX_API_KEY=<your_watsonx_api_key>
  WATSONX_PROJECT_ID=<your_watsonx_project_id>

Step 4 : Run the App

Once you've installed the dependencies and set up your environment variables, launch the app by running.

   streamlit run app.py

Then, open the displayed URL (typically http://localhost:8501) in your browser.

Step 5 : How to Use the app

"Pro Tip: Try uploading different small document types simultaneously to test the app's ability to cross-reference information and provide context-aware answers!"

Upload Documents: Drag and drop supported files (e.g., PDFs, DOCX, JSON) in the app sidebar.

Select Model and Parameters: Choose a Watsonx model and configure settings like output tokens and decoding methods.

Ask Questions: Enter queries in the chat input to retrieve answers based on the uploaded document.

---

Project Structure

Multi-Doc-Retrieval-Watsonx/
├── app.py               # Main application file
├── requirements.txt     # Python dependencies
├── README.md            # Project documentation
└── .env                 # Environment variables (not included in repo, create manually)

"Note: The .env file contains sensitive API keys and should not be shared publicly."

Dependencies

Streamlit: Builds the app's interactive user interface.

LangChain: Handles document retrieval and enhances query responses.

HuggingFace Transformers: Provides embeddings for efficient document search.

Watsonx Models: Powers the AI's text generation and question answering.

Python Libraries: Supports file handling (pandas, python-docx, python-pptx, etc.) for processing different document types.

Contributing

"I welcome all contributions! Feel free to suggest improvements, fix bugs, or add new features. Let's make this project even better together. If you found this helpful, give it a star ⭐ on GitHub!"

Fork the repository.
Create a feature branch:

git checkout -b feature-name

Commit your changes:

git commit -m 'Add a new feature

Push to the branch:

git push origin feature-name

Open a Pull Request.

Conclusion

"Ready to supercharge your document search capabilities? Try the demo now, clone the repository, and start exploring the possibilities of AI-powered multi-document retrieval with Watsonx and LangChain!"

More Blogs and Interesting Projects

For more blogs and interesting projects, visit my personal website: https://abdulrahmanh.com/

#watsonx.ai
#GenerativeAI

0 comments

129 views

Permalink

https://community.ibm.com/community/user/blogs/abdul-rahman/2024/11/16/how-to-chat-with-multiple-documents-using-watsonx

watsonx.ai

watsonx.ai

How to Chat with Multiple Documents Using Watsonx and Streamlit

By Abdul Rahman posted Sat November 16, 2024 03:45 AM

Chat with Multiple Documents Using Watsonx and LangChain😻

Permalink

Additional
Resources

Office

Quick Links

watsonx.ai

watsonx.ai

How to Chat with Multiple Documents Using Watsonx and Streamlit

By Abdul Rahman posted Sat November 16, 2024 03:45 AM

Chat with Multiple Documents Using Watsonx and LangChain😻

Permalink

Additional Resources

Office

Quick Links

Additional
Resources