Can pdfs be passed to tools as input by an agent? If so, how?
I created an agent and created a tool for the agent. The tool is a flow that I created. The flow includes a document classifier (with 4 classification options), a branch, and 4 different field extractors.
Flow: Start-->Document Classifier-->Branch-->Extractor-->End
When I provide a pdf into the chat and give the instructions "Use the IDP_Pipeline tool with the provided pdf", the agent usually returns an error, with "{'error': 'Message: The file type of the provided document content is not supported by Document Processing. with code: 400, and of type: Not supported; The supported file types are: .pdf, .tiff, .tif, .png, .jpg, .jpeg}".
I definitely provided a pdf. I believe that the issue stems from the fact that tools cannot accept a pdf as input. If a document classifier accepts a pdf, but tools cannot accept a pdf, it is not possible for a classifier to ever receive a pdf. Maybe I am misinformed. I provided a screenshot (no_pdf.png) with my evidence that tools cannot accept a pdf as input.
There is a necessity for input and output schemas to be provided for activities such as Document Classifier and Document Extractor
Goal: Provide a .pdf into the agent chat interface. When the agent receives a pdf, it will call my custom IDP_Pipeline tool with the pdf as an input. Then, the document will be classified, and fields will be extracted by the tool. Classification and extracted fields will be returned. (Probably in a json object, but there is no documentation on the output schema of classifiers or extractors)
Maybe I need to use the binary string of the pdf as input? But that doesn't seem quite right.
Any help would be appreciated.
Here is a loom link which shows a screen recording with voiceover of my issue. Loom is a trusted cloud based screen recording and video sharing tool.
Exploring Intelligent Document Processing with IDP Pipeline
| Loom |
remove preview |
|
| Exploring Intelligent Document Processing with IDP Pipeline |
| In this video, I walk you through the setup of an agent that utilizes an intelligent document processing tool called IDP Pipeline. I demonstrate how to classify documents into specific categories, such as hospital reports, and extract relevant fields from them. |
| View this on Loom > |
|
|
------------------------------
Benjamin Tranter
------------------------------