Content Management and Capture

 View Only
  • 1.  FileNet/BAW directly calling Datacap OCR action

    Posted Mon May 24, 2021 04:17 AM
    Hi,
    Is it possible to create a FileNet / BAW workflow that can directly call Datacap OCR action (like Abbyy Recognize) at some point and retrieve the processed texts?
    I have created Datacap workflows that push images/texts to FileNet repository. But I have not tried it in reverse. Calling Datacap OCR engine from FileNet or BAW workflow.
    Has anyone tried and did it work? How have you done it? Is it all about Datacap REST API?

    ------------------------------
    dsakai
    ------------------------------


  • 2.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Tue May 25, 2021 04:28 AM
    I think you want a very simple response but as I know this is not possible to have :-)
    from the very beginning it would be good to know  that OCR action is working based  on page and S/W TIFF  inside a Datacap application rule
    Therefore you have to create and configure a Datacap application which will export your P8 document content to a batch file. The file must be converted based on Datacap other actions to TIFF s/w in your Datacap application and then runs through OCR action.
    After OCR ABBYY process results a text file and HTML file for each page  of your P8 document content.

    After merging the html   / text files of the same document content using another Datacap action (I think here you have to create on your own one...),  you can import this resulting file (html or txt or both)  as a new document Version of the same P8 document. Then you can use it for a search content.

    I  99% sure you it exists currently nothing like this out of the box...
    I developed something for a presale  presentation I think 4 years ago... or perhaps more...

    I hope this helps you...

    Dorothea


  • 3.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Tue May 25, 2021 06:48 AM
    Thanks for your reply.

    > The file must be converted based on Datacap other actions to TIFF s/w in your Datacap application and then runs through OCR action.
    > After OCR ABBYY process results a text file and HTML file for each page  of your P8 document content.
    > After merging the html   / text files of the same document content using another Datacap action (I think here you have to create on your own one...),  you can
    > import this resulting file (html or txt or both)  as a new document

    So, aside from installing and configuring Datacap application, sending, converting image to tiff, reading it by OCR ABBYY process, and importing the resulting file
    can all be done via Datacap wTM API, all directed and called from FileNet?

    ------------------------------
    dsakai
    ------------------------------



  • 4.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Tue May 25, 2021 12:25 PM
    Dsakai,

    The short answer is no, you cannot call Datacap actions from P8.  Normally we would have a Datacap workflow handle OCR and send the documents back to P8.

     

    Happy to hop on a call to discuss best practices here. 





    ------------------------------
    Camden Weis
    Head of Sales
    (336) 491-9517
    cweis@DASpartner.com
    ------------------------------



  • 5.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Fri May 28, 2021 01:44 AM
    Can P8 possibly call Datacap wTM API methods? In that case, P8 can perhaps upload and retrieve images and texts via these APIs?

    ------------------------------
    dsakai
    ------------------------------



  • 6.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Tue May 25, 2021 04:11 PM
    This is actually relatively simple to do,     DataCap has a Rest API called wtm,  this can work in both batch mode and what they referred to as transactional mode.   The  API is a bit funky,  but in principle behaves like any HTTP Restful post.    BAW has the ability to call a Restful service.   this is is a really good URL to explain how it works.

    https://www.ibm.com/docs/en/baw/20.x?topic=service-invoking-rest-by-using-javascript


    Now the main funky thing is how to you upload a XML file and the file you wish to perform an action(I suggest you have a PDF or TIF ready).   

    This is the documentation to the DataCap API

    https://www.ibm.com/docs/en/datacap/9.1.8?topic=reference-datacap-web-services-rest-api-methods


    I have done this in a Node app and a long time ago in BAW.  If I find time,  I'll do it in BAW again and share the twx.

    ------------------------------
    Daniel Crow
    ------------------------------



  • 7.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Fri May 28, 2021 01:41 AM
    Thank you for your confirmation.
    Our team is seeking ways to integrate Datacap and BAW (and Filenet).
    I am glad you have done it already. I will save the links.

    > The  API is a bit funky,  but in principle behaves like any HTTP Restful post.    BAW has the ability to call a Restful service.   this is is a really good URL to explain how it works.


    ------------------------------
    dsakai
    ------------------------------



  • 8.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Mon May 31, 2021 05:40 PM
    Hi,

    You can do that in P8, just follow the idea in the link below and aggregate a set of Datacap wTM calls along/instead of saving the document in a folder:
     - https://www.ibm.com/docs/en/datacap/9.1.7?topic=idffp-downloading-bulk-filenet-p8-content-datacap-by-using-filenet-sweep-job

    ------------------------------
    Jurandir Patria
    ------------------------------



  • 9.  RE: FileNet/BAW directly calling Datacap OCR action

    Posted Tue June 01, 2021 01:22 AM
    Thank you for the link! I am saving it as well.
    I've found the following description on the page. This is a good encouragement.

    ----
    You can use the FileNet sweep framework and its bulk processing capabilities to download the document content into a directory. Once these files are in the directory, a Datacap application can be used to ingest the documents into Datacap and extract information that can be exported back into FileNet P8. See the IBM Knowledge Center for more details about handling bulk processing with FileNet sweeps:

    ------------------------------
    dsakai
    ------------------------------