IBM Security QRadar SOAR

 View Only
  • 1.  Get PDF attachment content

    Posted Wed September 20, 2023 10:10 AM

    Hi,

    I need to get the content of a pdf file in string format. I didn't find an application or Python code that helped me with this case. Can they help me?

    Thanks, regards!



    ------------------------------
    Federico Camelino
    ------------------------------


  • 2.  RE: Get PDF attachment content

    Posted Thu September 21, 2023 03:18 PM

    Hi, while we don't have an app to get the content of a PDF as a string, you may find the Image OCR Functions for IBM SOAR app helpful as it's able to interpret text from image files.



    ------------------------------
    Priya Sapra
    ------------------------------



  • 3.  RE: Get PDF attachment content

    Posted Fri September 22, 2023 06:56 AM

    Greetings,

    There is no package ready to extract PDF information into a file, sadly.

    But, you could obtain the contents of a PDF file (or any attachment, really) using the REST API Functionality:

    https://exchange.xforce.ibmcloud.com/hub/extension/b1e4814282b33a826f36c72cf1bc4751

    First, query to get all Incident Attachments using:

    ​/orgs​/{org_id}​/incidents​/{inc_id}​/attachments

    Obtain the ID of the desired attachment, and then get the content using:

    ​/orgs​/{org_id}​/incidents​/{inc_id}​/attachments​/{attach_id}​/contents

    Anyways you should be careful with this approach. Malicious code can be injected into PDF files and then executed upon reading it's contents. I'm not entirely sure if this can be triggered with GET actions specifically for PDS, but nonetheless you should still be cautious if you are going to read external /unsecure files.

    Cheers!



    ------------------------------
    Pol Estecha Hernández
    ------------------------------



  • 4.  RE: Get PDF attachment content

    Posted Fri September 29, 2023 03:09 PM

    Hello Pol, I was testing with the indicated function. I cannot obtain the content of the PDF file in plain text, but I can obtain it in JSON. I am attaching screenshots of the Playbook error and the configuration of the "Call REST API" function.

    Thanks and regards!



    ------------------------------
    Federico Camelino
    ------------------------------



  • 5.  RE: Get PDF attachment content

    Posted Wed November 15, 2023 10:40 AM

    Hi Pol and team,

    Do they have a news for this topic?

    I'm waiting, greetings!



    ------------------------------
    Federico Camelino
    ------------------------------