Hi Michael,
I am not sure in your case if the PDFs are password-protected, or just have security restrictions.
In our case, PDFs are mostly not password protected but had access restrictions, meaning you can open the PDF manually but content copying/OCR is not allowed.
https://community.ibm.com/community/user/automation/communities/community-home/digestviewer/viewthread?GroupId=3769&MessageKey=db2618ba-5ce7-4b67-bb55-08a7c3794554&CommunityKey=ab6c0dd9-b4c5-406a-b324-bdd687a0adde&tab=digestviewer&ReturnUrl=%2fcommunity%2fuser%2fautomation%2fcommunities%2fcommunity-home%2fdigestviewer%3ftab%3ddigestviewer%26CommunityKey%3dab6c0dd9-b4c5-406a-b324-bdd687a0addeThe difference is that
PDFFREDocumentToImage fails with "PDF is password-protected" for first case, in which case PDF cannot be opened manually and there is no other way but to resubmit the PDF and reprocess the batch.
In the second case,
PDFFREDocumentToImage fails with "PDF has access restrictions". There are a couple of ways to handle it:
1) If using PDFFREDocumentToImage in version 9.1.6 or above, you can use a setting before calling this action.
https://www.ibm.com/support/knowledgecenter/SSZRWV_9.1.7/com.ibm.dc.reference.doc/dcacb182.htm"Processing Secured PDF Files
-
PDF files that have the properties, Content Copying: Not Allowed or Content Copying for Accessibility: Not Allowed enabled, the action will remove these properties automatically from the PDF so that image files can be created for each page of the PDF. If the PDF is changed, a backup of the original PDF will be saved in the batch directory. The original PDF will be saved with the name "filename.original.pdf". For example, TM000001.original.pdf
The default suffix "Original" can be changed by setting the DCO variable "y_PdfBackupSuffix" prior to calling the action PDFFREDocumentToImage.
rrSet(".secure", "@X.y_PdfBackupSuffix")
PDFFREDocumentToImage(300, 18, 32, 33,".bw.tif", ".color.tif", ".gray.tif", 0, false, 100)
This example will backup the original file as PDFFileName.protected.pdf and remove the security properties from the file associated with the DCO object and then create and image for each page in the PDF.If y_PdfBackupSuffix is not set then by default .oiginal.pdf will be appended to backup file."
2) If using versions older than 9.1.6, you can try using the solution provided in this article, however PDFDocumentToImage is deprecated in older versions of Datacap.
https://www.ibm.com/support/pages/unable-convert-secured-pdf-file-convert-actions-ibm-datacapMore Info:
Working with secure PDF files using DatacapKavitha
------------------------------
Kteegala
Original Message:
Sent: Mon September 20, 2021 10:08 AM
From: Michael Shadley
Subject: Password Protected PDFs
Thanks for the suggestion Kavitha! I think that will work nicely for us. I wasn't aware of the Exception-related actions and how they worked. Yes, batches do occasionally abort due to what look like random memory issues on the server, but 9 times or out of 10 it's because of a security restriction on the PDF. So if we can't distinguish between the two failure modes within Datacap it's probably not too big an issue. We'll just send an alert e-mail to the line-of-business support group. And resubmitting the PDF after checking for security restrictions should fix either scenario.
Michael
------------------------------
Michael Shadley
Original Message:
Sent: Sun September 19, 2021 05:07 PM
From: Kteegala
Subject: Password Protected PDFs
Hi Michael,
You can also take advantage of Nenu jobs to run hourly and send notifications for batches and use a filter to identify the batches failed at various stages.
Are all failures in Escan caused during PDFFREDocumentToImage action? From your query, it seems like they fail for other reasons as well.
If so, you should be able to separate PDFFREDocumentToImage action into a separate ruleset and use Exception handler for PDFFREDocumentToImage failures and use sendLogEmail action before aborting the batch. You can utilize exception handler to decide whether you want to abort a batch for any failure or take some other action.
https://www.ibm.com/docs/en/datacap/9.1.8?topic=actions-logsendemail
https://www.ibm.com/docs/en/datacap/9.1.8?topic=actions-exceptionsethandler
Hope this helps.
Kavitha
------------------------------
Kteegala
Original Message:
Sent: Fri September 17, 2021 12:20 PM
From: Michael Shadley
Subject: Password Protected PDFs
Hi all,
We have a Datacap application to process PDFs coming from our business users into a shared network folder. It converts the pages to individual TIFF images using PDFFREDocumentToImage to present in Verify. The business users know they aren't supposed to send PDFs that have password protection enabled, but it still happens from time-to-time. And it results in a batch that is aborted in the EScan step since PDFFREDocumentToImage always aborts a batch if it encounters an error during conversion. We would like to notify a line-of-business support group when a password-protected PDF is submitted rather than have our Datacap IT support group research and resolve the batch. However, the added complication is that sometimes batches abort in EScan because of random server glitches or memory management bugs. So we can't say that every batch that aborts in EScan is due to a password-protected PDF. We'd really like to trap for password-protected PDFs and treat those as one use case and have everything else be treated as a second catch-all use case that would be handled by the IT support group.
So is there a best practice for detecting/handling password-protected PDFs in Datacap? Or is the best practice to use some sort of preprocessor to prevent them from getting to Datacap in the first place? I know we're not the first organization to face this problem.
Thanks for your suggestions.
------------------------------
Michael Shadley
BOK Financial
------------------------------