Content Management and Capture

Come for answers. Stay for best practices. All we’re missing is you.

View Only

Back to Blog List

Introduction to ADP JSON Output

By XUE XU posted Fri April 01, 2022 04:13 AM

When a file is uploaded to ADP, after extraction and processing, the output could be downloaded in JSON format, including classification, extraction and OCR information. This blog introduces some common and useful data of the JSON output for 3^rd party application to consume them easily.

How to get this json output for the uploaded document?

JSON output is from public API: GET /v1/projects/<project_id>/analyzers/<file_id>/json
More details about public api could refer to document: https://community.ibm.com/community/user/automation/blogs/xue-xu/2022/03/27/automation-document-processing-api-introduction

How to get the predicted document class for the uploaded document?
This 'Classification' part contains Document Class information predicted by ADP for the uploaded document.

'Actual': Document Class name predicted by ADP for this document.

'ID': Document Class ID of above document class.
'ClassMatch': Confidence level for the prediction.

How to get the value/key/position/validation result for one key class?

'KVPTable' list all KVP (Key Value Pair) in the page, each KVP contains Key, Value, Position of Key and Value, ID, confidence, ID and name of KeyClass.

'Sensitivity': Content sensitive
'Mandatory': KeyClass mandatory, means whether the KeyClass must have valid KVP matched
'EditedValue': Is the value been edited or just detected by OCR
'ObjectDetectionMatch': Match object detection or not
'ValueType': 'Text', 'Barcode', 'Table', 'Checkbox' or 'Signature'
'ValueMetrics': Statistical data for values of KeyClass, such as minimum, maximum and average length of a value
'ValidatorResult': result for KeyClass validator check

If result is 'fail', 'ValidatorFailures' will show more details

- 'Severity': currently 3 levels: Informational, Warning, Error

How to get the top kvp for one key class if multiple kvp extracted for it?

There could be multiple candidate KVP extracted for one Key Class, and a ranked list of these KVP is given by ADP, and the highest ranked KVP is the best one that predicted by ADP for this Key Class.

Each item in 'KeyClassRankedList' is for one KeyClass object, including ID, name, type and sorted KVPList. The item in KVPList could be one or more.
'PageNo': Page number which the KVP is found in, the number starts from 0
'KVPID': Unique value to identify the KVP
'Reserved1': Reserved for internal user only

How to get the OCR result?

pageList contains all data of file extraction page by page. Every page contains blocklist, page info, KVPTable etc.

The page is divided into several blocks for OCR result, each block contains some lines which includes words.

- Each block contains block ID, position (StartX, StartY, Width and Height), line data
- Each link contains line ID, position, word data
- Each word contains word ID, position, OCR Confidence, style, length, value

'PageInfo' contains basic information of the page, such as language, page position (width and height), ID, dpi data and OCR confidence.

Full JSON output description:
https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/21.0.3?topic=api-outputs

0 comments

124 views

Permalink

https://community.ibm.com/community/user/blogs/xue-xu/2022/04/01/introduction-to-adp-json-output

Content Management and Capture

Content Management and Capture

Introduction to ADP JSON Output

By XUE XU posted Fri April 01, 2022 04:13 AM

How to get this json output for the uploaded document?

How to get the value/key/position/validation result for one key class?

How to get the top kvp for one key class if multiple kvp extracted for it?

How to get the OCR result?

Full JSON output description:
https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/21.0.3?topic=api-outputs

Permalink

Additional
Resources

Office

Quick Links

Content Management and Capture

Content Management and Capture

Introduction to ADP JSON Output

By XUE XU posted Fri April 01, 2022 04:13 AM

How to get this json output for the uploaded document?

How to get the value/key/position/validation result for one key class?

How to get the top kvp for one key class if multiple kvp extracted for it?

How to get the OCR result?

Full JSON output description:https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/21.0.3?topic=api-outputs

Permalink

Additional Resources

Office

Quick Links

Full JSON output description:
https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/21.0.3?topic=api-outputs

Additional
Resources