Content Management and Capture

View Only

OCRPL confidence levels

Duncan Shields posted Mon February 02, 2026 06:02 AM

Hi everyone,

I've been using OCRPL with a customer that processes a number of similar payment slips, some with machine print and some which are handwritten.

The level and accuracy of capture is pretty impressive, but it is let down by what I consider to be overly high confidence levels.

Nearly every character has 100% confidence which, especially with handwritten fields, is an artificially high level. It basically makes it impossible to use confidence levels to manage whether or not a slip should be sent for manual validation.

There is nothing worse than false positive results where the captured fields are wrong, but the confidence is 100%.

I'm using a variety of methods to get the values, mostly using PopulateZNField as all the fields have been captured using the OCR_PL Recognize() action. Where a fields hasn't been populated it will use RecognizeFieldOCRPL() and I have noticed that on some occasions that even on the same field 2 different values might be captured, but both will have 100% confidence.

I've noticed that occasionally RecognizeFieldOCRPL() appears to capture the field upside down and back to front so 950.00 would appear as 00.056. (It still has 100% confidence)

Has anyone else had similar experiences with OCRPL, is there anything I can do to get more realistic confidence levels in the results?

Has anyone tried DC9.1.10 with OCRPL has there been any significant changes?

As an aside OCRA's results are much less accurate than OCRPL but when it does fail to get it right its confidence levels reflect this.

Thanks,

Duncan

Suman Suhag posted Mon February 16, 2026 05:13 AM

import re

from abbyy_sdk import AbbyyAPI # Hypothetical; replace with actual ABBYY import

def recognize_field_with_validation(image_path, field_name):

api = AbbyyAPI(app_id='your_app_id', password='your_password')

# Preprocess: Detect and correct orientation

processed_image = api.process_image(image_path, options={'orientation': 'auto'})

# Primary recognition

result = api.recognize_field(processed_image, field_name)

value = result['text']

confidence = result['confidence'] # Often inflated

# Custom validation for numeric fields (e.g., currency)

if re.match(r'^\d+\.\d{2}$', value): # Normal format

return value, confidence

elif re.match(r'^\d{2}\.\d{3}$', value[::-1]): # Reversed? e.g., 00.056 -> 650.00

corrected_value = value[::-1] # Reverse string

if re.match(r'^\d+\.\d{2}$', corrected_value):

return corrected_value, confidence * 0.8 # Penalize confidence for correction

else:

# Fallback to RecognizeFieldOCRPL with retries

for attempt in range(3):

alt_result = api.recognize_field_ocr_pl(processed_image, field_name, options={'retries': 1})

alt_value = alt_result['text']

if re.match(r'^\d+\.\d{2}$', alt_value):

return alt_value, alt_result['confidence'] * 0.9 # Slight penalty

return None, 0 # Failed

# Usage

value, conf = recognize_field_with_validation('path/to/image.jpg', 'amount_field')

print(f"Recognized: {value}, Adjusted Confidence: {conf}")

Content Management and Capture

OCRPL confidence levels

Additional
Resources

Office

Quick Links

Content Management and Capture

OCRPL confidence levels

Related Content

Datacap Field Confidence Level

Field Confidence Question

Datacap OCRPL Action Recognize fail.

What's New in Datacap: Summer 2021

Document Capture(DataCap) VS Document Processing

Additional Resources

Office

Quick Links

Additional
Resources