Robotic Process Automation (RPA)

 View Only

 Issues with Text Classifier

Richard Aitchison's profile image
Richard Aitchison posted Thu November 20, 2025 09:48 AM

Hi. I raised this same issue back in 2022 (https://community.ibm.com/community/user/discussion/machine-learning-text-classification-inaccuracy) and had no response.

We're in the process of building a BOT for a customer, we are looking to try and identify/classify 2 specific document types.

I've built a machine learning (text classifier) model with directories of files that are exemplar of the 2 document types.

When I use it I've been struggling with getting meaningful results. If I try and classify documents that are of the appropriate document types then I seem to getting the correct choice (but the score is always the same - 88). If I try and classify a completely different document that has nothing to do with the 2 document types - and as such I would expect to get a very low score  for either doc type - I'm still getting a score of 88 against doc type 1 (always the same doc type). This means that I can have no trust in the results.

I've tried different algorithms - but the result is always the same.
 
As documentation is poor and there's no 'logging' in SaaS environment that allows us to see what's happening in the model - please can someone advise if this is a bug or whether there's some trick I need to train the model in order for it to work correctly?

------------------------------
Regards
Richard Aitchison

Senior Consultant
Insight 2 Value Ltd
------------------------------

Madhur Vashisht's profile image
Madhur Vashisht

Hi Richard, 

Thanks for calling this out. We are internally investigating this issue and if required we will schedule time with you directly. Irrespective, you will have an update from the team in a few days.