Robotic Process Automation (RPA)

 View Only

 Issues with Text Classifier

Richard Aitchison's profile image
Richard Aitchison posted Thu November 20, 2025 09:48 AM

Hi. I raised this same issue back in 2022 (https://community.ibm.com/community/user/discussion/machine-learning-text-classification-inaccuracy) and had no response.

We're in the process of building a BOT for a customer, we are looking to try and identify/classify 2 specific document types.

I've built a machine learning (text classifier) model with directories of files that are exemplar of the 2 document types.

When I use it I've been struggling with getting meaningful results. If I try and classify documents that are of the appropriate document types then I seem to getting the correct choice (but the score is always the same - 88). If I try and classify a completely different document that has nothing to do with the 2 document types - and as such I would expect to get a very low score  for either doc type - I'm still getting a score of 88 against doc type 1 (always the same doc type). This means that I can have no trust in the results.

I've tried different algorithms - but the result is always the same.
 
As documentation is poor and there's no 'logging' in SaaS environment that allows us to see what's happening in the model - please can someone advise if this is a bug or whether there's some trick I need to train the model in order for it to work correctly?

------------------------------
Regards
Richard Aitchison

Senior Consultant
Insight 2 Value Ltd
------------------------------