Robotic Process Automation (RPA)

 View Only

Machine Learning - Text Classification inaccuracy

  • 1.  Machine Learning - Text Classification inaccuracy

    Posted Tue May 17, 2022 04:24 AM
    Hi. I'm in the process of prototyping a BOT for a potential customer, as part of this prototype is looking to try and identify/classify specific document types attached to emails. Specifically invoices or New Claims Advice.

    So I've been looking at the potential to use RPAs machine learning to achieve this - but I've been struggling with getting meaningful results. Whatever document I ask the BOT to classify I pretty much get the same result every time - it's an invoice with around 88% confidence. (It even classified a customer presentation I gave as an Invoice - 88%)

    I've tried to follow the instructions in the Knowledge Center and have set up 2 directories that contain example documents for the different document types, and then set up different machine learning models using the different algorithms: Bag of Words, N-Gram, Text Classifier. But always the same result.

    I've tried with the training documents in their original PDF format and also text only representations of these documents.

    Instead of using example documents, I then tried setting up some text files that then just contain key terms that you would expect in the different document types - with no overlap in these terms between the document types. Again this made no difference.

    So I would really appreciate some assistance in order to understand whether RPAs machine learning should be capable of meeting this use-case and if so what is required to set it up to in order to get meaningful results. And if not - why not - as it would appear to me to be quite a usual use case for a BOT.

    As the use of the machine learning within RPA - is really a black box - it would be really helpful /useful if there's was some sort of logging from it that could be used to understand why it's come up with the results that it has, which could then be used to provide improve the training data.

    ------------------------------
    Regards
    Richard Aitchison

    Senior Consultant
    Insight 2 Value Ltd
    ------------------------------