Global AI and Data Science

 View Only

data cleaning

  • 1.  data cleaning

    Posted Fri May 24, 2019 08:07 AM
    Hi! I am reading data from scanned medical documents (provider Notes) using Pytesseract OCR. The resultant data has some noise and misspells. My ultimate goal is to extract useful medical information from data. Right now I'm stuck with how to correct both medical and English misspells. I have to create a dictionary which contains both medical and English words. I'm looking for direction on what steps I need to perform.

    ------------------------------
    shafiqa iqbalh
    ------------------------------

    #GlobalAIandDataScience
    #GlobalDataScience