Global AI and Data Science

Global AI & Data Science

Train, tune and distribute models with generative AI and machine learning capabilities

 View Only
  • 1.  data cleaning

    Posted Fri May 24, 2019 08:07 AM
    Hi! I am reading data from scanned medical documents (provider Notes) using Pytesseract OCR. The resultant data has some noise and misspells. My ultimate goal is to extract useful medical information from data. Right now I'm stuck with how to correct both medical and English misspells. I have to create a dictionary which contains both medical and English words. I'm looking for direction on what steps I need to perform.

    ------------------------------
    shafiqa iqbalh
    ------------------------------

    #GlobalAIandDataScience
    #GlobalDataScience