InfoSphere Optim

 View Only

Finding Sensitive Data in Free-Form Text

  • 1.  Finding Sensitive Data in Free-Form Text

    Posted Fri April 10, 2020 02:16 PM
    Curious if anyone is using or has created logic to find sensitive or PII data within free-form text strings. An example of this could be chat window information between a customer and a service rep whereby a customer could potentially enter sensitive information which gets saved into the repository holding all relevant chat information. 

    If someone needed to do analysis on such data, any potentially sensitive data should be masked; yet coding a process to identify such data elements has a lot of possible permutations to deal with and starts to verge into needing some level of AI to learn all the possible patterns for things such as SSN, CCard, DLN, et al. 

    We have plenty of exit code examples which parse data fields looking for data elements using tags (i.e. JSON, XML, etc.), but these free-form data objects present a new wrinkle for masking since they do not follow any set patterns. 

    Do any of our Optim users or consultants have processes they have coded or leveraged to find sensitive data in free-form data fields so that it could be masked using ODPP? Appreciate all thoughts and input on this. 

    ------------------------------
    Keith Tidball
    Progressive Insurance
    ------------------------------

    #InfoSphereOptim
    #Optim