Master Data Management

 View Only
  • 1.  FPF Filter

    Posted Wed June 03, 2020 03:58 PM
    Hi, 

    I am trying to setup false positive filter by using 4DIM weight file. Not sure where to mention the fields which I am using for FPF filter.

    Below example gives me 4 dim array. But how would I know which fields are getting participated? Does it  considers name&Gender, DOB, BirthYr and SSN by default

    1|1|A|CMPID-FPF-DIST|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
    1|1|A|CMPID-FPF-DIST|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
    1|1|A|CMPID-FPF-DIST|0|0|2|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
    
    Inputs are highly appreciated. 

    Thanks,
    Venkat

    ------------------------------
    Venkata Ramana Mekala
    ------------------------------

    #MasterDataManagement


  • 2.  RE: FPF Filter

    Posted Tue June 16, 2020 09:55 AM
    Venkat,

    The False Positive Filter (FPF2) is the 4-dimensional version.  It requires 4 inputs (Name, DOB, SSN, & Gender).  These inputs do not actually have to correspond to the exact fields, for example SSN would not be an attribute in Canada.  

    The way that these 4 elements need to be structured to make this work, is the following:

    You will need individual comparison functions in your algorithm for each of these elements (knowing that each project may have different attribute codes).

    • Name Compare: In the comparison function properties, the "Comparison Specification Code" must be set to: XNM
    • Gender Compare: In the comparison function properties, the "Comparison Specification Code" must be set to: SEX
    • Date of Birth Compare: In the comparison function properties, the "Comparison Specification Code" must be set to: DOB
    • SSN (or Other Identifier) Compare: In the comparison function properties, the "Comparison Specification Code" must be set to: SSN (Even if the identifier you are working with isn't an SSN, maybe it's something like an MRN, a Tax ID, or a Customer ID.)
    The FPF2 Comparison Function needs to be connected to the Date of Birth process as a second Comparison Function.

    The weights for the FPF2 will appear in the mpi_wgt4dim table.  It's important, to have all of the weight entries in the table.  Even though many scenarios will not have a penalty for the False Positive Filter, you still have to have the full set of weights.  There are 216 rows in the mpi_wgt4dim table for the FPF2 weights.

    FPF Weight Indexes

    Name and Gender share the first column where 11 indexes represent the possible conditions. 

    Index

    Name Result

    Gender Result

    0

    Missing

    Missing

    1

    Exact

    Missing

    2

    Partial

    Missing

    3

    Disagree

    Missing

    4

    Missing

    Agree

    5

    Exact

    Agree

    6

    Partial

    Agree

    7

    Disagree

    Agree

    8

    Missing

    Disagree

    9

    Exact

    Disagree

    10

    Partial

    Disagree

    11

    Disagree

    Disagree

    Birth Date Edit Distance is in the 2nd column.

    Index

    Meaning

    0

    One or both Dates are Missing mm/dd

    1

    The mm/dd Dates are an Exact match

    2

    The mm/dd have an Edit Distance of 1

    Birth Date Year Difference is in the 3rd column.

    Index

    Meaning

    0

    One or both years are Missing

    1

    Dates are 0-4 Years different

    2

    Dates are 5-9 Years different

    3

    Dates are 10-14 Years different

    4

    Dates are 15-19 Years different

    5

    Dates are >= 20 Years different

    SSN is tracked horizontally so the actual FPF penalty scores appear across the columns.

    Column

    Meaning

    0

    One or both SSNs are Missing

    1

    SSNs are an Exact match

    2

    SSNs have an Edit Distance of 1 or more



    ------------------------------
    Tyson Carter
    ------------------------------



  • 3.  RE: FPF Filter

    Posted Tue June 16, 2020 09:55 AM
    Venkata,

    I've tried to reply to you a few times, so hoping that this time works.

    In order for the False Positive Filter (FPF2) to be invoked properly there are a few things that need to be in place.

    1.  Correct Comparison Function Settings
          a. NAME - The comparison function that you use for Name MUST have a "Comparison Specification Code" of XNM for the False Positive Filter to read it.
          b. GENDER - The comparison function that you use for Gender MUST have a "Comparison Specification Code" of SEX for the False Positive Filter to read it.
          c. BIRTH DATE - The comparison function that you use for Birth Date MUST have a "Comparison Specification Code" of DOB for the False Positive Filter to read it.
          d. UNIQUE ID/SSN - The comparison function that you use for a Unique Identifier (or SSN) MUST have a "Comparison Specification Code" of SSN for the False Positive Filter to read it.

    • NOTE: If your data does not contain a SSN attribute, any unique identifier will suffice, but you have to call it SSN in the comparison function's Comparison Specification Code.
    2.  The False Positive Filter comparison function must be connected to the Birth Date's comparison role.
    3. Your Weights in the mpi_wgt4dim table must contain all 216 rows of weights for the False Positive Filter scenarios.  You can't just put in the ones you want.  Many of the scenarios will have scores of "0" to indicate no penalty.

    FPF Weight Indexes

    Name and Gender share the first column where 11 indexes represent the possible conditions. 

    Index

    Name Result

    Gender Result

    0

    Missing

    Missing

    1

    Exact

    Missing

    2

    Partial

    Missing

    3

    Disagree

    Missing

    4

    Missing

    Agree

    5

    Exact

    Agree

    6

    Partial

    Agree

    7

    Disagree

    Agree

    8

    Missing

    Disagree

    9

    Exact

    Disagree

    10

    Partial

    Disagree

    11

    Disagree

    Disagree

    Birth Date Edit Distance is in the 2nd column.

    Index

    Meaning

    0

    One or both Dates are Missing mm/dd

    1

    The mm/dd Dates are an Exact match

    2

    The mm/dd have an Edit Distance of 1

    Birth Date Year Difference is in the 3rd column.

    Index

    Meaning

    0

    One or both years are Missing

    1

    Dates are 0-4 Years different

    2

    Dates are 5-9 Years different

    3

    Dates are 10-14 Years different

    4

    Dates are 15-19 Years different

    5

    Dates are >= 20 Years different

    SSN is tracked horizontally so the actual FPF penalty scores appear across the columns.

    Column

    Meaning

    0

    One or both SSNs are Missing

    1

    SSNs are an Exact match

    2

    SSNs have an Edit Distance of 1 or more



    ------------------------------
    Tyson Carter
    ------------------------------