Data Governance - Knowledge Catalog

Data Governance - Knowledge Catalog

Get advice from their industry peers, communicate with data governance and quality experts on best practices, and stay up to date on product news and helpful materials.

 View Only
  • 1.  Data Quality Analysis for Birth Date

    Posted Mon August 21, 2023 11:25 AM

    Hello,

    I need to analyze birth date of our customers in IBM Watson Knowledge Catalog. Format of data is YYYY-MM-DD to analyse i wrote this data quality definition.

    IF R exists and len(trim(R))<>0 and month(R) = 2 and year(R)%4 = 0 and year(R)%100 <> 0 or year(R)%400 = 0 then day(R) <= 29 else day(R) <= 28 OR IF month(R) in_reference_list {'4','6','9','11'} then day(R)<=30 OR IF month(R) in_reference_list {'1','3','5','7','8','10','12'} then day(R)<=31

    In summary IF birth_date exists firstly checks february condition (for leap year) then checks for other months. But IBM WKC does not allow this complex rule. How can i simplfy it? Can i use more than one if in data quality definition?



    ------------------------------
    Gizem Tepecik
    ------------------------------


  • 2.  RE: Data Quality Analysis for Birth Date

    Posted Mon August 26, 2024 06:36 PM

    Hi

    We do have 190 sample that may provide the results you need - https://www.ibm.com/docs/en/cloud-paks/cp-data/5.0.x?topic=definitions-sample-data-quality

    You can use this simple formula to get the results you want:

    IF Field1 EXISTS THEN Field1 IS_DATE



    ------------------------------
    HOW MING (Felix) YONG
    Senior Data & AI Technical Specialist - Information Architecture
    IBM
    Melbourne
    ------------------------------