Cloud Pak for Data

 View Only
  • 1.  View all entities after matching.

    Posted Thu September 22, 2022 12:55 PM
    Hii there,
    How we can view all entities after running matching setup on our data sets, instead of searching each entity on master data explorer.

    ------------------------------
    Preeti Yadav
    ------------------------------

    #CloudPakforDataGroup


  • 2.  RE: View all entities after matching.

    Posted Fri September 23, 2022 09:42 AM
    Edited by System Fri January 20, 2023 04:19 PM
    hi Preeti

    I wonder what your real use case is & why you need to do this?

    Anyway, the master data explorer allows you to explore all your ingested data (slice and dice it), and view the search results either at the entity level or the record level. There is simple search and advanced search. You can get more information on this in the documentation here: https://dataplatform.cloud.ibm.com/docs/content/wsj/mdm/explore.html?audience=wdp

    You can use either type of search to see all the entities. How you would do this depends on the data model you are using and how your data is mapped to it.
    e.g. Let's assume attribute "Record source" is fully populated with values corresponding to two data sources "dsA" & "dsB".
    Then in simple search you could just type both those values into the search bar to view all the entities (multiple search terms return results based on OR logic). You can probably come up with other examples of how to do it. Remember simple search searches across all attributes, record types, entity types (although you can filter that in the UI as well if you wish).
    You can do something similar in advanced search too, e.g. you could build a rule which says show me all the entities where "Record source does not contain Value".

    You can also use the Export feature built-in to the master data explorer. Just hit the Export button (in the top right hand corner) & you will see all the export options, including export "All data" & ".. export data as records or entities". See this link for more info about exporting: https://dataplatform.cloud.ibm.com/docs/content/wsj/mdm/export.html?audience=wdp

    Another way to do it would be to use the match 360 connector, & publish this to a catalog for viewing via a connected data asset. See this link for more info: https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/conn-m3.html?audience=wdp

    There may be other ways to view all the entities & you might might want to consider the data volume you have in case there are any performance implications..

    ------------------------------
    JOHN MATTHEWS
    ------------------------------



  • 3.  RE: View all entities after matching.

    Posted Mon September 26, 2022 08:02 AM
    Hi John
    Thank you very much for reaching out and suggesting solutions, Actually we are trying to solve problem of fuzzy duplicates, we get duplicates under one entity so we can either join or delete duplicate record, But in case of production data there will be huge number of records and possibly large number of entities in that case searching one entity at a time will be a tedious job. So any chance we can get all entities in one go and simply we can delete or join records like we do when we get entities on searching.

    ------------------------------
    Preeti Yadav
    ------------------------------



  • 4.  RE: View all entities after matching.

    Posted Mon September 26, 2022 09:12 AM
    I do not understand what you are trying to do.

    You say: "we are trying to solve problem of fuzzy duplicates, we get duplicates under one entity"..

    As a result of running matching, you are supposed to get "duplicate", nearly duplicate, records under an entity (according to your match algorithm). See: https://dataplatform.cloud.ibm.com/docs/content/wsj/mdm/data-concepts.html?audience=wdp

    ------------------------------
    JOHN MATTHEWS
    ------------------------------



  • 5.  RE: View all entities after matching.

    Posted Tue September 27, 2022 02:18 AM
    Actually we are doing data quality check so we need to remove duplicate records from data, so when we get duplicate records under entity we need to keep best record and delete other records.

    ------------------------------
    Preeti Yadav
    ------------------------------



  • 6.  RE: View all entities after matching.

    Posted Tue September 27, 2022 05:27 AM
    I still do not understand exactly what you are trying to do. When you refer to deleting records etc, are you talking about in match 360 or in the source systems? How do you determine the "best record"? I would have thought the best record is the entity itself, a composite view of all the contributing source records, according to the composite view rules. Let's arrange a call If you want to discuss further so we can find the best way forward - johnmatt@ie.ibm.com .

    ------------------------------
    JOHN MATTHEWS
    ------------------------------