Cloud Pak for Data

 View Only
Expand all | Collapse all

🎟 Connecting to your data sources

JULIA Montarbo

JULIA MontarboThu May 06, 2021 02:34 PM

  • 1.  🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:31 PM
    Edited by System Test Fri January 20, 2023 04:25 PM
    Data can live just about anywhere. Uploading a file from your desktop is fairly straightforward, but how do you connect to data that lives in other locations?

    🎟 For 1 ticket, tell us how you determine which data to connect to. How do you get the information needed to connect to it? 
    If you're not the one setting up these connections to your data sources, tell us who does this at your company.

    Here's an example of what the "Add connection" page looks like in Cloud Pak for Data. 

    Rachel Miles Sijacic
    Sign up for the User Experience Program:

  • 2.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:34 PM
    Learn how to connect to your data in:

    JULIA Montarbo

  • 3.  RE: 🎟 Connecting to your data sources

    IBM Champion
    Posted Thu May 06, 2021 02:40 PM
    This is an interesting question 'who does this at your company?'. We do have Data Architect and also Application owners. Would not know if there is a specific figure that would do it. Most of the times things are done by who manges the component.
    Would there be to possibility to set differnt roles in CP4D? Roles to give to userid to be able to do diffrent things, different priviledges?

    Chiara Baldan

  • 4.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:42 PM
    Totally! We're seeing more and more that job titles don't necessarily indicate responsibilities as a rule. Check out the user management post to see how roles work!

    Rachel Miles Sijacic
    Sign up for the User Experience Program:

  • 5.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:57 PM
    Thanks for the input, Chiara! In CP4D we do have specific permissions that can prevent or grant the ability to create connections depending on the project or catalog you are a member of. It sounds like you have a wide variety of people who are connecting to data sources, but typically that's done by the owner or administrator of a particular component. We would love to hear more about how you determine who should and who should not have the ability to connect to data sources!

    Daniel Klahn

  • 6.  RE: 🎟 Connecting to your data sources

    IBM Champion
    Posted Thu May 06, 2021 03:13 PM
    Hi Daniel,
    on mainframe side usualy is the Application owner that tells DB2 administrator and RACF administrator whom or which applications need to have access to data.
    Similarly for Oracle, Teradata, MongoDB, Big Data, you would have different security administrator and different DB administrators. Would be a challange to have a single member/group to do it.

    Chiara Baldan

  • 7.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:53 PM
    @Chiara Baldan it sounds like maybe it would make sense for you to give specific users Editor access to the Platform connections page so that they can define the connections.

    For more information, see:

    JULIA Montarbo

  • 8.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:42 PM
    Before building any product, we write up a PRD (Product Requirement Document) which is accessible to me while making database connections. PRDs has info in it.

    Rohit Goyal

  • 9.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:43 PM
    That seems very helpful!

    Rachel Miles Sijacic
    Sign up for the User Experience Program:

  • 10.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:53 PM
    Great info Rohit! Could you expand on what you mean by "PRDs has info in it."? What kind of info does it contain? Do you have everything documented there to create a connection to your data source? Database, host, port, etc?

    Karen Gonzalez

  • 11.  RE: 🎟 Connecting to your data sources

    IBM Champion
    Posted Thu May 06, 2021 02:52 PM
    It really depends on which stage of the project you are in. In the beginning most of the admins have pretty high access rights. But later we need more granular and orthogonal access filters. So just a filter admin or not is not enough.
    We would need some level to define the qualify the right to see this data or not, which depends on the business role of the user.

    Roland Schock
    Distinguished Engineer
    ARS Computer und Consulting GmbH

  • 12.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:00 PM
    Roland - some interesting insights here! Could you briefly define for me what you mean by "stage of the project"? What stages do you have? And what are the admins doing at the "beginning" when connecting to data sources that later admins are not doing? And how/what "levels" would you set for qualifications or business roles to be able to create and manage data source connections?

    Karen Gonzalez

  • 13.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 04:13 PM
    @Roland Schock​ Only users with the database credentials can access platform connections. However, Watson Knowledge Catalog might be a good solution for ensuring that only the right users see information. For details, see

    JULIA Montarbo

  • 14.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 02:54 PM
    Would like to be able to upload plain csv or json file; connector for WebCrawling; connector for Twitter handles;  possibility to add custom connector with custom code to connect to customer's content management system ;
    We discuss with the customer which data is available that would best support the use case and how it is available; whether they can export the data for us into a system that is easily accessible for us and to connect to or whether they have a specific system that we need to connect to; usually they need to give us access to that database by creating a user with the rights to crawl the data; sometimes it is important that the user has a specific role so that he can only access the relevant data and is not importing data that should not be visible in the system like Discovery or the data needs to be flagged for which user roles it should be visible in the end user application; i.e. more metadata needs to be imported with the data itself.

    Dorothee Reinhard
    Data & AI Scientist
    IBM Switzerland
    0041 79 565 74 56

  • 15.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:27 PM
    Great insight into the process of determining which data, how and who! You'll be happy to hear that in CPD4D we do in fact offer users the ability to upload their own drivers to create custom connections to their unique data sources, on top of the numerous data sources we natively support.

    It sounds like you become somewhat familiar with the data source before deciding whether or not to use it. When you run into problems connecting to and accessing the data, do you typically troubleshoot with the customer or are there other resources you use or wish you had?

    Daniel Klahn

  • 16.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 04:56 PM
    It also sounds like Watson Knowledge Catalog should be a part of the solution.

    JULIA Montarbo

  • 17.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:01 PM
    Depends on where the data lives. Usually, there will be some documentation (Github, Confluence, etc.) around the data I'm trying to connect to like source, host, port number, username, etc.

    Grishma Jena

  • 18.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:04 PM
    Grishma - Who is responsible for that documentation you mention? I assume this documentation is shared with others besides you? Does it contain information for other data sources? Would love to hear more from you on your process!

    Karen Gonzalez

  • 19.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:14 PM
    Data is chosen based on the business need.  Usually it is a Data Steward role or other technical lead who sets up the connection.

    Shawn Lunny

  • 20.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 03:37 PM
    Edited by System Test Fri January 20, 2023 04:43 PM
    We are certainly hearing more and more that setting up the connection is more of a data provider task, not necessarily by the consumers of the data. Thanks for the input!

    Daniel Klahn

  • 21.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 04:07 PM
    Shawn - Do you have direct access to your Data Steward? How would you discover new data sources that you need and how would you request them from your Data Steward.

    Karen Gonzalez

  • 22.  RE: 🎟 Connecting to your data sources

    Posted Thu May 06, 2021 06:31 PM
    Too many places and not in-sync.

    Metadata Import has connection.
    Project has connection.
    Enterprise Connection.
    Discovery has connection.
    All the above connections are not consistent.


  • 23.  RE: 🎟 Connecting to your data sources

    Posted Thu October 14, 2021 01:24 PM
    we do not have a designated role for setting these connections.  one pain point that is preventing us from using this is the need to have shared credentials.  an improvement would be to allow the username and password to remain blank and if it is blank when someone adds the connection there is a prompt asking for the username and password.  

    we find the correct connection based on first finding who owns the data we need access to then following their process to gain the access.  once approved we are then told how to access.  for internal data sources for our team we would already know this information.


  • 24.  RE: 🎟 Connecting to your data sources

    Posted Thu October 14, 2021 01:46 PM

    Thanks Dama! I am curious... what version are you using today? We are trying to address the exact "shared credentials" approach that you mentioned.

    Karen Gonzalez

  • 25.  RE: 🎟 Connecting to your data sources

    Posted Thu October 14, 2021 01:36 PM
    I found the connecting to data sources somewhat confusing as there are multiple ways to connect to the data of interest. e.g Their is the Platform Connections process, then in Projects there is the add a connection path, then in Catalogs there is yet another way to connect to data source.  Connections created in the Add to Catalog path may not work or even be visible when trying to use a Project (Data Quality).

    Tom Kochie

  • 26.  RE: 🎟 Connecting to your data sources

    Posted Thu October 14, 2021 01:50 PM

    Thanks Tom! I am wondering if you could take a look at this topic (linked here)? It is a question about having a platform wide view of all connections you have access to... maybe this would help? What are your thoughts?

    Karen Gonzalez