Cloud Pak for Data

 View Only
  • 1.  What would you ask a room of Cloud Pak for Data users and experts?

    Posted Thu December 17, 2020 09:00 AM

    We asked our members "If you were in a room full of Cloud Pak for Data users, what question would you ask?"

    So far, we’ve posted four questions from community members to the Cloud Pak for Data Group. We will post a few more soon, but if you missed the first discussions in our “Ask the Room” series, you can read and reply to them here:



    ------------------------------
    Shannon Rouiller
    ------------------------------

    #CloudPakforDataGroup


  • 2.  RE: What would you ask a room of Cloud Pak for Data users and experts?

    Posted Wed January 06, 2021 02:30 PM

    When it comes to data modelling and storage with SQL and NoSQL, how do you decide which is the best approach for your project?

    Also, when creating AI models, do you find there is a big difference between using SQL and NoSQL databases?



    ------------------------------
    Fernanda Braga
    ------------------------------



  • 3.  RE: What would you ask a room of Cloud Pak for Data users and experts?

    Posted Fri January 08, 2021 05:53 PM
    In general when you're selecting a data warehouse you want to pay attention to a few things:

    • Type of the Data
      • What kind of data are you storing? Is it images, documents, heirarcical data, tabular, geospatial?
      • Is it consistant? Does the data have nulls? Does the data structure change from record to record?
      • Do you have multiple kinds of data? Is some of it columnar and some of it hierarchical or blobs?
    • Access Patterns 
      • Is your data Transactional? 
      • Do you get one bulk update once a day or is it a realtime feed?
      • How often do you query the data? Do you need whole rows or just single columns?

    SQL databases are generally (with a few exceptions) focused around relational tabular data. This allows you to easily slice and dice the data you're looking for just like if it was a massive spreadsheet.

    NoSQL isn't any single type of database as it is a catch all term for anything that doesn't match my previous description for SQL. This includes everything from document stores like Mongo or Couch DB to Key-Value and Graph databases like Berkeley DB and Neo4j. Which of these you use is going to be determined by the two questions I listed above, what does your data look like, and what access patterns do you have.

    As for which database is most commonly used for AI. Most AI Algorithms exclusively use tabular data and any data that is not tabular must be made such first. So if you're looking at doing visual recognition, the first thing you do is map each pixel to a column in a very wide relational table. Often converting to black and white and normalizing the pixel values in the process. If you choose to use one of the NoSQL databases it is highly likely you will first start by converting the data INTO a tabular form prior to training or using your model.

    ------------------------------
    HANS UHLIG
    ------------------------------



  • 4.  RE: What would you ask a room of Cloud Pak for Data users and experts?

    Posted Sat January 16, 2021 06:14 PM

    I would ask Cloud Pak for Data members, given Cloud Pak for Data is a fully integrated data and AI platform, what is the best security solution (features, protocols) that embraces modernized principles - addressing significant issues such as business collection and organizing of data, and providing the analytics needed to properly infuse AI.  And, and in consideration of the fact that Cloud Pak for Data is Cloud-native by design.  

    Responses can include mentionIng of Kubernetes; in fact, I welcome all comments.



    ------------------------------
    Yvonne R. McGinnis
    DevOps (hopeful), Systems Administration
    Obama Foundation, Chicago
    Chicago Cato, Illinois
    773-886-5579
    ------------------------------