Cloud Pak for Data Group

 View Only

Connecting your RedShift data for AI Modelling with IBM Cloud Pak for Data

By Michael Cronk posted Tue January 04, 2022 08:24 PM



Mark Brown, Sr. Partner Solutions Architect, AWS

Linh Lam, Sr. Partner Solution Architect, AWS

David Lebutsch, Distinguished Engineer, IBM



AWS and IBM recently announced that IBM Cloud Pak for Data, a unified platform for data and AI, has been made available on AWS Marketplace.  Cloud Pak for Data comes with well-known applications in different categories such as:


  • Built-in data governance applications (IBM Watson Knowledge)
  • Data quality and integration applications (IBM DataStage)
  • Model automation (IBM Watson Studio)
  • Purpose-built AI model risk management.


Customers gain a streamlined data pipeline, leveraging the AWS services they are using to collect data and feed it directly into IBM Cloud Pak for Data to generate actionable insights in real time. Customers can potentially leverage that data to automate other AWS services as well.


How to connect Amazon Redshift to Cloud Pak for Data

IBM Cloud Pak for Data comes ready to connect to your AWS data sources: 


In this post, we will show you how to connect  Amazon RDS for PostgreSQL  to IBM Cloud Pak for Data at a Platform level, making your Redshift data warehouse accessible for use by all of its services. You will create a connection that can be used Watson Studio, Watson Knowledge Catalog, or as an assets catalog for other projects. Any Cloud Pak user who has access to the platform can see this connection, but only users with the credentials for the data source can use it.



This product requires a moderate level of familiarity with AWS services. If you’re new to AWS, visit Getting Started with AWS and Training and Certification. These sites provide materials for learning how to design, deploy, and operate your infrastructure and applications on the AWS Cloud.


This product assumes basic familiarity with IBM Cloud Pak for Data components and services as well. If you are new to IBM Cloud Pak for Data and Red Hat OpenShift, see Additional resources.


It is also highly recommended that the IBM Cloud Pak for Data Deployment Guide  be consulted and reviewed prior to using this product. 

Required permissions: To create a platform-level connection, you must be an Editor or Administrator on Cloud Pak’s “Platform connections” catalog.

Connecting at the Platform Level

After logging in to the Cloud Pak for Data web client, go to the navigation menu, and select Data > Platform connections, then click New connection. In the following list you will see Amazon RDS, as in this example:

To connect to Amazon RDS, you’ll need the following details:

  • Database
  • Hostname or IP address
  • Port number
  • Username and password
  • Credentials

Which you will then input in the form as pictured below. You can also test your connect once created, using the button at the upper right:

At which point you will have completed all that is necessary for your database to be available as a Platform Connection, useable as a data source in Watson, Cognos or other Cloud Pak for Data services.


To summarize, IBM and AWS have provided a unified way deploy and use IBM Cloud Pak for Data and we have shown here how you can easily connect it to Amazon RDS for PostgreSQL. This solution enables your organization to not only simplify and automate how they collect, organize, and analyze data, but also leverage its artificial intelligence and machine learning capabilities.