watsonx.data

watsonx.data

Put your data to work, wherever it resides, with the hybrid, open data lakehouse for AI and analytics

 View Only

WatsonX.data and Netezza: Scaling Analytics in a Hybrid Cloud

By Owais Ahmad posted Wed May 21, 2025 02:42 AM

  

Scaling your Analytics with WatsonX.data and Netezza: A Hybrid Cloud Solution

Modernising your data warehouse while preserving your investments

 

Handling big data efficiently can be a tedious task and as someone who's worked with enterprise data systems for quite some time, I've seen this scenario play out countless times: Your organization's big loads of data warehouse has served you well, but now it's struggling under the weight of ever-growing data volumes. Storage space is running low, query performance is degrading, and your analytics team is feeling the pain bearing huge costs.

Are you struggling the same? Then WatsonX.data is the solution you might be looking for, so recently, I had the chance to explore solutions that addresses these exact pain points, and I'm excited to share whatever I learned.

The Best of Both Worlds with Hybrid strategy: WatsonX.data + Netezza

IBM's approach to this challenge is refreshingly practical: rather than forcing you to abandon your existing Netezza investment, we have created a seamless integration between WatsonX.data and Netezza, their cloud-native lake house solution.

The concept is elegantly simple: implement a tiered storage strategy where:

  • Critical, frequently accessed data remains on Netezza, leveraging its performance optimised architecture
  • Cold, historical data moves to WatsonX.data, taking advantage of low-cost object storage

This hybrid approach delivers immediate benefits:

  • Free up precious storage on your Netezza appliance
  • Reduce processing congestion for your high-priority workloads
  • Maintain access to all your data through a unified query interface
  • Leverage open data formats and flexible query engines


Now we we see practical ways how we can transform the data from Netezza catalogs to iceberg tables.

Step 1. Make a connection between WatsonX.data and Netezza servers

In order to connect your WatsonX.data to your netezza performance server, you can go to infrastructure manager(On left-hand side panel). Here you can see all of the  architecture of all the data sources and data engines that are already connected to WatsonX.data. To create a new connection all you need to do is to click on the add component on the top right and select one of the many different connectors that are available out of box.  


For example, to connect to your Netezza Database, you can select Netezza under data sources, give the new connection a name, fill in the database name based on the database name in Netezza and then fill in the rest of the credentials and test the connection. 

You can also check the associate catalog checkbox to automatically create a catalog. A Catalog is an intermediate level between your data base and the engine that contains all of the metadata and information that your engine can use to query the data. Give the catalog a name and then click on create.


Step 2. Executing queries in web console in WatsonX.data
Once the connection is created you can hover over the catalog to connect it to an engine of your choice for querying.
 Once the connection is made, if you go to the Data Manager on the left, you can actually view your Netezza database and the sample table that we will be using for this demo. You can also query your Netezza data by going to the query workspace on the left. For example, we can get a count of the data and run the query. 

Now let's say we would like to offload partitions of this data that are very old, for example from 2000 to 2008 to a low-cost object storage. We now will run a query to create a placeholder table for the data to be offloaded. 

Using this query we create a schema named old data and a place holder table named offloaded data 2000 to 2008. After that we run another query to now copy the partitions of the data that their sign year is between 2000 to 2008 to the newly created table in the iceberg data. 


To make sure data has been copied we can query the newly offloaded data to get a count of the records. We can now also run a similar query to delete the same records with the same condition from the original table. In cases where you would like to delete a whole table you c
an also do that either by running the drop query template from the query workspace, or by going to data manager and dropping the table from there. 

You can now access both the data that resides in your Netezza environment and the data that's been moved to the data lake through WatsonX.data. This unified access allows you to run powerful queries that join or merge data from both sources, giving you the best of both worlds: Netezza's performance for critical workloads and WatsonX.data's scalability for historical information.

For improved productivity, you can save frequently used queries as worksheets for later reuse. This feature is particularly valuable for regular reporting and analysis tasks that span both environments.

This hybrid approach not only solves immediate storage and performance challenges but also positions your organization for future analytics capabilities. By embracing this architecture, you're creating a foundation that preserves your existing investments while opening the door to advanced analytics, machine learning integration, and AI-powered insights.

Whether you're just beginning your data modernisation journey or looking to optimise an existing environment, the WatsonX.data integration offers a practical, cost-effective path forward that delivers immediate benefits without disrupting your critical business operations.


#watsonx.data #PrestoEngine #Catalog #Bucket #Db2Connector-Presto #NetezzaPerformanceServerConnecterPresto #HiveMetastore

0 comments
10 views

Permalink