Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables (MQTs)on Native Cloud

 View Only
When:  May 21, 2024 from 11:00 AM to 12:00 PM (ET)

Sign up or log in through IBM Community to register for this event
Webex Link will be updated a day prior to the event here

A fundamental piece of the integration of watsonx.data and Db2 Warehouse are the new Datalake tables. A new kind of table allows users to define an “external” Datalake tables from your Lakehouse, stored as open data formats (ODF). To access these tables, Db2 uses a Scheduler that splits the multiple objects that form the Datalake table in order to distribute them across the Db2 nodes. Each time the query is executed, the external table must be accessed by the corresponding Db2 node from the source location, which incurs the full communication and data processing costs. In order to alleviate those challenges, the Db2 Warehouse 11.5.9 introduced the full support of MQTs (Materialized Query Tables) over Datalake tables. With this support, you can create a column organized MQT as a Native Cloud Object Storage (COS) MQT over a Datalake table and get the full performance benefit of both column organized tables and Native COS tables. This webinar will present how to make these MQTs on ODF tables, some rules of thumb to make them, and show some performance benefits that were obtained by using them in a workload.

Speakers: Daniel Zilio and John Poelman

Speakers Bio:

Daniel Zilio has been with IBM for 25 years and has been one of the fathers of physical database design methods within IBM, including developing methods to automatically select indexes, DPF distribution keys, as well as MQTs. He was also a senior member of the Db2 compiler team and has recently worked on introducing column organized MQTs on Native COS.
 
John Poelman has been testing the performance and scalability of relational tables stored in open data formats for nearly a decade now, originally with a focus on Apache Hadoop-based offerings such as IBM Big SQL. John’s recent work is towards optimizing the performance of Datalake table and data virtualization capabilities integrated into the Db2 family of products. has been testing the performance and scalability of relational tables stored in open data formats for nearly a decade now, originally with a focus on Apache Hadoop-based offerings such as IBM Big SQL. John’s recent work is towards optimizing the performance of Datalake table and data virtualization capabilities integrated into the Db2 family of products.

#db2webinarseries

Location

Online Instructions:

Pricing Information

Registration Price
All Registrants Free Event