Db2

 View Only
Expand all | Collapse all

Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables on Native Cloud Webinar

  • 1.  Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables on Native Cloud Webinar

    Posted Tue May 07, 2024 02:11 PM
    Edited by Luis Dorantes Tue May 21, 2024 01:55 PM

    If you missed the webinar, you can find the recording here.

    Join us for an insightful discussion on 21 May, 2024 at 11:00 AM ET to get an overview of Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables on Native Cloud. Daniel Zilio and John Poelman will present how to make these MQTs on ODF tables, some rules of thumb to make them, and show some performance benefits that were obtained by using them in a workload.

    Click here to register and add the event to your calendar.

    Where is the link to join? Check your email the day before the event!

    How to get the most out of this webinar:

    • Explore the Db2 Community.

    • Ask questions for the presenter by replying to this discussion post before, during and after the webinar. Daniel Zilio and John Poelman will be happy to answer your questions and share insights.

    • Check out upcoming Db2 events here.

    We look forward to seeing you there!



    ------------------------------
    Medha Parekh
    ------------------------------



  • 2.  RE: Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables on Native Cloud Webinar

    Posted Tue May 21, 2024 02:00 PM

    Hello Everyone! 

    Just wanted to say  thank you to everyone who joined us on Today's webinar, we appreciate your participation today!

    If you've got any questions about what we covered, feel free to drop them here in the thread. Our speakers @DANIEL ZILIO and @John Poelman are ready to answer any questions.

    If you missed the webinar, you can find the recording and Presentation Material here.

    Looking forward to hearing from all of you, 



    ------------------------------
    Luis Dorantes | Data Community Manager
    IBM
    ------------------------------



  • 3.  RE: Accelerating your Datalake tables with Db2 Warehouse Materialized Query Tables on Native Cloud Webinar

    Posted Tue May 21, 2024 03:30 PM

    There were a number of questions asked while John and I presented. Let me out the questions and answers here:

    Question 1. In the slides, it was shown that EXTENSIZE 4 was used. Was that meant to be recommended or just the default?

         Answer 1: EXTENTSIZE 4 happened to be the size we used but the user could change to whatever value they want.

    Question 2. Were the MQTs created in cloud or object storage in the presentation. 

         Answer 2: For this presentation, the MQTs were created on a tablespace that was on a storage group on top of native cloud object store. The MQTs could be created on block storage but storage is far cheaper and performs efficiently when storing MQTs on native cloud object store in Db2.

    Question 3: MQT refresh will not affect the base table from being used in queries, but will the MQT only be used when refresh is complete???

         Answer 3: It is correct that when the MQT is not available, the Datalake table is not prohibited from being used. When we refresh a new MQT, we only activate it after the refresh and will be available at that point.

    Question 4: In the presentation, it was stated that the MQT can cache Datalake table information, but we are not referring to caching in memory and instead caching on disk, correct?

        Answer 4: Correct. MQT caching per se is the case of taking Datalake table information and storing it into a MQT table which in turn is stored as a DB2 native cloud object store table, which when picked to be used by the query compiler for a query's execution, will access the MQT and make use of the NVMe cache and the Db2 memory buffer pool which would not be used if the query access the Datalake table directly.

    Question 5: Is there a plan to have db2advis (The Db2 Design advisor) recommend these column organized MQTs referring to Datalake tables and stored in DB2 native cloud object store?

       Answer 5: At the moment, db2advis only recommends MQTs on DB2 local base tables and not on Datalake tables. There is an option in db2advis where one can input the tablespace that would be used as the recommended tablespace to store the output MQTs, which allows a user to input a tablespace on native cloud object store (NCOS). This MQTs referring to local Db2 tables could be recommended to be stored on NCOS. There is no tooling yet in Db2 to recommend MQTs referring to Datalake tables. Providing tooling to make these recommendations is under discussion as a possible future consideration. There is a system: IBM Data Virtialization that has a cache advisor in which it recommends MQTs on Datalake tables and recommends the MQTs to be external (Hadoop) tables themselves, but these are recommended to be stored in the non-Native cloud object store at present.

    Question 6: If the base table is a column-organized table or row-organized table, would the MQT gain be the same?

       Answer 6: For non-Datalake base tables, if you make an MQT on the column-organized table, the MQT needs to be column-organized and if the base table is row-organzied, then the MQT needs to be reorganized. Executing a query with columnar access compared to row-orgainzed access is different as the columnar data engine allows for synopsis table filtering, only load only columns required by the query, has a different level of compression, and used optimized operators in the columnar data engine (CDE), and the Db2 query compiler accounts for these differences in its costing so the performance benefit for a column-organized MQT compared to a plan using a row-organized MQT will differ as a result.

    The query compiler haand execution engine 



    ------------------------------
    DANIEL ZILIO
    ------------------------------