Db2

Db2

Connect with Db2, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only

Significant Performance Improvements with AI Join Cardinality Estimates in Db2 version 12.1.3

By Lori Strain posted 8 hours ago

  

Significant Performance Improvements  

with AI Join Cardinality Estimates   

in Db2 version 12.1.3

By Lori Strain, Tam Minh Tran and Nicholas Ostan 

 

We are excited to announce another significant advancement in the Db2 AI Query Optimizer: join cardinality prediction. This new capability allows the optimizer to accurately predict cardinalities of equality joins between tables accounting for correlation between local and join predicates. While mileage may vary, tests on the AI Query Optimizer in Db2 version 12.1.3 have shown significant query performance improvement with some individual queries executing 100 times faster!  AI improves the overall performance of all 99 queries in the TPC-DS Benchmark by 17%. 

 

In our previous blog posts and videos (1, 2, 3, 4, 5, 6) we've explored how the Db2 AI Query Optimizer revolutionizes database query optimization using neural networks. We've discussed our approach to gradually infuse AI into the optimizer and our goals including:

  • reliable and stable query performance, 
  • minimization of resource requirement and making query optimization simple,
  • the importance and challenges of cardinality estimation and how neural networks are used to improve cardinality estimation for single tables after applying local predicates 

In this blog post, we'll review the concept of cardinality estimation and why it is critical to query performance and delve into how the Db2 AI Query Optimizer not only significantly improves join cardinality predictions, access plans and thus query performance without needing to train any new models but also reduces the need for query tuning. 

Cardinality Estimation 

Optimal access plans are key to query performance and resource consumptionAn access plan is a sequence of operators, such as index scans, joins or group bys, used to compute the results of a queryCardinality is the number of rows produced by a query or an operator in an access planAI single table cardinality estimation has improved the optimizer’s ability to select efficient join ordersAI join cardinality estimation takes this to the next level For example, it is more efficient to execute joins which produce fewer rows before joins which generate more rows. Join cardinality estimates are also key to making decisions regarding operations further up the planQuery performance is significantly improved when large number of rows produced by a join are not repartitioned for a subsequent join or used to generate hash tables for subsequent joins which cannot be contained in memory.  

 

The Traditional Approach 

Traditional query optimizers rely on basic statistics and simplifying assumptions for join cardinality estimation. The simplifying assumptions typically involve: 

  1. Independence: Assuming that column values in different tables are independent of each other 
  2. Uniformity: Assuming uniform distribution of values  
  3. Inclusion:  All of the values in the join column from the table with the smaller number of distinct values are in the join column from the table with the larger number of distinct values. 

A traditional optimizer might estimate the number of rows (cardinality) that satisfy the local predicates on the tables to be joined and then apply a formula based on distinct values in the join column to estimate the join cardinality such as: 

cardinality of one table after applying local predicates         

x cardinality of second table after applying local predicates 

x 1/maximum number of distinct values in join column 

An  example 

SELECT * FROM Orders O JOIN Customers C ON O.custid = C.custid 

WHERE O.orderdate >= '2022-01-01' AND C.region = 'NA' 

Relevant Statistics 

 

cardinality 

min 

max 

# distinct values 

orders 

54,995 

 

 

 

customers 

1000 

 

 

 

orderdate 

 

Jan 1, 2015 

Dec 31,2024 

 

region 

 

 

 

2 

custid 

 

 

 

1000 

 

3 of the 10 years or 30% of 54,995 or 16,498.5 rows would be expected to satisfy the local predicate on ordersSimilarly, the cardinality estimate after applying the local predicate on region would be 500 and the join cardinality estimate would be:

(54,995 x .3) * (1000 x .5) x 1/1000 = 8249 

 

The Issue 

 The traditional approach misses critical insights: 

  • Customers in the NA region might place orders at different frequencies than customers in other regions 
  • Recent orders might have a different customer distribution than historical orders 
  • The combination of these predicates might create a distribution pattern that isn't captured by statistics for individual tables. 

The actual cardinality of the query is 25,524. The traditional optimizer estimate is off by 68%.   

 

The AI Approach 

By leveraging AI, Db2 can understand more complex relationships between data across tables such as correlation and non-uniformity and therefore, compute better cardinality estimatesThe AI optimizer’s cardinality estimate for the example above is 25,859It is off by about one percent. 

Db2 version 12.1.3 does not require new AI models to be trainedIt uses single table cardinality models available since 12.1 to populate a histogram for each of the tables being joinedThe histograms represent the distribution of data in the table after applying the local predicates.  The fact that the histograms represent the data distribution after the application of the local predicates is key to understanding the degree of correlation between the join predicate and the local predicates.

The following histogram is generated for the orders table in the example above. 

The y-axis of the histograms represents the number of rows from the tables that satisfy the local predicates and have values in the join column which fall between the lower and upper bounds of the histogram bucket boundaries on the x-axis. The cardinality of each histogram bucket is the AI cardinality estimate for the following single table query, where x is the customer id at the lower bound of the histogram bucket and y is the value of customer Id at the upper bound of the bucket. 

Select * from orders 

where orderdate >= '2022-01-01' 

and custid between x and y 

A second histogram is generated for the customer tableThe intersection of the two histograms can then be computed generating insight into relationships between data in the orders table and data in the customer table. The result is a histogram for the resulting join distributionThe total of all the buckets for the join distribution is the estimated join cardinality. The AI cardinality estimate is 25,859 which is only one percent above the actual cardinality of 25,524. 

For a very visual and dynamic technical video, see the IBM Db2 AI Query Optimizer video by Nicholas Ostan on YouTube. 

The next section summaries our results for tests comparing the relative accuracy of the new AI join cardinality estimates  to the traditional optimizer’s estimates. 

Improved Cardinality Estimates 

The following box plots depict the errors associated with cardinality estimates for thousands of automatically generated join queries. The plots compare the error in the cardinality estimates for the AI optimizer on the left compared with the traditional optimizer on the rightThe middle box shows the error for the traditional optimizer when using automatically generated group column statisticsGroup column statistics capture correlation information for pairs of columns by recording the number of distinct combinations of values. An error of 0 signifies a perfect cardinality estimate, whereas an error of 1 means the estimate is 1 order of magnitude different than the actual. The error is negative if the estimate is too low and positive if it is too high. The plots represent joins on tables in the TPC-DS schemaThe one on the left represents joins between the item and store_sales tables and the one on the right joins between the catalog_sales and date_dim tablesThe join predicates for each pair of tables are always the sameLocal predicates are randomly generated.   The queries are randomly generated with local predicates on 1 or more columns defined in the model, which can include equality, between, and INNote that the error in the AI estimates is tightly clustered around 0 and that even with group column statistics, the traditional estimates are often much further from the actual cardinality and tend to be too lowAlso, the error in the traditional estimates varies significantly more from one query to the nextThe impact of cardinality estimation errors on plan selection and thus performance is amplified when the relative error for 2 joins in the same query is significantly different e.g. when a query joins store_sales to both item and date_dim and the cardinality estimate for the join between store_sales and date_dim is much lower relative to the actual cardinality than the estimate for the join between store_sales and item. 

DATE_DIM-CATALOG-SALES box plot

The next sections explain in detail how more accurate join cardinality estimates lead to superior plans and significant improvements in performance. 

Improved Performance   

TPC-DS runs 17% Faster 

The total execution time for all 99 standard queries in the 3TB TPC-DS Benchmark is reduced by 17 percent using AI cardinality estimation in Db2 version 12.1.3.  This includes 70 to almost 90 percent improvement for queries 22,72 and 84. 

SAP BW Benchmark Query Execution Times Improve

Internal testing showed the AI Query optimizer improved the execution times for 4 of 22 queries by 44 to 67% and did not cause any queries to regress.

Superior Cardinality Estimates, Plans and Performance

Improved cardinality estimates result in better plans which deliver superior performance. This  section explains in detail how improved AI cardinality estimates impact the plan and performance of TPC-DS query 84Similar improvements in join cardinality estimates result in partial early aggregation (PEA) and early out join plans which improve the performance of queries 22 and 72. 

                                    

The following are annotated fragments of the traditional and AI plans for TPC-DS query 84: 

 

Select c_customer_id as customer_id, 

  coalesce(c_last_name, '') || ', ' || coalesce(c_first_name, '') as customername 

From customer, 

  customer_address, 

  customer_demographics, 

  household_demographics, 

  income_band, 

  store_returns 

where 

  ca_city = 'Greenfield' and 

  c_current_addr_sk = ca_address_sk and 

  ib_lower_bound >= 11660 and 

  ib_upper_bound <= 11660 + 50000 and 

  ib_income_band_sk = hd_income_band_sk and 

  cd_demo_sk = c_current_cdemo_sk and 

  hd_demo_sk = c_current_hdemo_sk and 

  sr_cdemo_sk = cd_demo_sk 

order by c_customer_id fetch first 100 rows only

Traditional Plan:

AI Plan:

The following table summarizes the traditional and AI estimates for the cardinality of key joins along with the actual cardinality and relative errors.

 

HSJOIN(12) 

HSJOIN(9) 

actual 

30233 

1.3e7 

Traditional estimate 

3196 

1.38998e+06 

Traditional estimate as % of actual 

11% 

11% 

AI estimate 

22721 

9.88053e+06

AI estimate as % actual 

75% 

76% 

 

Unlike the traditional optimizer, the AI optimizer accounts for the impact of the local customer address city predicate on the selectivity of the join predicate between customer_address  and customer. The error in the traditional estimate for HSJOIN(12) propagates to HSJOIN(9)As a result, the traditional plan contains two joins (HSJOIN(9) and HSJOIN(8)) with more than 10 million rows on the probe sideThe AI optimizer recognizes that HSJOIN(9) produces significantly more rows so it avoids this joinInstead, the AI optimizer chooses to join store_returns last as shown belowUnlike the traditional plan, the AI performs only a single join with more than 10 million rows on the probe sideHSJOIN(11) in the AI plan is small and therefore quick relative to the HSJOINs in the original plan. The following chart summaries the join costs for the AI and traditional plans.

 

Join in 

trad plan 

Trad Plan Cost 

Join in AI plan 

AI Plan Cost 

JOIN with STORE_RETURNS  

HSJOIN(9)

486.4 

HSJOIN(8) 

454.37 

Join with household_demographics  

HSJOIN(8) 

49.9 

HSJOIN(11) 

 0.15 

Total join cost 

 

536.3 

 

454.52 

TOTAL PLAN COST 

 

36,362 

 

36,280 

              ^ 

100 Times Performance Improvements 

With AI cardinality estimates, some queries execute 100 times fasterFor example, the following query runs 120 times faster using AIIt is a custom query run on a 3TB TPC-DS database. It takes 5.12 seconds using AI compared to 623.11 seconds without AI.   

select 

  c_customer_id as customer_id, 

  coalesce(c_last_name, '') || ', ' || coalesce(c_first_name, '') as customername 

from 

  customer, 

  customer_address, 

  customer_demographics, 

  household_demographics, 

  income_band, 

  store_returns, 

  inventory, 

  date_dim, 

  item 

where 

  ca_city = 'Greenfield' and 

  c_current_addr_sk = ca_address_sk and 

  ib_lower_bound >= 11660 and 

  ib_upper_bound <= 11660 + 50000 and 

  ib_income_band_sk = hd_income_band_sk and 

  cd_demo_sk = c_current_cdemo_sk and 

  hd_demo_sk = c_current_hdemo_sk and 

  sr_cdemo_sk = cd_demo_sk and 

  sr_item_sk = i_item_sk and 

  inv_item_sk = i_item_sk and 

  inv_date_sk = d_date_sk selectivity 2.818459e-05 and 

  d_month_seq between 1193 and 2000 

group by c_customer_id, c_last_name, c_first_name 

order by c_customer_id 

fetch first 100 rows only 

 

The portions of the traditional and the AI plans are shown belowThere are two differences between the plans.   The difference at the bottom of the plan, shown in less detail, greyed out and in a smaller font is the same as the plan difference for query 84 discussed above and it reduces the execution time by ~40%The more significant plan difference is the reversal of the probe and build side of the topmost hash join. Swapping the probe and build sides of HSJOIN(13)  on its own reduces query execution time by 98% without any other plan changes.   

Traditional Plan:

AI Plan:

The following table compares the ratios between the estimates and actuals of the traditional and AI estimates for the joins shown in plan fragments above. Even though the AI estimate is still significantly too low, it is improved enough to result in a far superior plan.

 

Inv x DD join 

HSJOIN(16/30) 

HSJOIN(15) 

actual 

9.25e+07

2e+07 

5.18597e+10 

Traditional estimate 

4.48935e+07 

4.75945e+05 

1.09795e+08 

Traditional estimate as % of actual 

49% 

2% 

0.2% 

 

AI estimate 

1.15741e+08 

3.38322e+06 

9.75508e+08 

AI estimate as % of actual 

125% 

17% 

1.9% 

 

The key difference between the plans is that the AI plan is early out and the traditional plan is notHSJOIN(13) can only be early out if the customer  table is accessed on the outer because the only columns referenced above the join are from the customer table. The error in the cardinality estimate for HSJOIN(13) prevents the traditional optimizer from recognizing the true cost of the non-early out joinLike the PEA in query 22, the early out join improves runtime performance by reducing the number of rows output by the join and processed by the subsequent group byThe following table summarizes the differences between the traditional and AI plans. Note:   

 

  1. the large difference between the traditional cost estimate and the AI cost estimate for the traditional plan highlighted in  yellowCost estimates are heavily influenced by cardinality estimates.   Therefore, like the traditional cardinality estimate, the traditional cost estimate is too low. Therefore, the traditional cost estimate for one plan should not be compared to the AI cost estimate for another to determine which plan will execute faster.
  2. the large difference in the AI costs of the group by in the two plans highlighted in green.  

 

 

Trad Plan 

AI Plan 

 

Trad Estimate 

AI Estimate 

 

1.Customer on outer of HSJOIN(13) 

NO 

YES 

2. HSJOIN(13) early out? 

NO 

YES 

3. Rows output by HSJOIN(13)  

1.09e+08 

9.76e+08 

422,902 

4. Cost HSJOIN(13)  

517 

2,491 

951 

5.Cost GRPBY 

585 

333,180 

7 

6.TOTAL COST 

71,814 

405,831 

72,119 

              ^ 

No New Models Required 

DB2 version 12.1.3 reuses single table cardinality models already available in version 12 prior to mod pack 3 to improve join cardinality estimates, access plans and query performancePredicting join cardinalities without any new models keeps the resources required by the Db2 AI Query Optimizer lowDelivering significantly improved query performance and minimizing resources are two of the goals of the DB2 AI Query Optimizer as stated in the introductory blog in this series. 

Minimal Tuning Required 

AI vs Statistical Views 

In addition to significantly improving query performance while minimizing resources required, another goal of the DB2 AI Query Optimizer is to minimize the query tuning required to get more stable performance.   Traditionally, statistical views have been used to improve query performanceHowever, defining such views and collecting statistics on them is a time consuming processWithout statistical views, the AI total execution time for all 99 queries in our 3TB TPC-DS test was 5% faster than the traditional execution time with statistical views as depicted in the following chartsThe results  with the AI optimizer are shown on the right compared with the traditional optimizer with statistical views in the middleThe bar on the left represents the untuned traditional optimizer discussed aboveNot only can AI eliminate the need to spend time defining and implementing statistical views, it can also improve the performance of workloads which have already been tuned using statistical viewsIn our tests, AI improved the performance of queries 64 and 84 by 63 and 70 percent respectively when compared with the performance of the traditional optimizer with statistical views. 

 

Furthermore, the extra 5% execution time did not include either the roughly 14 hours it took to collect statistics on the views or the approximately 40 hours it took a database expert to define them. The time required to collect statistics on them often prohibit customers from being able to use statistical views. In comparison, AI delivers better performance and takes only minutes to trainThe AI training time for the 3TB TPC-DS database used in our tests was 3 minutes. 

 

Statview 

AI 

Effort to implement 

40 hours 

0 

Runstats execution time 

14 hours (for statviews) 

3 minutes (for model training) 

TOTAL 

54 hours 

3 minutes 

 

Note:

  1. The runstats times in the table above do not include the time taken to collect statistics on the TPC-DS tables without model training which is 102 minutes for both statistical views and AI. 
  2. The statistical views used were specifically tailored to the TPC-DS workloadLess time would be required to define general statistical views which might also be more useful for workloads as they evolve or for ad-hoc queriesHowever, general statistical views would have included significantly more tables and therefore, more time would have been required to collect statistics on them.     

AI vs Column Group Statistics 

At the time of writing this blog, multiple equality join predicates are not yet handled by the AI models. However, once this is done, collecting column group statistics (CGS) for join cardinality estimation will also not be required. Note that column group statistics are not needed for correcting local predicate estimations as of Db2 version 12.1 

Try It Today 

Experience the enhanced capabilities of the AI Query Optimizer by upgrading to Db2 version 12.1.3. If you're already using the Db2 AI Query Optimizer, you'll automatically benefit from the join cardinality enhancement. No additional configuration or tuning is required. The feature works seamlessly with your existing workloads. If you haven't yet enabled the Db2 AI Query Optimizer, now is the perfect time to do so. To enable AI query optimization simply set the database configuration parameter “AUTO_AI_OPTIMIZER” to “ON”. The addition of join cardinality prediction makes the Db2 AI Query Optimizer even more powerful and effective at improving query performance. These improvements translate directly to business benefits: faster insights, lower infrastructure costs, and reduced need for performance tuning expertise. 

References 

  1. Zuzarte, Calisto. The AI Query Optimizer in Db2. IBM TechXchange Community. 2024-05-24. 

  1. Zuzarte, Calisto & Corvinelli, Vincent.The AI Query Optimizer Features in Db2 Version 12.1. IBM TechXchange Community. 2024-08-15. 

  1. Finnie, Liam. The Neural Networks Powering the Db2 AI Query Optimizer. IBM TechXchange Community. 2024-10-16. 

  1. Frendo, Brandon.Simplify Query Performance Tuning with the Db2 AI Query Optimizer. IBM TechXchange Community. 2024-12-10. 

  1. Finnie, Liam. Db2 AI Query Optimizer: Expanded Predicate Support and Improved Model Training. IBM TechXchange Community. 2025-09-20. 

  1. IBM.IBM Db2 AI Query Optimizer . YouTube. 2025-10-23.  

About the Authors 

Lori Strain is an expert in Db2 Optimization. She has worked in database optimization since joining IBM in 1989, was instrumental in the development of the original cost model for Db2 version 2.0 and is very excited to be part of the team working to build the next generation of database optimization technology based on AI. 

Tam Minh Dai Tran worked in the Db2 Optimizer from 1999 to 2017Since then, she has been working in the Db2 Query Performance team on various projects in the areas of runtime and query compiler. 

Nick Ostan is a software developer in the Db2 AI Optimizer group. Nick studied biochemistry and biophysics at the University of Toronto before transitioning into data science for biotechnology/drug discovery, and then ultimately towards pure deep learning. 

0 comments
45 views

Permalink