This Db2 for z/OS News from the Lab blog entry was originally published on 2018-03-15.
By Terry Purcell and Paul McWilliams.
In Db2 12 we built on the profile-based statistics collection that we introduced in Db2 10 and the externalization of missing statistics that the optimizer identifies that we introduced in Db2 11. In Db2 12, we enabled the optimizer to automatically update the statistics profiles to collect the recommended statistics. The result is that you can simply use RUNSTATS, or inline statistics, with USE PROFILE to collect the statistics that the Db2 optimizer recommends. This Db2 12 enhancement frees you from the burden of deciding which statistics to collect.
However, before migrating to Db2 12, you should prepare by making sure that you have a simple and clean base of statistics in the catalog. This preparation can be done in Db2 10 or Db2 11. When Db2 creates statistics profiles, it inherits the existing statistics already in the catalog. If you have old statistics, Db2 can suddenly start collecting those, even if they are unneeded and were only collected long ago. We've written before about problems that old and stale database statistics can cause for the query optimization process, and it's always been good practice to clean them up. It becomes even more important when you are likely to incur the added costs of standardizing their collection.
Here's an example: assume that for your table T1 the SYSCOLDIST catalog table contains some old HISTOGRAM statistics on columns C3 and also on C5, and some old COLGROUP statistics on the combined columns C2 and C4. These are old statistics you are not routinely collecting today. Add to this that you regularly run the following job to collect statistics:
RUNSTATS TABLE(T1) INDEX(ALL)
If the Db2 optimizer also recommends that you collect FREQVAL statistics on C1, then when Db2 creates a new statistics profile, it will include the following options:
TABLE(T1) INDEX(ALL)
COLGROUP(C1) FREQVAL COUNT 10
COLGROUP(C2, C4)
COLGROUP(C3) HISTOGRAM NUMQUANTILES 100
COLGROUP(C5) HISTOGRAM NUMQUANTILES 100
If you collect statistics with USE PROFILE, you begin collecting all of these statistics, including those that you have not routinely collected recently, and your collection costs will increase. Plus, after you first collect statistics with this profile, all of the statistics now have the same STATSTIME values, and you cannot differentiate the old ones that you were not regularly collecting before.
We suggest that you use the following SQL query to identify rows in the SYSCOLDIST catalog table for multi-column and histogram statistics that are older than the most recent statistics collection, and-single column frequency statistics that are more than a year old:
SELECT CD.TYPE, CD.NUMCOLUMNS, CD.TBOWNER,
CD.TBNAME, CD.NAME, MIN(CD.STATSTIME) AS MINSTAT,
COUNT(*) AS OCCURANCES
FROM SYSIBM.SYSCOLDIST CD
WHERE CD.STATSTIME < CURRENT TIMESTAMP - 1 MONTH
AND (CD.TYPE IN ('C', 'H')
OR CD.NUMCOLUMNS > 1
OR CD.STATSTIME < CURRENT TIMESTAMP - 1 YEAR)AND NOT EXISTS(SELECT 1
FROM SYSIBM.SYSINDEXES I
WHERE I.TBCREATOR = CD.TBOWNER
AND I.TBNAME = CD.TBNAME
AND I.STATSTIME > '0001-01-01-00.00.00.000000'
AND CD.STATSTIME BETWEEN I.STATSTIME - 8 DAYS
AND I.STATSTIME + 8 DAYS)
AND NOT EXISTS(SELECT 1
FROM SYSIBM.SYSTABLES T
WHERE T.CREATOR = CD.TBOWNER
AND T.NAME = CD.TBNAME
AND T.STATSTIME > '0001-01-01-00.00.00.000000'
AND CD.STATSTIME BETWEEN T.STATSTIME - 8 DAYS
AND T.STATSTIME + 8 DAYS)
GROUP BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAME
ORDER BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAME
WITH UR;
You can then use DELETE statements with appropriate WHERE clauses to remove the unneeded statistics from SYSCOLDIST. You might want to first insert the records that you plan to delete into a separate table, in case you encounter access path problems after the DELETE, and you need to restore them.
An alternative is to use RUNSTATS with RESET ACCESSPATH (introduced in Db2 11) for the identified objects, to remove entries from SYSCOLDIST and reset all statistics for the objects to -1. If you use this approach, be sure to collect statistics again for the objects, with your standard RUNSTATS options. However, because Db2 might choose new access paths while the statistics are set to the default values (-1), using DELETE to remove the unneeded statistics from SYSCOLDIST is possibly the less disruptive alternative.
Read more from Terry about statistics in Db2:
Terry Purcell is the lead designer for the DB2 for z/OS Optimizer at IBM's Silicon Valley Lab and is recognized worldwide for his expertise in query optimization and performance with DB2 for z/OS. Paul McWilliams is a technical writer for Db2 for z/OS.
#Db2forz/OS#db2z/os#Db2Znews