Introduction
Application developers must code efficient SQL and understand how to optimize SQL in order to develop efficient applications. But it is not just the programmer, but also the DBA who must understand SQL optimization. It is a joint responsibility and frequently requires robust tools to help code, modify, and optimize SQL effectively.
Database Administrators know that having accurate database statistics is an important component needed to achieve the best performance for any SQL query. The statistics collected and saved can be used to help estimate the costs for a plethora of possible access plans for each query and ultimately help choose the most efficient access path.
Runstats and IBM Data Virtualization Manager for z/OS (DVM)
To collect and store this key object statistical information about any virtual table and on all the data sources supported, DVM provides its own Runstats feature.
Why run Runstats?
But why gather all this statistical information? The DVM SQL engine is very powerful, but its ability to choose the best access path can be significantly improved by having current statistics giving it the best chance of delivering the best query performance. Further, the statistics gathered can form a vital component in assisting in analyzing or troubleshooting any SQL query.
What statistics are collected by Runstats?
The following statistics are collected by Runstats:
- The total number of values in a particular column
- The total number of distinct values in a column
- The percentage distribution of unique values
Where are the statistics gathered by Runstats stored?
The statistics are collected and stored in an internal format in a record within the global variable NODE : GLOBAL9 , SUBNODE; STAT for each virtual table that Runstats has been executed on.
To display the statistics, you must use the DVM RUNSTATS DISPLAY option as they are stored in an internal format and cannot be displayed by browsing the global variable above.
Executing DVM Runstats
Runstats can be executed either through the DVM Studio or in batch using the DVM DS-Client. You can use the following Runstats command syntax to collect the statistics:
SELECT RUNSTATS (‘<Virtual_Table_Name>’, ‘RUNSTAT-OPTIONS’;
Note: Use of single quote marks to delimit the virtual table name and runstats option.
The DVM Runstats options available to you are:
- RESET: Reset the previous statistics and restarts the collection of the Database
- CLEAR: Delete any Runstats data statistics previously collected and store
- DISPLAY: Display the Runstats data statistics collected
- DB2STAT: Only collect the DB2 data statistics
- CATALOG: Collect the data statistics from the catalog data
- NOVDEC: Enable the NOVDEC flag to trim trailing and leading spaces and zeros
- NOHASH: Enable the NOHASH flag
A quick guide on how to collect Runstats using the DVM Studio
After starting the DVM Studio you should be presented with the DV Data perspective where you can use the ‘Set Server’ button to select, authenticate and connect to the required DVM Server.
From here you can reuse or open a new SQL file and use the SQL editor to enter the RUNSTATS command like this:
In the example above the Runstats statements when executed will:
- Clear out any statistics that might exist for virtual table RAILSTNS
- Collect and save the statistics for the virtual table
- Display the statistics collected on virtual table RAILSTNS
- Run a query to count the number of rows in the virtual table
To execute all 4 statements above simply press PF5. The results generated follow.
From the Console Tab you can confirm what was executed:
Next, you can view outputs from the statements executed by tabbing through the SQL Results pane.
From the SQL Messages of the SQL Results pane:
Clearing out old statistics:
Populating the statistics:
Next, Displaying the statistics collected:
The optimizer parses the SQL statement and performs various phases of optimization, typically involving verification of syntactic and semantic correctness, followed by query analysis and formulation of the access paths to satisfy the query.
Now that the DVM server captures statistics of remote data source data objects, you can leverage this capability to fuel the SQL optimizer with the data distribution leading to the most efficient query plan possible.
#Db2Toolsforz/OS#DVMz