Everyone could use a few tricks up their sleeve to impress others at work. SQL tricks are usually easy to code, but the challenge is when to use what trick. Let’s examine one of the easiest SQL tricks that will provide the most gain: Code smarter WHERE clauses in the optimal sequence.
Smarter WHERE clauses can turn you into an SQL performance hero.
Four Points of Filtering
There are four points of filtering inside the DB2 engine. This
is where you start to gather the information on which clauses are the smartest. The four points are:
- Indexable: The predicate is a candidate for Matching Index access. When the optimizer chooses to use a predicate in the probe of the index, starting at the root page, the condition is named Matching (matching the index).
- Stage 1 Index Screening: The Stage 1 predicate is a candidate for filtering on the index leaf pages but it is not indexable.
- Stage 1 Data Screening: The Stage 1 predicate is a candidate for filtering on the data pages.
- Stage 2: The predicate is not listed as Stage 1 and will be applied on the remaining qualifying pages from the third point of filtering.
Now that you know the four points of filtering, the following chart shows which predicates belong in each point.
There are 32 conditions that can be processed at the root of an index. This is the first point and most powerful filtering point in DB2. This is because WHERE conditions processed here can filter large portions of data without reading every index value.
An example is the predicate WHERE CODE IN (‘K,’ ‘S,’ ‘X’). In a typically dense index (a ratio of one leaf page to thousands of leaf pages) is probed only three times, does index skipping over all the <> values in the leaf pages. If 2 percent of the index contains ‘K,’ 8 percent contains ‘S,’ and 1 percent contains, ‘X,’ roughly 11 percent of the total index leaf pages are read as show below:
Index Screening Adds Filters
The next point of filtering is called Index Screening, adding 13 more filters to the 32 prior totaling 45 that can be applied to each leaf page index entry.
For detailed notes on the predicates, see a summary of predicate processing
. The order these predicates are applied depends on the type of filter and the order they are typed in. In general, all 45 predicates can be applied at some point of the scan of the leaf pages if the column referenced exists in the optimizer selected index. For example, if the index is created as:
CREATE INDEX DSN8B10.XSALES1
USING STOGROUP DSN8G110
And the query is:
WHERE CODE IN (‘K’, ‘S’, ‘X’)
AND REGION BETWEEN ‘NAPPA’ AND ‘SAN DIEGO’
AND YEAR LIKE ‘%14’
AND PRODUCT LIKE ‘DB2%’
When the optimizer chooses to use the XSALES1 index, the IN and BETWEEN predicates are indexable will be processed on the root page of the index. The result will be the following three probes of the index:
WHERE CODE = ‘K’ AND REGION = ‘NAPPA’
WHERE CODE = ‘S’ AND REGION = ‘NAPPA’
WHERE CODE = ‘X’ AND REGION = ‘NAPPA’
Once the first occurrence of ‘NAPPA’ is found on the leaf pages for each of the probes, index screening can begin. The last two LIKE predicates are index screening predicates. The order that index screening predicates are processed is the following built in sequence:
1. =, IN
2. BETWEEN, <, <=, >,>=
4. Non Correlated Subquery
5. Correlated Subquery
Restrictive Conditions First
If there are multiple conditions in each category, the predicates are processed in the order they appear in the query. The query above has a tie in the LIKE position. DB2 will process these ties in any category in the order they were typed in. Be an SQL performance hero in this query and make sure the most restrictive LIKE is ahead of the less restrictive LIKE.
For example, if LIKE ‘%14’ qualifies 77 percent of the table and LIKE ‘DB2%’ qualifies 2 percent of the table, they should switch places in the query. This way the LIKE ‘DB2%’ of the table goes first and very little, 2 percent of the leaf page values of LIKE ‘%14’ will need to be tested. This is also referred to as an early out.
A good rule of thumb for queries, especially when you do not know what the index design will be, is to place the most restrictive conditions at the beginning of the WHERE clause and the rest below in the reverse order of their filtering capability.
Good luck, hero!
Sheryl Larsen is a World Wide DB2 for z/OS Evangelist with IBM.