Menno,
You provide not enough details of access path to make best possible answer, e.g. join method, size of table B, etc.
Optimizer has estimated that Table Space Scan is cheapest, possibly in conjunction with Merge Scan Join (maybe not).
Optimizer cannot estimate query cost accurately in cases of complex inner queries. One can use Optimizer hints as others mention or help Optimizer to understand better by SQL mod. e.g. Cursors should code Optimize for N Rows clause.
If we used OPTMIZE FOR 10 ROWS, as example, Db2 might then estimate cost of simple nested loop join via an index to be much cheaper option. It helps DB2 a lot to estimate how many rows will be in the final result set. FETCH FIRST n ROWS ONLY can also be used in final query or in sub queries to restrict rows and make estimates of cost better as well.
Deep analysis of where complex queries get estimates wrong is often not needed, using some basic techniques. The internal cost estimates by Optimizer at each access path step are very useful not to estimate actual cost but to see where DB2 misunderstands the cost or the filtering the worst or where it thinks the high costs are happening (often correctly). Many correlated subqueries can estimate filtering of 50% quite far from the true filtering, which could be very strong or very weak. Hypothetical example as I don't know if your complex query has that in there.
One hopes the table stats are somewhere in the ball park of reality too. Stats are used for cost estimates, not needing to be completely accurate, but not completely misrepresentative too.
------------------------------
Michael Hannan
------------------------------
Original Message:
Sent: Thu January 30, 2025 03:32 AM
From: Menno Avegaart
Subject: Does DB2 support optimizer hints?
I'm having a lot of problems with a rather complex query, but essentially it boils down to this simplified version:
select b.*
from (
select id
from [complex subquery]
) a
inner join b
on
b.id = a.id
I've rephrased this query in several different ways, but DB2 insists on doing a table scan on table b instead of using the primary key on id, even though the subquery just returns a handful of ids.
Is there any way I can provide optimizer hints for the query like you can do with other database vendors?
Or else, is there another way to efficiently execute the subquery first and then lookup the records in b?
------------------------------
Menno Avegaart
------------------------------