Recently one of our major business partners submitted the following feature request:
- No dependencies are recorded anymore.
So you do not know anymore with which objects the routine interacts
- SQL executes under "invoker's rights"
That is the invoker of the routine or trigger would need the authorization to execute the SQL within.
- There will be no more "conservative binding"
Every time an SQL statement gets executed it would do so being compiled based on the CURRENT PATH, CURRENT SCHEMA, and CURRENT TIMESTAMP in effect at the time.
That means there is no assurance the SQL would operate on the same table or invoke the same routines when executed multiple times.
- The execution plan may change between two invocations if statistics were updated on a table and/or the configuration of the memory is changed
There are probably more side-effects when switching from static SQL to dynamic. But these are the ones I can come up with immediately.
The answer I got was interesting:
"I'm OK with all conservative binding semantics, but I want to be sensitive to statistics changes.
Since the database is empty when the routines are created.
The execution plans will nearly guaranteed be sub-optimal."
So what this vendor wants is for SQL within routines and triggers to be recompiled whenever there is a change to any of the used objects.
This may be an update to the statistics or perhaps the addition of an index.
In case of dynamic SQL such operations cause so called "soft invalidation".
That is the cached SQL plans for the statements which depend on the changed objects cannot be reused for new invocations.
So the first new invocation of the the statement will cause a recompilation placing an updated version of the plan in the cache.
Well, if that's all you want fro static SQL, then DB2 can actually do that today.
Here I describe how.
Scenario
As always we'll work of a running example. The following schema defines two tables "EMP" and "DEPT".
We then create a procedure which fetches an employee based on the emp id.
The *cough* "complexity" lies in the fact that there is a join.
SET SCHEMA = SAMPLE;
SET PATH = SAMPLE, CURRENT PATH;
CREATE TABLE emp (id INTEGER NOT NULL, name VARCHAR(15), deptid INTEGER NOT NULL);
CREATE TABLE dept(id INTEGER NOT NULL, name VARCHAR(10));
CREATE OR REPLACE PROCEDURE get_emp(IN id ANCHOR TO emp.id,
OUT name ANCHOR TO emp.name,
OUT dept ANCHOR TO dept.name)
SPECIFIC GET_EMP
SELECT emp.name, dept.name INTO name, dept
FROM emp, dept
WHERE emp.id = get_emp.id
AND dept.id = emp.deptid;
Insert some data.
INSERT INTO dept VALUES(1, 'Marketing');
INSERT INTO emp VALUES(1, 'Jones', 1);
Test the procedure.
CALL get_emp(1, ?, ?);
Value of output parameters
--------------------------------
NAME = Jones
DEPT = Marketing
Exposing the problem
CALL PROFILE.ENABLE_ACTIVITY_MONITOR();
CALL get_emp(1, ?, ?);
CALL PROFILE.DISABLE_ACTIVITY_MONITOR();
CALL PROFILE.GET_ACTIVITY_ACTUALS(?);
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
HSJOIN
( 2)
13.5704
NA
/-----+------\
1 1
1 1
TBSCAN TBSCAN
( 3) ( 4)
6.78499 6.78516
NA NA
| |
1 1
NA NA
TABLE: SAMPLE TABLE: SAMPLE
DEPT EMP
Q1 Q2
A small problem. I had intended to start with a nested-loop join and argue that this would be a bad join choice as the number of rows increases.
DB2 outwitted me (again).
But there are less subtle ways to "encourage" a plan changes:
ALTER TABLE emp ADD PRIMARY KEY (id);
ALTER TABLE dept ADD PRIMARY KEY (id);
ALTER TABLE emp ADD FOREIGN KEY (deptid) REFERENCES dept(id);
CALL PROFILE.ENABLE_ACTIVITY_MONITOR();
CALL get_emp(1, ?, ?);
CALL PROFILE.DISABLE_ACTIVITY_MONITOR();
CALL PROFILE.GET_ACTIVITY_ACTUALS(?);
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
HSJOIN
( 2)
13.5704
NA
/-----+------\
1 1
1 1
TBSCAN TBSCAN
( 3) ( 4)
6.78499 6.78516
NA NA
| |
1 1
NA NA
TABLE: SAMPLE TABLE: SAMPLE
DEPT EMP
Q1 Q2
Nothing has changed! DB2 did not pick the new indices.
None of the changes we did to the table forced DB2 to invalidate the plan, so it didn't.
That's the premise of static SQL.
In order to get DB2 to re-consider the plan we must tell it to do so explicitly:
CALL PROFILE.REBIND_ROUTINE('SAMPLE', NULL, 'GET_EMP');
CALL PROFILE.ENABLE_ACTIVITY_MONITOR();
CALL get_emp(1, ?, ?);
CALL PROFILE.DISABLE_ACTIVITY_MONITOR();
CALL PROFILE.GET_ACTIVITY_ACTUALS(?);
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
^NLJOIN
( 2)
13.5709
NA
/-----------+-----------\
1 1
1 1
FETCH FETCH
( 3) ( 5)
6.78546 6.78546
NA NA
/----+----\ /----+----\
1 1 1 1
1 NA 1 NA
IXSCAN TABLE: SAMPLE IXSCAN TABLE: SAMPLE
( 4) EMP ( 6) DEPT
0.0149777 Q2 0.0149777 Q1
NA NA
| |
1 1
NA NA
INDEX: SYSIBM INDEX: SYSIBM
SQL120820083059340 SQL120820083059420
Q2 Q1
So in DB2's preferred mode of operation it is your responsibility to rebind static SQL after updating statistics or adding indexes.
But what if I want DB2 to take care of itself? I have already turned on the self tuning memory manager and automatic runstats collection.
I'm happy how that works for dynamic SQL and I want DB2 to take care of all my SQL PL or PL/SQL as well.
Can VALIDATE RUN do the job?
My original comeback to the vendor was to try VALIDATE RUN.
This BIND option will delay compilation of a static SQL statement until first execution - I thought.
So that will solve the problem.
Unfortunately during development of this scenario I discovered that I had missed a crucial property of VALIDATE RUN.
VALIDATE RUN is a fail-over mechanism only.
That is, only SQL that would otherwise raise a bind time error gets pushed out.
The purpose of VALIDATE RUN is to cross your fingers and hope by the time the SQL statement executes a missing object such as a user temporary table is available.
If your SQL has no such errors, which can be assumed, the SQL will be just as static as without VALIDATE RUN.
What VALIDATE RUN can't REOPT ONCE can.
The idea of using a BIND option however does hold water. We just have to pick a different option.
REOPT ONCE and
REOPT ALWAYS are options which delay optimization of an SQL statement until the input values are available.
The idea is that given a specific set of input values a better plan can be generated.
- REOPT ONCE will do this optimization only once - as the name implied.
This option is often used in OLTP environments
- REOPT ALWAYS will recompile the statement for every invocation.
This option is only useful for reporting queries where the execution time greatly eclipses the compilation time.
When defining SQL Routines, triggers or even anonymous blocks there is no direct way to set the BIND options within the syntax.
Instead you set bind options either at an instance level using the DB2_SQLROUTINE_PREPOPTS variable or on a session basis using the SET_ROUTINE_OPTS() procedure.
Let's redo our scenario using REOPT ONCE.
DROP TABLE SAMPLE.emp;
DROP TABLE SAMPLE.dept;
SET SCHEMA = SAMPLE;
SET PATH = SAMPLE, CURRENT PATH;
CREATE TABLE emp (id INTEGER NOT NULL,
name VARCHAR(15),
deptid INTEGER NOT NULL);
CREATE TABLE dept(id INTEGER NOT NULL,
name VARCHAR(10));
Enable REOPT ONCE for the following routine definitions.
Note that, despite the name, compiled triggers and anonymous blocks will also be affected.
CALL SET_ROUTINE_OPTS('REOPT ONCE');
CREATE OR REPLACE PROCEDURE get_emp(IN id ANCHOR TO emp.id,
OUT name ANCHOR TO emp.name,
OUT dept ANCHOR TO dept.name)
SPECIFIC GET_EMP
SELECT emp.name, dept.name INTO name, dept
FROM emp, dept
WHERE emp.id = get_emp.id
AND dept.id = emp.deptid;
It's good practice to clean up after ourselves.
CALL SET_ROUTINE_OPTS('');
INSERT INTO dept VALUES(1, 'Marketing');
INSERT INTO emp VALUES(1, 'Jones', 1);
CALL get_emp(1, ?, ?);
Value of output parameters
--------------------------------
NAME = Jones
DEPT = Marketing
We don't expect any changes yet. The plan should be identical to the original scenario:
CALL PROFILE.ENABLE_ACTIVITY_MONITOR();
CALL get_emp(1, ?, ?);
CALL PROFILE.DISABLE_ACTIVITY_MONITOR();
CALL PROFILE.GET_ACTIVITY_ACTUALS(?);
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
HSJOIN
( 2)
13.5704
NA
/-----+------\
1 1
1 1
TBSCAN TBSCAN
( 3) ( 4)
6.78499 6.78516
NA NA
| |
1 1
NA NA
TABLE: SAMPLE TABLE: SAMPLE
DEPT EMP
Q1 Q2
Now, this is where it gets interesting:
ALTER TABLE emp ADD PRIMARY KEY (id);
ALTER TABLE dept ADD PRIMARY KEY (id);
ALTER TABLE emp ADD FOREIGN KEY (deptid) REFERENCES dept(id);
CALL PROFILE.ENABLE_ACTIVITY_MONITOR();
CALL get_emp(1, ?, ?);
CALL PROFILE.DISABLE_ACTIVITY_MONITOR();
CALL PROFILE.GET_ACTIVITY_ACTUALS(?);
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
1
1
^NLJOIN
( 2)
13.5709
NA
/-----------+-----------\
1 1
1 1
FETCH FETCH
( 3) ( 5)
6.78546 6.78546
NA NA
/----+----\ /----+----\
1 1 1 1
1 NA 1 NA
IXSCAN TABLE: SAMPLE IXSCAN TABLE: SAMPLE
( 4) EMP ( 6) DEPT
0.0149777 Q2 0.0149777 Q1
NA NA
| |
1 1
NA NA
INDEX: SYSIBM INDEX: SYSIBM
SQL120820081204820 SQL120820081204900
Q2 Q1
Success! The plan changed without the need to rebind.