Db2

Db2

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only

Scoping rules in DB2 

Tue March 10, 2020 07:48 PM

Posted by: Serge Rielau

Years ago name resolution for so called identifiers was an easy thing to do.
All there was were columns.
But then along came routines. Routines had parameters, local variables and FOR loops.
Next came global variables, then modules and ultimately pseudo-columns.
All of these objects share one and the same name space..
In this post I will try to illuminate the rules by which DB2 decides who is who.

Let's start with a simple example:

CREATE TABLE T(c1 VARCHAR(10));
INSERT INTO T VALUES 'Table T';
SELECT c1 FROM T;
C1
-----------
Table T
1 record(s) selected.
Presumably to no-one's surprise.

Let's complicate matters as bit:
CREATE TABLE S(c1 VARCHAR(10));
INSERT INTO S VALUES 'Table S';
SELECT (SELECT c1 FROM T) FROM S;

C1
-----------
Table T

1 record(s) selected.
There are two columns C1 that could be resolved here. But DB2 will always resolve to the one in the innermost scope.
The innermost scope is always the immediate FROM of a given query.
SELECT (SELECT c1 FROM T AS X(c2)) FROM S;
C1
----------
Table S

1 record(s) selected.
Two lessons are being taught by the example above.
First when the column is renamed in the FROM clause is renamed then the original name becomes inaccessible.
That is in the above example neither T nor T.c1 are accessible. They are now known as X and X.c2 respectively.
Second when c1 cannot be found within the innermost FROM clause DB2 will look in the surrounding context.

What defines "outer"? Let's try some more scenarios:
SELECT (SELECT res FROM T, (VALUES c1) AS X(res)) FROM S;
SQL0204N "C1" is an undefined name. SQLSTATE=42704
This didn't go so well.
DB2 didn't pick up T.c1 here because T is not an outer scope to the VALUES. It is "lateral" to VALUES.
But DB2 also didn't jump two levels up to S.c1. I suppose it could but, alas, it does not.

A small modification:
SELECT (SELECT res FROM T, LATERAL(VALUES c1) AS X(res)) FROM S;
RES
----------
Table T

1 record(s) selected.
By telling DB2 that we want to allow LATERAL correlation it will find T.c1.
But will it also be able to find S.c1?
SELECT (SELECT res FROM T AS Y(c2), LATERAL(VALUES c1) AS X(res)) FROM S;
RES
----------
Table S

1 record(s) selected.
Indeed! So LATERAL not only allows DB2 to refer to laterally correlated columns it also allows it to look further out in the scope.
Let's add another twist
SELECT 1 FROM T WHERE EXISTS(SELECT c1 FROM S AS X(c2));
1
-----------
1

1 record(s) selected.
So DB2 can look outside of an EXISTS to correlate it without problem.
Let's summarize:
  • WIthin an SQL Statement local columns reference are preferred over outer column references
  • To access columns in the same FROM from clause as the subquery where a column is referenced the LATERAL keyword needs to be used.
Let's introduce local variables to the mix:
SET SERVEROUTPUT ON;
--#SET TERMINATOR @
BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T));
END @

Table T
The column c1 wins over the variable because the variable is "further out" than the column.
Let's validate that we can get to the variable.
BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T AS X(c2)));
END @

Var
FOR loops also produce a form of local variables.
BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
CALL DBMS_OUTPUT.PUT_LINE((SELECT c1 FROM T AS X(c2)));
END FOR;
END @

For
So the FOR loop introduces a level of nesting here and therefore eclipses the local variable.
Note that compounds can be nested:
BEGIN 
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
INNER: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Inner';
CALL DBMS_OUTPUT.PUT_LINE(c1);
END;
END FOR;
END @

Var Inner
If we want to refer to the for loop we need to qualify:
BEGIN 
DECLARE c1 VARCHAR(10) DEFAULT 'Var';
FOR loop AS SELECT 'For' AS c1 FROM SYSIBM.SYSDUMMY1 DO
INNER: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Inner';
CALL DBMS_OUTPUT.PUT_LINE(loop.c1);
END;
END FOR;
END @

For
Let's move further out. The next level out are parameters of routines and transition variables of trigger.
I'll limit myself to demonstrating functions here:
CREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
RETURN c1;
END@

VALUES foo()@
1
----------
Var Outer

1 record(s) selected.
The pattern is becoming pretty obvious now the local variable is inside the compound block and it eclipses the parameter with the same name
If we want to reference the parameter we qualify it with the routine name
CREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
RETURN foo.c1;
END@

VALUES foo()@
1
----------
Parm Foo

1 record(s) selected.
There is one more interesting scenario within functions.
CREATE OR REPLACE FUNCTION foo(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
outer: BEGIN
inner: BEGIN
DECLARE c1 VARCHAR(10) DEFAULT 'Var Outer';
END inner;
RETURN c1;
END@
VALUES foo()@
1
----------
Parm Foo
1 record(s) selected.
It is not possible to see a variable from compound that a statement isn't itself within.

Let's summarize:
  • Within compound blocks local variables within an inner block eclipse those with the same name of an outer block
    You must use compound labels to disambiguate in the same fashion as you use table names or correlation names within SQL.
  • Still within an SQL statement columns will always eclipse variables
  • FOR loops are treated like compound blocks where the loop name is the qualifier and the column names of the cursor are the variables
  • Parameters compose the outermost block and an be referenced by the routine name.
  • In case of triggers the columns of the transition variables can be qualified by the transition variable name.
Are we done yet? Far from it. If an identifier is neither a column, nor a FOR-loop variable a local variable or a parameter then then the next possibility are module variables if teh routine is part of a module.
CREATE OR REPLACE MODULE mod@

ALTER MODULE mod ADD VARIABLE c1 VARCHAR(10) DEFAULT 'Var Module'@

ALTER MODULE mod PUBLISH FUNCTION FOO(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN c1@

VALUES mod.foo()@
1
----------
Parm Foo

1 record(s) selected.
Again we are going inside out. The parameter wins over the module variable, but the module variable can be referred to by qualifying it with the module name or unqualified if it is not eclipsed:
ALTER MODULE mod DROP FUNCTION FOO(VARCHAR())@
ALTER MODULE mod PUBLISH FUNCTION FOO(c2 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN c1@
VALUES mod.foo()@
1
----------
Var Module

1 record(s) selected.

ALTER MODULE mod DROP FUNCTION FOO(VARCHAR())@
ALTER MODULE mod PUBLISH FUNCTION FOO(c1 VARCHAR(10) DEFAULT 'Parm Foo')
RETURNS VARCHAR(10)
RETURN mod.c1@
VALUES mod.foo()@
1
----------
Var Module
1 record(s) selected.
Note that within a module context even variables which are not published can be accessed as long as they belong to the same module as the routine.

We are slowly digging digging ourselves out from the depth of name resolution to the outer layers.
Next we deal with global variables.
By now I trust you will believe me if I tell you that any local module variable, parameter, local variable or column will eclipse a global variable. No need for proof.
CREATE OR REPLACE VARIABLE schema.c1 VARCHAR(10) DEFAULT 'Var Schema'@

CREATE OR REPLACE FUNCTION FOO() RETURNS VARCHAR(10) RETURN c1@
SQL0206N "C1" is not valid in the context where it is used. LINE NUMBER=1. SQLSTATE=42703
But we are now starting to encounter a new curious beast. The PATH.
Since variables reside in schemas we can only see them if their schema is on the PATH
SET PATH = CURRENT PATH, SCHEMA@

CREATE OR REPLACE FUNCTION FOO() RETURNS VARCHAR(10) RETURN c1@

VALUES FOO()@
1
----------
Var Schema

1 record(s) selected.
OK, now as far as DB2 would be concerned we should be done now. But there are some rather curious behaviors that our beloved competition has concocted that, like it or not, DB2 must support for the sake of portability. The first one are PSEUDOCOLUMs

A pseudo column looks like a column but it isn't one at all. The three most common pseudo columns are NEXTVAL, CURRVAL and ROWNUM.

Let's investigate:
CREATE OR REPLACE VARIABLE S.NEXTVAL VARCHAR(10) DEFAULT 'Var Global'@

CREATE OR REPLACE SEQUENCE S@

VALUES (NEXT VALUE FOR S, S.NEXTVAL)@
1 2
----------- ----------
1 Var Global

1 record(s) selected.
In Oracle S.NEXTVAL has the same meaning as NEXT VALUE FOR S.
DB2 will prefer referencing to anything else first before, as a last resort try it's luck with a pseudo column
DROP VARIABLE S.NEXTVAL@

VALUES (NEXT VALUE FOR S, S.NEXTVAL)@
1 2
----------- -----------
2 2

1 record(s) selected.
OK, this post certainly making up for my six weeks of silence in sheer length. There is one last curiosity that I want to discuss for completeness.
CREATE OR REPLACE FUNCTION C1() RETURNS VARCHAR(10) RETURN 'Func'@

VALUES C1@
1
----------
Var Schema

1 record(s) selected.
Hold it.... wait for it....
DROP VARIABLE SCHEMA.C1@
VALUES C1@
1
----------
Func

1 record(s) selected.
Tadah!
Again, thanks to our friends at Redwood Shores you may think you are referring to a column, or a variable, or a parameter or whatever, but really you are referring to a function.
DB2 will only do this if it has exhausted absolutely every other possibility.

Now I could talk about the rules for qualified names, but perhaps I reserve this for another day.
Let's do a final summary:
The order of resolution for simple identifiers is as such:
  1. Column belonging to the local FROM clause
  2. Column belonging to a lateral table in the parent's FROM clause if LATERAL is specified and the reference is a subquery in that FROM clause
  3. Column belonging a FROM clause in the parent's FORM clause
  4. local variable or FOR loop variable resolving from the innermost block or FOR loop to the outside
  5. Parameter of a routine or transition variable column of a trigger
  6. Variable in the module in which the cursor or routine is defined
  7. Global variable if the variable is on the PATH
  8. Pseudo column or a function without parameters or a function with defaults for all parameters if the function is on the PATH
Well, that was fun, wasn't it? Now imagine you had to program that ;-)
#Db2

Statistics
0 Favorited
6 Views
0 Files
0 Shares
0 Downloads