Db2

Db2

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only

MODULEs, a primer 

Mon March 09, 2020 05:56 PM

Posted by: Serge Rielau

Motivation

Some of you may have been using DB2 as far back or even longer than DB2 7.1, or DB2 UDB V7.1 to be correct.
In that release we introduced SQL Procedures, and SQL Functions.
When we shipped the features there was a bit of a confusion because everyone from sales to marketing to customers seemed have a dire need for a name of logical constructs we supported.
It's just SQL we said. ANSI SQL comprises more than just SELECT. In includes IF, BEGIN.. END and so forth.
All we do is implement the SQL standard!
I learned and important lesson then. When you refuse to name a thing that wants to be named the market does it for you.
So we ended up with contraptions such as PL/SQL (oops, that's Oracle) and SPL (IDS).
We finally ended up with SQL Procedure Language (SQL PL) which has the unfortunate habit of being two acronyms SQL PL (hard to search) .
And it also get people confused with PL/SQL leaving us with SQL/PL.
This slash from OS/2 will haunt us to retirement I fear. 
 
Internally however we did have a name: PSM
PSM is derived from the ANSI SQL/PSM section of the standard and stands for Persistent Stored Modules.
So, why didn't we use PSM externally? Simple: DB2 was missing the M for MODULE.
This post is all about that missing M which finally joined SQL PL in DB2 9.7.

MODULE

In a nutshell a module in DB2 is an extension of the namespace for a number of objects types.
Objects in DB2 either have one part (TABLESPACE), or two parts (TABLE) to their name.
The first of the two parts is the schema.  

Elements of modules

A module allows the the following objects a third name which is called the module name.
The objects for which this is allowed presently are:
  • Scalar functions in any language including SQL.
    Table functions or source functions today are not allowed,
     
  • Procedures in any language including SQL

  • User defined distinct, row and array types
    Structured types are not allowed
     
  • Global variables of all kinds
     
  • Conditions

There is no particular reason why no other objects are supported.
Conceptually we could have sequences, views, perhaps even tables with this properties and perhaps some day we will.

What these supported objects have in common however is that they are what one would typically find in a library in a regular procedural language.
And a library is really what a module is meant to model in DB2 (and ANSI SQL).

Creating a module

In ANSI SQL/PSM a Module is one single object very much like a file in, say C which contains all sorts of functions and types..
PL/SQL in Oracle is very similar. There is a head and a body and that's it.
We in DB2 Development didn't like that very much.
It seems outdated in an age of GUis and version control systems to thing as a module as something monolithic.
We wanted the properties associated with libraries, but with finer control.
So in DB2 you assemble modules piece by piece. And you can alter them on a per-object basis.
 
Every module starts of empty:
CREATE OR REPLACE MODULE HR;
That looks a lot like a CREATE SCHEMA statement, but a module itself is part of a schema.
So you can create modules of the same name in multiple schemata.
In addition access control is managed on a module level.
Something that is generally not available for schemata.
CREATE ROLE HR; 
GRANT EXECUTE ON MODULE HR TO ROLE HR;
Now every member of the HR role will be able to reference all visible objects within the module.  
I say visible here because libraries in most languages differentiate between private objects and public objects.
E.g. in C you may define an export file which tells the compiler/linker which symbols to export and make visible to users of the library.
Modules in DB2 support the same concept.

Adding prototypes to modules

Every object in the module can be either:
  • ADDed
    This makes the object private for use within the module only.
     
  • PUBLISHed
    This exports the objects so it can be references by other modules or global objects or queries.
Further more modules support the concept of forward declaration which allows to separate the specification of  objects from their implementation.
This is useful to implement recursive calls of routines as well as to provide an external specification for reference to the user which is needed to use the module. 
Think of C-style header-files for comparison.
Since objects are added to the module we use the ALTER MODULE statement.
SET SCHEMA BLOG;
CREATE OR REPLACE MODULE HR;
ALTER MODULE HR PUBLISH TYPE emp AS ROW (empid INTEGER, name VARCHAR(20), salary INTEGER);
/
ALTER MODULE HR PUBLISH PROCEDURE HIRE(INOUT emp emp);
/
ALTER MODULE HR PUBLISH PROCEDURE FIRE(IN empid INTEGER);
/
ALTER MODULE HR PUBLISH VARIABLE HRPARTNER  VARCHAR(20) DEFAULT USER;
/
This is a valid specification for a module.

Name resolution

You cannot reference any object within the module without specifying the module itself.
Only within the module itself are references to its; own objects by simple name allowed. 
VALUES HRPARTNER;
SQL0206N  "HRPARTNER" is not valid in the context where it is used.
To reference a module you can either qualify the module with its schema name
VALUES BLOG.HR.HRPARTNER;
1                  
--------------------
SERGE              
Unlike tables modules are not resolved by CURRENT SCHEMA :
VALUES CURRENT SCHEMA;
1                                                
--------------------------------------------------
BLOG               
                             
VALUES HR.HRPARTNER;
SQL0206N  "HR.HRPARTNER" is not valid in the context where it is used.
Instead they resolve strictly by PATH.
SET PATH = CURRENT PATH, BLOG;
VALUES HR.HRPARTNER;
1                  
--------------------
SERGE
By strictly I mean that the resolution of the module is independent of matching rules for routines.
Only once the module is determined will DB2 look at the best match for a referenced routine within that module.
The first module of the right name within the path is selected.
 
Lastly you can create public synonyms on modules.
If no module with the right name is found on the path DB2 will consider a public synonym:
SET PATH = SYSTEM PATH;
CREATE PUBLIC SYNONYM HR FOR MODULE BLOG.HR;
VALUES HR.HRPARTNER; 1                   -------------------- SERGE
For more information about name resolution in general you can refer to Scoping Rules in DB2.

Adding routine bodies

While the type and variable are fully defined, procedures have no bodies.
Any attempt to execute them will yield and error:
CALL HR.FIRE(1);
SQL20496N  The routine "BLOG.HR.FIRE" cannot be invoked because it is only a routine prototype.
Before fixing that we complete our schema with an employee table and a sequence;
SET SCHEMA = BLOG;
CREATE TABLE emp(empid INTEGER NOT NULL PRIMARY KEY, name VARCHAR(20), salary INTEGER);
CREATE SEQUENCE empid;
To provide the body for a previously specified routine we must use ADD  and provide an exact match for the signature and any specified routine properties:
ALTER MODULE HR ADD PROCEDURE HIRE(INOUT emp hr.emp)
BEGIN
  SET emp.empid = NEXT VALUE FOR empid;
INSERT INTO emp VALUES emp;
END;
/
ALTER MODULE HR ADD PROCEDURE FIRE(IN empid INTEGER)
BEGIN
  DELETE FROM emp WHERE empid = fire.empid;
END;
/
Now the procedures can be executed:
VARIABLE empid INTEGER; 
BEGIN
  DECLARE emp hr.emp;
  SET emp = (NULL, 'John', 23000);
  CALL hr.hire(emp);
  SET :empid = emp.empid;
END;
/
PRINT empid;
1

SELECT * FROM emp;
      EMPID NAME                      SALARY
----------- -------------------- -----------
          1 John                       23000

Altering module elements

An advantage of modules over PL/SQL Packages is the ability to surgically change bits and pieces of it.
For example if we want to change the EMP type we can do that easily.
Currently DB2 does not support OR REPLACE fro module elements.
We need to DROP the TYPE before adding it back in.
ALTER TABLE emp ADD COLUMN country VARCHAR(20) DEFAULT 'Canada';
ALTER MODULE HR DROP TYPE emp;
/
ALTER MODULE HR PUBLISH TYPE emp AS ROW ANCHOR ROW emp;
/

BEGIN
  DECLARE emp hr.emp;
  SET emp = (NULL, 'Udo', 31000, 'Germany');
  CALL hr.hire(emp);
  SET :empid = emp.empid;
END;
/
PRINT empid; 
2
SELECT * FROM emp;
       EMPID NAME                      SALARY COUNTRY             
----------- -------------------- ----------- --------------------
          1 John                       23000 Canada              
          2 Udo                        31000 Germany
As a side note: DB2 automatically re-validated the HIRE procedure after the change to the emp data type.
So far all our objects within the module have been public.

Adding private module elements

Let's make the computation of the employee id a bit more interesting.
We want to empid to be combined with a country code.
The translation of countries to codes will be stored in a private array.
And we will define a routine to encapsulate the sequence generation
ALTER MODULE HR ADD TYPE countries AS INTEGER ARRAY[VARCHAR(20)];
/
ALTER MODULE HR ADD VARIABLE countries countries;
 /
ALTER MODULE HR ADD FUNCTION empid(country VARCHAR(20)) RETURNS INTEGER
BEGIN
   RETURN NEXT VALUE FOR empid * 1000 + countries[country]; 
END;
Note that there is no requirement to have a separate specification for the routine.
The same is true for the public routine HIRE.
When we replace it we can immediately provide the body if that's what we want.  
ALTER MODULE HR DROP PROCEDURE HIRE;
/
ALTER MODULE HR PUBLISH PROCEDURE HIRE(INOUT emp emp)
BEGIN
  SET emp.empid = empid(emp.country);
  INSERT INTO emp VALUES emp;
END;
/

BEGIN
  DECLARE emp hr.emp;
  SET emp = (NULL, 'Wim', 31000, 'Netherlands');
  CALL hr.hire(emp);
  SET :empid = emp.empid;
END;
/
SQL0407N  Assignment of a NULL value to a NOT NULL column "TBSPACEID=2, TABLEID=5, COLNO=0" is not allowed.
We forgot to fill in values for the countries!
This is an opportunity to introduce module initialization.

Module initializer

You can ADD a module procedure called SYS_INIT without parameters which is invoked by DB2 the first time an object in the module is referenced.
Note that there is no "un-init".
ALTER MODULE HR ADD PROCEDURE SYS_INIT
BEGIN 
  SET countries['Canada'] = 100;
  SET countries['Netherlands'] = 101;
  SET countries['Germany'] = 102;
END;
/
DISCONNECT;
CONNECT serge@localhost/test;
BEGIN
  DECLARE emp hr.emp;
  SET emp = (NULL, 'Wim', 31000, 'Netherlands');
  CALL hr.hire(emp);
  SET :empid = emp.empid;
END;
/
PRINT empid;
21101;
Sidenote: since my connection is the only one on my laptop and I did not activate my database explicitly DB2 deactivated the database.
As a result I lost a few sequence values from the cache.
Module initialization is rarely used since module variables provide DEFAULT clauses. But not every variable can have defaults.
Also SYS_INIT provides a better control about the timing of the initialization. 
When using external routines SYS_INIT can be useful to set up the environment. 

Dropping modules

I have shown the highlights of module creation. There is nothing left but switching of the lights. 
One thing you will learn to appreciate about modules is that all the module's objects get dropped when you drop the module.
DROP MODULE BLOG.HR;

#Db2

Statistics
0 Favorited
3 Views
0 Files
0 Shares
0 Downloads

Comments

Wed September 02, 2020 04:19 PM

Thanks for this.  I'm having a lot of trouble finding much on Modules or just current SQL/PL.  I can't really find anything since "Essentials" 2nd ed in 2004. If anyone has recommendations, other than knowledge center, I'd appreciate it!