Posted by: Serge Rielau
A few days ago I did an internal presentation on improvements to DB2's compatibility with Oracle applications.
Doing so I stumbled upon one change I actually implemented myself that had nearly slipped my mind.
The change is very minor, and, on the face of it has nothing to do with Oracle applications.
What I implemented for DB2 9.7.3 were comparison operators for "small LOBs"
A small LOB in itself is a bit of an oxymoron, given that the L stands for Large.
Also Oracle allows a lot less native operations on LOBs than DB2 to begin with. And comparison operators are not amongst them.
So, what's the motivation?
Imagine you have a table that looks like this:
CREATE TABLE emp(empid INTEGER,
name VARCHAR(4000),
firstname VARCHAR(4000),
address VARCHAR(4000),
salary DECIMAL(31, 2),
title VARCHAR(4000),
jobdesc VARCHAR(4000),
review VARCHAR(4000),
skills VARCHAR(4000),
education VARCHAR(4000),
languages VARCHAR(4000));
OK, perhaps the columns in this table are a tad silly.
But who can earnestly say they have never seen string types maxed out (4000 is Oracle's max for VARCHAR in SQL).
Does anyone have a name 4000 bytes long?
Unlikely, but just because the developer who designed this application ended up taking the easy route when defining the domain of the
column doesn't mean we can simply override it to another smaller value.
Also there are cases when strings can indeed get rather big - on occasion.
One example may a "product description" or some "remarks" column.
Now, given DB2's limit of 32KB (32677 to be exact) per row, how do you implement a schema like the above?
One way, of course is to split the table vertically. In that case a join is required to reconstitute the table.
This will be OK for selects. After all the join can be hidden in a view.
But DB2 does not allow updates through joins.
Even INSTEAD OF triggers won't help there because transition variables mustn't exceed the maximum row size either.
Another approach is to "lobify" some of the columns.
But LOB's are said to have some downsides:
- You cannot index LOBs
- You cannot compare LOBs
- LOBs are slow to access and insert since they are not subject to the buffer pool
- LOBs are said to be slow when retrieving them to the client.
I have already cleaned up with some of these myths in
"Speeding up LOBS".
So, slow access and slow retrieval of LOBs are simply not true anymore in many cases.
The most important case here, being a VARCHAR(4000) gone CLOB.
LOB comparisons
In DB2 9.7.3 we figured that the chances that someone wants to actually compare big lobs are really not that high.
We also figured that for those small LOBs which could have been VARCHARs to begin with we could simply compare them as VARCHARs.
If the actual size of the LOB is less than 32KB then the comparison will succeed. If it is bigger it will fail.
A VARCHAR(4000) that has been turned into a CLOB(4000) will therefore always be comparable.
So what DB2 does for LOBs in general is to convert:
<clob> <compare> <something>
into
CAST(<clob> AS VARCHAR(32672)) <compare> <something>
Similarly DBCLOB (NCLOB) is cast to VARGRAPHIC and BLOB to VARCHAR FOR BIT DATA.
The functions supported are:
- "="
- "<>"
- ">"
- ">="
- "<"
- "<="
- BETWEEN
Previously supported was LIKE, POSSTR, SUBSTR and most other string functions.
Not that the IN predicate is absent here for the time being.
That leaves only one limitation: You still cannot index LOBs, even small ones.
However, I think that in many cases when strings are large there is no desire to index the column to begin with.
What's the point of indexing "remarks" or "description"?
These are candidates for full text indexes which is a different topic altogether, but not for a b-tree index.
So our silly table above could be tweaked as such:
CREATE TABLE emp(empid INTEGER,
name VARCHAR(4000),
firstname VARCHAR(4000),
address CLOB(4000) INLINE LENGTH 100,
salary DECIMAL(31, 2),
title VARCHAR(4000),
jobdesc CLOB(4000) INLINE LENGTH 200,
review CLOB(4000) INLINE LENGTH 200,
skills CLOB(4000) INLINE LENGTH 100,
education CLOB(4000) INLINE LENGTH 50,
languages CLOB(4000) INLINE LENGTH 100);
I have avoid changing name, firstname and title because they could reasonably indexed.
A query such as:
SELECT jobdesc FROM emp WHERE languages LIKE '%ENGLISH%' AND education = 'University"
will
work unchanged thanks to comparison support for small LOBs
#Db2