by SergeRielau
But first, what's up with Serge?
OK, I haven't been posting for a long time, but I do have a good excuse.
Some six months ago I switched jobs from being "SQL Architect for DB2 for LUW" to being "Chief Architect for OpenPages".
OpenPages is a product IBM acquired some two years ago that is leading in the eGRC (enterprise Governance, Risk, and Compliance) space.
Coming from deep in the bowels of DBMS development covering a three tier architecture, Java and a host of Web UI interfaces is challenging, but just what the doctor ordered to broaden my skills.
Luckily, for me, OpenPages was not totally new to me. The product has undergone "blue washing" and now supports DB2 for LUW in Oracle compatibility mode.
You might say I have become my own - most critical - customer, praising past right choices and cursing the wrong ones.
The role of the network in application performance
With my old DB2 server hat performance used to be a function of reducing code path within DB2, and reducing Disk IO.
Projects like the "new SQL" that was born out of the past TPC-C benchmark wars are testimony to that thinking.
To be sure, reducing network trips between client and server was also on the agenda and those who have seen my "SQL on Fire 1" and 'SQL on Fire 2" presentations have witnessed me driving that point home.
There is one area however that I was totally blind towards: The amount of data that actually flows on the network.
DB2 uses the open DRDA standard protocol to flow statements and bind variables from the client to the server and resultsets or out parameter values back to the client. That's just what it is.
Now, prior to helping blue wash OpenPages I was engaged in the same effort at the Tivoli Maximo.
After functional testing was complete performance was bench marked and tuned and we eventually got stuck:
Maximo on DB2 was slower than Maximo on Oracle!
All telltales on the application and the DBMS told a different story, however.
CPU consumption was lower and IO was better as well.
Things looked good on paper, but that did not change the reality.
It took a while until we seriously started looking at the network as only remaining cause.
After all, the network, like the hardware was not a variable and therefore was too easily dismissed.
We found out that small DB2 result sets were multiple times bigger on the wire than those from Oracle.
Why does this matter?
I'm no networking expert, but the more frames are being transmitted the higher the likely hood that one of them encounters a collision.
And if a collision is encountered, I'm told, the entire stream must be resubmitted, not just the lost frame.
Extended Describe
Result Sets in DRDA are largely self describing. There is a header which describes what is flowing back.
This includes names of result set columns, types, lengths and other essentials.
But there is other information that can be of interest, such as flagging of special properties of a column:
- Can the column be updated?
- Is the column an identity column
- Is a column based on an expression?
- Is the column used for temporal tables?
Other examples include information about the makeup of the query such as the qualified names of the tables in the FROM clause of the query and the original names of the table columns that feed the result set columns.
All this is additional metadata (aka "Extended Describe") for which the size is independent of the number of rows being returned.
In an OLTP environment where queries are fast to execute and mostly return just one row, often with few columns, this metadata can quickly become a multiple of the actual payload in size.
And this is exactly what happened here!
DB2's JDBC driver, by default asks for Extended Describe information because it cannot know whether the data will be requested at a later point or not.
So DB2's JDBC follows a "rather be safe than sorry" approach.
enableExtendedDescribe
In DB2 10.1 FP2 a new property enableExtendedDescribe has been introduced to the class com.ibm.db2.jcc.DB2BaseDataSource.
The following three values are possible:
- DB2BaseDataSource.NOT_SET (0)
The IBM Data Server Driver for JDBC and SQLJ requests extended describe information. This is the default.
- DB2BaseDataSource.YES (1)
The IBM Data Server Driver for JDBC and SQLJ requests extended describe information.
- DB2BaseDataSource.NO (2)
The IBM Data Server Driver for JDBC and SQLJ does not request extended describe information.
NO is what we want to set here unless we happen to call any of the following methods:
- Connection.findAutoGeneratedKeysColumn
- DB2ResultSetMetaData.getDBTemporalColumnType
- ResultSet.getMetaData on the ResultSet object that is returned by PreparedStatement.getGeneratedKeys
- ResultSet.insertRow
- ResultSet.deleteRow
- ResultSet.updateRow
- ResultSet.updateXXX methods
- ResultSetMetaData.getTableName
- ResultSetMetaData.getSchemaName
- ResultSetMetaData.getColumnName
- ResultSetMetaData.isAutoIncrement
The proof is in the pudding
In the case of Tivoli Maximo disabling Extended Describe gathering in the JDBC driver solved the performance problem.
Last I heard Maximo on DB2 performs better than Maximo on Oracle.
The world feels right again.
Here at OpenPages one key performance critical benchmark that I used to test out the property (a simple update to the WebSphere configuration) gave an instant 10% boost.
So, give it a try and release that handbrake you have been driving around with for years!
120000D76F #Db2