Rajib Sarkar (rssarkar)
STSM Db2 Support
May 24, 2016.
Any well written code should have some kind of tracing facility to allow an "inside" view to its working so that when problems happen the issue can be looked into while its happening. You must have come across OS tracing tools such as dTrace ( Solaris ), systemTap ( Linux ), probeVue ( AIX )., tcpdump ( Linux ) , iptrace ( AIX ) for network tracing. On similar lines, DB2's tracing facility is coded so that it can provide a view of the inner working of the engine and it can be used to find bottlenecks within the engine.
However, it needs to be handled with care ( as is the case with any tracing tool ) as too much of it can cause huge performance degradation or a wrong option being used can cause a system to be so slow that it looks like a hang. I am going to discuss the optimal usage of DB2 Tracing facility for different kind of scenarios and also some of the options you should or should not use based on different situations. The tool which controls DB2's tracing facility is known as db2trc.
Let me start with the different options available with the db2trc facility ( DB2 v10.1 FP5+ or 10.5FP3+ ).
STAGE #1 - COLLECTION
Usage: db2trc [facility] <command> [-u]
[facility]
db2 - DB2 instance (default)
das - DB2 Administration Server instance
cf - CF Server
cfcli - CF Client
<command>
change - Change trace options
clear - Clear the trace buffer
dump - Generate trace dump file
info - Information
off - Disable the trace facility
on - Enable the trace facility
stop - Stop tracing
STAGE #2 - PARSING
Usage: db2trc <command> [-u]
<command>
flow - Generate control flow diagram
format - Format (includes CLI/ODBC driver traces)
formattedFlow - Generate global flow diagram
info - Information
perffmt - Format a performance trace
perfrep - Format a performance report
For more information add the "-u" option to any of the above commands
You can get the above commands by just typing in "db2trc" from the command line.
Typically, when you turn on db2trc, the trace records are put in a cyclical buffer in memory. The size of the buffer can be found using the command ( if instance is up ):
$> db2pd -memsets | egrep "Address|Trace"
Name Address Id Size(Kb) Key DBP Type Unrsv(Kb) Used(Kb) HWM(Kb) Cmt(Kb) Uncmt(Kb) CmtRt(Kb) DcmtRt(Kb) HoldRt(Kb) Sngl
Trace 0x0000000000000000 14057472 33355 0x45EE9E74 0 -1 0 33355 0 33355 0 0 0 0 0
You can change the buffer size by setting the registry variable BEFORE bringing up the instance:
db2set DB2TRC_DEF_BUFFSIZE=512m ( you can go up to 1GB maximum. It accepts the size in bytes or suffixes such as 'm' or 'g' ).
You can also put the trace records to a file with a -f option. However this has to be used with caution and is recommended only to be used in error / crash situation which happens very quickly e.g. right after the engine is started etc. The default recommendation is NEVER to use it until and unless specifically asked by support engineer.
Ok, now that the preamble is out of the way, let me get down to the nitty gritty and show how it can be used for different type of situations.
Overall performance is slow
A typical situation when everything is running slow. This happens very commonly right after some upgrade, but may happen otherwise.
Suggested command#1 to use:
db2trc on -l ( small L ) 512m -t
sleep 30
db2trc dmp trc.dmp.1
sleep 30
db2trc dmp trc.dmp.2
sleep 30
db2trc dmp trc.dmp.3
db2trc off
After this, use these commands to format the trace dump files.
db2trc perfrep trc.dmp trc.perfrep -sort timeelapsed -order desc -g
This command will generate a "profile" trace for each of the threads working within the engine. The output will be something like:
Node : 0 , PID : 7771 , TID : 46912849832256
nCalls TotalElapsed AvgElapsed TotalSpent AvgSpent FunctionName
1 0.000512000 0.000512000 0.000003000 0.000003000 sqlriProcessSplitRowData
1 0.000509000 0.000509000 0.000001000 0.000001000 sqljsDrdaAsCBPutLongDataQueryCheck
1 0.000503000 0.000503000 0.000004000 0.000004000 sqlriWriteCsoIntoBuffer
1 0.000499000 0.000499000 0.000008000 0.000008000 sqlrimso
1 0.000491000 0.000491000 0.000001000 0.000001000 sqlriglf
1 0.000490000 0.000490000 0.000003000 0.000003000 sqldFetchLFD
1 0.000485000 0.000485000 0.000002000 0.000002000 sqldx_lob_read
1 0.000480000 0.000480000 0.000005000 0.000005000 sqldxReadLob
1 0.000474000 0.000474000 0.000003000 0.000003000 sqldxReadLobData
1 0.000471000 0.000471000 0.000001000 0.000001000 sqldx_diskread
1 0.000470000 0.000470000 0.000002000 0.000002000 sqlbDirectRead
1 0.000468000 0.000468000 0.000009000 0.000009000 sqlbDMSDirectRead
1 0.000328000 0.000328000 0.000028000 0.000028000 sqlbDMSMapAndRead
1 0.000295000 0.000295000 0.000002000 0.000002000 sqlbReadBlocks
1 0.000293000 0.000293000 0.000293000 0.000293000 sqloReadBlocks
2 0.000179000 0.000089500 0.000006000 0.000003000 sqlbDMScheckObjAlloc
2 0.000158000 0.000079000 0.000013000 0.000006500 sqlbIsExtentAllocated
The way to interpret the above is to look at the TotalElapsed and TotalSpent column values. TotalElapsed is the "elapsed time to get in and out of this function", TotalSpent is the "time spent inside the function". For example, sqlriProcessSplitRowData is showing TotalElapsed as .000512 seconds and 0.000003 seconds as TotalSpent which means, the function was entered in time t1 and exited 0.000512 seconds later but it actually spent only 0.000003 seconds inside the function itself, the other time was spent in some of the functions it called. Since, the command had ordered it in descending order, keep looking for a function where TotalElapsed =~ TotalSpent. In the above example, its the sqloReadBlocks function which as its name suggests is an I/O function. So, of the 0.000512 seconds, 0.000293 seconds was spent in the I/O call. This is how you can figure out a bottleneck from this output.
This way, you can attempt to see where is the bottleneck. If its an obvious problem (refer to my SSL Perf blog ) it will surely bubble up here.
Suggested command#2
db2trc on -perfcount -t
sleep 180
db2trc dmp trc.perfcount
db2trc off
db2trc perffmt trc.perfcount trc.perffmt
This command keeps how many times a function is called and how much time is spent in each function. It doesn't have the per thread granularity which the perfrep report provides. However, it creates a nice profile information of the whole engine. Typically, I use the following command to parse the trc.perffmt output.
sort -t"(" -k2nr trc.perffmt
The output is like:
1204129 (1294 sec, 118794000 nanosec) csmFetch
1204594 (1288 sec, 694681000 nanosec) sqlak_callbDrdaOutput
2815676 (1225 sec, 149336000 nanosec) csmGetCursorBuf
560574 (1223 sec, 174019000 nanosec) csmDriveFetch
560525 (1222 sec, 257363000 nanosec) sqljrDrdaArFetch
560486 (1200 sec, 247509000 nanosec) sqljrRecv
2414805 (1185 sec, 162567000 nanosec) clientboDDFMode1Conversions
2410385 (1183 sec, 317071000 nanosec) clientboWriteBytes
560425 (1173 sec, 235236000 nanosec) sqljsParse
560158 (1171 sec, 596516000 nanosec) sqljsParseRdbAccessed
559503 (1170 sec, 921528000 nanosec) sqljs_ddm_cntqry
560041 (1170 sec, 81641000 nanosec) sqlrr_fetch
560021 (1158 sec, 474592000 nanosec) sqlrr_process_fetch_request
1765631 (1135 sec, 986047000 nanosec) sqlribnoCbDrda
2953145 (1080 sec, 720990000 nanosec) sqljsDrdaAsCBPutLongDataQueryCheck
3490741 (1068 sec, 659227000 nanosec) sqlriWriteCsoIntoBuffer
3490211 (1067 sec, 508322000 nanosec) sqlrimso
3485339 (1054 sec, 156969000 nanosec) sqlriglf
3486049 (1051 sec, 759578000 nanosec) sqldFetchLFD
3485674 (1047 sec, 24618000 nanosec) sqldx_lob_read
3484393 (1039 sec, 146770000 nanosec) sqldxReadLob
3486801 (1031 sec, 887696000 nanosec) sqldxReadLobData
3483545 (1010 sec, 669271000 nanosec) sqlbDirectRead
3483541 (1009 sec, 188149000 nanosec) sqlbDMSDirectRead
3485667 (1007 sec, 485205000 nanosec) sqldx_diskread
3485702 (957 sec, 532408000 nanosec) sqlbDMSMapAndRead
3479676 (933 sec, 916670000 nanosec) sqlbReadBlocks
3470448 (926 sec, 439074000 nanosec) sqloReadBlocks
8182 (907 sec, 85306000 nanosec) sqloCSemP
8180 (906 sec, 780554000 nanosec) sqlbpfRemoveFromQ
If some call is a bottleneck, you will see it right around the top. Do not be fooled however, with some typical waiting functions such as the functions waiting for the client to send something etc. ( sqljrRecv being one of them ). They will be at the top for a large instance with lots of threads if most of the threads are idle.
What happens if you have a DPF system with the same issue ?
The trace memory segment is SHARED by ALL the local members on the server so you don't have to turn it on for ALL members separately. Usually, if you have a large DPF installation ( say > 50 nodes ) my rule of thumb is to take trace data on Co-ordinator node + 2 data nodes. If you happen to require to turn on tracing on ALL the members, you can now run the following command from any ONE node.
db2trc on -l 512m -t -member all
If you want to dump the file, you can also issue the command from any one node however, the restriction is that the dump directory MUST be a directory which can be written to by ALL the nodes.
db2trc dmp trc.dmp -sdir <shared dir> -member all
It will automatically suffix the hostname of the server from where its being dumped.
If you don't have the command option ( -sdir ) available, you can do something like:
tstamp=`date "+%Y%m%d_%H%M%S"`
rah "||db2trc dmp trc.dmp.$tstamp.\`hostname -s\`"
Rest of the suggestions on perfcount and perfrep still apply on DPF too.
My suggestion would be to use both methods ( I personally favor perfrep option. However, it's available only in the later releases ) to find out some obvious bottlenecks on an overall slowdown situation.
A particular query is running slow
The identification of the application handle which is running the query is criical here, as you don't want to impact any other queries running on the database. You can identify the application handle via:
- a) If you have just connected and you are going to run the query within that same connection:
db2 "select mon_get_application_handle() from sysibm.dual" OR
db2 "select application_id() from sysibm.dual"
Suggested command to use:
db2trc on -l 512m -t -apphdl <application handle" ( You can go up to 16 apphandles separated by commas ) OR
db2trc on -l 512m -t -appid <appid> ( You can go up to 12 appids separated by commas )
sleep 60
db2trc dmp trc.dmp.1
sleep 60
db2trc dmp trc.dmp.2
db2trc off
db2trc perfrep trc.dmp.1 trc.perfrep.1
You can always verify the db2trc command using:
db2trc info
The output will be like:
Marker : @TRACE@
Trace version : 7.0
Platform : Linux/X8664
Build level : s151221
maxBufferSize : 33554432 bytes (32 MB)
auxBufferSize : 0 bytes (0 MB)
allocationCount : 2
DB2TRCD pid : 0
Trace destination : <shared memory buffer>
numSuspended : 0
Trace starting time : 2016-05-24-19.02.51.280774-240
Buffer size : 33554432 bytes (32 MB)
Allow buffer to wrap : yes
Mask : *.*.*.*.*
Timestamps : enabled
PID.TID mask : all
Fixed data mask #1 : all
Fixed data mask #2 : all
Max system errors : infinite
Treat this rc as sys err: none
Member mask : none
Application handle mask : 272 << This will show you which apphandles it's tracing
Application ID mask : none
Using the same methodology as I described above to find a bottleneck in perfrep report, you can do a similar thing for the application handle.
You can extend this to:
- a) trace a slow load process.
- i) Identify the application handle running the load job using the command db2pd -db <dbname> -load
- ii) Start an apphandle trace ( db2trc on -l 512m -t -apphdl <apphdl> )
- b) trace a slow backup / restore
- i) Identify the application handle doing the backup / restore using the command db2pd -edus | grep db2med
The apphandle is embedded in eduname.
- ii) Start an apphandle trace ( db2trc on -l 512m -t -apphdl <apphdl> )
What if you find turning on db2trc is affecting your system performance adversely
In cases where using the above commands are causing a performance degradation, you will have to use masks to reduce the trace footprint. Unfortunately, this has to be supplied from support as this requires some source code knowledge and access to come up with a good mask to reduce the impact on the system. I am sharing a few common masks which we use to debug problems, you can use it if you see the problem description is matching your scenario.
1) I/O performance
db2trc on -l 512m -t -Madd sqloReadBlocks -Madd sqloWriteBlocks
sleep 60
db2trc dmp trc.dmp
db2trc off
db2trc flw trc.dmp trc.flw -t
Run the attached script to parse the output. The syntax is:
findTimesByOperationsFromTrace.pl -operation "Read I/O" -file trc.flw -func "sqloReadBlocks enter" -func "sqloReadBlocks exit"
The output will be like:
Read I/O times between 1 and 5: 661 ( 30.1826% ) Total time(ms) = 928.960 ( 2.3804% )
Read I/O times between 5 and 9: 282 ( 12.8767% ) Total time(ms) = 2113.160 ( 5.4148% )
Read I/O times between 10 and 14: 249 ( 11.3699% ) Total time(ms) = 3093.450 ( 7.9266% )
Read I/O times between 15 and 19: 239 ( 10.9132% ) Total time(ms) = 4131.900 ( 10.5876% )
Read I/O times between 20 and 24: 182 ( 8.3105% ) Total time(ms) = 4082.750 ( 10.4616% )
Read I/O times between 25 and 49: 470 ( 21.4612% ) Total time(ms) = 16063.380 ( 41.1607% )
Read I/O times between 50 and 74: 71 ( 3.2420% ) Total time(ms) = 4231.670 ( 10.8432% )
Read I/O times between 75 and 99: 13 ( 0.5936% ) Total time(ms) = 1143.200 ( 2.9293% )
Read I/O times between 100 and 199: 20 ( 0.9132% ) Total time(ms) = 2628.130 ( 6.7343% )
Read I/O times between 200 and 299: 3 ( 0.1370% ) Total time(ms) = 609.390 ( 1.5615% )
Total number of Reads: 2190, Total time spent in Read I/O: 39025.990 (ms), Average time per Read I/O: 17.820 (ms)
**Disclaimer Note: This script is provided as-is without any support.
2) Debug slow stored procedures.
db2trc on -l 512m -t -Mfile trcMaskForStoredProcedures
sleep 60
db2trc dmp trc.dmp
db2trc off
db2trc perfrep trc.dmp trc.perfrep
<contents of file trcMaskForStoredProcedures>
add sqljcReceive.entry
add DB2.SQLJC.sqljcReceive.1178
add sqljcReceive.exit
add sqljsParse.entry
add sqljsParse.exit
add sqlrr_execute.entry
add sqlrr_execute.exit
add sqlrr_activity_exe_start.entry
add sqlrr_activity_exe_start.exit
add sqlriSectInvoke.entry
add sqlriSectInvoke.exit
add sqljcSend.entry
add DB2.SQLJC.sqljcSend.1177
add sqljcSend.exit
add PVM::run.entry
add PVM::run.exit
add sqlra_get_section.entry
add sqlra_get_section.exit
add sqlra_load_pkg.entry
add sqlra_load_pkg.exit
add sqlra_get_var.entry
add sqlra_get_var.exit
add sqlnn_cmpl.entry
add sqlnn_cmpl.exit
add DB2.SQLR.sqlrr_rds_common_pre1.200
add sqlra_search_for_pkg.entry
add sqlra_search_for_pkg.exit
add sqlra_open_pkg.entry
add sqlra_open_pkg.exit
add sqlra_pkg_lock.entry
add sqlra_pkg_lock.exit
add sqlra_sqlC_get_sibling.entry
add sqlra_sqlC_get_sibling.exit
add sqlra_sqlC_get_stmt.entry
add sqlra_sqlC_get_stmt.exit
add sqlricall.entry
add sqlricall.exit
</contents>
This mask file was come up due to some debugging effort for a very tricky stored procedure problem.
As can be seen, setting masks needs referring to the problem and where in source code the potential problem could be which is why most of the time its advisable to contact support to help you set up a good trace mask to have the least impact on your system.
Other useful db2trc options
1) If you are dealing with millisec issues e.g. trying to match a WAS log entry to a DB2 trace entry to figure out which connection was servicing the WAS connection. You can use the -wc ( wallclock ) option to format the db2trc output to get wall clock times. It then becomes much easier to co-relate the entries in DB2 trace logs to the application logs.
db2trc flw trc.dmp trc.flw -t -wc
2) If you can trace to file ( -f option ) but would like to limit the size of the file you can use:
db2trc on -l ( small L ) 2g -f trc.dmp OR
db2trc on -i 2g -f trc.dmp
The first command will limit the file size to 2GB and once it hits the limit it will wrap to the front and keep writing to it. The second command will stop writing to the file after it hits size 2GB.
3) If you know the return code from db2diag.log for a specific issue, you can stop the trace when the error is hit by using the command:
db2trc on -l 512m -t -rc <signed integer value of the error code> -e 1 ( The -e 1 option will stop the tracing right after the 1st occurrence of the error code ).
4) If you want to investigate BLU performance issues, you can turn on BLU performance trace
db2trc on -l 512m -t -Madd CDE_PERF_TRACE
Useful scripts
1) If you have the whole trace but you want to focus on the times taken by a particular operation e.g. Insert row or I/O then you can use the script findTimesByOperationsFromTrace.pl
Syntax to use:
findTimesByOperationsFromTrace.pl -operation "Read I/O" -file trc.flw -func "sqloReadBlocks enter" -func "sqloReadBlocks exit" or
findTimesByOperationsFromTrace.pl -operation "RowInsert" -file trc.flw -func "sqldRowInsert enter" -func "sqldRowInsert exit" etc.
2) If the version you are using does not have the -perfrep option, then you can use the analyzetrace script on a db2trc flw or fmt output to generate a profile output. The output looks like
Pid Lvl FuncName TTime(ms) HTime(ms) LTime(ms) AvgTime(ms) NCalls ERecnumHTime
---------- --- ------------------------------- --------------- --------------- --------------- --------------- --------- -------------
7771(Tid = 46913080518976, Node = 0)
0 sqloWaitEDUWaitPost 3999.644 999.952 999.875 999.911 4 1751
0 ossGetAvgCPULoad 0.432 0.126 0.044 0.086 5 2337
0 sqloGetEnvInternal 0.015 0.005 0.000 0.002 10 1
0 sqlogmt 0.009 0.003 0.001 0.002 5 3
0 sqeApplication::UpdatePITNodeI 0.006 0.002 0.001 0.001 5 1138
0 pdUpdateBaselinesIfNecessary 0.005 0.001 0.001 0.001 5 12
0 sqleRollupGetCachedCEAL 0.002 0.001 0.000 0.000 5 607
The column legends are:
Pid - Process Id ( thread id ) of the traced proces.
Lvl - The level of indentation of the function call i.e. if its 0 its the outermost, if its 1 its probably called from the function at level 0.
TTime - Total time elapsed in the function
HTime - Highest time amongst ALL the calls to the function
LTime - Least time amongst ALL the calls to the function
AvgTime - Average time
NCalls - Number of calls to the function
ERecnumHTime - Entry record number of the Highest Time call to this function. If you search for this record number in the flow output you will find it.
Things you should not do with db2trc especially on Production system
1) Never use the -f option to start tracing to a file. It can pretty much bring the whole system to a grinding halt if the system is large enough or busy. You should use it ONLY if asked specifically by support. There are exception situations like a crash right after starting up the engine where without -f option support cannot really move forward. Never ever use this option for performance data collection as its got its own overhead and it skews the times so much that the whole trace information becomes useless to debug the performance issue.
2) On DPF installations, do not attempt to start db2trc on ALL members. It can cause tremendous slowness. Usually, if you do have to use db2trc on DPF systems, follow the thumb rule of one co-ord node and 2 data nodes. There will always be exceptions to rule but use it on ALL members only with guidance from support.
Hope you find this entry useful and helpful enough for you to start using db2trc .. :-)
Please leave a comment if there are any questions or feedback.
#Db2