We have been having a great discussion in another thread (Queue Monitoring: How Are Your Tools Doing?) about Queue depth. Have you ever thought about the dynamics of queuing, queue depth, monitoring, and what this means for performance? If so, then this is the Blog for you.
Queueing
Queuing is what happens when there is an impedance mismatch between two separate processes. These two processes can be generally thought of as Producer and Consumer processes. For MQ, I like to refer to them as Writers and Readers. If the Consumer/Reader process(es) cannot keep up with the Producer/Writer process(es), then a backlog will result. This is an example of impedance. In MQ, it's called queuing and the symptom is rising Queue depth.
This leads to an important observation. Queue depth is not a property of a queue. Rather, it's a property of the interaction between two sets of processes: The Writer process(es) and the Reader process(es). Looking at Queue depth from a static point of view we can conclude:
- A Queue depth of zero may mean that the Reader process(es) is keeping up with the Writer process(es).
- A Queue depth of zero may mean that there is not an active Writer process(es).
- A Queue depth of zero does not mean that there is an active Reader process(es).
- A Queue depth greater than zero may mean that the Reader process(es) is temporarily not able to keep up with the Writer process(es).
- A Queue depth greater than zero may mean that there is no active Reader process(es).
Queue depth, in and of itself, is a static property. It can also be viewed as a trend over time. The change in queue depth can be calculated by a simple formula:
Previous Queue depth + Enqueues (Puts) - Dequeues (Gets) = Current Queue depth
The change in Queue depth thus can be seen as the result of the interaction between the Writer and the Reader processes. This can easily viewed by using Oliver Fisse's excellent MH04 MQ SupportPac (xmqqstat). Looking at Queue depth from a dynamic point of view we can conclude:
- An Enqueue count greater than zero does mean that there is a functioning Writer process(es).
- A Dequeue count greater than zero does mean that there is a functioning Reader process(es).
- An increase in Queue depth means that there is a functioning Writer process(es).
- A decrease in Queue depth means that there is a functioning Reader process(es).
A caveat here, some of the above conclusions assume persistent non-expiring messages. Obviously, non-persistent and expiring messages can further complicate diagnosing behavior. The only true measure of Reader and Writer process health are the Dequeue and Enqueue statistics.
Queue Monitoring Data Available
MQ provides monitoring information from multiple different sources, MQSC commands, PCF commands, Statistics, and Event messages. The following table (Table 1) list the monitoring data that is available from MQ. The second table (Table 2) describes the configuration settings required to enable all of the supported Queue monitoring data fields.
Table 1: Queue Monitoring Data Fields |
Source |
Sub-Source |
Data Element |
Comments |
MQSC Command |
DIS QLOCAL |
CURDEPTH |
Current queue depth. |
MQSC Command |
DIS QLOCAL |
IPPROCS |
Input (Reader) process handles. |
MQSC Command |
DIS QLOCAL |
OPPROCS |
Output (Writer) process handles. |
MQSC Command |
DIS QSTATUS |
LGETDATE |
Last "MQ Get" date. |
MQSC Command |
DIS QSTATUS |
LGETTIME |
Last "MQ Get" time. |
MQSC Command |
DIS QSTATUS |
LPUTDATE |
Last "MQ Put" date. |
MQSC Command |
DIS QSTATUS |
LPUTTIME |
Last "MQ Put" time |
MQSC Command |
DIS QSTATUS |
MSGAGE |
Oldest Message age. How long (in seconds) has the oldest message been in the queue. |
MQSC Command |
DIS QSTATUS |
QTIME |
Message time on queue. Average times (in microseconds) that messages remain in the queue. Two averages are presented, the first average is for the most recent messages. The second average is over a longer interval. The values for these interval times are neither published nor available. |
PCF Command
|
MQCMD_RESET_Q_STATS
|
TimeSinceReset
|
Elapsed time since statistics were last reset. |
PCF Command |
MQCMD_RESET_Q_STATS |
HighQDepth |
Maximum number of messages in queue since statistics were last reset. |
PCF Command |
MQCMD_RESET_Q_STATS |
MsgEnqCount |
Number of messages enqueued (Put) onto the Queue since statistics were last reset. |
PCF Command |
MQCMD_RESET_Q_STATS |
MsgDeqCount |
Number of messages dequeued (Get) from the Queue since statistics were last reset. |
Queue Manager Event |
Performance Event |
QSVCINT |
A queue Service Interval was exceeded. |
Table 2: Queue Monitoring Configuration Fields |
Object |
Parameter |
Comments |
Queue
Manager
|
MONQ |
Default Queue Manager wide Queue monitoring (OFF, NONE, LOW, MEDIUM, HIGH).
Note that there is no performance or reporting distinction between the "LOW", "MEDIUM", and "HIGH" settings. All of these settings enable Queue monitoring.
|
Queue
Manager
|
STATQ |
Default Queue Manager wide Queue statistics (OFF, NONE, ON). |
Queue
Manager
|
PERFMEV |
Performance Events enabled (ENABLED, DISABLED). Performance Events must be enabled in order to receive Queue Service Interval event messages. |
Queue
|
MONQ |
Queue monitoring (OFF, QMGR, LOW, MEDIUM, HIGH). Queue Monitoring must be enabled to allow the additional QSTATUS fields described above to be populated. |
Queue
|
STATQ |
Queue statistics (QMGR, OFF, ON). Queue statistics must be enabled to allow the Queue Statistics fields described above to be populated in the response to the PCF command. |
Queue
|
QSVCIEV |
Queue Service Interval event enabled (HIGH, OK, or NONE). This event must be enabled in order to receive Queue Service Interval High event messages. |
Queue
|
QSVCINT |
Queue Service Interval (milliseconds).
If available, another message must be processed within this interval.
|
It is important to note that most of the monitoring information potentially available requires some configuration before it can be used! These settings should be part of your standard Queue Manager setup script as well as your Application Queue definition script. You do have those, don't you? To recap, Queue monitoring data comes from four different sources:
- MQ Basic Queue Local data fields (CURRDEPTH, IPPROCS, OPPROCS).
- MQ Realtime Queue Monitoring data fields (LGETDATE, LGETTIME, LPUTDATE, LPUTTIME, MSGAGE, QTIME). Queue Monitoring must be enabled!
- MQ Queue Statistics data fields (TimeSinceReset, HighQDepth, MsgEnqCount, MsgDeqCount). Queue Statistics must be enabled!
- MQ Performance Event data fields (QSVCINT). Queue Performance Events must be enabled!
Monitoring Queues
In general, we are not really interested in monitoring queues, with a caveat to follow! We are actually much more interested in monitoring the Reader, Writer, and MQ processes. As long as these processes are healthy, everything is probably working as designed. From this perspective, and from the MQ monitoring data discussed previously, we can conclude:
- Key metric: Message Dequeue rate. If this rate is in the "healthy" range, then both MQ and the Reader process(es) are performing as designed.
- Key metric: Message Enqueue rate. If this rate is in the "healthy" range, then both MQ and the Writer process(es) are performing as designed.
- Key metric: QTime. If this number is in the "healthy" range, then the system is currently processing as designed. Note that SLA's may not being met if there is also a backlog of messages (e.g. Queue depth) to process!
- Key metric: MsgAge. If this number is not in the healthy range, then the system is not currently meeting it's SLA! This may be due to the backlog of messages being processed.
- Key metric: Disk Space available to MQ. MQ requires disk space for both logs and message storage. If either of these needs are not met, the queueing system will fail. This should be an essential part of all monitoring strategy and is not specific to any particular Application or Queue.
- Key metric caveat: CurrDepth. If there is an artificial breaking point build into the system (e.g. Maximum Queue Depth), then the Application will fail when this depth is reached. Therefore, under these specific conditions, this is an essential parameter to monitor.
- If the enqueue and dequeue rates are "healthy" then any queue depth is the result of a past imbalance between the readers and writers.
- If there is a large queue depth, then the difference in these rates will describe whether the Queue depth will rise or fall in the future. Note that the Reader capacity should normally be designed to exceed the Writer capacity by a significant margin. It is this margin that allows the Readers to catch up to the current messages and return to meeting SLAs when there is a backlog of messages requiring recovery.
Messaging Performance
This Blog has laid down the necessary queuing concepts, and related MQ data, to next delve into MQ messaging performance. I will cover that topic in an upcoming Blog post. Until then, happy messaging.