MQ

MQ

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Queue Monitoring: How are your tools doing?

  • 1.  Queue Monitoring: How are your tools doing?

    Posted Sat May 05, 2018 11:22 AM

    What are you monitoring for on your queues?  Twenty five years ago, we had far fewer options that we do today.  Here are some of the things that can be monitored:

    • Queue depth
    • Open Input Handles
    • Open Output Handles
    • Date & Time of last Get (only if Queue monitoring enabled)
    • Date & Time of last Put (only if Queue monitoring enabled)
    • Message Age (only if Queue monitoring enabled)
    • Queue Time (only if Queue monitoring enabled)

    If you aren't familiar with these, or if you don't have real-time Queue monitoring enabled, you should definitely check out the MONQ and MONCHL parameters on the Queue Manager, Queue, and Channel objects!   

     

    Regards,

    Glen Brumbaugh



  • 2.  RE: Queue Monitoring: How are your tools doing?

    Posted Sun May 06, 2018 04:31 PM

    And you can monitor them in different ways too. Take queue depth for example.

    You can do a quick check any time on queue depth to get the live value right now.

    DISPLAY QLOCAL(queue-name) CURDEPTH

    Or you can set up events to push a notification to your MQ monitoring tool when the depth of the queue approaches a value where you would be interested.

    Look into Queue Depth Events for more on that.

    For a historical look at the way your queue was used, you also have Accounting and Statistics (written to SMF on z/OS and as PCF messages to SYSTEM queues on Distributed) where you can find high and low watermarks for the depths of your queues over the time period covered by the data.

    Cheers,
    Morag



  • 3.  RE: Queue Monitoring: How are your tools doing?

    Posted Mon May 07, 2018 09:57 AM
    Good points by both Morag and Glen and it raises an issue with which most customers are familiar - namely how do they spot if there is a real problem (as opposed to a system design feature) and having spotted one, how much Problem Determination do they do themselves ?
    Queue Depth is a great example to use.
    Suppose a peek display (or possibly a performance event) reveals that an  application queue has a depth - of say - 50 msgs.
    Is this good/bad/ok ?
    Only Application/System knowledge can answer this.
    It may be that this is a trigger queue and someone has set a depth trigger of  50 - i.e. MQ will start a process to (hopefully) drain the queue when the depth reaches 50.
    Or perhaps this is a really active queue in a busy system that has "burst" activity and what we are seeing is not a static depth of 50 msgs but an average depth with all the messages coming and going. As Glen has said, MSGAGE and QTIME will give an idea as to whether we have a healthy dynamic queue or not and if this is a target queue (i.e. someone should be getting from it) then IPPROCS is always worth checking.

    If this is a remote queue, then perhaps there is a problem with the channel and as Morag knows better than me, there is rich information to be found in CHSTATUS especially XQMSGA and XQTIME.
    And as an aside, what would Glen and Morag advise re.monitoring of SYSTEM.CLUSTER.TRANSMIT.QUEUE ?

    Finally, an MQ practitioner I know with many years experience has always been slightly wary of using queue depth events on the basis that given the Async nature of MQ, if a depth event is generated when the queue is at 90% of some critical value, it is possible that by the time the monitoring program has processed this information, the queue may have reached the critical depth.

    I cannot say that I have come across this issue, but I offer it here for what it is worth.



  • 4.  RE: Queue Monitoring: How are your tools doing?

    Posted Tue May 08, 2018 01:51 AM

    Hi Dermot, all,

    Another thing that occurs (for z/OS in particular) is monitoring of how much offloading is occurring to SMDS and how full CF structures are becoming. We have seen requests such as the one below to provide additional information to enable more timely alerting when storage is becoming critical:

    https://www.ibm.com/developerworks/rfe/execute?use_case=viewChangeRequest&CR_ID=107639

    I would be interested in your collective feedback on how valuable this would be.

    Cheers, Matt



  • 5.  RE: Queue Monitoring: How are your tools doing?

    Posted Tue May 08, 2018 04:26 AM

    Hi Matthew. I am afraid that since leaving IBM, all my MQ work has been with either Distributed systems or zLinux so I am very much out of touch with z/OS and MQ Shared Queues.

    But .. having read the request, I am a bit puzzled since it talks about the Admin structure becoming full and unless things have changed a lot in the past 5 years, the Admin Structure does NOT contain msg queue data so how is it getting full ? From memory, it contains essentially a transaction table for all the active UOWs in the QSG so the more UOWs, the bigger the table and the more CF storage it would require so there is a danger of it exceeding its allocation. Since the Admin Structure is critical for Shared Queue operation,  then having some sort of usage event warning would seem a good idea (are there any native CF usage stats that would help ?) but I would still like to know what is going on that is causing the structure to fill up.



  • 6.  RE: Queue Monitoring: How are your tools doing?

    Posted Wed May 09, 2018 03:31 AM

    Hi Dermot,

    You're right, this particular requirement resulted from a scenario where there was insufficient Admin Structure storage to handle in-flight UOWs. I believe we also store information about active connections. We could potentially look at providing more information about the amount of active UOWs that the CF is able to handle. Similarly, on the application structure side, again we could improve the reporting messages in the build up to storage problems.

    Cheers, Matt



  • 7.  RE: Queue Monitoring: How are your tools doing?

    Posted Wed May 09, 2018 01:36 AM

    A "peek" at curdepth can be misleading, as it can go up and down by 1000's msgs per second. We use IBM Tivoli situations to do multiple samples over a time frame. If it exceeds a threshold over a number of samples, then we alert. This reduces false positives.  We also use high/low threshold queue depth event monitoring, to alert over certain percentage of maxdepth.  We also alert for input process count = 0 AND current depth > 0 on some queues.  Message aging alerts are also implemented in Tivoli, typically 10 minutes.

    Glenn Baddeley



  • 8.  RE: Queue Monitoring: How are your tools doing?

    Posted Wed May 09, 2018 08:59 AM

    I very much like looking at message age.  Even more so for transmit queues.   The combination of message age and depth together may give an idea if a down channel is impacting many users or only a few.

     

    I'd add that one thing that may not be obvious is that uncommitted gets will affect message age.  I have one application that will show up in dspmqtrn output as having uncommitted UOWs in the middle of the night.  And MSGAGE shows up on that queue as 8,062,333.  I think that likely goes back to march 10th.  

    That makes me think that they are somehow bouncing their WebSphere instances and leaving messages gotten but not committed.   Every once in a while I force the commit.



  • 9.  RE: Queue Monitoring: How are your tools doing?

    Posted Wed May 09, 2018 08:51 PM

    I thought that WebSphere Application Server had a restart process to which you need to provide a user with sufficient MQ privileges across all applications so that it can commit or roll back messages that were considered in an unfinished UOW when WAS shutdown. At restart WAS would run that process and "fix" all pending units of work.

    Looks like maybe the user assigned that process doesn't have enough MQ privileges or the process has never been setup ?



  • 10.  RE: Queue Monitoring: How are your tools doing?

    Posted Thu May 10, 2018 12:18 PM

    If you have a link to how to set up that process it would be great.   I don't remember seeing any 2035's in my AMQERR01.LOG or in the event queue.



  • 11.  RE: Queue Monitoring: How are your tools doing?

    Posted Thu May 10, 2018 05:27 PM

    Sorry no link, but I remember having read about it once in the WAS MQ Setup documentation. So you should be able to find it in the WAS infocenter.



  • 12.  RE: Queue Monitoring: How are your tools doing?

    Posted Thu May 10, 2018 07:04 PM

    Websphere applications have long had MQ connection issues.  Historically, we have frequently had to recycle the Applications to enable them to reconnect to MQ.  This was due to a lack of error handling in the Java code itself.  This was neither a MQ nor a WebSphere problem. However, the problem, for quite some time now, has been able to be handled by the IBM software without any modification to the Java code.  

    In the days long past, when we didn't have Client Reconnect capabilities, this was normal.  In the modern era, these applications should be configured to reconnect automatically.  The failure to do so can now be blamed on both the developers and the administrators.  

    If you're not aware of the MQ Client Reconnect capabilities in WebSphere, look here:  WebSphere JMS Configuration.

     

    Regards,

    Glen Brumbaugh



  • 13.  RE: Queue Monitoring: How are your tools doing?

    Posted Sat May 12, 2018 12:12 PM

    If this topic has been interested, this link (IMUC Blog on Monitoring Queues) will take you to a Blog that takes a deep dive into the monitoring information available within MQ.  

     

    Regards,

    Glen Brumbaugh

    IBM Champion (Cloud)



  • 14.  RE: Queue Monitoring: How are your tools doing?

    Posted Tue November 27, 2018 07:56 AM
    Hello

    The link is broken, can you provide another one?

    Regards

    ------------------------------
    Gérard Labadie
    ------------------------------



  • 15.  RE: Queue Monitoring: How are your tools doing?

    Posted Wed November 28, 2018 11:19 AM
    Probably this :
    IBM Middleware User Community : Blogs : MQ Insights #01 - Queueing: An Operations Research Viewpoint
    Socious remove preview
    IBM Middleware User Community : Blogs : MQ Insights #01 - Queueing: An Operations Research Viewpoint
    Queuing is what happens when there is an impedance mismatch between two separate processes. These two processes can be generally thought of as and processes. For MQ, I like to refer to them as and . If the Consumer/Reader process(es) cannot keep up with the Producer/Writer process(es), then a backlog will result.
    View this on Socious >



    ------------------------------
    John Hawkins
    CTO
    Lightwell
    ------------------------------