MQ

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

View Only

Back to discussions

Expand all | Collapse all

Max Queue Depth: Are You at Risk?

1. Max Queue Depth: Are You at Risk?

Like
Glen Brumbaugh
Posted Tue April 24, 2018 06:25 PM

Reply
I witnessed yet another Production Outage this week caused by an Application reaching the Maximum Queue Depth for a critical queue. As you're probably aware, the default Maximum Queue Depth on the SYSTEM.DEFAULT.LOCAL.QUEUE is 5,000 messages. Many defined queues inherit their default depth setting from this queue's properties. Is this really the depth you want?

For 25 years now, I have warned about this setting. Part of any asynchronous messaging capability must be the ability to buffer incoming messages when they can neither be consumed fast enough nor are simply not being consumed because the reading application is down. The system should be designed for reliability and resilience, not for the "happy path". Consider the downside for setting this value to its maximum setting. The available disk size could be allowed to fill up, IF REQUIRED TO BY TRAFFIC VOLUMES. The alternative is to BREAK THE BUSINESS APPLICATION DUE TO YOUR ADMINISTRATIVE SETTINGS. Yes, that's right, you broke Production. Not the Application. Not the root cause of the backup. You. The administrator.

Your only defense against this disaster is monitoring. Monitoring every queue (including SYSTEM queues). And actually paying attention to alerts. Even the false ones. 7 x 24 x 365. Now for some perspective. Disk has come a long ways since MQ was designed and introduced. If you are constrained by disk space, then monitor disk space. We generally have much better server monitoring than middleware monitoring anyway. If you need to isolate apps from poorly behaving "run-away" apps, then isolate them. On separate servers. Problem solved. If the app wants to use up all of the space and crash then at least the business application (through their IT arm) was responsible.

If MQ causes an application to crash and take a production outage when there was plenty of space available, how do you justify that action? There really is no good reason to take a production outage in that instance. Especially when there may have been 10s to 100s of GB available. The only real remaining reason, and it's not a particularly good one, to be still using these settings is the queue full percentage monitoring. There are far better ways to accomplish this monitoring now, but I'm sure there are many legacy monitoring configurations based upon using the depth setting. If that's your case, then revisit the depth you're using.

Times have changed. Have you? Check and evaluate your settings. They may still be appropriate, they may never have been appropriate, or they may need to be changed. It's spring. Perform a MQ tuneup.

Regards,

Glen Brumbaugh
2. Max Queue Depth: Are You at Risk?

Like
petri kaunisto
Posted Wed April 25, 2018 12:20 AM

Reply
This is an important and interesting topic.
When setting up new integrations with MQ as messaging backbone, how many of you are stringent in specs ?
Do your customers know the volumes they are sending ?
Do they know message sizes ?
Periodicity (Batch or time critical )?
If so, how do you capture and document these details ?
Important information to be able to have an “always on “ environment of course.
We have “old” settings on queuedepth , by old inheritance . These are discovered along the way when migrating to new MQ Versions and so forth.
It is a good idea as you mention Glen, to capture “badly behaving” flows on isolated queue managers.

From: Glen Brumbaugh <wsmqfam-ws@lists.imwuc.org>
Sent: Wednesday, April 25, 2018 03:25
To: WSMQFam-ws@lists.imwuc.org
Subject: [WSMQFam-ws] - Max Queue Depth: Are You at Risk?

I witnessed yet another Production Outage this week caused by an Application reaching the Maximum Queue Depth for a critical queue. As you're probably aware, the default Maximum Queue Depth on the SYSTEM.DEFAULT.LOCAL.QUEUE is 5,000 messages. Many defined queues inherit their default depth setting from this queue's properties. Is this really the depth you want?

For 25 years now, I have warned about this setting. Part of any asynchronous messaging capability must be the ability to buffer incoming messages when they can neither be consumed fast enough nor are simply not being consumed because the reading application is down. The system should be designed for reliability and resilience, not for the "happy path". Consider the downside for setting this value to its maximum setting. The available disk size could be allowed to fill up, IF REQUIRED TO BY TRAFFIC VOLUMES. The alternative is to BREAK THE BUSINESS APPLICATION DUE TO YOUR ADMINISTRATIVE SETTINGS. Yes, that's right, you broke Production. Not the Application. Not the root cause of the backup. You. The administrator.

Your only defense against this disaster is monitoring. Monitoring every queue (including SYSTEM queues). And actually paying attention to alerts. Even the false ones. 7 x 24 x 365. Now for some perspective. Disk has come a long ways since MQ was designed and introduced. If you are constrained by disk space, then monitor disk space. We generally have much better server monitoring than middleware monitoring anyway. If you need to isolate apps from poorly behaving "run-away" apps, then isolate them. On separate servers. Problem solved. If the app wants to use up all of the space and crash then at least the business application (through their IT arm) was responsible.

If MQ causes an application to crash and take a production outage when there was plenty of space available, how do you justify that action? There really is no good reason to take a production outage in that instance. Especially when there may have been 10s to 100s of GB available. The only real remaining reason, and it's not a particularly good one, to be still using these settings is the queue full percentage monitoring. There are far better ways to accomplish this monitoring now, but I'm sure there are many legacy monitoring configurations based upon using the depth setting. If that's your case, then revisit the depth you're using.

Times have changed. Have you? Check and evaluate your settings. They may still be appropriate, they may never have been appropriate, or they may need to be changed. It's spring. Perform a MQ tuneup.

Regards,

Glen Brumbaugh

-----End Original Message-----
This e-mail is confidential and it is intended only for the addressees. Any review, dissemination, distribution, or copying of this message by persons or entities other than the intended recipient is prohibited. If you have received this e-mail in error, kindly notify us immediately by telephone or e-mail and delete the message from your system. The sender does not accept liability for any errors or omissions in the contents of this message which may arise as a result of the e-mail transmission.
3. RE: Max Queue Depth: Are You at Risk?

Like
Dermot Flaherty
Posted Wed April 25, 2018 01:04 AM

Reply
As you said Glen, monitoring is key and don't forget transmission queues and channels !
4. RE: Max Queue Depth: Are You at Risk?

Like
Glenn Baddeley
Posted Wed April 25, 2018 06:46 PM

Reply
>If so, how do you capture and document these details ?

When the app designer requests a local queue, they need to fill in a checklist that includes persistence, max msg length, max queue depth, expected daily volume, maximum msg rate, batch or real time etc. This makes them think about these important MQ planning considerations and allows the MQ admininstrator to review sizing.

Max queue depth and queue data file system size should be able to cope with the maximum likely outage of the consumer app. In a DR this could be a couple of days!

Monitoring should alert on exceeding the normal expected peak queue depth, not 80% of the max queue depth. A normal peak might be <100 msgs, anything more than that indicates an issue that needs to be investigated.

HTH,

Glenn Baddeley

Coles Supermarkets Australia Pty Ltd

MQ

MQ

Max Queue Depth: Are You at Risk?

Glen BrumbaughTue April 24, 2018 06:25 PM

petri kaunistoWed April 25, 2018 12:20 AM

Dermot FlahertyWed April 25, 2018 01:04 AM

Glenn BaddeleyWed April 25, 2018 06:46 PM

1. Max Queue Depth: Are You at Risk?

2. Max Queue Depth: Are You at Risk?

3. RE: Max Queue Depth: Are You at Risk?

4. RE: Max Queue Depth: Are You at Risk?

Additional
Resources

Office

Quick Links

MQ

MQ

Max Queue Depth: Are You at Risk?

Glen BrumbaughTue April 24, 2018 06:25 PM

petri kaunistoWed April 25, 2018 12:20 AM

Dermot FlahertyWed April 25, 2018 01:04 AM

Glenn BaddeleyWed April 25, 2018 06:46 PM

1. Max Queue Depth: Are You at Risk?

2. Max Queue Depth: Are You at Risk?

3. RE: Max Queue Depth: Are You at Risk?

4. RE: Max Queue Depth: Are You at Risk?

Related Content

MQ Insights #01 - Queueing: An Operations Research Viewpoint

Monitor queue depth in MQ queue manager and send alerts to Splunk

Queue Monitoring: How are your tools doing?

Easily controllable queue-file sizes

My XMITQ Queues are filling up – what should I monitor?

Additional Resources

Office

Quick Links

Additional
Resources