When the UOWLOGDA/UOWLOGTI fileds are blank then recycling the MQ server would remove the issue (the messages are non-persistent).
If however the UOWLOGDA/UOWLOGTI fields are non-blank then the queue manager will restore the transaction to exactly the same state following a queue manager restart (MQ has guaranteed to be able to commit these messages if asked to do so by WAS).
If recycling the QMgr did cause any transactions with a non-blank UOWLOGDA/UOWLOGTI to be resolved then that would imply the queue manager recycle had caused WAS and the QMgr to go through XA transaction resolution (This should be driven by WAS as transaction manager in this case). WAS should NOT require a queue manager restart for this to occur, after the failure that left the transaction indoubt WAS should automatically have driven the XA reconnect logic when communications were next re-established between the QMgr and WAS.
Manually resolving the transaction risks effectively duplicating, or losing a message. As discussed earlier, in the case of a non-persistent message there is always a chance of losing the message (and it would be pretty unusual to use a 2 phase transaction o coordinate putting a NP message). In the case of a persistent message it seems like a really bad idea to manually resolve a transaction without understanding exactly what went wrong to leave the prepared transaction. You should probably take this up with WAS support.
Original Message:
Sent: Wed March 08, 2023 06:30 AM
From: Sebastian Wilk
Subject: Uncomitted Messages on CLUSTER.TRANSMIT - How to find the cause
I'd much rather not dig that deep into the bits and bytes to figure out the culprit. The WAS logs sound like a good starting point, I will poke my colleagues whenever we get a more recent occurence.
And that's what I meant with the messages aren't going anywhere: they will stay there until we resolve them manually or bounce the queuemanager/server.
------------------------------
Sebastian Wilk
Original Message:
Sent: Wed March 08, 2023 04:02 AM
From: Andrew Hickson
Subject: Uncomitted Messages on CLUSTER.TRANSMIT - How to find the cause
A prepared transaction that has written to the log will certainly have been hardened to disk, either (or both) the recovery log, or the queue file for SCTQ.
Depending on whether you are using linear or circular logging, and how much time and work has passed since the transaction was processed you could find the data on disk with a combination of dmpmqlog and perhaps a hexdump dump of the queue file. However this is a non-trivial task and you might be better off looking in your WAS logs at around the time reported in the UOWLOGTI/UOWLOGDA fields for signs of any errors before attempting to find the message data on disk.
The messages CANNOT "go anywhere" until the transaction manager (WAS) has told the queue manager whether to commit or backout the prepared transaction (or you force the issue by manually resolving the transaction).
I would expect WAS support to be able to help you identify the WAS instance suffering the failure from the XA transaction Id.
------------------------------
Andrew Hickson
Original Message:
Sent: Wed March 08, 2023 01:26 AM
From: Sebastian Wilk
Subject: Uncomitted Messages on CLUSTER.TRANSMIT - How to find the cause
Hey Andrew,
at least we can narrow it down to WAS at this point, while not much, it's a start.
I have another one where UOWLOGDA and UOWLOGTI are not blank
Other than that, the symptoms are the same:
We have uncomitted messages sitting in the SCQT. As you can see from the date, those messages aren't exactly new, and since they don't seem to be going anywhere, they will be dumped the next bounce.
While not causing any major issues (yet), it generates alerts as it's generally not the case for us to have messages sitting in the SCQT unless we do have a maintenance somewhere.
Thank you for your input✌
------------------------------
Sebastian Wilk
Original Message:
Sent: Tue March 07, 2023 01:21 PM
From: Andrew Hickson
Subject: Uncomitted Messages on CLUSTER.TRANSMIT - How to find the cause
These messages an not simply "uncommitted", but are in a prepared state. The XA format id is "WASD" which implies some WAS application put messages onto the SCTQ and then failed between the prepare and the commit (only an MQ channel should ever get messages from SCTQ as part of a two phase transaction).
The UOWLOGDA and UOWLOGTI are blank, which implies this tranaction has never written anything to the MQ recovery log. This in turn implies that all of the messages in this transaction are non persistent. It's a bit unusual to be using a 2 phase transaction with a non-persistent message.
It's generally strongly advisable not to manually resolve transaction except in very extreme circumstances (typically when a resource manager or transaction manager is decommissioned following some major failure). Normally, when the transaction manager instance (WAS) next establishes a session with this queue manager any indoubt transactions should be automatically resolved. In this case the messages are clearly non-persistent, and so were always at risk of not being delivered, and so it's probably ok to force a backout this transaction (causing the put's to be undone).
As the messages haven't been committed it's not possible to view them through the MQI. As the messages are non-persistent then won't appear in the recovery log. It's quite possible the messages haven't ever been written to disk and only exist inside shared memory buffers inside the queue manager. There is no way for a customer to view the content of these buffers and hence no way for you to view the messages.
------------------------------
Andrew Hickson
Original Message:
Sent: Mon March 06, 2023 05:11 AM
From: Sebastian Wilk
Subject: Uncomitted Messages on CLUSTER.TRANSMIT - How to find the cause
Hey fellow MQ folks,
we are unfortunately getting uncommitted messages into a couple of our SYSTEM.CLUSTER.TRANSMIT.QUEUES.
While we are able to resolve them via rsvmqtrn, we would like to figure out where these messages are coming from and why they are not being comitted.
The Queuemanagers in question run on Windows Server 2k16 with MQ Version 9.2.0.4
The following info is available via dspmqtrn -q:
Is there any way to find more information about the messages? And ideally figure out what the problem is? Checking the logs yielded no results, and running a trace is problematic since it does happen at random and the frequency is not very high.
Any pointers would be greatly appreciated.
Kind regards
------------------------------
Sebastian Wilk
------------------------------