mqclient.ini should be the way you aim for rather than setting a plethora of environment variables. You can put everything in mqclient.ini (even the equivalent of MQSERVER environment variable). Read this for an elaboration of this perspective: IBM MQ Little Gem #13: mqclient.ini
In the above blog post you'll also notice that there is a table with a list of the mqclient.ini file attributes and the equivalent environment variables if there are ones. As you'll see, I have not found there to be an environment variable equivalent for the reconnect fields.
Original Message:
Sent: Fri January 19, 2024 08:15 PM
From: Jim Creasman
Subject: Need help understanding error code AMQ9795E from MQ client
Just to close the circle on this I was able to add a mqclient.ini file with the above options and set MQCLNTCF to this location. Setting the reconnect timeout to 30 seconds is better than 30 minutes (default value) in our case. This allows our client code to get the MQRC_RECONNECT_FAILED error sooner and then react (e.g., log an error, stop taking requests, etc).
I have a couple of follow on questions:
- Are there environment variables for the settings in the .ini file? Or, is the .ini file the only way to set these?
- The IBM documentation gives a good description of each, but I'm wondering if there are any blogs or other documentation on best practices for assigning the values in this file. Any recommended reads?
Thanks
------------------------------
Jim Creasman
Original Message:
Sent: Tue January 16, 2024 05:11 PM
From: Jim Creasman
Subject: Need help understanding error code AMQ9795E from MQ client
Some good news... I was able to find another occurrence of the same error (and around the same time) in our Test environment. The full explanation is:
EXPLANATION:The client channel definition location was specified as URL '<<<url here is correct>>>',however the file could not be retrieved from this location. The error returned was (16) 'HTTP response code said error'. The protocolspecific response code was (503).
Given the URL is correct, and the CCDT service was running during the timeframe given, I suspect network gremlins were at work. The MQ client was unable to reach the CCDT service when needed. The client in Test was able to recover, but in Prod it eventually timed out on the retry. This put the client in a state of MQRC_RECONNECT_FAILED.
What I'm focused on now is how to properly recover when this happens in the future. In the Prod logs the client appears to try reconnecting for about 30 minutes before it finally reports PUT1: MQCC = MQCC_FAILED [2] MQRC = MQRC_RECONNECT_FAILED [2548]. This is too long. I'd like to shorten the time and have more visibility into the process. This page describes how to register for reconnect events. It also mentions using parameters in mqclient.ini to make adjustments. For example,
MQReconnectTimeout=30ReconDelay=(1000,200)(2000,200)(4000,1000)
I'll be looking further into this, but always interested in advice/input from other groups. Our goal is to code our MQ clients to be self-healing whenever possible. When that isn't possible having the logs point directly to the issue (e.g., getting a 503) saves time in debugging.
Thanks
------------------------------
Jim Creasman
Original Message:
Sent: Tue January 16, 2024 11:32 AM
From: Jim Creasman
Subject: Need help understanding error code AMQ9795E from MQ client
Unfortunately, the AMQERR01.LOG file is the one piece of this puzzle I don't have. The service runs in Kubernetes. Since this was a production error, it was imperative to get the service back up and running. The easiest way to do that is to bring up a new instance. The old instance and file system were lost. If I can recreate, or if it happens again, I will be sure to nab the AMQ log file before recycling.
------------------------------
Jim Creasman
Original Message:
Sent: Fri January 12, 2024 11:10 PM
From: Morag Hughson
Subject: Need help understanding error code AMQ9795E from MQ client
Gave me a chuckle that my own blog post was used to answer the question, but not by me! :-D
Jim - Was the message was written in the client side error log as well?
The full message would also give you the HTTP error code, as the reference example that @om prakash showed as well as the URL that was being used, which would be helpful in the diagnosis, so I would definitely go and have a look in the client side error log, because you will get on better knowing the full details. Just knowing 16 = 'HTTP response code said error' isn't hugely helpful.
Cheers,
Morag
------------------------------
Morag Hughson
MQ Technical Education Specialist
MQGem Software Limited
Website: https://www.mqgem.com
Original Message:
Sent: Thu January 11, 2024 05:53 PM
From: om prakash
Subject: Need help understanding error code AMQ9795E from MQ client
Refer the MQGEM blog - https://mqgem.wordpress.com/2016/06/22/mqccdturl-and-mqclient-ini/
------------------------------
om prakash
Original Message:
Sent: Thu January 11, 2024 05:52 PM
From: om prakash
Subject: Need help understanding error code AMQ9795E from MQ client
The error returned was (16) 'HTTP response code said error'. Theprotocol specific response code was (404).ACTION:Ensure that the URL is reachable and if necessary correct the detailsprovided.
Curl of the URL can add more info.
------------------------------
om prakash
Original Message:
Sent: Thu January 11, 2024 05:12 PM
From: Jim Creasman
Subject: Need help understanding error code AMQ9795E from MQ client
Morag,
Unfortunately, that single line is all I see being printed. This is a NodeJS client that is using the mq-mqi-nodejs NPM package. This message is likely being produced by the underlying C libraries used by the Node code and is being output from within this code. All I know for sure is that it is produced when our code calls the Put1Promise function. This function calls MQPUT1 from the C libraries.
The error has been intermittent so it's not easy to track down. I was hoping that knowing the meaning of "16" might help.
Jim
------------------------------
Jim Creasman
Original Message:
Sent: Thu January 11, 2024 03:24 PM
From: Morag Hughson
Subject: Need help understanding error code AMQ9795E from MQ client
Hi Jim,
The full text of that error message is:-
AMQ9795E
The client channel definition could not be retrieved from its URL, error code (<insert_1>).
Severity 30 : Error
Explanation The client channel definition location was specified as URL <insert_3>, however the file could not be retrieved from this location.
The error returned was (<insert_1>) <insert_4>. The protocol specific response code was (<insert_2>).
Response Ensure that the URL is reachable and if necessary correct the details provided.
--
As you can see, there is a text version of the numeric code 16 straight after the number in the Explanation section of the message.
Could you paste in the whole error message from the error log and not just the title. That will give us all the information.
Cheers,
Morag
------------------------------
Morag Hughson
MQ Technical Education Specialist
MQGem Software Limited
Website: https://www.mqgem.com
Original Message:
Sent: Thu January 11, 2024 11:58 AM
From: Jim Creasman
Subject: Need help understanding error code AMQ9795E from MQ client
I'm debugging a reconnect issue with one of our MQ clients that uses a CCDT URL to manage the connection. In this case the client side is a producer that puts messages to a queue. Occasionally, the reconnect action will be triggered and the client will then attempt to reconnect until the time limit has expired.
The initial error message I see in the logs is this one:
AMQ9795E: The client channel definition could not be retrieved from its URL, error code (16).
I haven't been able to find what "16" means in this context. Is this a code returned by the underlying TCP layer, or something IBM MQ assigns for a certain condition?
Thanks,
Jim
------------------------------
Jim Creasman
------------------------------