This is part of an occasional series of small blog posts where I (Morag) will write about some of the quirks in IBM MQ, in the hope of invoking the response, "Well, I didn't know that!" Read other posts in this series.
In this post I am going to write about the way IBM MQ behaves when you try to use multiple key repositories in a single MQ application process. While it is arguable whether this is a quirk or a restriction, I think it is clear that it is an unexpected behaviour that trips up users and from that perspective it is worth writing about.
This can be a confusing behaviour because it initially appears that the MQ client is sometimes randomly failing with MQRC_SSL_INITIALIZATION_ERROR (2393) and other times it works fine. When trying to diagnose the problem, and isolating just the failing connection, it always works just fine.
The problem comes about when two different SSL/TLS Key Repositories are in use by a single application. This is only likely to happen when multiple queue managers in different environments are used by a single application, and is probably most often encountered with administration tools, rather than business applications.
Let's imagine a situation where I have two queue managers DevQMgr1 and DevQMgr2 and I have SSL/TLS secured connectivity to both queue managers. When I connect to DevQMgr1 on its own it connects successfully. When I connect to DevQMgr2 on its own it also connects successfully. The problem occurs when I try to be connected to both of them at the same time, then one of them, the second one I connect to, suffers an MQRC_SSL_INITIALIZATION_ERROR (2393) reason code.
This error occurs because of a restriction put on IBM MQ by the GSKit sub-component. A single process cannot have two different KDB files loaded into memory at the same time. So when the second connection is made, it attempts to use the KDB already loaded into memory and likelihood is that this connection will fail because the required certificates to validate the partner will not be present. The error message that you will see in the client AMQERR01.LOG is:-
AMQ9633E: Bad SSL certificate for channel 'MQGEM.SVRCONN.TLS'.
A certificate encountered during SSL handshaking is regarded as bad for one of the following reasons:
(a) it was formatted incorrectly and could not be validated
(b) it was formatted correctly but failed validation against the Certification Authority (CA) root and other certificates held on the local system
(c) it was found in a Certification Revocation List (CRL) on an LDAP server
(d) a CRL was specified but the CRL could not be found on the LDAP server
(e) an OCSP responder has indicated that it is revoked
(f) The keysize of the certificate is too small for the configured limit. (MinimumRSAKeySize)
As a result of your knowledge that independently each connection works just fine on its own, you can rule out (a) and (c)-(f). This leaves only (b). It tells you that the "certificates held on the local system" - that is the ones in the KDB that your application is using - are unable to validate the certificate sent during the handshake. You know that they are able to do so, and therefore the MQ client must not be using that KDB in this failing case.
There's no big obvious, "you are not using the KDB you think you are using" error message to find, but the above logic should be enough to figure out that you are in this situation.
So having discovered that you are in this situation, what can you do about it? I suppose there are two possible answers to that.
- Only connect to one queue manager at a time. When you are done using one queue manager, disconnect from that queue manager before connecting to the next one. If the application you are using does not have the ability to request a disconnect then you may have to restart the application to achieve this.
Combine the certificates you need into a single KDB and use that KDB for all queue managers. Each connection can present a different certificate by ensuring the certificate label is provided as part of the connection details, so your identity presented to the queue manager does not have to change as a result of this re-factoring of KDBs at the client side. Equally, all the CA certificates that are needed to validate all the queue managers will be present in your single KDB so nothing has to be changed at the queue managers end of things.
It is worth noting here that there may be times when it is not suitable to combine KDBs. While technically there is no impediment, from a process perspective there may well be. For example, it may be unwise to have connections to both a production and a test system live and active at the same time, in the same application. You might want to keep a strict isolation between the two so that there is no opportunity for cross-talk. Some people would choose to keep these environments completely separate, even going to the lengths of having a separate instance of the application for production altogether.
Of course, this problem may not only occur between production and test environment, but may occur just as easily between different application development areas, so it is a matter of how you wish to manage your working processes as to whether you decide to combine or separate connections in this way.
Hopefully now you are aware that this specific issue can happen, and you will be able to spot it if it happens to you.