IBM MQ 9.4.4 provides support for Native HA & Cross-region replication outside of OpenShift and Kubernetes, e.g. VMs, bare-metal, docker, etc.
IBM MQ Native HA was originally developed to provide a cloud-native message availability solution for OpenShift, the latest MQ release extends support to new Linux environments.
Let's take a nostalgia trip back to take a look at how we arrived to where we are today.
MIQM (2009)
Earlier in my career at IBM, I was part of the development team that originally developed the multi-instance queue manager (MIQM) support, first available in 2009 (WebSphere MQ 7.0.1). At the heart of a multi-instance queue manager there is a shared filesystem that provides co-ordination of up to two instances (one active, one standby) using leased filesystem locks.
The failure of an active instance to renew its leases on the filesystem locks allows the standby instance the opportunity to take over as the active instance. The multi-instance queue manager is really quite a simple solution that builds on the resiliency of the shared filesystem and for many MQ deployments this works perfectly well. As with any tech, the least available component determines the availability of the whole system, if a critical sub-component fails and isn't quickly restored or replaced with an alternative the availability is impacted.
The performance, redundancy and availability of the shared filesystem is critical to a multi-instance queue manager. Should any problems arise, diagnosing issues with a shared filesystem such as NFSv4 locking problems can involve collecting diagnostics for multiple vendors for the filesystem client and server which isn't ideal.
RDQM (2018)
In 2018 (IBM MQ 9.0.4), Replicated Data Queue Manager or RDQM was developed to maintain multiple distributed copies of MQ data to provide redundancy and avoid a single filesystem or node from being a single point of failure. RDQM support is available for RHEL on x86-64 and depends upon two additional components DRBD & Pacemaker to provide availability.
RDQM and maintaining replicated copies of queue manager data was an evolutionary step forward in providing increased message availability for IBM MQ, however the component dependencies and limited platform support have been prohibitive barriers to adoption for some. Significantly, container environments cannot use RDQM as a solution due to the dependency on the DRBD kernel drivers.
Native HA (2021)
Here is where Native HA picks up the container challenge, first appearing as a unique feature for CloudPak for Integration (CP4I) in 2021, it offered a similar replicated topology to RDQM but delivered that capability inside the core IBM MQ queue manager. No dependencies, a truly native capability. Since the initial release, license entitlements and platform coverage have broadened to include MQ Advanced and Kubernetes. IBM MQ 9.4.4 extends the reach of Native HA even further.
One significant difference between RDQM and Native HA is that the former replicates queue manager data at a filesystem level whilst Native HA just replicates just the recovery log. It's worth looking at that difference in more detail to understand how that works.
The recovery log is the master copy of all things persistent with a queue manager, whether that be queue definitions, persistent messages, durable subscriptions or transactions - in fact anything that you rely on MQ to recover in the event of a system crash is written to the recovery log. Log writes that must be recoverable such as an MQCMIT are forced to physical disk before confirming the operation back to the application, thus providing MQ with assured messaging. When a queue manager starts up, the recovery log is replayed to perform crash recovery from the last point of consistency, which is known as a checkpoint.
Alongside the recovery log is MQ's data files, often referred to colloquially as queue files. You should consider these files as the working copy of MQ's state and they may contain a mix of persistent and non-persistent data, for example when a queue exceeds its memory buffers for storing messages the file steps in. Unlike the MQ recovery log, these file updates are buffered for performance. Periodically new checkpoints are recorded to ensure that the contents of the queue files accurately reflect what was written to the recovery log, thus reducing the amount of log replay that would need to occur if the queue manager were to crash.
Native HA builds on top of the most solid of foundations, the IBM MQ recovery logs and queue files.
Native HA works by deploying 3 replica copies of the recovery log for each queue manager, applying a raft consensus algorithm to provide distributed consistency. The replicas elect a leader based on which one has the best copy of data and the remaining replicas follow the leader by replicating the recovery log, continually replaying the log data to reconstruct the queue files.
Native HA Cross-region replication (2025)
So a Native HA queue manager works by replicating MQ recovery logs and replaying the contents to rebuild the queue files. The native replication removes the need for the storage layer to do so, this capability is all built-in to MQ with no external dependencies. Earlier this year 2025 (IBM MQ 9.4.2) we announced an extension to Native HA in the form of Cross-region replication, adding asynchronous replication to provide an out of region disaster recovery option.
The latest continuous delivery (CD) release this year (IBM MQ 9.4.4) marks a further significant milestone for Native HA. No new capabilities this time, but the broadening of supported Linux environments to include VMs, bare-metal and other container deployments.