The IBM MQ Appliance includes built-in support for both high availability (HA) and disaster recovery (DR). In HA and DR configurations, multiple appliances are used to provide a resilient solution for running queue managers, which supports rolling maintenance and which can protect against software and hardware failures, or data centre outages. Queue manager data is replicated between the appliances so that queue managers can fail over from one appliance to another. Fail over is automatic for HA and administratively orchestrated for DR.
The 9.3.2 firmware release enhances this capability to support an HA group of appliances on both sides of a DR link, as per the equivalent DR support that is available for RDQM on Linux. Prior to 9.3.2, an HA group of appliances can only have a single appliance at a remote location for disaster recovery.
Multiple queue managers can be defined in this configuration, and each queue manager can fail over between appliances independently. Queue managers can have a floating IP address within each HA group, but IP addresses cannot float across the DR link from one HA group to another. Applications can use MQ connectivity options, such as a CCDT or a connection name list to try to connect to each site in turn, or they can be routed to the correct site by using a global load balancer, DNS entry, or an equivalent network routing capability.
To establish this configuration you use either the MQ Console or the command line interface (CLI) to perform the following steps:
1. Configure an HA group of appliances at site 1
2. Configure an HA group of appliances at site 2
3. For each queue manager:
a. Create an HA queue manager at one of the sites (e.g. site 1)
b. Configure the queue manager as the DR primary instance at that site (e.g. by using the crtdrprimary command)
c. Configure the queue manager as the DR secondary instance at the other site (e.g. by using the crtdrsecondary command)
These steps are the same as configuring an HA group to have a single remote appliance for DR. However, for DR between HA groups, additional information is provided to the crtdrprimary and the crtdrsecondary commands to indicate that an HA group is to be used at both sites, and to provide information about each appliance.
Replication of queue manager data is performed from the current HA primary appliance (that is, where the queue manager is running) to the HA primary appliance at the DR site (that is, where the queue manager will run after a DR fail over).
Prior to 9.3.2, it is possible to emulate this pattern by configuring DR between an HA group and a single appliance, whereby the single DR appliance is a member of an independent HA group. This approach is now referred to as the legacy solution in the official documentation. In this configuration, queue managers can fail over to the single DR appliance, then the DR configuration can be removed to add the queue managers to HA. The DR configuration can then be re-established with one of the appliances at the main site. The new 9.3.2 capability is significantly simpler from an operations perspective than the legacy solution because it avoids the need to reconfigure HA and DR after each DR fail over. For customers who have implemented the legacy solution, they can migrate to the new 9.3.2 capability by just removing and recreating the DR configuration without needing to recreate their queue managers.
For more information about the new support in 9.3.2 for disaster recovery between HA groups, please see the official documentation at https://www.ibm.com/docs/en/mq-appliance/9.3?topic=cha-configuring-disaster-recovery-fail-over-another-high-availability-group
For more information about the IBM MQ 9.3.2 continuous delivery release on the appliance and on other platforms see Ian Harwood's blog article.#MQ#IBMMQ#DisasterRecovery#highavailability