MQ

 View Only

Patching with active RDQM installations/configuration

By Alex Chatt posted 3 days ago

  

As all RDQM customers/users know, RDQM relies on both pacemaker and DRBD to work, which requires the installation of pre-requisite packages. Though some of these Pacemaker packages can be found in common Redhat repos, only the packages that come with the MQ Advanced media are supported.

The Issue + diagnostics

With a few of our customers, we have seen that issues can occur when patching needs to be performed on an RDQM node. If a user issues a “yum/dnf update”, and they have an active repository that contains later versions of the pacemaker packages installed as part of RDQM, yum/dnf can end up “updating” these pacemaker packages to the ones found.

This often results in that node being unable to run any RDQM HA/HA-DR queue managers, and the rdqmstatus command reporting that the node is unavailable or that the RDQM subsystem can’t be started.

$ rdqmstatus

AMQ3785E: The replicated data subsystem has not been started.

If we were then to issue the pacemaker command “crm status” on a couple of our HA nodes, we can get a bit more information around what’s gone wrong:

[node1]$ crm status

From Node: Node1

Cluster Summary:

  * Stack: corosync (Pacemaker is running)

  * Current DC: node3 (version 2.1.8-3.el9-3980678f0) - MIXED-VERSION partition with quorum

  * Last updated: Mon Feb  3 12:06:19 2025

  * Last change:  Wed Jan 29 13:44:40 2025 by root via root on node1

[node2]$ crm status

WARNING: could not get the pacemaker version, bad installation?

WARNING: list index out of range

Cluster Summary:

  * Stack: corosync (Pacemaker is running)

  * Current DC: node3 (version 2.1.8-3.el9-3980678f0) - MIXED-VERSION partition with quorum

  * Last updated: Mon Feb  3 12:16:11 2025 on node2

  * Last change:  Wed Jan 29 13:44:40 2025 by root via root on node1

From the output, we can see a few concerning messages, both about the “MIXED-VERSION” partition and the warnings about the bad pacemaker install. At this point, the best thing to do would be to check what pacemaker package you have installed using the “rpm -qa pacemaker” command

[user@node1 ~]$ rpm -qa pacemaker

pacemaker-2.0.5.linbit-2.0.el8.x86_64

The RDQM pacemaker packages will all have “linbit” within the name, but if you have grabbed a newer pacemaker from a remote repo, you would expect to see something like:

[user@node1 ~]$ rpm -qa pacemaker

pacemaker-2.1.9.-2.el8.x86_64

This rule applies to all packages found within the “/Advanced/RDQM/PreReqs/el{ver}/pacemaker-{ver}” folder, so best to check all of these packages to ensure none got updated/changed.

Solution and prevention

At this point, to the fix the issue you will have to uninstalled RDQM and the new pacemaker packages and reinstall the ones that came with the installation media.

This is a process that I’m sure many would like to avoid, so the question we should ask is “how do we avoid this from happening”. If you are someone who will only use RPM to install these RDQM pre-req packages, then you could add the following line to “/etc/yum.conf” to avoid any yum repository looking for any of the pre-req RDQM packages:

exclude=cluster* corosync* drbd kmod-drbd libqb* pacemaker* resource-agents*

We are aware though that many customers like to use yum/dnf to install these packages, so for those customers, it would be best to add the above line for any repo that could contain pacemaker packages (e.g. appstream). These .repo files can be found under “/etc/yum.repos.d”.

You can also just include this list within the yum/dnf command itself, like so:

yum update --exclude=cluster*,corosync*,drbd,kmod-drbd,libqb*,pacemaker*,resource-agents*

Most automation tools like Ansible will also provide the ability to exclude certain package.

With this exclude list being added, you should no longer be in danger of pulling in incompatible pacemaker (or DRBD) packages when performing node maintenance. 

0 comments
12 views

Permalink