Original Message:
Sent: Wed October 16, 2024 07:27 AM
From: Abu Davis
Subject: IBM MQ NativeHA on Kubernetes with ibm-messaging/mq-helm chart
Is NativeHA now supported on Kubernetes platforms such as AKS or Rancher for MQ SC-2 release? Is the MQ Operator supported as well or helm is the only way for now?
------------------------------
Abu Davis
Original Message:
Sent: Tue March 28, 2023 05:48 AM
From: Arthur Barr
Subject: IBM MQ NativeHA on Kubernetes with ibm-messaging/mq-helm chart
Hi,
As you say, IBM MQ uses the approach of having only one of the three Pods pass their readiness probe. There are some disadvantages to the current approach, as you've noticed, where some tools equate "readiness" with success. We did consider other alternatives, however they seemed to offer more disadvantages, for example:
- Have all instances be "ready", and for any clients or queue managers connecting try each of the instances in turn. This would (in theory) be handled automatically by an MQ client application inside the same Kubernetes cluster, but could take three TCP/IP connections (and TLS handshakes) to find the active instance. The clients would obviously need all three addresses. Connections from outside the cluster would generally need to use a single address (e.g. of a Router or load balancer), and have application logic to retry
up to three times for as long as it's not randomly connected to the active instance. This should theoretically work in the MQ software product, but is not explicitly tested for Native HA. - Have a custom MQ-aware router running in another Pod, which could route traffic to the active instance. This would require a new component, and would certainly complicated Native HA deployments. The new component would need to be HA itself, so potentially two stateless Pods for the routing, plus three queue manager Pods.
Each option has downsides, but the current solution seemed the best fit. I don't think ArgoCD should be equating readiness with success, so would perhaps argue that they could improve here.
Rolling updates are discussed in Considerations for performing your own rolling update of a Native HA queue manager. In particular, any rolling update which has no regard for which Pod is the elected leader is not desirable for Native HA, as it has a 66% chance of resulting in an extended outage (two fail-overs). e.g. if the leader is the first or second Pod to be updated, then the new leader will be updated very soon. You could even get very unlucky and require three fail-overs. For this reason, the MQ Operator implemented custom logic to perform the rolling update. This is a complex piece of code, and is closely tied to both how operators work, and to the exact method of deployment (e.g. StatefulSet of three replicas), so not a generic tool.
There aren't any plans to change either of these areas at the moment. If you feel strongly that the down-sides mentioned would work better in your environment, you could submit a feature request at https://ibm.biz/mqideas
------------------------------
Arthur Barr
Container Architect, IBM MQ
IBM
Original Message:
Sent: Fri March 24, 2023 06:06 AM
From: Christoph Kuenzle
Subject: IBM MQ NativeHA on Kubernetes with ibm-messaging/mq-helm chart
Hi,
We have a IBM MQ NativeHA setup running on kubernetes with the sample Helm chart ibm-messaging/mq-helm (github.com).
As it is implemented, the failover depends on the readynessProbe, using a Service in front of the 3 replicas to switch in case of failover.
This has 2 disadvantages:
- In argocd the application is shown as "progressing", and the standby replicas constantly emit "not ready" events.
- Standard rolling updates are not possible. Instead a script has to executed that "ripples" through the replicas and deletes one by one, waiting for it to become available until progressing to the next one.
Are there plans to provide a better solution, that allows for automatic rolling updates?
Thanks,
Chris
------------------------------
Christoph Kuenzle
------------------------------