MQ

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

View Only

Back to discussions

Expand all | Collapse all

RDQM DR/HA -- Performance Impact

Jump to Best Answer

1. RDQM DR/HA -- Performance Impact

Like
RAJESH VERMA
Posted Thu January 11, 2024 05:40 PM

Reply
Hello,

Please help me to find the remedy to slow-responsive queue manager running in RDQM DR/HA env. The messages logs showing following error;

Jan 11 09:51:59 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.200000
Jan 11 09:52:29 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.380000
Jan 11 09:52:59 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.310000
Jan 11 09:53:13 txulmqprd2 su[72087]: (to mqm) root on pts/0
Jan 11 09:53:29 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.410000
Jan 11 09:53:59 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.180000
Jan 11 09:54:05 txulmqprd2 kernel: drbd qm_mqp1_uv.dr _remote: [drbd_s_qm_mqp1_/10522] sending time expired, ko = 6
Jan 11 09:54:29 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.250000
Jan 11 09:54:59 txulmqprd2 pacemaker-controld[1817]: notice: High CPU load detected: 1.270000
Jan 11 09:55:01 txulmqprd2 systemd[1]: Configuration file /usr/lib/forescout/daemon/SecureConnector.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Jan 11 09:55:02 txulmqprd2 kernel: drbd qm_mqp1_uv.dr _remote: [drbd_s_qm_mqp1_/10522] sending time expired, ko = 6
@@@

Its impacting the application which connects to queuemanager a bigtime. IOwait time is too high. I will be greatful if anybody can advise me.

Thank you,

Rajesh

------------------------------
RAJESH VERMA
------------------------------
2. RE: RDQM DR/HA -- Performance Impact
Best Answer

Like
om prakash

IBM Champion
Posted Thu January 11, 2024 05:55 PM

Reply
You should open a case with IBM. Looks to be storage write issue.

------------------------------
om prakash
------------------------------

Original Message
3. RE: RDQM DR/HA -- Performance Impact

Like
Andrew Hickson
Posted Fri January 12, 2024 05:06 AM

Reply
You have to be very careful with wht you red into IOWAIT in an MQ environment. What follows is mostly generic MQ advice, rather than RDQM specific advice.

In most MQ environments nearly all of the forced IO should be to the MQ recovery log. The way the recovery log works is that all the active hConn's essentially append to the log buffer. Each time some hConn required their IO to be guaranteed as much as can be efficiently written from the log buffer will be written in a single forced write. When that write completes the logger will check to see if any other hConn has requested further IO to be forced and if so will immediately schedule another write (again the biggest write that can be efficiently scheduled based upon what data other tasks have appended to the log buffer). The overall effect of this is a batching effect where a small number of large writes are issued, rather than a high number of small writes. The algorithm works well with a wide variety of IO latencies, as might be expected given MQ's long history and therefore exposure to different IO technologies.

In an HA/DR environment there tends to be more IO latency (as the IO has to be replicated to a remote node) and thus the tendancy is towards a smaller number of larger writes (assuming sufficient concurrency in the application workload to keep appending to the log buffer). In such a situation very high IOWAIT times would be expected.

Have you run amqsrua to look at the LOG statistics ? in particular the write sizes and the IO latency.

Regarding the high load average, have you looked at the high level MQI statistics to compare the number of MQI calls of different types ? If you compare the number of successful MQPUT's with the total number of MQI calls in any interval you'll get some idea as to the efficiency of your applications. For example an application that does MQCONN;MQOPEN(request);MQOPEN(reply);MQPUT(request); MQGET(reply); MQCLOSE(request);MQCLOSE(reply);MQDISC will use MUCH more CPU time than one which does

   MQCONN;MQOPEN; MQOPEN

      while(X)

        MQPUT

        MQGET

    end-while

   MQCLOSE

   MQCLOSE

   MQDISC

Looking at high level MQI stats would be a good first step in lookin at unexpectedly high CPU usage.

------------------------------
Andrew Hickson
------------------------------

Original Message
4. RE: RDQM DR/HA -- Performance Impact

Like
RAJESH VERMA
Posted Fri January 12, 2024 03:34 PM

Reply
Thank you very much Andrew,

Problem determination is in progress, seems something was changed in the Network between the HQ and DR site, which increased the iowait and taking the CPU resources. I have stopped the replication between the HQ and DR site for now which helped to make the business process normal. I have opened the case with IBM before posting the question here as also advised by Om Prakash, thank you Om Prakash.

I also want to know if I can change the configuration to replicate the data between two sites by using one of the passive node in HQ so less overhead on the active node in RDQM HA at HQ... just thinking to avoid similar issue in future.

Thank you,
Regards,

Rajesh

------------------------------
RAJESH VERMA
------------------------------

Original Message
5. RE: RDQM DR/HA -- Performance Impact

Like
Girish D V
Posted Fri January 12, 2024 01:43 PM

Reply
Hi,

Looking at the frequency of high CPU messages recorded in the logs, it would appear that the system load is high. So first point of consideration would be to review the current running processes then identify the processes that are consuming high CPU and memory. It is possible the processes consuming high CPU could be MQ or non-MQ process. Depending on the process the next steps can be decided as to understand why those processes are consuming high CPU. Also you can ensure if there any delays with read/write to file system.

Regards,

Girish

------------------------------
Girish D V
------------------------------

Original Message
6. RE: RDQM DR/HA -- Performance Impact

Like
RAJESH VERMA
Posted Fri January 12, 2024 03:26 PM

Reply
Thank you Girish.

I have stopped one of the main non-MQ proess which seems to coming on top in the process list. But the incident is impacting the production so I had stopped the replication to DR and which gave some relief in high iowait for now. While work with network team to find the network congestion issue, which is what taking the whole CPU.

Thank you,
Regards,

------------------------------
RAJESH VERMA
------------------------------

Original Message

MQ

MQ

RDQM DR/HA -- Performance Impact

RAJESH VERMAThu January 11, 2024 05:40 PM

om prakashThu January 11, 2024 05:55 PMBest Answer

Andrew HicksonFri January 12, 2024 05:06 AM

RAJESH VERMAFri January 12, 2024 03:34 PM

Girish D VFri January 12, 2024 01:43 PM

RAJESH VERMAFri January 12, 2024 03:26 PM

1. RDQM DR/HA -- Performance Impact

2. RE: RDQM DR/HA -- Performance Impact
Best Answer

3. RE: RDQM DR/HA -- Performance Impact

4. RE: RDQM DR/HA -- Performance Impact

5. RE: RDQM DR/HA -- Performance Impact

6. RE: RDQM DR/HA -- Performance Impact

Additional
Resources

Office

Quick Links

MQ

MQ

RDQM DR/HA -- Performance Impact

RAJESH VERMAThu January 11, 2024 05:40 PM

om prakashThu January 11, 2024 05:55 PMBest Answer

Andrew HicksonFri January 12, 2024 05:06 AM

RAJESH VERMAFri January 12, 2024 03:34 PM

Girish D VFri January 12, 2024 01:43 PM

RAJESH VERMAFri January 12, 2024 03:26 PM

1. RDQM DR/HA -- Performance Impact

2. RE: RDQM DR/HA -- Performance Impact Best Answer

3. RE: RDQM DR/HA -- Performance Impact

4. RE: RDQM DR/HA -- Performance Impact

5. RE: RDQM DR/HA -- Performance Impact

6. RE: RDQM DR/HA -- Performance Impact

Related Content

Patching with active RDQM installations/configuration

RDQM Configuration

RDQM networking setup, best practices and diagnosing issues

Solution to the “modprobe: FATAL: Module drbd not found “ error in Linux RHEL with kernel level 957 with IBM MQ RDQM (Replicated Data Queue Manager).

RDQM DR : DR status Partitioned

Additional Resources

Office

Quick Links

2. RE: RDQM DR/HA -- Performance Impact
Best Answer

Additional
Resources