Performance of NativeHA MQ with Redhat ODF Storage
Based on the research conducted by the IBM team, we've identified that ODF serves as a general-purpose replicated data solution for OpenShift. It ensures data integrity for single volumes by replicating data to other nodes, making it highly valuable for many general-purpose workloads and a suitable starting point. However, it may not be suitable for all scenarios, especially for workloads demanding low disk latency such as IBM MQ in specific situations. IBM MQ can be deployed in a Native HA configuration, where it independently replicates data to different nodes for high availability, making ODF redundant in this regard.
It is advisable to utilize:
a) Any non-replicating block storage for Native HA MQ.
or
b) Redhat ODF "Non-resilient storage class" for Native HA MQ.
After testing the environment, we discovered that we can significantly reduce latency and eliminate timeout issues by omitting ODF for MQ for this particular workload, all while maintaining reliability and data integrity. The objective in performance testing is to determine the optimal configuration for maximum throughput, and we have the flexibility to choose from various storage types to achieve our goal.
Testing architecture employed:
Multi-instance application ----msg----> ACE1 ----msg----> ACE2 ----msg----> ACE3 ----msg----> ACE4 ----msg----> destination system
|
|
DB
MQ is present at every step between ACE1, ACE2, ACE3, ACE4, and the destination. All communication between ACE flows occurs through MQ Input and Output nodes. Both ACE and MQ will reside on one of those green worker pods.
TEST 1 (Conducted with NativeHA MQ using ODF storage)
Configurations :-
Max No. of Connections - 500/port (Total 9 ports used)
Timeout on TCPServerInput Node, TCPServerReceive Node and TCPServerOutput Node - 5s
Native HA MQ CPU - 2Core
Native HA MQ Memory - 16Gi
replica=3 is set in ODF configuration
Results of Testing :-
a) Total Records Hit by UPI Switch - 12000
b) Total Records Received by ACE - 1455
c) ./amqsrua -m EIPMQMHA -c DISK -t Log
Publication received PutDate:20240228 PutTime:08105414 Interval:4.261 seconds
Log - bytes in use 16777216
Log - bytes max 1677721600
Log file system - bytes in use 1173692416
Log file system - bytes max 52576092160
Log - physical bytes written 442368 103809/sec
Log - logical bytes written 368640 86508/sec
Log - write latency 53425 uSec
Log - write size 6025
Log - current primary space in use 0.72%
Log - workload primary space utilization 1.42%
Log - bytes required for media recovery 21MB
Log - bytes occupied by reusable extents 0MB
TEST 2 (Conducted with Single Resilient QM using ODF storage)
Configurations :-
Max No. of Connections - 500/port (Total 9 ports used)
Timeout on TCPServerInput Node, TCPServerReceive Node and TCPServerOutput Node - 5s
Single Resiliant MQ CPU - 2Core
Single Resiliant MQ Memory - 16Gi
replica=3 is set in ODF configuration
Results of Testing :-
a) Total Records Hit by UPI Switch - 12000
b) Total Records Received by ACE - 2165
c) ./amqsrua -m EIPMQMSI -c DISK -t Log
Publication received PutDate:20240228 PutTime:08382303 Interval:9.644 seconds
Log - bytes in use 1006632960
Log - bytes max 1677721600
Log file system - bytes in use 1111490560
Log file system - bytes max 53687091200
Log - physical bytes written 1077248 111694/sec
Log - logical bytes written 983804 102006/sec
Log - write latency 46177 uSec
Log - write size 6622
Log - current primary space in use 1.71%
Log - workload primary space utilization 1.71%
TEST 3 (Conducted with VM QM)
Configurations :-
Max No. of Connections - 500/port (Total 9 ports used)
Timeout on TCPServerInput Node, TCPServerReceive Node and TCPServerOutput Node - 5s
VM CPU - 1 Core
Results of Testing :-
a) Total Records Hit by UPI Switch - 12000
b) Total Records Received by ACE - 6168
c) ./amqsrua -m EIPMQM -c DISK -t Log
Publication received PutDate:20240228 PutTime:09405414 Interval:1 minutes,4.785 seconds
Log - bytes in use 2013265920
Log - bytes max 3355443200
Log file system - bytes in use 2944016384
Log file system - bytes max 17169383424
Log - physical bytes written 26169344 403939/sec
Log - logical bytes written 4382523 67647/sec
Log - write latency 526 uSec
Log - write size 4612
Log - current primary space in use 3.07%
Log - workload primary space utilization 3.07%
Upon comparing the "Log - write latency" across different environments, it is evident that there are variations in the output:
Log - write latency: 526 uSec (VM QM) < 46177 uSec (Single Resilient QM) < 53425 uSec (NativeHA QM)
Although the "Log - write latency" doesn't show significant differences between Single Resilient QM and NativeHA QM, the recorded values are notably high. This suggests a potential issue with the disk performance, storage utilized, or extra replication on the container.
TEST 4 (Conducted with VM QM using mqldt tool)
mqldt tool focuses on writing files with various block sizes and measures metrics such as total writes, bytes written, and latency.
./mqldtTest.sh
Executing test for write blocksize 16384 (16k). Seconds elapsed -> 60/60
Total writes to files : 81717
Total bytes written to files : 1338851328
Max bytes/sec written to files (over 1 sec interval) : 23379968
Min bytes/sec written to files (over 1 sec interval) : 21217280
Avg bytes/sec written to files : 22321743
Max latency of write (ns) : 17236448 (#23599) (17236 uS)
Min bytes/sec (slowest write) : 950543
Min latency of write (ns) : 364030 (#68893)
Max bytes/sec (fastest write) : 45007280
Avg latency of write (ns) : 713826 (713.826 uS)
mqldt (VM):
Total bytes written to files : 1338851328 (1.3 GB)
Avg latency of write (ns) : 713826 (713.826 uSec)
Avg bytes/sec written to files : 22321743
Comparison:
a) Total Writes:
a.1) amqsrua: 26,169,344 bytes (logical) + 4,382,523 bytes (physical) ≈ 30,551,867 bytes
a.2) mqldt: 1,338,851,328 bytes
b) Latency:
b.1) amqsrua: 526 microseconds
b.2) mqldt: 713.826 microseconds
c) Write Throughput Rate:
c.1) amqsrua: 403,939 bytes/second (physical) + 67,647 bytes/second (logical) ≈ 471,586 bytes/second
c.2) mqldt: 22,321,743 bytes/second
While the tests measure different aspects of disk performance, the comparison suggests that the disk performance is relatively similar in terms of write throughput and latency. The mqldt test shows higher total bytes written, but amqsrua has a lower write latency. Both tests indicate decent disk performance (in comparison).
Please note that we have already conducted a comprehensive analysis by comparing the amqsrua across three types of queue managers (VM QM, Single Resilient QM, and NativeHA QM). The examination of mqldt/mqldt-c is being carried out solely for additional verification. The accuracy of the amqsrua results is nearly confirmed.
Illustration Demonstrating the Impact of RHODF (with a Replication Factor of 3) on Increased Latency for NativeHA MQ
MQ NativeHA is designed for cloud-native RWO/block storage setups and is responsible for replication. If you also have ODF with replication, then you introduce additional latency because you must wait for ODF to complete its replication. As per the diagram above, the ODF replications, represented by the lines, are adding unnecessary lag.
Performance is diminished due to the replication of data eightfold
- Systems such as MQ Native HA or API Connect, which possess built-in replication mechanisms, dispatch two duplicates of the data to additional replicas to bolster stability.
- Upon this data being recorded onto the local block storage volume of every instance, a storage provider engaged in replication will proceed to duplicate this data into two additional sites for each volume, culminating in a total of eight replicas.
- Ordinarily, the system is required to await the completion of all these synchronous write operations prior to relinquishing control back to the initial requestor, leading to a decline in the performance during operation.