MQ for z/OS on z15 - What performance benefits might you see?

By Anthony Sharkey posted Thu January 09, 2020 10:18 AM

IBM announced the new IBM z15 mainframe server in 2019, and what follows is an overview of what MQ for z/OS' performance expectations were, plus the actual results of moving our MQ performance sysplex from z14 to z15.

MQ performance reports are generally posted on the mqperf github repository and the MQ for z/OS on z15 performance report is available here.

The z15 offers many improvements over z14, but of particular interest to us were:
  1. Improved processor performance.
  2. Increased number of processors, both in total for the machine and on each CPC drawer.
  3. Improvements from the on-chip compression accelerator as a replacement for the zEnterprise Data Compression (zEDC) Express feature.

When setting expectations of any performance improvements on z/OS, the first place to look is typically the IBM Large System Performance Reference (LSPR) website. Consideration should be given to the complexity of the workloads being run - and the measurements that we run vary significantly from very simple to complex, so there wasn't a single percentage value that we could apply to our workloads. Despite this, we used a rule of thumb of 11 to 16% reduction in transaction cost - although of course, some measurements could be more and some could be less.

Improved processor performance and increased number of processors .. on each CPC drawer

The z15 can hold up to five CPC drawers, where each CPC drawer may contain five single chip modules (SCMs).

Each SCM on z15 is designed with 12 cores, up from 10 cores on z14.

In our measurements we typically found no more than 10 cores allocated as general purpose processors from each SCM - which is still an increase on the 8 that we typically saw on the z14.

Generally MQ for z/OS performed as expected based on the expectations from LSPR. As you might expect, there were some exceptions but these were typically measurements that were not limited solely by CPU. For example, measurements limited by I/O disk response time performed at the lower end of expected performance.

Scalability of our workload showed a significant improvement when running on LPARs with higher numbers of processors. The following chart shows the transaction rate for a workload using non-persistent out-of-syncpoint messaging.

In this chart we see for both z14 and z15, peak throughput is attained with 20 processors. The z15 achieves 19% higher peak throughput, with 761,000 transactions/second, or 1.52 million messages per second.

Scalability: Transaction Rate - Non-persistent out-of-syncpoint
Improvements from the on-chip compression accelerator

Starting with z13, IBM introduced the zEnterprise Data Compression (zEDC) Express PCIe feature, which helps with software costs for compression/decompression operations (by offloading these operations) as well as providing efficiency to data encryption (compression before encryption).

With z15, the zEDC Express functionality has been moved off from the PCIe infrastructure into the processor nest. This re-location significantly improves the performance for the functions of compression/decompression.

This on-chip compression is implemented in 2 modes:
  1. Synchronous execution for problem state
  2. Asynchronous optimization for large operations under z/OS.

MQ supports the use of compression, whether using hardware via on-chip (z15), zEDC (z13 onwards) or software, for channel compression with the setting of the channel attribute COMPMSG(ZLIBFAST).

MQ channel compression is performed using the synchronous execution, and a significant factor in the improved performance is the reduced latency from the switch to/from the PCIe-based zEDC processor on z14.

Additionally the threshold for performing compression in hardware on z15 has been lowered from 4KB to 1KB.

With highly compressible messages, we have seen a decrease in transaction cost of up to 42% when compared with the equivalent workload on z14, with an increase to the transaction rate of up to 90% on a low-latency network.

The following chart compares the transaction cost when the message payload increases in compressability for both z14 and z15.

The chart additionally shows the equivalent transaction cost when COMPMSG(NONE) is specified, i.e. no compression is attempted.

ZLIBFAST channel compression - transaction cost
Notes on transaction cost:
  1. z15 costs are up to 43% lower than the equivalent z14 measurement, with more compressible messages demonstrating the largest difference.
  2. z15 costs for “incompressible” messages are higher than the equivalent z14 measurement. This is due to the z14 measurement being unable to compress the payload, whereas on z15, the message payload was 3% compressible and therefore needed to be inflated by the receiving channel initiator.
  3. The significant increase in z14 cost between 60 and 80% compressible is due to the compressed message being too small to inflate in hardware, instead having to be inflated in software. The lower thresholds on z15 means that all of the payload is compressed and inflated in hardware.
  4. The measurements with no compression always demonstrated lower transaction cost than compressing the message payload, however compressing highly compressible messages on z15 showed parity with uncompressed messages on z14.

The performance report "MQ for z/OS on z15" contains detail on these and other performance benefits that we have observed on the IBM z15.