AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

 View Only

Performance improvement in block IO throughput due to Bulk DMA unmap feature in AIX 7.3 TL3

By KOKIL DEURI posted Fri December 13, 2024 03:02 AM

  

Prerequisite:

OS version: AIX 7.3 TL3 or above

System Firmware version: FW1060 or above

Introduction:

While doing read/write operations to block IO devices over the PCI bus, AIX uses DMA (Direct Memory Access). As per AIX design, the buffers used in the IO operation are not pre-DMA-mapped, but instead DMA mapped and unmapped on-the-fly . In a typical IO flow, once device driver initiates a IO request, the relevant buffer is DMA mapped using a hypervisor call, the actual IO transfer is done by the device DMA controller and then the buffer DMA mappings are removed using another hypervisor call. That is done for every IO request e.g. if the IOPS is X, then this mapping/unmapping will happen X times per sec. As the relevant hypervisor calls are costly operations, this can cause a performance bottleneck limiting IOPS to a lower value.

Solution:

Starting with AIX 7.3 TL3, this design has been enhanced so that the DMA unmapping hypervisor call at the end of every IO completion can be avoided. This is possible because at any instant, if there are sufficient resources available to create new DMA mappings for the incoming requirements, then there is no need to do costly unmapping operation after every IO. Instead, the mapping info is added to a "pending list" and when sufficient quantities of them have accumulated, they are unmapped in "Bulk" using a new hypervisor call. This new hypervisor call (available in systems with firmware FW1060 and above) allows the OS to  pass a list of disjoint DMA mapped areas to be unmapped in one-shot.

This enhancement provides significant boost to IO throughput due to the significant reduction in the frequency of DMA unmapping hypervisor calls which are very impactful for DMA transfers. 

Performance Test Results:

In our in-house performance testing, depending on the workload, we have observed more than 60% IOPS gain for block IO on supported devices. 

Sample results:

Sample Performance Data
Apart from 4K block-size, we see performance gain in higher block-sizes also e.g. 8K, 16K etc but with increasing block-sizes, the gain may taper down.

Supported Devices:

This feature is supported only on physical devices where the partition owns the physical adapter slot. The following two categories of devices are supported:

  1. All disks having path via Fibre Channel (FC) adapters of speed 16Gbps or more. For a list of supported FC adapters, please look at Section 2.6.2 of  IBM Redbook titled “IBM Power E1080 Technical Overview and Introduction” (https://www.redbooks.ibm.com/redpapers/pdfs/redp5649.pdf)
  2. NVMe devices. For a complete list, refer to Table 2-13 of the above Redbook.

Note: This feature is enabled by default when all prerequisites are met. If for any reason, you want this feature to be disabled, please contact IBM support.

0 comments
99 views

Permalink