View Only

IO Adapter Enlarged Capacity Improvements in PowerVM 2.2.5

By Tom Sand posted Mon June 22, 2020 12:02 PM


PowerVM I/O Adapter Enlarged Capacity LogoAll Power servers for Power7 and newer allow for improved I/O adapter performance by reserving additional memory for high speed adapters. This additional memory is used to increase the I/O Address Translation Tables (ATT) in the Hypervisor and in turn increase the maximum DMA window a device driver can map for I/O at any one time. See the Background on I/O Address Translation section later in this document to better understand I/O ATTs. The correct setting for I/O Adapter Enlarged Capacity for your server depends on the operating system, the device driver capabilities, the adapter and the slot used for the adapter.  The following sections will guide you on how to properly configure your system for I/O Adapter Enlarged Capacity and give details on the changes in PowerVM 2.2.5 to reduce the amount of memory reserved for these high speed adapters.

AIX and IBM i
The AIX and IBM i operating systems do not require or make use of very large DMA windows which consume memory that is set aside for I/O Adapter Enlarged Capacity.  If your server is not going to be running Linux, you can make a significant reduction in the amount of memory set aside by the firmware if you disable the I/O Adapter Enlarged Capacity feature.  To disable this feature, first power off the server as changes are only allowed when the server is powered off.  Next, from the HMC in the Operations section select the Launch Advanced System Management (ASM) option.  After you sign on to the ASM screen, select the System Configuration option and then select the I/O Adapter Enlarged Capacity option.  An image similar to the following panel will be presented:

PowerVM I/O Adapter Enlarged Capacity ASMI
To disable the I/O Adapter Enlarged Capacity feature, you uncheck the Enable I/O Adapter Enlarged Capacity option and press the Save settings button.  When the system is powered on, no additional space will be set aside by the firmware for these adapters.

For Linux on Power, the device drivers for certain I/O adapters are optimized for very large DMA windows, which utilize the additional space that is set aside for I/O adapters.  There are a couple different strategies that can be followed for allocation of memory.  One strategy, if you are not concerned with the additional memory requirement, is to enable the I/O Adapter Enlarged Capacity feature for every slot in the server.  To follow this strategy you chose the largest possible value for the Enlarged IO Capacity Slot Count for Node for each node.  In this situation the firmware will set aside memory for every slot in the server so every adapter in the server will have additional memory independent of the specific I/O slot hosting the adapter.  The Linux device drivers will then utilize this additional memory to improve the performance of I/O adapters that support very large DMA windows.  Depending on the number of nodes and amount of memory physically installed in the server, reserving memory for all I/O slots can significantly increase the amount of memory reserved by the firmware. (See PowerVM 2.2.5 update below for improvements in FW860.) For many customers, it would be better to tailor these setting to create a balance between flexibility and additional memory consumption.  To tailor the configuration, first you need to identify the number of adapters that will benefit from the additional reserved memory.  Following is a link to a table of supported adapters and assignment order of adapters that support 64-bit DMA and support very large DMA windows.  For any adapter that is listed as required or highly recommended, you should be sure to allocate additional space for those adapters.  The next step is to ensure that the physical adapters are plugged into slots where the reserved memory is allocated.  The following table, which is derived from the table of supported adapters and assignment order, shows the mapping of the slots in an I/O drawer and the enablement of IO Adapter Enlarged Capacity:
PowerVM I/O Adapter Enlarged Capacity slots
Since these values can only be changed when the server is powered off, you may want to reserve an extra slot or two for hot plugging additional adapters in the future.  As an example, if there are currently three adapters that need the additional memory and you reserve one more slot for future growth, specify 4 for the number of I/O slots on the I/O Adapter Enlarged Capacity panel, place the actual adapters in slots C1, C3 and C5 and slot C6 would be left reserved for future hot plugging an additional adapter that supports 64-bit DMA and very large DMA windows.

PowerVM 2.2.5 Update
Prior to PowerVM 2.2.5, the PowerVM hypervisor set aside memory based on a 4K I/O page size.  Starting with PowerVM 2.2.5 the PowerVM hypervisor sets aside memory based on a 64K I/O page size.  The overall effect of this change is that the amount of memory that is set aside for I/O Adapter Enlarged Capacity will be reduced to 1/16.  For servers with large amount of installed memory, this will be a significant reduction in the amount of memory reserved by the firmware.  This benefit happens automatically when installing FW860 on your Power8 server.  Existing versions of Linux that are capable of running on Power8 will automatically adjust to use the 64K I/O page size.

Background on I/O Address Translation
PowerVM I/O Adapter Enlarged Capacity Address Translation

The diagram above is provided to help explain I/O Address Translation Tables (ATT), DMA windows, platform memory addresses and DMA addresses.  The Hypervisor must allocate space at platform IPL time for an ATT for each I/O Adapter present or targeted to be plugged in the future.  The size of this ATT directly corresponds to the maximum DMA window the I/O device driver in the partition may allocate. The larger the ATT, the more pages that can be mapped, and hence the larger the DMA window.  If a DMA window is large enough, it is possible for the I/O device driver to map every page in partition memory at partition IPL time.  This removes the latency of mapping and remapping individual pages as they are used during normal OS operations.

Here is the general flow of how the I/O ATT works - When the I/O device driver wishes to give the I/O adapter an address to DMA to or from, it must call the Hypervisor to have the Hypervisor update an ATTE with the real page number (RPN) of the main storage address. From this Hypervisor call the device driver receives a 64 Bit DMA Address to pass to the I/O Adapter for the DMA. At a later point in time the I/O Adapter initiates a DMA by placing the 64 Bit DMA Address on the PCIe link as the DMA address.  The PCI Host Bridge (PHB) picks up this 64 Bit DMA Address, and splits it into the ATTE Index and the Page Offset.  The PHB uses the ATTE Index to fetch the Real Page Number (RPN) from the ATT. This RPN is merged with the Page Offset to form the 64 Bit Platform Memory Address.  This 64 Bit Platform Memory Address is then used to perform the DMA operation.

The I/O Adapter Enlarged Capacity setting is not for everyone.  AIX and IBM i customers can save memory by disabling it.  I/O Adapter Enlarged Capacity will help LINUX customers using high performance adapters maximize I/O performance.  The changes in PowerVM 2.2.5 allow customers to enable this performance improvement while using less memory for the Hypervisor than previous releases.  With the information provided in this article you should now be able to take your system configuration and correctly adjust the I/O Adapter Enlarged Capacity setting.

Contacting the PowerVM Team
Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions