PowerVM

NVMe device support in Virtual IO Server (VIOS)

By NINAD PALSULE posted Sat July 25, 2020 03:14 PM

  

What is NVMe?

Non-Volatile Memory express (NVMe) was developed as an industry specification for accessing non-volatile storage via the PCIe interface. NVMe was a grounds up specification intended to capitalize on the internal characteristics of flash storage: minimize latency, maximize performance, efficiency, and simplify device management.

NVMe, like SATA or USB, allows for multiple vendors to develop products compliant with the specification which are all supported by the same host device driver, therefore removing software compatibility as an adoption inhibitor.

Key areas of improvement in the NVMe specification:

  • increased queue depth
  • reduced register access per command
  • lightweight protocol requiring minimal path length
  • multiple MSI-X supported

Overview of NVMe support

NVMe devices do not support the SCSI architecture model hence a new device driver is added in the AIX operating system. VIOS added support for NVMe devices in version 2.2.6. Similar to other disk types, NVMe devices are presented as block storage devices. No changes are required in the upper level components such as LVM and file systems (as shown in the figure 1 below). The initial NVMe solutions for Power are PCIe attached and local to the system.

 

NVMe device use-cases in the PowerVM environment

The common use cases for the NVMe device are shown in the figure 2 below.

 

Note: Each NVMe drive is a separate PCIe endpoint and can be assigned individually to a unique AIX, VIOS, or Linux logical partition (LPAR).

 

The above configuration (Figure 2) is similar to other locally attached storage devices. As with other SSD technology, NVMe has a limited write endurance and may not be suitable for write intensive workload

 

  • User can assign one or more NVMe devices to the VIOS partition:
    • Device can be used as a VIOS boot device.
    • User can configure devices as a local read cache in the Share Storage Pool (SSP).
    • User can carve out logical volumes (using Logical Volume Manager (LVM)) and assign those to client as a LV backed virtual SCSI (vSCSI) devices. Client partition can use it for any purpose e.g. boot image, disk caching, etc.
      • Note : VIOS does not allow assigning NVMe disk to client as a physical volume (PV) backed virtual SCSI (vSCSI) device.
  • User can assign the NVMe device to the LPAR client partition as shown in the figure 2 (left side). Client partition can use it for any purpose, e.g. boot image, disk caching, etc.

Best practices

  • NVMe is a high-speed flash storage which comes in various levels of write endurance. Refer to the write endurance rating of your NVMe device to ensure it is suitable for the intended workload. Consult the IBM feature description of your specific NVMe device to determine its drive write per day (DWPD) rating.
  • As with any storage technology, NVMe may be subject to failure. It is advisable to mirror the VIOS boot device. In some Power 9 systems expansion cards are used to hold M.2 form factor NVMe devices. It is advisable to mirror across expansion cards to protect from expansion card failure as shown in figure 3 below.

 

 

  • Transferring VIOS image to NVMe disk: User can use LVM mirroring to migrate existing boot image to NVMe disk. They can add NVMe mirror copy to rootvg and remove the old copy after sync is done.
  • LVM allows commingling of flash storage with rotating magnetic storage in a volume group.  This type of configuration may not provide optimal performance.

Key points to note

  • VIOS will not support NVMe devices for following usage:
    1. Similar to any other locally attached devices, client LPARs that are using virtual SCSI (vSCSI) devices backed by NVMe (locally attached) cannot be used in the logical partition mobility (LPM) operations.
    2. NVMe disks cannot be used in the shared storage pool (SSP).
    3. VIOS will not allow mapping NVMe devices as a Physical volume (PV) backed in the Virtual SCSI (vSCSI) configuration.
    4. NVMe devices will not be supported as an Active Memory sharing (AMS) device.
    5. In the initial releases, default values for the NVMe adapter attributes is same as AIX default attributes hence there is no rules work supported. Attribute support for NVMe adapter will be added based on feedback from the field.

NVMe device examples

This section provides examples of how to identify NVMe devices and their attributes. Although these same commands are used to display information for any other devices, it shows some of the differences. NOTE: For more details refer the AIX/PowerVM documentation on NVMe [https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.ktechrf2/nvme.htm].

NVMe adapter information

  • NVMe is a PCIe device and can be listed as a child for PCIe device.


 

  • Number of channel (nchan): Each channel is an independent kernel thread with dedicated facilities to process IO.
  • Maximum size of DMA transfer (max_dma_window): It can be set more optimally if the size and the number of IOs issued at a time are known or predictable.
  • User can use “-vpd” option to find the physical location of the device as shown below

 

NVMe disk information

  • User can use “-child” option to find the list of disks for specific NVMe adapter.

  • User can use “-attr” option to find more information about NVMe disk similar to SCSI disks.

  • Platform specific information can be displayed using “-vpd” option.

 

NVMe diagnostic information

VIOS does not provide any RBAC for diag command. Hence customer will have to login to root shell to access the diag. User can use diag to configure NVMe devices and check health of the devices. Following figures shows the example of health check for NVMe device.

 

  • Run command “diag” -> Press Enter -> Select “Task Selection” -> Select “NVMe general health information” -> Select the specific NVMe adapter and press Enter.

  • Health information shows the life used, read/write statistics and errors.

Additional Information

Summary

PowerVM environment supports NVMe use cases such as VIOS boot device, backing device in the VIOS for exporting VSCSI storage to client LPARS, and server-side flash caching of storage data. The design of NVMe and its related device drivers are optimized for flash storage resulting in a more efficient use of system resources.  Additionally, NVMe may provide a more cost-effective solution compared to SAS since NVMe does not require PCIe controllers separate from the storage devices and the NVMe backplane is already included in the base price of some POWER9 servers.

Contacting the PowerVM Team

Have questions for the PowerVM team or want to learn more?  Follow our discussion group on LinkedIn IBM PowerVM or IBM Community Discussions

0 comments
62 views

Permalink