AIX

 View Only

Storage Space Unmap Support in AIX

By NINAD PALSULE posted Wed September 23, 2020 10:58 AM

  
                                                  Storage Space Unmap Support in AIX

Overview:

Thin Provisioning is a technology employed by storage products for optimized resource utilization. Under a Thin provisioned scheme, a disk is created with the specified size by pre-allocating none or partial storage space for it. The underlying blocks are allocated as-and-when a write is issued by the host to the disk. This improves the utilization of the storage subsystem when space is pooled across multiple disks. To derive full value of thin provisioning, it is important for the host to release disk blocks that are not being used any more.

Following figure shows the typical AIX Input/Output(IO) stack. Traditionally the stack did not support returning freed blocks to the storage subsystem because there was no support from the disk device driver(DD) to return this space.

 

As shown in figure 2, information about free blocks did not flow beyond the filesystem layer and the information about free partitions did not flow beyond the logical volume manager (LVM) layer.

Starting at the AIX 7.2 Technology Level 1, AIX starts returning freed blocks to storage subsystem using the SCSI standard WRITE_SAME operation. Blocks are returned when one of the following operation is performed by the user.

  • Removal of logical volume.
  • Removal of logical volume mirror copies.
  • Removal of JFS2 filesystem which initiates removal of logical volume.
  • Reduction of JFS2 filesystem size through shrink operation which reduces the size of the logical volume.

As shown in the figure 3, upon LVM request, the disk device driver will start informing the storage subsystem that blocks are freed. It is also important to know that shrink filesystem is the only way to unmap free block space from the filesystem.

 



Functional Overview:

The AIX storage space unmap functionality is implemented in the LVM and the disk device driver. One of the important goals was to provide the automatic operation without adding any new command or new command option. The unmap operation is performed asynchronously without blocking the initiating command.

 

  • Logical Volume Manager (LVM):

LVM determines if the device is thinly provisioned or not with the help of ioctl(IOCINFO) command provided by disk device driver. If the device is thinly provisioned, then LVM informs disk device driver whenever physical partition(s) on the device are freed. The physical partitions are freed by the following LVM and filesystem commands. Now as part of command execution LVM takes an extra step to inform disk device driver about the freed space.

  • User can use following LVM commands to initiate storage space reclamation:
    • rmlv command: This command is used to remove the logical volume from the volume group. All the physical partitions allocated for logical volume are freed.
    • rmlvcopy command: This command is used to remove the mirror copy of logical volume. All the physical partitions allocated for the mirror copy will be freed.

 

  • User can use following filesystem commands to initiate storage space reclamation with the help of LVM.
    • rmfs command: This command is used to remove the filesystem and logical volume. All physical partitions allocated for logical volume are freed.
    • chfs (shrink fs): The chfs command is used to extend or shrink a filesystem.  For a JFS2 filesystem shrink operation, once the space is reduced the filesystem asks LVM to reduce the logical volume size which frees up the associated physical partitions.

 

Space reclaim will be supported on all types of volume groups like Scalable volume group (SVG), Big volume group, Small volume group, concurrent volume group and root volume group.

 

The lvmstat command is enhanced to provide space reclamation information for physical volumes in the volume group. The new option "-r" added to show the information about space reclamation. User can also use “-r -L” option to get more details about failures. Some of the important fields are as follows:

 

 

  Reclaim Reclaim state "on" indicate that storage and AIX disk driver supports reclaiming space on this device.
 

Mb_freeed:

Amount of physical partition space is freed from logical volume by commands like rmlv, rmfs, rmlvcopy, and chfs in megabytes

 

Mb_pending:

Space reclamation pending for the physical volume space in megabytes.

 

Mb_success:

Space reclamation requests succeeded at disk driver in megabytes.

 

Mb_failed:

Space reclamation requests failed by the disk driver in megabytes.

 

Mb_reused:

Free physical partition space reused for the logical volume without requesting the space reclamation in megabytes.

 

 

Examples:

  • Find out thin provisioned disks in the volume group using lvmstat command. Reclaim state “on” means storage can reclaim the space if informed.

  • Initiating space unmap using file system shrink operation. Here filesystem size is reduced by 5GB. Check the reclaim information using lvmstat command.

                      

  • Initiating space unmap using file system removal. The filesystem removal operation will remove the logical volume which will free up the associated physical partitions. Space unmap is performed asynchronously hence you can see that there is a pending count on hdisk10.  

                    

                    

  • Initiate space reclamation by removing logical volume.

  • Disk Device driver:

Disk Device driver interacts with storage subsystem using standard SCSI commands. It uses the SCSI inquiry command to find out whether the device is thinly provisioned and upon getting unmap request from LVM, it sends a SCSI command to storage subsystem to reclaim the space.

Related Tunable(s)

Some aspects of the space release function can be managed/tuned via the system tunable available via the ioo command (or corresponding SMIT panel). This includes

  • Enabling/disabling the function without requiring a reboot
  • Controlling the amount of memory resource used for this functionality

To process requests to release blocks back to storage, a dedicated (system-wide) pool of buffers is used by the AIX Disk Driver. The number of buffers in this pool is one of the factors that (among some other things) dictate the maximum number of requests that can be processed in parallel by the AIX Disk Driver.

The following tunables can be changed

  • dk_lbp_enabled

Setting this to 0 (zero) will disable this functionality on the AIX node. This tunable can be changed and made effective, without requiring a reboot. By default, this tunable is set to 1 (one), indicating that the feature is enabled on the AIX node.

To query the current setting, use the following command


To disable this function, use the following command

  • dk_lbp_num_bufs

To process requests from LVM to release blocks back to storage, a dedicated (system-wide) pool of buffers is used by the AIX Disk Driver. The number of buffers in this pool limits the maximum number of requests that are processed in parallel by the AIX Disk Driver. To monitor if any requests were aborted due to the number of buffers being too low, the AIX administrator can view /proc/sys/disk/lbp/statistics file. If the Out-of-Memory counter is non-zero then it means that the pool size should perhaps be increased.

This tunable can accept any value between (1 – 1024) and can be changed without causing any disruption to ongoing request.

To query the current value for this tunable, use the following command


To check if some space-release requests have failed because of insufficient buffers, run the following command.


To set the number of buffers in the pool to be 128, use the following command


 

  • dk_lbp_buf_size

Traditionally, most disks use a sector size of 512-bytes but some new storage products now also support a 4096-byte sector. For space release requests to work on a thin-provisioned disk that uses 4096-bytes sectors, the buffer size of the pool should be defined as 4096. Note that a buffer size of 4096 bytes can also work with thin-provisioned disks that use block size of 512-bytes.

To query the current buffer size used for the space-release pool, use the following command.


To set the buffer size for space-release pool to 4096-bytes, use the following command


Key points to note:

  • Virtual storage support in PowerVM: Storage space reclamation functionality is supported through the NPIV (N_Port ID virtualization) for supported storage. But functionality is not supported for the storage attached through the virtual SCSI mode.
  • Supported Storages: In the initial release this feature is supported on following storage products for appropriate firmware levels that include thin-provisioned disk feature.
    • IBM DS8000
    • IBM XIV
    • IBM Flashsystem A9000
    • IBM SVC
    • EMC Symmetrix Family​​
  • Space reclamation is best effort: As per SCSI specification the space reclaim functionality is best effort. Hence different storages implement it differently. They have different block sizes and the request must be aligned on the correct block size. Also LVM physical partition size is decided at volume group creation and reclaim block size may not align with the partition size or partition start. Some storage subsystems support reclaim block size which is much bigger than the LVM partition size, and these storage subsystems might not support partial block reclamation. In this scenario, LVM might not be able to accumulate enough contiguous free partitions to reclaim the whole block size. Therefore, it is possible that when user deletes multiple LVM partitions it might not end up reclaiming the equivalent amount of space in the storage subsystem.​
  • Performance considerations: To have minimal impact on the I/O performance, the disk driver shall treat space release operations as low priority in comparison to regular read/write operations and LVM will try to minimize number of requests submitted per device.​
  • Device thin provision capability detection: LVM and disk driver will attempt to detect the capability dynamically but if it is not able to, then it is recommended that the user should varyoff and varyon the volume group after turning on the capability at the storage.​
  • Unmapping free partition space from old VG or from interrupted request: Volume groups created prior to supported version may have free partition space on physical volumes. This free space is not eligible for automatic unmap after upgrading to the supported version. So to initiate unmap of this space, administrator has to create and delete dummy logical volume on those free partitions. But space will be automatically reclaimed for the partitions which are freed after installation of supported version. Also there are cases where asynchronous unmap operation get interrupted due to varyoffvg or system crash. In that case to initiate remaining unmap operation, administrator has to create and delete dummy logical volume on partitions which are freed prior to interruption.

Additional Information

  • Refer to the man pages of commands for more details.

Contact:

 

0 comments
24 views

Permalink