Mainframe Storage

Mainframe Storage

Enhancing performance, reliability, and security ensuring the availability of critical business workloads

 View Only

z/OS Compression Estimation Tool - Example Results

By Nick Clayton posted Mon February 03, 2025 04:58 AM

  

Following on from my last post on this topic one of our clients was in touch with some results from the z/OS Compression Estimation Tool that they had obtained in their environment. They have kindly agreed that the high-level statistical results could be shared so long as this did not contain any identifiable information.

Data reduction comprises both compression and thin provisioning and it is important to consider both when looking at sizing an environment and so we will look at both here. Performing space release on a regular basis ensures that deleted datasets are no longer consuming logical or physical space and is also something that should be automated based on physical or logical capacity threshold alerts. If you are comparing the Estimation Tool results with the actual system data this is best done immediately after performing a space release so that the two results will match

Many z/OS clients maintain a pool of spare volumes which can be quickly added to DFSMS storage groups. In a thin provisioned environment these volumes will only consume a single extent for the VTOC and VVDS so it is possible to pre-allocate a larger number of spare volumes than might be done without thin provisioning. This can be seen from the graph below where there is a large amount of capacity with more than 90% free space which is the spare volumes.

Excluding the spare volumes there are a significant number of storage groups with relatively limited free space and hence limited thin provisioning benefits but the overall saving due to thin provisioning is around 26%.

As discussed in the previous post on this topic compression rates can vary widely depending on how compressible the data is and especially whether it has already been compressed or encrypted by z/OS. In the production environment there is a high proportion of database data and some of this is already compressed with coprocessor compression. Because of this a compression percentage of between 30 and 40% is estimated for the majority of data in the environment with an overall average of 33%

The development environment at this client is very different from the production environment with higher compression percentages. The development environment has a lot more code and load libraries and other data that is not already compressed and is highly compressible with large portions of the data achieving between 60 and 80% compression. The overall average data compression percentage is 64%

The development environment has a higher amount of free space as shown below and so would save more capacity with thin provisioning. 55% of the configured capacity would be saved with thin provisioning. This is something that I’ve seen in many development/non-production environments with multiple teams and varying activities meaning that there is more stranded capacity in different DFSMS storage groups compared to a well-managed production environment.

The chart below shows the overall breakdown of the configured, logical and physical capacity estimations using the data from the Compression Estimation Tool excluding the spare volumes. This clearly shows the greater percentage savings in the development environment both from thin provisioning and compression.

*The grey part of the bar is the configured capacity that is saved with thin provisioning and the blue part is the logical capacity saved with compression. The purple part is then the physical capacity consumed on the drives after both thin provisioning and compression savings.

These results of a large-scale use of the Compression Estimation Tool in a real-world environment clearly show the potential benefits of FlashCore Modules in a z/OS environment. They also show that it is dangerous to assume a specific compression ratio without knowledge of the data that is present in the environment. In this example the development environment had very different compression estimation results to production and so a projection based on the development environment would have not been correct.

0 comments
31 views

Permalink