Storage

 View Only

Sizing and Understanding DS8000 FlashCore Module compression in z/OS environments

By Nick Clayton posted Wed January 15, 2025 05:36 AM

  

FlashCore Modules (FCM) provide a compression capability for primary storage on z/OS that is independent of dataset type and is applied to all data on the DS8000. They do not provide the same performance and other ancillary benefits as zEDC compression or coprocessor compression for Db2 tables and so should not be seen as a replacement for this. However, they can provide data reduction for those datasets that are not supported by dataset compression capabilities.

Given that many clients use both zEDC for sequential data and coprocessor compression for their Db2 tables we expect to see highly variable compression across different volumes in a z/OS environment. However, there are likely significant volumes of non-compressed datasets such as Db2 indexes or VSAM databases that will benefit from FCM compression.

If dataset encryption is used, then these datasets will not be compressed at all on the storage system and if this is pervasive(sic) then the overall compressibility at the storage level may be very limited.

Another factor to consider is that with CKD track formatting, the actual data on a 56KB track is less than this. With 4KB records we have 48KB of data on a track and compression will be able to recover some of this space even if the actual data is not compressible. Because of these various factors it is important to understand the range of compression ratios that would be experienced in a z/OS environment to determine what the average compression ratio would be. 

DFSMS Compression Estimation Tool

The DFSMS compression estimation tool enables the user to obtain an estimate of the compressibility of data on a z/OS volume by using the zEDC compression functionality with a set of parameters which emulate how the FlashCore Module compression is performed.  This is provided as a new utility within the DFDMSdss product. It will perform the same processing as a dump of a volume but rather than performing a backup of the volume will perform compression on the allocated tracks and calculate an average compression ratio of these tracks. As well as the compression statistics the tool will also provide free space statistics on the volume so that both the thin provisioning and compression benefits can be calculated from this output.

Comparing Estimations and FlashCore Module compression

The compression estimation tool is not a precise match for what happens inside the DS8000 and so there will always be some difference between the results of the tool and what is seen in the DS8000. The following graph shows a comparison of the Compression Estimation Tool results and the actual FlashCore Module compression for a real-world environment. 

Compression Estimation Tool sample results


 
The results show that the estimation tool was within 10% of the actual compression ratio for the majority of samples and especially for volumes with a low compression ratio the FlashCore Modules tended to get a slightly better compression ration than the tool. This provides a high degree of confidence that the tool can be used for planning purposes.

Using the Compression Estimation Tool

To obtain a good overall estimate of the compressibility of data in a z/OS environment it is important to run the Compression Estimation Tool against a statistically significant and representative set of data.

Compression Estimation Tool volume breakdown


The table above shows the variation of compression in a real-world environment with the bars showing the amount of capacity with different compression ratios. Different DFSMS storage groups and groupings of NONSMS volumes are shown in different colours and show that there is both variation between and within storage groups even if there are definite groupings that can be seen.

If it is possible, then the best results will be obtained by running the tool against all volumes in an environment. However, this is unlikely to be practical in many cases, so the guidelines below help identify a subset of volumes to run the tool against.

The following should be taken into consideration

  • Perform the estimation on volumes that together comprise at least 10% of the capacity of the environment. A larger sample would be better, but this should be considered a minimum.
  • Include volumes from all DFSMS storage groups and a sample of significant NONSMS volumes. Aim for 10% of volumes within each storage group with a minimum of 10 volumes and a maximum of 100 if the storage group is very large.
  • When selecting volumes select those with the least free space as free space is not considered when doing the compression analysis.
  • Select both large and small volumes when in the same storage group as DFSMS will tend to skew new allocations towards smaller volumes so they may have different data than larger volumes.

For each storage group or volume grouping you could assume that this data compresses by the weighted average of the compression ratio of the volumes evaluated. If there is very significant variation within a particular grouping, then it might be advisable to consider increasing the number of volumes evaluated to make sure that the sample is representative. 

Using results for sizing a new storage system

With FlashCore Modules it is mandatory to use thin provisioned volumes and compression will always be performed on the data on the FCMs. Therefore, in the majority of environments the system will be overprovisioned where the total capacity of the volumes will be greater than the physical capacity of the drives. When sizing the system, you need to consider both the savings provided by thin provisioning and by compression as these together will determine the required physical capacity of the system.

For example, if we assume that 15% of space in the environment is not used due to thin provisioning and that the compression rate of the used space is 50%. Then if the existing storage capacity was 500TB then this would consume 212.5TB of FlashCore Module capacity. 

Typically, users doing overprovisioning will plan on the utilisation of the system being at less than 80% and would plan on purchasing additional capacity if this threshold was reached. So 265TB would meet the 500TB required today. If the user had 10% year on year growth and wanted to provide 4 years of growth, then the capacity required would be 389TB.

It is highly likely that the granularity of Flash drives today does not exactly match the capacity requirements of a particular environment and so the capacity should be rounded to the nearest possible configuration.

 In this case a sensible configuration might be 4 drive sets of 9.6TB FCMs although in many cases it is likely that the performance requirements could also be met by 2 drive sets of 19.2TB FCMs as shown below. Both these configurations would provide 408TB of capacity 

Example DS8000 drive configurations



#Highlights
#Highlights-home

5 comments
62 views

Permalink

Comments

Thu January 23, 2025 10:29 AM

That's great - thanks for the additional details.  We'll look forward to the future posts :)

Thu January 23, 2025 06:05 AM

Hi Joel,

If the host data is already compressed with zEDC we would expect minimal compression benefit from the FCMs. However if it is compressed with coprocessor compression (eg Db2 tables or VSAM) then we would expect some additional benefits from the FCMs. I'll be posting another entry with some more details on this.

The data in Safeguarded Copy is the same as the data on the source except it only contains the data that is being updates so in generally it is going to see the same compression as the source volume. Perhaps a little less if the data that is regularly updated is more the zEDC data than otherwise.

However the main benefit of FCMs for SGC is the ability to usefully use large extents and get the much larger virtual capacity for this. If you have a large system then this will enable easier setting of the Safeguarded Multiplier without running out of virtual capacity. I suspect this should also be the subject of another post... 

Wed January 22, 2025 11:52 AM

Hi Nick,

Thank you for an excellent write up on the sizing utility.  Very helpful for planning.  Could you expand on this to provide guidance on using this method for sizing FCMs for Safeguarded Copy capacity?  We see this as a great first step for Z customers that are hesitant to use FCMs for production workload given the capacity variability.  If the SGC source data is already compressed (via Z host compression or FCMs), would you expect to see any additional benefit with FCMs for SGC use?

Thanks!

Fri January 17, 2025 05:39 AM

Hi Gavin,

It would be a little difficult to provide enough information to StorM for it to reliably calculate an average compression percentage for an entire DS8000 from results from a subset of volumes. 

However the idea of taking a set of volume results and then providing a projection for a system/sysplex is a good one and dxefinitely something we might consider on the z/OS side where we have knowledge of the DFSMS storage groups etc and could more reliably provide a projection.

Fri January 17, 2025 04:24 AM

Does Storage Modellor allow importing the DFSMS results for capacity sizing?