IBM Destination Z - Group home

Automate and Reduce Workload Costs While Preserving Critical Data

By Destination Z posted Mon December 23, 2019 03:25 PM

Now it’s possible to automate and reduce workload cost while preserving critical data. Innovation in mainframes is happening right here in Silicon Valley. While everyone is chasing the next big, disruptive unicorn, Silicon Valley mainframers are quietly disrupting the industry with new customer-driven innovation.

Curiously, most non-mainframe technology professionals are unaware of opportunities to innovate on mainframe software. They think the platform is full of age-old dinosaurs that occupy entire rooms, use card punch input, and require bits and bytes programming. Things have changed a lot over the years. The same mainframes that are now the size of your refrigerator run the most critical work of our top industries today. This includes large financial institutions like banks, insurance companies, healthcare, utilities, government, military, and a multitude of other public and private enterprises. Mainframes are a multi-billion dollar industry. Innovations that are achieved have huge positive impacts in the industry and the global economy.

The Challenge

First, let's take a look at one key challenge that the mainframe industry faces today. Many surveys have revealed that the biggest challenge of companies with mainframes today is the increasing cost. Mainframe costs include several components, but one part to be aware of involves IBM monthly license charge (MLC) costs.

For decades, mainframe customers have manually managed mainframe workload capping strategies, constantly shifting workload allocations to manage costs and meet business demand. But these techniques are manual, risky and error prone and often don’t provide desired results.

Mainframe customers are often charged for their software that runs on a mainframe based on peak millions of service unites (MSU) usage through a MLC. To determine the MLC, the IBM z OS generates monthly reports that determine the customer's system usage (in MSUs) during every hour of the previous month using a rolling average (e.g., a 4-hour rolling average) recorded by each LPAR or a capacity group for the customer. The hourly usage metrics are then aggregated together to derive the total monthly and hourly peak utilization for the customer, which is used to calculate the bill for the customer.

To control costs, customer might assign each LPAR or capacity group a consumption limit (e.g., Defined Capacity or Group Capacity Limit), and it can’t use more MSUs than allotted in its respective consumption limit. But this may result in some work not receiving the CPU resources it needs, in effect slowing the execution and completion of that work. This may have very undesirable effects on important workloads. Since meeting performance objectives of high importance work is deemed a necessary part of shifting resources, customers tend to raise capacity limits to meet the demand and avoid outage to their clients. But raising the capacity limit even for as little as an hour can increase MLC costs substantially.

Today’s Solution

The ideal solution should be dynamic that automates and reduces the MLC cost while also mitigating the risk to critical business workloads. Let’s discuss one such approach.

The solution system should dynamically change LPAR defined capacity (DC) and group capacity limits (GCL) values by taking into account the dynamic changing of workload importance. This is possibly done by interacting with the Workload Manager (WLM) component of the OS that runs on each LPAR, for the breakdown of MSU use by WLM service class, period and importance class Then group by importance class and aggregating information across the multiple operating LPARs and across multiple SYSPLEX groupings, whose CPU capacity is being managed.

The solution system requires intelligent use of several components of z Systems:

  1. Base Control Program Internal Interface (BCPii), which allows z/OS system applications to access HMC system to modify DCs and GCLs
  2. Continuous data collection system that collects 4HRA and WLM importance data.
  3. Continuous monitoring system that dynamically reallocates CPC capacity using BCPii based on the real time data collected by continuous data collection system. This system dynamically adjusts the DCs and GCLs to favor high importance work and limit low importance work to achieve maximum high importance throughput for the lowest possible cost across all billing entities.
The net effect is reduced MLC cost, while also meeting the goals for important workloads. Another post will look more at various performance and cost management trips and tricks.

Hemanth Rama is a senior software engineer at BMC Software. He has 11+ years of working experience in IT. He holds 1 patent and 2 pending patent applications. He works on BMC Mainview for z/OS, CMF Monitor, Sysprog Services product lines and has lead several projects. More recently he is working on BMC Intelligent Capping for zEnterprise (iCap) product which optimizes MLC cost. He holds master degree in computer science from Northern Illinois University. He writes regularly on LinkedIn pulse, BMC communities and his personal blog.