AIOps: Monitoring and Observability - Group home

IBM Z OMEGAMON AI for z/OS takes significant strides in mainframe monitoring

  

In March 2024, with the IBM Z OMEGAMON AI for z/OS 6.1 Fix Pack 2 release significant enhancements to improve the already robust z/OS monitoring capabilities were rolled out. This latest update introduces a suite of powerful features, including expanded reporting attributes and enriched workspace views. These additions provide deep visibility into CPU usage, memory utilization, and SMF-related activities, empowering organizations to conduct comprehensive performance monitoring and optimization across their z/OS environments.

Analyzing address space bottlenecks

In mainframes, address space bottlenecks are one of the main causes of performance degradation and potential system instability. Address space bottlenecks often occur when multiple processes or tasks within the same address space compete for CPU or memory.  

To analyze address space bottlenecks, you can capture samples of performance data, covering areas such as CPU usage, memory utilization, and I/O activity at regular intervals. The samples reflect snapshots in time of various performance parameters or attributes.

By analyzing these samples - using AI wherever possible - IBM Z OMEGAMON AI for z/OS 6.1 can reveal underlying issues, and forecast potential bottlenecks.

Since the attributes number in the hundreds, they are divided for convenience into groups of related attributes, called Attribute groups. Three of the new attributes added in the March update relate to the Address space bottlenecks group - Total CPU Samples, Total Wait Samples, and CPU Loop Index Importance.

You can use these new attributes to estimate the number of times when programs within an address space are either using the CPU or awaiting its availability. This process leads to the derivation of a metric known as CPU Loop Index Importance, which quantifies the degree to which an address space is in a state of waiting for a free CPU.

CPU Loop Index Importance complements the already existing CPU Loop Index attribute. Jobs that show high CPU Loop Index are not necessarily looping. It is only when you combine the Using CPU and CPU Wait numbers that you see the dramatic effect.

Monitoring dedicated memory requests

The update also introduces new table views in the Tivoli Enterprise Portal (TEP) that show administrators details about dedicated memory requests, which they can use to monitor memory usage across an LPAR as well as within specific address spaces. Dedicated memory requests refer to specific allocations of memory reserved for particular jobs or started tasks.  They are crucial to monitor because they directly impact system performance and resource allocation. By providing insights into dedicated memory, the new Dedicated Memory table views supplement the already powerful memory monitoring capabilities of IBM Z OMEGAMON AI for z/OS 6.1.

Managing SMF data

Given the centrality of SMF records in monitoring system performance, it is important to ensure that they themselves are configured correctly. To facilitate this, four new attribute groups along with e3270UI workspace views have been introduced: SMF exits installed, SMF DSN attribute group, SMF record types, and SMF global data. These new attribute groups and workspace views enable administrators to monitor the installed exits, verify configuration accuracy, and troubleshoot SMF record processing errors.

Summary

In summary, the new features offer insights that can significantly improve our understanding of processor capacity needs and resource allocation efficiency. They enrich IBM Z OMEGAMON AI for z/OS 6.1 by providing enhanced visibility into SMF-related activities, CPU usage, and memory utilization, facilitating more comprehensive performance monitoring and optimization across z/OS environments.

See also,

Blog: IBM® Z OMEGAMON® AI for z/OS® 6.1 Fix Pack 2

Documentation: What’s new in APAR OA65892, PTF UJ94887

#IBMz/OS #ibmz #monitoring #ArtificialIntelligence(AI) #OMEGAMON #CPU #Analytics #Mainframe