This past October, IBM held a virtual
user group event (playback still available) covering all facets of IBM Z including AI Ops, z/OS Academy, Dev Ops, CICS and IMS. There truly was quite a bit of great content that was presented over the two week event. In the AI Ops track, we discussed how many of our customers were on the journey to AI Ops and looking for solutions to help them become more resilient in their operations. The development teams here are IBM have been hard at work delivering new capabilities in order to provide our customers the right tools to become more resilient.
We cannot gain any new insights or drive any kind of analytics without first having access to data. IBM Z is one of the most well instrumented platforms in the world today, generating vast volumes of operational data. IBM Z Common Data Provider continues to be the solution for getting near real time access to your IBM Z operational data and making it available to where you want to do analytics.
Earlier this month, IBM released some new enhancements to Common Data Provider that will be a benefit to a large number of our customers. The first enhancement for Common Data Provider was adding to the large breadth of supported data by adding new data types and updating existing data types for currency. The enhancements delivered in this release are:
- New RMF III reports for STORC, STORF, and STORR
- New records for the z/OS Workload Interaction Correlator data for CICS and IMS
- Currency support for CICS, VTAM, and MQ SMF data
- New support for ICSF SMF 82 records
The types of data collected is just one piece of the puzzle. Another is the targets that consume the IBM Z data streams. One of the most common targets of operational data that our customers use is Splunk
®. Since Splunk is extremely prevalent at our customers, Common Data Provider contains several optimizations to ingest IBM Z data in order to reduce overhead and data volumes. Many customers leverage the data receiver provided with the solution to leverage these optimizations. With this latest deliver, the data receiver can now be run as a service in the operating system, further simplifying how the data receiver is managed in the environment.
Another mechanism to ingest data into Splunk is by use of the HTTP Event Collector. This latest delivery contains multiple enhancements for customers who want to leverage this path and include:
- The ability to configure the Splunk source type for data streams in the Common Data Provider configuration tool (streams no longer require the suffix '_kv')
- The ability to stream more log data and include CICS EYULOG, CICS MSGUSER, WebSphere SYSOUT and SYSPRINT, and many others.
Lastly, this latest delivery for Common Data Provider includes additional enhancements around data collection startup, troubleshooting, and configuration tool security. Full details of all of the enhancements in this and past deliveries can be found on the Common Data Provider
knowledge center.
With such an abundance of data, a challenge our customers still face is what insights can we gain from the data that we didn't have before. IBM Z Operations Analytics provides these insights through integrations with IBM Watson Machine Learning for z/OS and as dashboards on multiple analytics platforms. Additional key enhancements expanding the insights provided are now available for IBM Z Operations Analytics.
First, IBM Z is core component in our customers' hybrid clouds and hybrid applications which is often dependent upon solutions like IBM z/OS Connect. In this latest delivery, new dashboards are available to provide real time information into the health of your API economy. Dashboards with drill down capability have been delivered for both Splunk and Elastic
® Stack giving customers insight into z/OS Connect APIs, Services, and Request URIs. The following screen capture is an example of one of the dashboards now available, the Request URI dashboard showing metrics such as transaction counts, elapsed time, data rates, and error counts.
Over the past year, the team has been extremely busy enhancing the IBM Z Operations Analytics integration with Watson Machine Learning for z/OS in order to detect anomalous behavior on IBM Z subsystems. Detecting operational anomalies before they lead to broader business impacts leads to more resilient operations and can help prevent costly outages. One of the misconceptions I have seen is how simple some of the machine learning and AI solutions appear. It really is quite amazing the amount of data science that goes into these types of solutions and has been fun to watch some of the brilliant work that has been done by the development team to expand machine learning and AI to IBM Z.
With this latest delivery, Operations Analytics provides new ways to interact with the valuable data derived from scoring real time operational KPI data against a model of your system's normal behavior and includes the following capabilities:
- The ability to zoom into the scorecard view to get a more granular look at a specific time period
- The ability to quickly hide non anomalous data in order to focus on the most relevant information
- The ability to visualize multiple subsystems on the same score card in order to visually correlate anomalous behavior
- The ability to quickly navigate across time periods to quickly see trends and patterns
Advanced warning of potential issues is one of the most valuable benefits of the solution. In order to quickly surface these anomalies, Operations Analytics now supports sending event notifications that can be integrated into existing event management systems. Many event management systems are already configured to create tickets or notify operations of these system events. The integration becomes even more interesting as these events are integrated into a hybrid cloud solution that can correlate information from across the enterprise, specifically IBM Watson AI Ops.
Lastly, this latest delivery for Operations Analytics includes additional enhancements around IBM Watson Machine Learning for z/OS Platform support, tuning and training enhancements, and Splunk HTTP Event Collector support for existing dashboards. Full details of all of the enhancements in this and past deliveries can be found on the Operations Analytics
knowledge center.