
IBM Z Anomaly Analytics (ZAA) provides a platform to collect and analyze Key Performance Indicators (KPIs). In the latest continuous delivery release, several KPIs are marked as deprecated, meaning that we plan to remove them in a future release. The result will be more focused anomaly detection and reduced noise in the anomalies detected.
The first release of ZAA supported anomaly detection only for Db2. We have added support for CICS regions, MQ Queue Managers, IMS regions and RMF data from z/OS. We now support 5 subsystems and over 400 KPIs. Our customers needed help understanding the meaning of the individual KPIs and the significance of an anomaly in the KPI so they could take action to mitigate or fix the situation.
The ZAA development team reviewed every KPI in the product. We spent many hours with our development and service counterparts in each of the product areas understanding every KPI, what an anomaly would indicate and its predictive value. For some of the KPIs, we determined that the KPI did not have much value in indicating an undetected problem or provide additional insight into the problem.
In some cases, the values were constant or were derived from configuration parameters. The information is available from other sources and will not produce an anomaly. For example, VTAM_ACB_DYN_OPENS will change only on recovery from CICS or VTAM termination, so it will never indicate a condition that warrants investigation. The MQCPL_STRUCT_CNT in the MQ Coupling Facility group contains the number of objects defined in the coupling facility. It will change only when an object is added or removed, and therefore will not indicate any unusual condition.
In other cases, the function is still supported but product enhancements and usage patterns have changed over time. All KPIs in the IMS Checkpoint Format Buffer Statistics group fall into this category. IMS still supports the function and generates the log record, but it no longer provides any predictive or diagnostic value. The function monitored by the MQ KPIs READ_AHEAD_BPOOL and READ_AHEAD_IO KPIs in the MQ Queue Manager group is no longer functional after MQ 9.0. The field is still part of the SMF 115 record, but it will not be used.
The Db2 Latch group contains KPIs that are deeply tied to the internals of Db2. Anomalies in latch usage would be the effect of other problems in Db2 and not the root cause. The latch counts only have meaning in association with other metrics or events that are not recorded in SMF records. They may be useful when debugging other issues, but ZAA does not have the proper context to evaluate the meaning of an anomaly in these metrics.
All of this results in a tighter focus on KPIs that have predictive value. It will reduce the “noise” from KPIs that are anomalous, but don’t indicate a looming problem. It also reduces the data collected and stored, and should result in reduced CPU cycles for both streaming and scoring, and reduced storage requirements for the EDW.
To learn more about the supported KPIs and which are to be deprecated, visit our product documentation.
To learn more about IBM Z Anomaly Analytics and its ability to proactively identify operational issues visit our product page.