By Christoph Theisen
Introduction
IBM Db2 Query Monitor for z/OS is a performance management tool for monitoring SQL executions on Db2 for z/OS. In addition to monitoring, it has the capability to generate alerts and launch automated actions in an out-of-line situation. This includes generating WTOs and emails, running scripts or batch jobs, canceling Db2 threads, etc.
Many Db2 for z/OS Administrators appreciate the monitoring capabilities of Db2 Query Monitor but are not aware of the alerting function or dive deeper into its concepts.
This series of blogs is intended to provide more insight into the alerting function of the Db2 Query Monitor. We will see the,
- Basic concepts and practical examples of setting up alerts and automated actions.
- Tips and tricks on how to work with the message board.
Prerequisites for alerts
It is mandatory to set up the CAE infrastructure of the Db2 Query Monitor (the browser-based user interface, the web server – a.k.a. CAE server – and the agent-started tasks, CAE agents, on z/OS) before configuring alerts and automated actions. This includes at least one instance of a CAE server (running on Windows or z/OS Unix System Services) and one CAE agent running locally on every LPAR where a Db2 Query Monitor data collector is running. For more information on the installation and configuration of the CAE components, refer to Installing CAE components.
It is not relevant from the alerting perspective whether the CAE server is running on Windows or z/OS Unix System Services. However, the CAE server can launch scripts or batch files locally on the server instance, so this would make a difference between the two possible platforms.
If you already have CAE infrastructure, then additional installation work for alerts and actions is not required. Moreover, re-customization with the IBM Tools Customizer is also not necessary.
Setting up alerts
Db2 Query Monitor can initiate automated actions when an alert is generated. Alerts have many similarities to Exceptions in Db2 Query Monitor. Similar to Exceptions, Alerts for Db2 SQL statements have three basic categories:
- Threshold alerts
These are generated when a defined static threshold is reached for any of the following:
- In-Db2 CPU time
- In-Db2 elapsed time
- Number of Getpages
- Number of SQL calls
- Anomaly-based alerts
These are generated when an anomaly threshold is reached for any of the following:
- In-Db2 CPU time
- In-Db2 elapsed time
- Number of Getpages
- SQLCODE alerts
These are generated based on SQLCODEs, which Db2 SQL statements may have as a result.
Alerts are configured on a workload basis (which is one entry in a monitoring profile). Multiple workloads can be configured in parallel for alert processing.
The screenshot shows an alert configuration (taken from the ISPF user interface) for a workload that captures threshold alerts for In-Db2 CPU time, In-Db2 elapsed time, Getpages, and negative SQLCODEs.
Since the Alert Threshold for the number of Alert SQL Calls is set to 0, alerts from the SQL Call counts are not generated, and alerts based on anomaly thresholds are excluded.
An Administrator can exclude certain negative SQLCODEs from alert processing and include a list of positive SQLCODEs.
Note:
An SQL statement execution that satisfies multiple alert criteria also generates multiple alerts (for example, an execution with 50,000 Getpages and 4 CPU seconds would generate two alerts).
Also, the Db2 Query Monitor is not intended to process a high number of alerts in a short period of time. For information on setting the thresholds for alerts and exceptions, refer to Guidelines.
The configuration parameter ALERT_LIMIT specified for a Db2 Query Monitor data collector controls the number of alerts a Db2 Query Monitor subsystem can queue before discarding subsequent alerts.
Difference between Exceptions and Alerts
Though the configuration looks similar, there is a significant difference in the processing of Exceptions and Alerts.
- Exceptions are stored in the interval data sets of Db2 Query Monitor (such as SQLCODEs, Db2 metrics, Db2 commands, etc.) and can be reviewed as long as the interval data is available.
- Alerts are not stored in the interval data sets but are shown on a message board, which is only available from the Db2 Query Monitor browser client.
Configuring automated actions
As a user of the Db2 Query Monitor browser client, you may see the main options: Alerts, Configuration, and Tools. These become relevant in the context of alerts and actions.
Important: Viewing or managing alerts from the message board as well as the configuration of actions requires an appropriate user role in Db2 Query Monitor.
The Configuration tab contains a list of items that can be configured. Most of them (but not all) are related to automated actions. The first three items: Actions, Scopes, and Responses, must be configured when you want to set up automated actions from a Db2 Query Monitor Alert.
- Click the Configuration tab.
- Click the required arbitrary item from the drop-down list (for example, Scopes).
The left pane displays the list of configurable items, and the rest of the window displays the configuration panel for the selected item.
Note: You can switch from one configuration item to another (for example, from Scopes to Actions) by clicking the link on the left pane.
Scope
A scope defines the criteria that need to match for a certain type of alert. Every alert for which Db2 Query Monitor should initiate an automated action needs to fit into at least one scope. Alerts without a matching scope are still generated and displayed on the message boards, but no automated action is taken.
There are two basic types of scopes, which are based on Domain Elements and Events.
- After selecting Scopes from the Configuration drop-down list or the navigation tree, click Elements or Events for a list of configured Domain Elements and Events.
Domain Element scopes describe the elements that are subject to the monitoring process.
For examples,
- Db2 SQL Statement
- Dynamic Db2 SQL Statement (as a sub-type of Db2 SQL Statement)
- Db2 Query Monitor Subsystem
- Db2 Subsystem
- Db2 Buffer Pool
Db2 Query Monitor ships a predefined list of Domain Element scopes. You can take these Domain Elements as a template and define your own Domain Element scopes, for example,
- Db2 SQL Statements from distributed clients (= under PLAN name DISTSERV)
- Db2 SQL Statements running on a certain Db2 Subsystem ID
When you start working with alerts and actions, you will use the predefined Domain Elements. Our practical examples do not require customized or user-specific Domain Elements. Db2 SQL Statement will usually be the Domain Element on which you will primarily work.
Event scopes are probably more relevant for the alerting and monitoring process. It defines the type of event that must match the alert before an automatic action can be taken.
For example,
- A threshold for Getpages, In-Db2 CPU time, In-Db2 Elapsed time, and SQL Calls is reached.
- A negative SQLCODE was encountered.
- A threshold anomaly was encountered by Db2 Query Monitor anomaly detection.
Event scopes are organized hierarchically (similar to Domain Element scopes). For example, every event that is generated because a Getpages threshold is reached will match the event scope Get Page Count Exceeded Problem and, at the same time, the more generic scopes Alert Threshold Problem and Sql Problem.
- Click Monitored Information Types in the Configuration tab of the browser client or the navigation tree on the left pane to view this hierarchy.
Db2 Query Monitor ships with predefined event scopes that you can use for your specific types of alerts for which automated actions should be taken.
Note: There are more event scopes available than under the normal circumstances required for your daily work. For the beginning, focus on event scopes in the Db2 Sql Event node in the hierarchy.
Actions
You need to define actions that will be triggered when an alert is raised and that fit at least one of your scopes. Db2 Query Monitor has two main categories of actions:
- CAE Agent-based actions
These are initiated by the CAE agent process. The CAE agent process is a started task running on a z/OS system, so operator commands, WTOs, submits of Batch JCL, and Db2 commands would typically apply to that category.
- CAE Server-based actions
These actions are initiated by the CAE server, which can run on Windows or z/OS Unix System Services. Sending an email from an alert is a typical example of a CAE Server-based action.
Db2 Query Monitor ships templates for CAE Agent- and Server-based actions, which you can clone for your specific use case.
Setting up an action, regardless of whether it is an email, WTO, or other types of action, requires the specification of a Subject Type and an Event Type.
- The Event Type describes what kind of event is relevant for that action.
- The Subject Type relates to Domain Elements that you can see from the Scopes configuration item. The specific action of Subject Type must match the Domain Element of the corresponding event scope.
Db2 Query Monitor also ensures that only these Event Types can be specified that are suitable for the selected Subject Type.
To check which combinations of Subject Type and Event Type are supported:
- Click Actions in the Configuration tab of the browser client or the navigation tree on the left pane.
- Click CAE-Agent based actions, then the + icon to create a new action.
- Select WTO Action from the pop-up menu.
- Enter a name for the new action, then click OK. The browser client opens the action editor. View the drop-down menus for Subject Type and Event Type.
- Select Db2SqlStatement from the Subject Type and check the list of possible event types (for example, AlertThresholdProblem).
- Change the subject type to Db2Dbms. The list of possible event types changes. (AlertThresholdProblem is not shown anymore).
Depending on the combination of Subject Type and Event Type it is possible to let Db2 Query Monitor display context information (for example, SQL Statement Text, Authorization ID, Planname, Metrics) in the Message field.
- Click the Discard changes icon to leave the action editor without changes.
Response
A response combines a scope and one or more event types with one or more actions. When a specific alert matches a scope and that specific scope is combined with an action in a response, either the CAE Server or CAE Agent can launch that action. In addition to scopes and actions, it is also mandatory to specify which event state changes should trigger that action.
For example,
- An event of that type is posted for the first time.
- The repetition counter for that type of event is incremented (that is, the event repeats).
- The event is (un)acknowledged or cleared from the message board.
You can define responses and disable or enable them. Only responses in the Enabled state can initiate actions.
Note: Db2 Query Monitor uses Knowledge Base Management Language (KBML) for the definition of Scopes, Actions, and Responses. This blog and the practical examples do not cover KBML in detail. We show examples of how KBML expressions can be used to generate actions based on alerts. For more details, refer to Db2 Query Monitor documentation.
Summary
This blog has shown the basic concepts of alerting and automated actions in Db2 Query Monitor. Make sure your CAE infrastructure is set up and you have the necessary permissions for the Configuration options in the Db2 Query Monitor browser interface. In the next article, we will show a practical example of an automated WTO action generated from a threshold alert.
#Db2 #Db2forz/OS #Db2Toolsforz/OS