WebSphere Health Policies are a powerful, proactive monitoring solution for maintaining the health of your server environment. In this post, we'll take you through the steps of creating and applying a health policy. If you're new to Health Management in WebSphere, check out this prior post for an overview on Health Management
The first step in policy creation is navigating to the Health Policies page (Operational Policies -> Health Policies).
From this page, you can create, view, and modify health policies. To create a policy, click on the 'New..." button.
This brings you to the page below, where you can define the health policy name, provide a description of the policy for later reference, and select the health condition for the policy.
There are two options when selecting the health condition- you can choose from one of several predefined policies or create a custom health condition.
Note: The predefined conditions Excessive response time, Excessive request timeout, Storm drain detection, and Workload based require the use of an ODR. If using the plugin ODR, only the Excessive response time and Excessive request timeout are available. The Age based, Memory condition: excessive memory usage, Memory condition: memory leak, and Garbage collection percentage conditions do not require an ODR.
Let's use a predefined condition- in this case, we've selected Age-based condition. After choosing the health condition, click "Next".
The next health policy page shows options for how the health condition is monitored, whether the reaction mode is automatic or supervised, and what action(s) will be taken when the health condition is breached.
Depending on the health condition chosen on the prior screen, you'll have various options on how the condition is triggered. In this example, we've chosen 1 minute for the maximum age (age is how long the member of a policy has been in a "started" state).
Next, choose the reaction mode for the policy. Automatic reaction mode tasks are created and executed by the health controller following a policy breach. Supervised reaction mode tasks are created by the health controller, but not executed. Supervised reaction mode tasks require user approval before any action is taken. For this policy, we've chosen "Supervised".
We can then define which action(s) the health controller takes on a policy breach. If more than one action is chosen, the actions will execute in the order they are defined. You can add, remove, and move actions in the action list. Here we've chosen "Restart Server".
Note: If using a custom action, it must be defined before creating the health policy. Custom actions can then be accessed via the "Add Action" button.
Once you've finished defining the condition properties and actions, click "Next".
The next page will allow you to specify the members of a health policy. Use the drop down "Filter by" menu to view available members for a policy. You can filter by Server/ nodes, Clusters, Dynamic Clusters, On Demand Routers, and Cells.
After selecting a filter, the left pane will display available members. To add a member to the policy, select it and click "Add > >".
You should then see the member appear in the right pane.
Once you've finished defining policy members, click "Next".
This will take you to the last page in policy creation, where you can review a summary of the health policy before saving it. If everything looks good, click "Finish".
After creating a policy, you'll see it listed at the Health Policies page. There will be a message alerting you that your configuration has changed- click "Save" if you want to save this policy to the master configuration.
From here, if using a supervised policy, you'll need to approve health actions from the Runtime Tasks page. If using an automatic policy, there will be an informational entry in Runtime Tasks with details on when the policy was triggered. The Runtime Tasks page can be found by going to System Administration -> Task Management -> Runtime Tasks.
Runtime Tasks: Supervised Policy
For a supervised policy, we can check if it's been triggered and choose whether or not to accept the health action from the Runtime Tasks page. Navigate to Runtime Tasks(System Administration -> Task Management -> Runtime Tasks)
Actions for supervised health policies will appear in this pane.
If you've followed along with this post and created a policy with an age based condition of 1 minute, you might notice it takes more than 1 minute for the action to appear in the Runtime Tasks pane. This is because the default control cycle length of Health Controller is 5 minutes. The control cycle length variable defines the amount of time between environment checks initiated by the health controller. So, while our age based condition should trigger quickly, we'll have to wait until the controller wakes up and evaluates policies. This brings up an important point- Health management offers default settings that suit most environments, but in some cases, you will want to tune your controller to suit your environment and needs (for example, the control cycle length can be configured). For more information on tuning Health Management, refer to this Knowledge Center article:
Turning our attention back to the Runtime Tasks pane, after the Health Controller detects the violation of a supervised policy, it will appear with the options to Accept, Deny, or Close the task.
Click on the radio button under the "Select" column, select how you want to handle the task, and then click "Submit" to execute the task.
Runtime Tasks: Automatic Policy
Runtime Tasks will display information on automatic health tasks, including when the health condition was breached and the status of the health action. There will be no option to accept or deny automatic health policies.
If you've followed along with this article, you should now know how to create and configure basic health policies and check on the status of their execution.
To learn more about Health Management in general, visit the following Knowledge Center page: