Hello Scott
I've deleted yesterday the RFE that I had opened earlier this week.
The Appliance DOES records in its Event Logs or sends SNMP Traps indicating the expected behevior as can be observed in this event trail in our back-end monitoring system:
MajorMajor someserver 2019-10-23 07:49:32.0 [ISS]LogData: WGAWA0643E High CPU utilization: 100% (CPUUtilizationState)[name=system,priority=high] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM IM317009 N
Minor someserver 2019-10-23 07:48:32.0 2019-10-23 08:05:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 81% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM N
Minor someserver 2019-10-23 07:25:08.0 2019-10-23 07:26:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 85% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM Y
Minor someserver 2019-10-23 07:16:04.0 2019-10-23 07:17:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 88% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM Y
Minor someserver 2019-10-23 07:01:08.0 2019-10-23 07:01:12.0 [ISS]LogData: WGAWA0043W High CPU utilization: 89% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM N
Major someserver 2019-10-23 06:47:24.0 2019-10-23 07:26:08.0 [ISS]LogData: WGAWA0643E High CPU utilization: 90% (CPUUtilizationState)[name=system,priority=high] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM Y
Indeterminate someserver 2019-10-23 06:43:43.0 2019-10-23 06:43:47.0 [ISS]LogData: WGAWA0650I The CPU utilization has fallen below the configured threshold: 79% (CPUUtilizationState)[name=system,priority=low]***Clear*** logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM N
Minor someserver 2019-10-23 06:41:43.0 2019-10-23 06:50:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 86% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM Y
Indeterminate someserver 2019-10-23 06:27:57.0 2019-10-23 06:30:01.0 [ISS]LogData: WGAWA0650I The CPU utilization has fallen below the configured threshold: 76% (CPUUtilizationState)[name=system,priority=low]***Clear*** logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM Y
Minor someserver 2019-10-23 06:20:13.0 2019-10-23 06:31:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 82% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM Y
Indeterminate someserver 2019-10-23 06:06:46.0 2019-10-23 06:11:31.0 [ISS]LogData: WGAWA0650I The CPU utilization has fallen below the configured threshold: 75% (CPUUtilizationState)[name=system,priority=low]***Clear*** logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM Y
Minor someserver 2019-10-23 06:05:46.0 2019-10-23 06:11:11.0 [ISS]LogData: WGAWA0043W High CPU utilization: 80% (WGAWA0043W)[name=system,priority=medium] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_WGAWA0043W IS-AAM Y
Major someserver 2019-10-22 19:09:21.0 2019-10-22 19:21:04.0 [ISS]LogData: WGAWA0643E High CPU utilization: 92% (CPUUtilizationState)[name=system,priority=high] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM Y
Indeterminate someserver 2019-10-22 19:04:59.0 2019-10-22 19:05:05.0 [ISS]LogData: WGAWA0643E High CPU utilization: 93% (CPUUtilizationState)[name=system,priority=high] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM N
Indeterminate someserver 2019-10-22 18:49:00.0 2019-10-22 18:49:04.0 [ISS]LogData: WGAWA0650I The CPU utilization has fallen below the configured threshold: 72% (CPUUtilizationState)[name=system,priority=low]***Clear*** logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM N
Indeterminate someserver 2019-10-22 18:41:30.0 2019-10-22 18:44:31.0 [ISS]LogData: WGAWA0643E High CPU utilization: 94% (CPUUtilizationState)[name=system,priority=high] logdatatrap SNMPTRAP-iss-ISS-MIB-logdatatrap_CPUUtilizationState IS-AAM Y
Cheers------------------------------
Sylvain Gilbert
------------------------------
Original Message:
Sent: Tue October 22, 2019 02:18 AM
From: Scott Exton
Subject: wga_notifications and it's watchdogs
Sylvain,
This is just a limitation with the current event framework which we are using. Feel free to raise an RFE if you need the event framework to raise an event when the system recovers from a prior alert.
Thanks.
Scott A. ExtonSenior Software Engineer
Chief Programmer - IBM Security Access ManagerIBM Master Inventor
|
Phone: 61-7-5552-4008 E-mail: scotte@au1.ibm.com |
L11 & L7 Seabank Southport, QLD 4215 Australia
|
Original Message------
Thanks Scott
While we are on the subject, I have received questions from team mates asking why the Appliance does not send an SNMP trap (or just report to the Event Log depending on the System Alert configured) indicating when the CPU usage (for instance) decreases under a warning or critical level.
Currently, one can only observe in the Appliance Event log occurrences when the CPU usage increases beyond any of the warning/critical threshold. The fact that when the CPU usage goes back under any of the thresholds is not reflected in the Event log, it prevents one from assessing the duration of such condition.
Although that one can use the LMI/Restapi to query CPU usage data points to visualize the CPU usage pattern (trend analysis), from a pure Event Management standpoint, knowing when the CPU usage returns to a more normal pattern could help external monitoring system to auto-resolve incident, or on the opposite help delay the automatic opening of incident ticket if the high CPU usage only reflects a very short one-time spike.
I am open understanding better from others in the field as well: what are the best practices in this perspective of event management.
Thanks
------------------------------
Sylvain Gilbert
------------------------------