Hi
@Brian Robertson,
You can combine the following:
(1) QRadar notification rules for specific QIDs from log source type = "System Notification" like QID 38750092 "Disk storage unavailable".
(2) "Basic" SNMP monitoring. Memory used, storage etc. Depending on your monitoring solution, you could have an HTTP sensor for the GUI, which will basically check whether Tomcat is responding to requests, or not. - This is already in place based on what you mentioned.
(3) "Advanced" SNMP monitoring. Process/service monitoring for critical services like ecs-ec-ingress, ec-ec, hostcontext etc.
(4) Anomaly rules that check for lack of events over a certain period of time. You could use this to double check ecs, flow collection and related services are working OK.
Cheers,
Damian
------------------------------
Cheers,
Damian Zinni
------------------------------
Original Message:
Sent: Mon July 08, 2019 11:33 PM
From: Brian Robertson
Subject: Health Monitoring of QRadar Appliances
Hi All,
I'm part of a MSSP in NZ and we currently have numerous different QRadar deployments. One of our biggest pain points is being able to health monitor the various different QRadar appliances we have deployed. We do have a centralised monitoring system that can poll basic metrics (like CPU, memory utilization, disk space etc) via snmpwalk, but not QRadar specific items.
As an example, we'd like to be alerted if a core QRadar service (hostcontext, hostervices, tomcat etc) stopped for more than a certain period of time. I've been looking for specific events in the system that show this but haven't been able to find anything yet, there are heaps of events against the health metrics log source, but none seem to show exactly what I need.
Keen to hear what others have done in this space.
Thanks & Regards
Brian
------------------------------
Brian Robertson
------------------------------