Hi,
It starts with thinking about what is up what is down; the web server will either return something or nothing; if it returns something the most common error would be error code="500". Most monitoring solutions consider a healthy resource to be alive and the responds code = 200 or 302 (redirect).
Now from your IT perspective that's nice but does not cover everything; one example is administrative mode; if you leave that on by accident; your monitor won't pick up that the end customer can't login :). If you use AD/OAUTH and the end customer can't authenticate you get where I'm going...
Additionally you could be behind a strict firewall; so cloud based solutions might be able be used to orchestrate responds but actual identification should be done from within.
It all depends on your use case; plan accordingly. We use a docker image to do this for us for example, it is a script that actually does a login every 15 minutes from within our network it sends the results to our telemetry platform. Our Public Edge is not a FW > Maximo web server configuration so we monitor health of the edge components, network routes and infrastructure and other dependent services individually in a continuous mode.
The telemetry platform then makes decisions based on the metrics and notifies the appropriate persons via a paging service. Now when I say this please keep in mind that we monitor health of service and micro services not just up/down our selves so our use case might not apply.
So if your Maximo is deployed internet facing and you allow ICMP (ping) a simple tool will do part of the job, if you have a strict posture in your approach to security or your use case is more complex you probably will need more than one tool ;) and one source of data to verify your service is available.
If you need any help just PM me ;)
Christiaan
Maxlogic
------------------------------
Christiaan Lok
------------------------------
Original Message:
Sent: Mon May 24, 2021 10:16 AM
From: User1971
Subject: Automatically text & email me if Maximo is down
My organization had an incident recently where Maximo went down (after hours) due to Windows updates on a server. We didn't know about it until Monday morning - when users started emailing us about the issue.
Is there an industry-standard mechanism/product that can be used to reliably notify IT if Maximo is down/unreachable?
For example, is there a product you can recommend that would check every hour if Maximo us up, and if it's down for > 30 mins, then it would text & email a list of IT staff?
Thanks.
#Maximo
#AssetandFacilitiesManagement