Hi Charlie,
In the past, a long time ago, with old XI50b and XG45, which used to have much less memory then IDGs, the customer had issues with memory utilization... some times it as detected issues with service 'memory leakage', but as support team what I see most is a bad caching sizing.
Of course, it is frustrating that someone prefers to do a reboot to clean up memory rather than fix the caching sizing, but that's the way they opted to :( , and also frustrating that customer refuses to update their procedures, even when issues are fixed or no longer present, but that is out of my control.
------------------------------
Renato Zancoper
DataPower & MQ Admin
IBM
------------------------------
Original Message:
Sent: Thu August 13, 2020 04:24 PM
From: Charlie Sumner
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
Sorry to take this even further off-topic with this question, but why are you restarting your DataPower after a secure-backup? Are you experiencing anything that is causing you to reboot regularly?
------------------------------
Charlie Sumner
Original Message:
Sent: Thu August 13, 2020 09:54 AM
From: Renato Zancoper
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
Thanks, August;
LB Scenario would certainly be a use-case.
Another one is that we have a weekly maintenance routine that takes a secure-backup and then reboot the appliance ... we just wanna go to the next appliance after the first one is completely configured. Nowadays, we set some 'long enough' sleep time on the script to don't trigger the next shoot before the first is done, but that leads to unnecessary delays if the sleep time is too long, or the risk of trigger the next one too early if a problem happens at startup or if the sleep time was set too short.
I will review the domain-availability option and get back in case additional info is needed.
Regards
------------------------------
Renato Zancoper
DataPower & MQ Admin
IBM
Original Message:
Sent: Thu August 13, 2020 09:25 AM
From: August Ziesemer
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
Thanks a lot for the detailed reply, Peter! (yes, sounds like RFE area and you know how fast or not those can go...)
Renato, per my understanding those messages are per domain, so would need track all individually.
Another approach to your problem with the large long loading domains has its own risks: there is this domain-availability thing which if enabled can keep a domain quiesced until all services ready. However that functionality seems to bring mixed results depending on individual usecase and I saw some varying discussions/results with it in the past year. So if interested careful testing is needed. So this approach could allow to monitor via "show domains" status provider if domain is ready or not yet. May I guess you want to know when your loadbalancer can shoot again at the box?
------------------------------
August Ziesemer
DataPower Gateways L2 Support
IBM
Original Message:
Sent: Thu August 13, 2020 09:03 AM
From: Renato Zancoper
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
I don't wanna highjack PPotkay question, but I see myself in a very similar scenario, however, what I wanna identify is exactly when the appliance complete to config all domains after the reboot.
I have some physical appliances with huge domains which take a couple of minutes to complete each ... I see the message about the completion of each domain, like the one presented by August:
20200812T094523.528Z [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.
... but is there an indicator that all of them completed? or do I need to check each domain individually?
------------------------------
Renato Zancoper
DataPower & MQ Admin
IBM
Original Message:
Sent: Wed August 12, 2020 06:03 AM
From: August Ziesemer
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
The best log indicator is the "Booting" message in the onbox-only AuditLog. However it comes that early that you cannot subscribe it in a custom log target that writes to your destination of choice.
So the main challenge for logs is that the custom log target configs come up shortly delayed and so may miss to send a good indicator.
On a quick test I made a "all" "debug" log target in "default" domain to see which events might be significant and captured.
The results may vary slightly depending on exact order of execution. In general the "default" domain *.cfg file is the one first executed and gets processed sequentially.
# AuditLog shows time of system start (after "shutdown reload" in my test)20200812T094510.695Z [sys][success][0x82400013] (SYSTEM:default:*:*): Booting device serial #0000000 DPOS running 1.1, installed 1.120200812T094510.695Z [sys][success][0x82400015] (SYSTEM:default:*:*): Booting product id 5725 revision None firmware IDG.2018.4.1.1020200812T094510.695Z [sys][success][0x82400017] (SYSTEM:default:*:*): Booting build 318002 on 2020/02/21 11:09:49 count 100. Uptime 6221494# a while later my custom log target "all-debug" in "default" domain starts# if it would be subscribed to only 0x8040006b and one would consider this the truth and no impact by admin-state changes of it, one may create an event trigger (with regex on exactly this uniquely named log target) to send e.g. an alert to a SMTP log target? 20200812T094523.271Z [0x8040006b][system][notice] logging target(all-debug): tid(111): Logging started.20200812T094523.271Z [0x00360013][mgmt][info] logging target(all-debug): tid(111): Configured.# similarly, one might be lucky to catch the SSH service starting as you already mentioned, also with the risk of it being a false positive if its admin-state would be flipped e.g. by an administrator20200812T094523.284Z [0x00350014][mgmt][notice] ssh(SSH Service): tid(111): Operational state up20200812T094523.284Z [0x8240001c][audit][info] : tid(111): (admin:default:system:*): ssh 'SSH Service' - Operational state up# another good indicator could be this message telling you that "default" domain is configured# if you don't restart "default" domain against best practices, but only refresh it by reload or shutdown, this message would be pretty unique to a system restart20200812T094523.528Z [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.
In general, a recommendation to detect system restarts is rather to subscribe the "DateTimeStatus" status provider in your monitoring solution and set e.g. alerts in that tool if uptime (reload or reboot) did reset (to e.g. below 2 minutes or whatever).
------------------------------
August Ziesemer
DataPower Gateways L2 Support
IBM
Original Message:
Sent: Tue August 11, 2020 05:52 PM
From: Peter Potkay
Subject: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot
I'm looking for a reliable indicator in the DataPower logs that announces the appliance is starting back up after a reload or reboot.
It looks like the SSH service starts up early in the reload/reboot scenario, so I'm searching for the following
"'SSH Service' - Operational state up"
But its not always the first log message on a reload/reboot.
Is there something better to search for in the logs that equates to "Hello everyone, I'm a DataPower appliance and I am now starting."
------------------------------
Peter Potkay
------------------------------