DataPower

 View Only
Expand all | Collapse all

DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

  • 1.  DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    IBM Champion
    Posted Tue August 11, 2020 05:53 PM
    I'm looking for a reliable indicator in the DataPower logs that announces the appliance is starting back up after a reload or reboot.

    It looks like the SSH service starts up early in the reload/reboot scenario, so I'm searching for the following
    "'SSH Service' - Operational state up"
    But its not always the first log message on a reload/reboot.

    Is there something better to search for in the logs that equates to "Hello everyone, I'm a DataPower appliance and I am now starting."


  • 2.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Wed August 12, 2020 06:03 AM
    The best log indicator is the "Booting" message in the onbox-only AuditLog. However it comes that early that you cannot subscribe it in a custom log target that writes to your destination of choice.
    So the main challenge for logs is that the custom log target configs come up shortly delayed and so may miss to send a good indicator.

    On a quick test I made a "all" "debug" log target in "default" domain to see which events might be significant and captured.
    The results may vary slightly depending on exact order of execution. In general the "default" domain *.cfg file is the one first executed and gets processed sequentially.
    # AuditLog shows time of system start (after "shutdown reload" in my test)
    20200812T094510.695Z [sys][success][0x82400013] (SYSTEM:default:*:*): Booting device serial #0000000 DPOS running 1.1, installed 1.1
    20200812T094510.695Z [sys][success][0x82400015] (SYSTEM:default:*:*): Booting product id 5725 revision None firmware IDG.2018.4.1.10
    20200812T094510.695Z [sys][success][0x82400017] (SYSTEM:default:*:*): Booting build 318002 on 2020/02/21 11:09:49 count 100. Uptime 6221494
    
    # a while later my custom log target "all-debug" in "default" domain starts
    # if it would be subscribed to only 0x8040006b and one would consider this the truth and no impact by admin-state changes of it, one may create an event trigger (with regex on exactly this uniquely named log target) to send e.g. an alert to a SMTP log target? 
    20200812T094523.271Z [0x8040006b][system][notice] logging target(all-debug): tid(111): Logging started.
    20200812T094523.271Z [0x00360013][mgmt][info] logging target(all-debug): tid(111): Configured.
    
    # similarly, one might be lucky to catch the SSH service starting as you already mentioned, also with the risk of it being a false positive if its admin-state would be flipped e.g. by an administrator
    20200812T094523.284Z [0x00350014][mgmt][notice] ssh(SSH Service): tid(111): Operational state up
    20200812T094523.284Z [0x8240001c][audit][info] : tid(111): (admin:default:system:*): ssh 'SSH Service' - Operational state up
    
    # another good indicator could be this message telling you that "default" domain is configured
    # if you don't restart "default" domain against best practices, but only refresh it by reload or shutdown, this message would be pretty unique to a system restart
    20200812T094523.528Z [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.​

    In general, a recommendation to detect system restarts is rather to subscribe the "DateTimeStatus" status provider in your monitoring solution and set e.g. alerts in that tool if uptime (reload or reboot) did reset (to e.g. below 2 minutes or whatever).


    ------------------------------
    August Ziesemer
    DataPower Gateways L2 Support
    IBM
    ------------------------------



  • 3.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    IBM Champion
    Posted Wed August 12, 2020 07:25 AM
    Hi August, thanks for looking into this.

    We do monitor and alert with our BMC monitoring tools for DataPower Firmware Boot Count changing as well as if SecondsSinceReboot or SecondsSinceReload is < some value.

    Same additional context for my use case. We use a Log Target of type "syslog" to send DataPower Log Events to our central Enterprise Logging facility. When I am searching in our Enterprise Logging Tool for DataPower related entries, I'm looking for the log entry that can take me to the log message that announces the appliance restarting. Only now as I type this do I realize the problem with an off-the-appliance logging solution - after a DataPower restart I also have to wait until this specific Log Target comes up before I can start looking for any log messages it may have sent to the central logging facility. Obviously I won't see any log entries between the time the appliance first starts and the time that Log Target starts. And who knows how early or late in the reload process DataPower decides to enable Log Targets.

    Am I getting into Request For Enhancement territory? An RFE for a Log Target that is one of the very first things the appliance starts upon reload/reboot and to include clear event messages announcing the start?

    Regarding your findings, I'm liking this one:
    [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.​

    I have to assume this will only come out after all Log Targets in the default domain are enabled. True?
    It won't take me to the first log entry after a DataPower restart that is available in the central logging tool, but it should be a reliable way to get to within seconds of where those first entries will be. 



  • 4.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 09:04 AM

    I don't wanna highjack PPotkay question, but I see myself in a very similar scenario, however, what I wanna identify is exactly when the appliance complete to config all domains after the reboot.
    I have some physical appliances with huge domains which take a couple of minutes to complete each ... I see the message about the completion of each domain, like the one presented by August:
    20200812T094523.528Z [0x8100003b][mgmt][notice] domain(default): Domain configured successfully.​

    ... but is there an indicator that all of them completed? or do I need to check each domain individually?



    ------------------------------
    Renato Zancoper
    DataPower & MQ Admin
    IBM
    ------------------------------



  • 5.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 09:25 AM

    Thanks a lot for the detailed reply, Peter! (yes, sounds like RFE area and you know how fast or not those can go...)


    Renato, per my understanding those messages are per domain, so would need track all individually.
    Another approach to your problem with the large long loading domains has its own risks: there is this domain-availability thing which if enabled can keep a domain quiesced until all services ready. However that functionality seems to bring mixed results depending on individual usecase and I saw some varying discussions/results with it in the past year. So if interested careful testing is needed. So this approach could allow to monitor via "show domains" status provider if domain is ready or not yet. May I guess you want to know when your loadbalancer can shoot again at the box?



    ------------------------------
    August Ziesemer
    DataPower Gateways L2 Support
    IBM
    ------------------------------



  • 6.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 09:54 AM

    Thanks, August;

    LB Scenario would certainly be a use-case.
    Another one is that we have a weekly maintenance routine that takes a secure-backup and then reboot the appliance ... we just wanna go to the next appliance after the first one is completely configured. Nowadays, we set some 'long enough' sleep time on the script to don't trigger the next shoot before the first is done, but that leads to unnecessary delays if the sleep time is too long, or the risk of trigger the next one too early if a problem happens at startup or if the sleep time was set too short.
    I will review the domain-availability option and get back in case additional info is needed.

    Regards



    ------------------------------
    Renato Zancoper
    DataPower & MQ Admin
    IBM
    ------------------------------



  • 7.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    IBM Champion
    Posted Thu August 13, 2020 03:29 PM
    My RFE will be for a new send-off-appliance Log Target capability that allows it to be one of the first things up on restart, one of the last things down on planned shutdown and to include some buffering capability so that it can capture and eventually send start up related messages that occur before the outbound network interfaces are fully up. This info should not be locked away in the Audit Log only (my opinion).

    I will ask for the Log Target to be able to capture reliable, precise and consistent messages that announce:
    A. the very start of an appliance
    B. the full completion of starting including all app domains (I got you, Renato)
    C. the start of a shutdown
    D. the first last message possible before the shutdown completes.
    (Reboot, shutdown and reload are the same in this context.)



  • 8.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 04:52 PM
    Thanks PPotkay,

    Whenever you have the RFE opened, pls share the link so we can upvote it!

    ------------------------------
    Renato Zancoper
    DataPower & MQ Admin
    IBM
    ------------------------------



  • 9.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 04:25 PM
    Sorry to take this even further off-topic with this question, but why are you restarting your DataPower after a secure-backup? Are you experiencing anything that is causing you to reboot regularly?​

    ------------------------------
    Charlie Sumner
    ------------------------------



  • 10.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    Posted Thu August 13, 2020 04:50 PM
    Hi Charlie,

    In the past, a long time ago, with old XI50b and XG45, which used to have much less memory then IDGs, the customer had issues with memory utilization... some times it as detected issues with service 'memory leakage', but as support team what I see most is a bad caching sizing.

    Of course, it is frustrating that someone prefers to do a reboot to clean up memory rather than fix the caching sizing, but that's the way they opted to :(  , and also frustrating that customer refuses to update their procedures, even when issues are fixed or no longer present, but that is out of my control.


    ------------------------------
    Renato Zancoper
    DataPower & MQ Admin
    IBM
    ------------------------------



  • 11.  RE: DataPower - is there an event that is always logged when an appliance is coming up from a reload or reboot

    IBM Champion
    Posted Sun August 16, 2020 12:03 PM
    Request for Enhancement Headline:
    DataPower Log Target Enhancements for better off appliance logging

    Bookmarkable URL:
    http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=144639


    Description:
    We are looking for enhancements to DataPower Log Targets that send Log Events off the appliance to a central logging facility.

    To ensure all Log Messages generated by the appliance can be sent, we need Log Target capability that allows it to be one of the first things up on restart, one of the last things down on planned shutdown and to include some buffering capability so that it can capture and eventually send start up related messages that occur before the outbound network interfaces are fully up. This info should not be locked away in the Audit Log only (my opinion).

    We ask for the Log Target to be able to capture reliable, precise and consistent messages that announce:
    A. the very start of an appliance coming up
    B. the full completion of starting including all app domains
    C. the start of a shutdown
    D. the very last message possible before the shutdown completes.

    Reboot, shutdown and reload are the same in this context.


    Use case:
    We use Log Targets of type syslog to send Log Events to our central enterprise logging tool. When appliances restart, the Log Targets are not the last thing to come down and not the first thing to come up, so consequently we don't have all possible DataPower log messages in our central logging tool.

    When searching for Log Messages related to appliances starting up or shutting down in our central logging tool it is inconclusive. The best messages related to start up only show up in Audit Log. We would like these to be available in regular Log Targets so that we can send them.