DataPower

 View Only

Avoid transactions running for more than 2 minutes, or accept the risk for DataPower watchdog reload

  • 1.  Avoid transactions running for more than 2 minutes, or accept the risk for DataPower watchdog reload

    Posted Mon January 15, 2024 03:21 AM
    Edited by Hermann Stamm-Wilbrandt Mon January 15, 2024 11:55 AM

    DataPower is designed for high throughput, high concurrency.

    Any transaction running for 120s or more has the risk of a watchdog reload.
    In the backtrace file generated (near top paragraph end) in default domain temporary folder you will see:
    ...

    Thread 0x........: WATCHDOG RESTART is detected.
    ...

    As reported in other postings, transactions can run for more than 2 minutes for different reasons.
    If for such a transaction DataPower internally an "exclusive event" occurs, watchdog restart will happen.

    There is a huge list of possible causes for an exclusive event, let me show a CLI exit command as an example.

    Create a simple service with a GatewayScript action executing this simple "run forever" script:
    for(;;){}

    Send a curl request against that service triggering execution of that script.

    After 120s login to CLI, and execute these CLI commands:

    config
    httpserv foobar
    exit

    After executing the "exit" command, config update will request exclusive event and watchdog happens immediately.

    For curl command you will see:
    curl: (52) Empty reply from server

    This watchdog mechanism is inside DataPower product since early 2000s and by design.

    In case something bad (log running) happens messing up resources, the goal is to get DataPower as fast as possible into a good state again, with a watchdog reload.

    If a watchdog reload happens, and you don't know the reason, create a support ticket and provide the backtrace file.
    Support can decode the backtrace and give more details on the reload.

    A reason can be long running code as in example above, or a complex regular expression applied to big input data.

    In case of too complex regular expression this technote helps you to analyze the complexity of your regexps:
    https://www.ibm.com/support/pages/websphere-datapower-soa-appliances-and-interaction-complex-regular-expressions


    P.S:
    Time spent in async wait does not count for transaction duration.
    I was not able to trigger a watchdog reload via above CLI method during

    await wait(300000)

    waiting for 5 minutes:
    https://stamm-wilbrandt.de/en/blog/sync_wait.js%20and%20wait.xsl.html


    ------------------------------
    Hermann Stamm-Wilbrandt
    Compiler Level 3 support, IBM DataPower Gateways
    IBM
    Boeblingen Germany
    ------------------------------