Informix

 View Only
Expand all | Collapse all

Informix processing "freezes"

  • 1.  Informix processing "freezes"

    Posted Mon November 20, 2023 11:05 PM

    Hi all, wanted to ask some advice around some recent Informix "freezes" we've experienced lately.  on two occasions we've had cases where Informix has frozen for several minutes resulting in the business application also freezing.  the situation has resolved itself after a few mins, but has caused frustration, understandably, for our users.  These issues have occurred during normal trading hours, during busier times.

    On the latest occasion investigations indicate that a database checkpoint took a long time to complete (Total Time = 745) with very low Avg/Sec for Physical and Logical Logs (70 and 62, respectively).  We also see very little other DB activity during this period consistent with the DB freeze (log rolls take almost twice as long as other log rolls around that time), and CPU activity drops during the 'freeze' (perhaps indicating no other non DB activities causing CPU maxing etc.).

    Assumption is that informix has halted other DB activity to complete the checkpoint, or perhaps perform a rollback?

    We're running a bunch of onstat commands to get a baseline, e.g. -x, -k, -p, -g ses, -u, but hard to see anything that sticks out.

    There has to be an underlying cause, but any suggestions as to where to dig deeper / review?  

    Many thanks. Mark



    ------------------------------
    Mark Clayton
    ------------------------------


  • 2.  RE: Informix processing "freezes"

    IBM Champion
    Posted Tue November 21, 2023 02:25 AM
    Edited by Henri Cujass Tue November 21, 2023 02:25 AM

    Hi Mark,

    did you check what's going on with the Checkpoints? 
    I suggest checking the online status line - maybe CHKP Blocked and to monitor the dirty page checkpoint write back with "onstat -R|grep dirty ".

    Regards
    Henri



    ------------------------------
    Henri Cujass
    leolo IT, CTO
    Germany
    IBM Champion 2021 2022 2023
    IIUG Board Member
    henri.cujass@leolo.com
    ------------------------------



  • 3.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 02:56 PM

    Hi Henri, thanks for the prompt reply.  Will check out that 'dirty page' suggestion.  Appreciate the advice.  Cheers. Mark



    ------------------------------
    Mark Clayton
    ------------------------------



  • 4.  RE: Informix processing "freezes"

    IBM Champion
    Posted Tue November 21, 2023 02:40 AM
    Hi,

    from my experience this can occur when a real huge rollback is occurring (or multiple at the same time).
    This might result in a stuck engine, which reacts to onstat but is permanently blocking in very long checkpoints.

    You can monitor this in onstat -x (which will give you an "estimation" of the remaining rollback time).
    Look for lines with very big values in the locks column and which started a big number of logs in the past.

    The ugly thing here is that a rollback of a very long transaction (we encountered one with 
    a lot of sblobs involved recently) can take even longer as it took to produce the data.

    Our situation occurred because there were >5 parallel rollbacks in progress.
    (User did not get a response and retried multiple times ;))
    There were about 100 logs to rollback, which took ages.

    We decided to kill the engine (onmode did not work any more) and restart (because any other activity
    was mostly blocked anyway).
    This resulted in a rollback of the transactions very quickly, because at startup time,
    a number of parallel cleaners are running which speeds up the rollback.
    In our situation, rollback time initially was displayed with 2h, and was resolved with the
    engine bounce in 4min.

    onmode -z the long transactions does not help, because they are typically already in rollback,
    which needs to complete.


    Best,

    MARCUS HAARMANN






  • 5.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 02:59 PM

    Hi Marcus, thanks for the reply and suggestions. We have wondered about rollbacks too. On a previous occasion it needed a restart but the last one cleared after about 7 mins.  We'll dig a bit more about potential rollbacks.  We're running onstat -x, but good to have the additional guidance on the 'estimation'.  Appreciate the reply and suggestions.  Cheers, Mark



    ------------------------------
    Mark Clayton
    ------------------------------



  • 6.  RE: Informix processing "freezes"

    IBM Champion
    Posted Tue November 21, 2023 05:52 AM

    Was this 745 seconds checkpoint a blocking checkpoint?

    You'd see this immediately from an asterisk next to Trigger column in 'onstat -g ckp', but only if you captured such output not more than 20 checkpoints after the incident.

    To look further into the past, there's sysadmin:mon_checkpoint which should have (at least) all the checkpoints since last restart.

    If it was 'Blocking', then what was the trigger/caller?  And then, of course, slow disk i/o combined with volume of dirty pages, logical and physical log buffers to flush would have been the primary reason for the duration and, since blocking, the freeze.

    If it was not blocking, then it still could've been some session in "critical section" for a very long time, blocking the checkpoint from even starting ... and everyone else from entering into new "critical sections", i.e. from doing any modifying/transactional work.  The culprit would then be that first session now buried in the past.

    Without further details, we can only speculate ...

     Andreas



    ------------------------------
    Andreas Legner
    ------------------------------



  • 7.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 06:03 AM

    Just to check:

     

    Haven you seen any warning message in the Informix log file stating that for performance purpose, the physical log should be amplified ?
    this kind of behaviour often happens when at a given time the physical log is too small

    I know, simple idea, but relevant in many cases ��

     

    My 0.01 cent ....

    Eric

     

    Eric Vercelletto
    Data Management Architect and Owner / Begooden IT Consulting
    KandooERP Founder and CTO
    IBM Champion 2013,2014,2015,2016,2017,2018,2019,2020
    ibm-champion-rgb-130px

    Tel:     +33(0) 298 51 3210
    Mob : +33(0)626 52 50 68
    skype: begooden-it
    Google Hangout: eric.vercelletto@begooden-it.com
    Email:
    eric.vercelletto@begooden-it.com
    www :
    http://www.vercelletto.com
    www  https://kandooerp.org

     

     






  • 8.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 06:07 AM

    And sorry if too much trivial, but did you check about logical log full (also long checkpoint don't seem to be an evidence of that, but there can be more than one issue involved...)

     

    Eric Vercelletto
    Data Management Architect and Owner / Begooden IT Consulting
    KandooERP Founder and CTO
    IBM Champion 2013,2014,2015,2016,2017,2018,2019,2020
    ibm-champion-rgb-130px

    Tel:     +33(0) 298 51 3210
    Mob : +33(0)626 52 50 68
    skype: begooden-it
    Google Hangout: eric.vercelletto@begooden-it.com
    Email:
    eric.vercelletto@begooden-it.com
    www :
    http://www.vercelletto.com
    www  https://kandooerp.org

     

     






  • 9.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 03:07 PM

    Hi Eric, thanks for the reply.  Ah yes, physical log size is a good thought and possibility as system activity has grown quite a bit in recent years, so older settings may not be enough for the current transaction volumes. Re: your additional thoughts on space, our logical logs auto backup when each log completes, so we don't think we have a problem with logical logs being full, and we've not seen any issues with disk space, so far.  Will explore physical log further.  Appreciate the advice. Cheers. mark



    ------------------------------
    Mark Clayton
    ------------------------------



  • 10.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 03:02 PM

    Hi Andreas, thanks for the reply and advice.   Will add the -g command you suggest to a script so we can quickly execute during freezes.  Hopefully we get a better capture of data the next time and can identify the culprit.  Appreciate the advice.  Cheers. mark



    ------------------------------
    Mark Clayton
    ------------------------------



  • 11.  RE: Informix processing "freezes"

    IBM Champion
    Posted Wed November 22, 2023 06:58 AM

    Brief warning:  onstat -g ckp only shows completed checkpoints, so if an ongoing/pending checkpoint is part of the problem, it won't show up in this onstat's body (it would in its header) if captured while the problem is still ongoing.

    -> be sure this gets captured shortly after such incident - or resort to sysadmin:mon_checkpoint.



    ------------------------------
    Andreas Legner
    ------------------------------



  • 12.  RE: Informix processing "freezes"

    Posted Sun November 26, 2023 05:01 PM

    Hi, thanks for that.. noted...



    ------------------------------
    Mark Clayton
    ------------------------------



  • 13.  RE: Informix processing "freezes"

    IBM Champion
    Posted Tue November 21, 2023 08:22 AM

    Hi Mark,

    You have already received lots of advice here, but I'd like to add something else.  During one of these "freezes", run onstat -g bth and onstat -g BTH to see if there are any blocking threads. They normally take a while to run, but may help you identify what is preventing other things from running.  Also onstat -g wmx and onstat -g lmx to see if there are any locked or waiting mutexes.

    If the checkpoint is blocked, then you will start to see sessions with the first flag of "C" in onstat -u.  You may find that some people are unaffected because they are just doing reads, but writes may be paused, while they wait for that checkpoint to complete.

    Good luck!

    Mike



    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 14.  RE: Informix processing "freezes"

    Posted Tue November 21, 2023 03:12 PM

    Hi Mike, thanks for the reply and suggestions.   We'll add those commands to the 'freeze' script.   Noted you comment about the different user experiences too - unfortunately the 'freezes' hit us quite hard as Informix is part of a ERP stack so a lot of posting and writing going on - but an interesting point for us to note.  See how we go and fingers crossed!  Cheers. Mark



    ------------------------------
    Mark Clayton
    ------------------------------



  • 15.  RE: Informix processing "freezes"

    Posted Wed November 22, 2023 04:53 AM

    Hi Mark,

    It would be useful to know the version of Informix deployed.

    A possible cause of long delays which eventually free up is the db engine searching for free space to insert a row in a large table where there are variable length columns. The bitmap page may only say the pages are partially full which results in an inspection of these pages. It is possible for the engine to scan a large portion of a table before it finds suitable space.

    Some older defects related to this: IC98053, IC98529, IT00781, IT02028. I think all are addressed in the 12.10 code line and above.

    If this is happening obtaining multiple onstat -g ppf outputs and comparing them to spot activity can pinpoint the tables/fragments/indices involved. Probably better to get the perfstat.sh script from technical support which you can run during an incident to collect many different onstat outputs repeatedly. -g ath, -g lmx, -X, -g con would also be useful.

    Ben.



    ------------------------------
    Benjamin Thompson
    ------------------------------



  • 16.  RE: Informix processing "freezes"

    Posted Sun November 26, 2023 05:15 PM

    Hi Benjamin, we're on 12.10 FC13. Will review your suggestions with the team... Many thanks



    ------------------------------
    Mark Clayton
    ------------------------------



  • 17.  RE: Informix processing "freezes"

    IBM Champion
    Posted Wed November 22, 2023 02:22 PM

    Hi,

    When this happens what is the header from onstat?

    Is it "(CKPT REQ)" or "(CKPT INP)"?

    When the instance appears to freeze capture an "onstat -g stk all" that will allow us to tell what it is waiting on.

    As Ben said, which version is this?

    NOTE:  Version 14.10.xC9 has an enhancment

    IT40986 Enhancement for checkpoint processing

    this reduces the number of writes to reserved pages during the checkpoint by a factor of over 100!

    If you have slow storage then this helps a lot!

    Also assume you have the scheduled task to populate it then in  sysadmin database table mon_checkpoint what is the long wait time for the checkpoint at the time of the issue?


    Regards,
    David.



    ------------------------------
    David Williams
    ------------------------------



  • 18.  RE: Informix processing "freezes"

    Posted Sun November 26, 2023 05:21 PM

    Hi David, thanks fort the reply and suggestions. Will check as you suggest on next occasion. We're on 12.10 FC13 so not able to take advantage of that enhancement, unfortunately, however storage is good so no problems with slow disk I/O.  Cheers. Mark



    ------------------------------
    Mark Clayton
    ------------------------------