Informix

nested-group-icon.png

DB2

Expand all | Collapse all

12.10.FC4 known bug on shared memory/physical log/ontape?

  • 1.  12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Thu December 03, 2020 08:36 AM

    Hello everyone,

     

    Is anyone aware of a bug or have come across unusual behavior with version 12.10.FC4 engine on Linux where, usually when an ontape is running, sometimes the shared memory usage jumps exponentially, sometimes physical log keeps growing exponentially?

     

    These issues occurred on several occasions on this particular server where:

     

    It crashed on multiple occasions due to shared memory growing to the max allowed by the operating system (that one has been addressed largely by a better configuration approach),

     

    Physical log grew to 75% full and got "stuck" there on a checkpoint required status, holding up all updates/inserts/deletes.  This particular physical log is sized sufficiently in my opinion, about the same size as the total buffer pool – it is not tiny at all.

     

    These problem scenarios seem to present themselves when ontape level zero archive starts, but not every time ontape starts of course.  There does not seem to be any "major" application processes starting/running at the same time.

     

    Very problematic behavior I have not seen in any other setup I worked with over the years.  Trying to decide if an upgrade is in order (there is resistance to it) but if so, need some concrete proof.

     

    Thank you,

     

    Hal Maner

    M Systems International, Inc.

     



  • 2.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Thu December 03, 2020 10:21 AM

    Probably not related but wasn't FC4 the first release that allowed ontape to use differing UID/GID combinations ?

     

    Cheers

    Paul

     






  • 3.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Thu December 03, 2020 05:57 PM
    Any correlation possibly to system timestamp evolvement?  Look at timestamp value in checkpoint messages over longer period of time.

    This 'timestamp' is circular, in the range of a 4byte integer, and, among other things, it serves at marking the last 'time' (relative to this circle) a page got modified.  When there are periods of time where this stamp moves very quickly, a backup might incurr extraordinarily many 'very old' pages which it then needs to update - to enable proper incremental backups (whether being used or not).

    Just a thought...

    BR,
     Andreas

    ------------------------------
    Andreas Legner
    ------------------------------



  • 4.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Sun December 13, 2020 03:19 PM

    Seeing more of the same behavior.  Engine hung with "CKPT REQ".

     

    Do we following messages in the log have any explanation other than abnormal behavior by the engine (physical log is sized at 8 GB on this instance):

     

    20:30:06  Performance Advisory: Unable to extend Physical Log.

    20:30:06   Results: Attempt to extend physical log dbspace FAILED.

    20:30:06   Action: Please notify IBM Informix Technical Support.

     






  • 5.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Sun December 13, 2020 04:15 PM
    Yes. The inability to extend the physical log may indeed be why the engine is hung.

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 6.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Mon December 14, 2020 01:49 AM

    Yes but:

     

    1. Why is it trying to extend the physical log on its own in the first place?  
    2. If it decides to extend it then why is it unable – the 8 GB physical log is in its own dedicated 24 GB dbspace with 16 GB more space available.
    3. Is all this because of the  wonderful autonomic features that are enabled on this engine – AUTO_CKPTS etc. are set to 1.  Do you think this problem will go away if those are zero?

     

    Thank you,

     

    Hal

     






  • 7.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Mon December 14, 2020 08:53 AM

    Hal:

    Here are your questions with my comments:

    1. Why is it trying to extend the physical log on its own in the first place?  
      Because the engine has determined that the physical log is too small, so it wants to make it bigger.
    2. If it decides to extend it then why is it unable – the 8 GB physical log is in its own dedicated 24 GB dbspace with 16 GB more space available.
      The physical log can only be auto-expanded if it resides in a PLOG type dbspace or in an extendable chunk, not a normal chunk that is not extendable. PLOG dbspaces are auto expanding by default, so expanding the physical log in a PLOG space entails extending the initial chunk of the PLOG space (the physical log has to reside in a single chunk). Since creating a PLOG dbspace automatically fills the entire space with a physical log the same size as that PLOG space, and your physical log is in a dbspace that is bigger than the log, it must be in a "normal" dbspace and I will assume that the chunk in which it is living is not extendable, so the engine is not able to auto-expand it
    3. Is all this because of the wonderful autonomic features that are enabled on this engine – AUTO_CKPTS etc. are set to 1.  Do you think this problem will go away if those are zero?
      No. The auto-expanding physical log feature was introduced as part of the introduction of the PLOG dbspace type. There is probably an undocumented way to turn it off, but ... BTW, normally the engine just warns you that you need to make the physical log bigger unless it REALLY thinks that it needs to be expanded.


    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 8.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Mon December 14, 2020 12:53 PM

    Hi Art,

     

    Strange thing is this *is* in a PLOG dbspace.

     

    Here is the excerpt of  onstat -d:

     

    56b4ba60         2        0x1040001  2        1        2048     N PBA    informix plog

     

    59039028         2      2      0          12000000   7999947               PO-BED /ilink/prod/plog

     

    Thank you for all your feedback.

     

    Hal

     






  • 9.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted 26 days ago

    To close the loop on this: IBM support confirmed a defect with this version where the auto_tune thread causes the misbehavior on checkpoints/physical log.  We did indeed turn off AUTO_CKPTS (and I also found out it is not enough to comment out AUTO_CKPTS in the onconfig – you have to have it there with zero!)  and everything else "AUTO" as the shortest term fix. 

     

    The defect is apparently fixed in 12.10.FC12/14.10.FC1.

     

    Thank you,

     

    Hal

     






  • 10.  RE: 12.10.FC4 known bug on shared memory/physical log/ontape?

    Posted Mon December 14, 2020 11:14 AM
    Under the conditions you're describing, I think I'm seeing only one reason for the error you're seeing:  the server wasn't able to allocate more space to the physical log right at the end of the existing physical log.
    -> look at "oncheck -pe <dbspace_containing_plog" and see whether there's a decent amount of FREE space right after the physical log.

    ------------------------------
    Andreas Legner
    ------------------------------