Informix

 View Only
Expand all | Collapse all

Nasty Redhat issue ..

  • 1.  Nasty Redhat issue ..

    IBM Champion
    Posted Mon February 20, 2023 10:37 AM
    If you are using RH and ext4 without journaling it is possible for oninit (and oncheck) to deadlock in the kernel when it calls the disk and KAIO active

    How to spot .....

    Block in Checkpoint?? ..(CKPT INP)


    onstat -g ath | egrep "IO Wait"

    Look for flush_sub(x)

    onstat -g stk on the thread (or similar)

    0x00000000014e225b (/informix/ids/bin/oninit) yield_processor_mvp

    0x00000000014a2239 (/informix/ids/bin/oninit) mt_aio_wait.part.8 0x000000000149933e (/informix/ids/bin/oninit) mt_aio_start 0x0000000000e9b6aa (/informix/ids/bin/oninit) aclean_chunk 0x0000000000e9aa97 (/informix/ids/bin/oninit) flush_sub 0x00000000014f6c1f (/informix/ids/bin/oninit) startup


    onstat -g cpu to find the cpu from above to find PID, the pid will not ptrace, pstack or gbd and ignores kill -9

    ??

    Dmesg will show

    83280.333456] oninit ?? ?? ?? ?? ??D ffff889aebed8000 ?? ?? 0 ??1892 ?? 1728 0x00000084 [83280.333477] Call Trace: [83280.333516] ??[<ffffffff9578c3f9>] schedule+0x29/0x70 [83280.333576] ??[<ffffffffc07847e2>] ext4_unwritten_wait+0x93/0xbe [ext4] [83280.333598] ??[<ffffffff950c7080>] ? wake_up_atomic_t+0x30/0x30 [83280.333617] ??[<ffffffffc07316f9>] ext4_file_write+0x479/0x600 [ext4] [83280.333636] ??[<ffffffff95353d19>] ? __blk_run_queue+0x39/0x50 [83280.333653] ??[<ffffffffc0731280>] ? ext4_write_checks.isra.8+0x150/0x150 [ext4] [83280.333672] ??[<ffffffff952a61e3>] do_io_submit+0x3e3/0x8a0 [83280.333687] ??[<ffffffff952a66b0>] SyS_io_submit+0x10/0x20 [83280.333702] ??[<ffffffff95799f92>] system_call_fastpath+0x25/0x2a [83280.333720] INFO: task dio/dm-3:2264 blocked for more than 120 seconds. [83280.333737] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. The bad news (from RH)

    Running ext4 without journal is not a supported configuration, which is one of the reasons the above Bug xxxxxx was closed with "WONTFIX" tag.

    This leaves you with 2 options how to solve your problem:

    1. Move from ext4 to another type of filesystem, for example xfs.

    2. Enable journaling (ordered) on your ext4 filesystem to use it in a supported fashion.

    There is no 100% guarantee you won't hit the problem even with journaling enabled, but in cases where this problem was observed, there was no journal.

    Cheers
    Pail
    --  Paul Watson Oninit www.oninit.com Tel: +1 913 364 0360 Cell: +1 913 387 7529  Oninit?? is a registered trademark of Oninit LLC  If you want to improve, be content to be thought foolish and stupid Failure is not as frightening as regret


  • 2.  RE: Nasty Redhat issue ..

    Posted Mon February 20, 2023 10:56 AM

    Thanks for the heads up!

    Was it in VM environment or on bare metal ? RedHat 7 ot 8 ? (uname -a ?)  Any details about storage?



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 3.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Mon February 20, 2023 01:04 PM
    VM, and RH7

    On 2/20/2023 9:55 AM, Vladimir Kolobrodov via IBM Community wrote:
    010001866f89d376-d7a579e7-9cc4-4076-99e9-3c906882539a-000000@email.amazonses.com">
    Thanks for the heads up! Was it in VM environment or on bare metal ? RedHat 7 ot 8 ? (uname -a ?)  Any details about storage? ----------------...
    IBM Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Vladimir Kolobrodov
    Feb 20, 2023 10:56 AM
    Vladimir Kolobrodov

    Thanks for the heads up!

    Was it in VM environment or on bare metal ? RedHat 7 ot 8 ? (uname -a ?)  Any details about storage?



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  




     
    You are subscribed to "Informix" as paul@oninit.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.



    Original Message:
    Sent: 2/20/2023 10:56:00 AM
    From: Vladimir Kolobrodov
    Subject: RE: Nasty Redhat issue ..

    Thanks for the heads up!

    Was it in VM environment or on bare metal ? RedHat 7 ot 8 ? (uname -a ?)  Any details about storage?



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------


  • 4.  RE: Nasty Redhat issue ..

    Posted Mon February 20, 2023 01:13 PM

    Thanks for sharing!



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 5.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Wed June 28, 2023 01:44 AM

    Hi Paul,


    Which kernel version?

    Regards,

    David.



    ------------------------------
    David Williams
    ------------------------------



  • 6.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Wed June 28, 2023 09:30 AM
    Linux 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 12 10:45:03 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

    On 6/28/2023 12:43 AM, David Williams via IBM TechXchange Community wrote:
    010001890087abde-a9157f39-9d73-4e6e-8d19-a79ecdfc3fae-000000@email.amazonses.com">
    Hi Paul, Which kernel version? Regards, David. ------------------------------ David Williams ------------------------------ -posted to the "Informix" group
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    David Williams
    Jun 28, 2023 1:44 AM
    David Williams

    Hi Paul,


    Which kernel version?

    Regards,

    David.



    ------------------------------
    David Williams
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  




     
    You are subscribed to "Informix" as paul@oninit.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.



    Original Message:
    Sent: 6/28/2023 1:44:00 AM
    From: David Williams
    Subject: RE: Nasty Redhat issue ..

    Hi Paul,


    Which kernel version?

    Regards,

    David.



    ------------------------------
    David Williams
    ------------------------------


  • 7.  RE: Nasty Redhat issue ..

    Posted Thu July 13, 2023 09:55 AM

    Hi Paul, 

    Did you have any update from RHEL about this issue? 

    Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice. 

    In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: https://access.redhat.com/solutions/7006119



    ------------------------------
    Cesar Martins
    ------------------------------



  • 8.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Thu July 13, 2023 10:27 AM
    The official line is Redhat are going to fix it  as unjournalled ext4 is not supported.

    I just moved people to XFS via Informix mirroring and the swap mirror API.  Note: I hit another issue using swap mirror on multi-chunk dbspaces and the solution was to use swap mirror with chunks not dbspaces

    Cheers
    Paul

    On 7/13/2023 8:55 AM, Cesar Martins via IBM TechXchange Community wrote:
    010001894f88b69f-a20043c1-4051-430e-85b2-dd77a9c6fc34-000000@email.amazonses.com">
    Hi Paul,  Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a...
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Cesar Martins
    Jul 13, 2023 9:55 AM
    Cesar Martins

    Hi Paul, 

    Did you have any update from RHEL about this issue? 

    Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice. 

    In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: access.redhat.com/solutions/7006119



    ------------------------------
    Cesar Martins
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  




     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.



    Original Message:
    Sent: 7/13/2023 9:55:00 AM
    From: Cesar Martins
    Subject: RE: Nasty Redhat issue ..

    Hi Paul, 

    Did you have any update from RHEL about this issue? 

    Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice. 

    In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: https://access.redhat.com/solutions/7006119



    ------------------------------
    Cesar Martins
    ------------------------------


  • 9.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Thu July 13, 2023 10:32 AM
    Whoops ... The official line is Redhat are NOT  going to fix it  as unjournalled ext4 is not supported.


    On 7/13/2023 9:26 AM, Paul Watson via IBM TechXchange Community wrote:
    010001894fa5d206-2c83ce50-932c-4869-80f3-b2f784e00700-000000@email.amazonses.com">
    The official line is Redhat are going to fix it as unjournalled ext4 is not supported. I just moved people to XFS via... -posted to the "Informix" group
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Paul Watson
    Jul 13, 2023 10:27 AM
    Paul Watson
    The official line is Redhat are going to fix it  as unjournalled ext4 is not supported.

    I just moved people to XFS via Informix mirroring and the swap mirror API.  Note: I hit another issue using swap mirror on multi-chunk dbspaces and the solution was to use swap mirror with chunks not dbspaces

    Cheers
    Paul

    On 7/13/2023 8:55 AM, Cesar Martins via IBM TechXchange Community wrote:
    010001894f88b69f-a20043c1-4051-430e-85b2-dd77a9c6fc34-000000@email.amazonses.com">
    Hi Paul,  Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a...
    IBM                                                           TechXchange                                                           Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Cesar Martins
    Jul 13, 2023 9:55 AM
    Cesar Martins

    Hi Paul, 

    Did you have any update from RHEL about this issue? 

    Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice. 

    In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: access.redhat.com/solutions/7006119



    ------------------------------
    Cesar Martins
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  

      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  
    -------------------------------------------
    Original Message:
    Sent: Mon February 20, 2023 10:36 AM



     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.





     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.

    --  Paul Watson Oninit www.oninit.com Tel: +1 913 364 0360 Cell: +1 913 387 7529  Oninit® is a registered trademark of Oninit LLC  If you want to improve, be content to be thought foolish and stupid Failure is not as frightening as regret





  • 10.  RE: Nasty Redhat issue ..

    Posted Thu July 13, 2023 01:20 PM

    Ok, thank you for the update. 

    I was hopeful because the RH article has the status:  SOLUTION IN PROGRESS 



    ------------------------------
    Cesar Martins
    ------------------------------



  • 11.  RE: Nasty Redhat issue ..

    IBM Champion
    Posted Thu July 13, 2023 02:12 PM
    My ticket with RH was closed with a 'WONT FIX'

    On 7/13/2023 12:19 PM, Cesar Martins via IBM TechXchange Community wrote:
    0100018950440319-36bb4802-e1ad-47e2-a97c-7cf263b31360-000000@email.amazonses.com">
    Ok, thank you for the update.  I was hopeful because the RH article has the status:  SOLUTION IN PROGRESS  This solution is in progress...
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Cesar Martins
    Jul 13, 2023 1:20 PM
    Cesar Martins

    Ok, thank you for the update. 

    I was hopeful because the RH article has the status:  SOLUTION IN PROGRESS 

    This solution is in progress and will be completed soon for Red Hat customers.


    ------------------------------
    Cesar Martins
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  




     
    You are subscribed to "Informix" as paul@oninit.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.



    Original Message:
    Sent: 7/13/2023 1:20:00 PM
    From: Cesar Martins
    Subject: RE: Nasty Redhat issue ..

    Ok, thank you for the update. 

    I was hopeful because the RH article has the status:  SOLUTION IN PROGRESS 



    ------------------------------
    Cesar Martins
    ------------------------------

    Original Message:
    Sent: Thu July 13, 2023 10:31 AM
    From: Paul Watson
    Subject: Nasty Redhat issue ..

    Whoops ... The official line is Redhat are NOT  going to fix it  as unjournalled ext4 is not supported.


    On 7/13/2023 9:26 AM, Paul Watson via IBM TechXchange Community wrote:
    010001894fa5d206-2c83ce50-932c-4869-80f3-b2f784e00700-000000@email.amazonses.com">
    The official line is Redhat are going to fix it as unjournalled ext4 is not supported. I just moved people to XFS via... -posted to the "Informix" group
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Paul Watson
    Jul 13, 2023 10:27 AM
    Paul Watson
    The official line is Redhat are going to fix it  as unjournalled ext4 is not supported. I just moved people to XFS via Informix mirroring and the swap mirror API.  Note: I hit another issue using swap mirror on multi-chunk dbspaces and the solution was to use swap mirror with chunks not dbspaces Cheers Paul
    On 7/13/2023 8:55 AM, Cesar Martins via IBM TechXchange Community wrote:
    010001894f88b69f-a20043c1-4051-430e-85b2-dd77a9c6fc34-000000@email.amazonses.com">
    Hi Paul,  Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a...
    IBM                                                           TechXchange                                                           Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Cesar Martins
    Jul 13, 2023 9:55 AM
    Cesar Martins

    Hi Paul, 

    Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice.  In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: access.redhat.com/solutions/7006119

    ------------------------------ Cesar Martins ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  
    ------------------------------------------- Original Message: Sent: Mon February 20, 2023 10:36 AM
     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.
     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.
    --  Paul Watson Oninit www.oninit.com Tel: +1 913 364 0360 Cell: +1 913 387 7529  Oninit® is a registered trademark of Oninit LLC  If you want to improve, be content to be thought foolish and stupid Failure is not as frightening as regret
    Original Message: Sent: 7/13/2023 10:27:00 AM From: Paul Watson Subject: RE: Nasty Redhat issue ..
    The official line is Redhat are going to fix it  as unjournalled ext4 is not supported. I just moved people to XFS via Informix mirroring and the swap mirror API.  Note: I hit another issue using swap mirror on multi-chunk dbspaces and the solution was to use swap mirror with chunks not dbspaces Cheers Paul
    On 7/13/2023 8:55 AM, Cesar Martins via IBM TechXchange Community wrote:
    010001894f88b69f-a20043c1-4051-430e-85b2-dd77a9c6fc34-000000@email.amazonses.com">
    Hi Paul,  Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a...
    IBM TechXchange Community

    Informix

    Post New Message
    Re: Nasty Redhat issue ..
    Reply to Group Reply to Sender
    Cesar Martins
    Jul 13, 2023 9:55 AM
    Cesar Martins

    Hi Paul, 

    Did you have any update from RHEL about this issue?  Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice.  In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: access.redhat.com/solutions/7006119

    ------------------------------ Cesar Martins ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  
    Original Message: Sent: Mon February 20, 2023 10:36 AM
     
    You are subscribed to "Informix" as famouseric@gmail.com. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.



    Original Message:
    Sent: 7/13/2023 9:55:00 AM
    From: Cesar Martins
    Subject: RE: Nasty Redhat issue ..

    Hi Paul, 

    Did you have any update from RHEL about this issue? 

    Here we also hit this bug for the last 12 months, where our database freeze a few times and two times only this week. 

    Now I'm planning to migrate to another FS, probably XFS... however, if we can have any hope to keep the EXT4, should be nice. 

    In my last case with RHEL they ask for the kernel dump, which I have no opportunity to set up on our production environment, however, appears the problem already knows, I found this: https://access.redhat.com/solutions/7006119



    ------------------------------
    Cesar Martins
    ------------------------------