AIX Open Source

 View Only
Expand all | Collapse all

Recurring bash_64 core dumps on AIX

Archive User

Archive UserWed August 14, 2019 03:12 PM

Archive User

Archive UserFri August 16, 2019 02:31 PM

Archive User

Archive UserThu October 31, 2019 05:29 PM

  • 1.  Recurring bash_64 core dumps on AIX

    Posted Wed August 14, 2019 03:12 PM

    Originally posted by: hanya


    Hey guys,


    We've been having recurring bash_64 process core dumps on our servers running AIX 7.1 after updating to bash 4.4-3 from the AIX Toolbox for Linux. It was fine when we were on bash 4.2-3 before that.

     

    We opened a case with IBM Support and here is what they came back with (in blue) after debugging one of the core files:

     

    david.gray (IBM)

    25 Mar 2019 ‎02‎:‎11‎ ‎PM

     

    It appears bash is looping in code that is setting terminal attributes in response to a signal, perhaps a SIGWINCH or something similar, and it loops in this forever until it runs out of stack.

     

    itcaix16 $ dbx -p /=./ ./usr/opt/freeware/bin/bash_64 ./home/fcroot/core

    Type 'help' for help.

    warning: The core file is not a fullcore. Some info may

    not be available.

    [using memory image in ./home/fcroot/core]

    reading symbolic information ...warning: Unable to access the stab file. Some info may not be available

    warning: no source compiled with -g

     

    Segmentation fault in __ioctl at 0x9000000000336d4

    0x9000000000336d4 (__ioctl+0xd4) e8410028 ld r2,0x28(r1)

    ioctl(0x0, 0x540300005403, 0x11001b4c0, 0x800000000000d032, 0x3b68, 0x0, 0xf1000a0150839800, 0x8000000000001032) at 0x900000000033dac

    tcsetattr(??, ??, ??) at 0x90000000017f834

    _set_tty_settings(0x0, 0x11001b4c0) at 0x1000d79a8

    set_tty_settings(0x0, 0x11001b4c0) at 0x1000d7a84

    rl_deprep_terminal() at 0x1000d8268

    rl_cleanup_after_signal() at 0x1000d699c

    ...

    rl_cleanup_after_signal() at 0x1000d699c

    _rl_handle_signal(0x100000001) at 0x1000d6ea8

    _rl_signal_handler(0x100000001) at 0x1000d7010

    _rl_release_sigint() at 0x1000d7100

    rl_deprep_terminal() at 0x1000d8274

    (dbx) proc rlimit

    rlimit name: rlimit_cur rlimit_max (units)

    RLIMIT_CPU: (unlimited) (unlimited) sec

    RLIMIT_FSIZE: (unlimited) (unlimited) bytes

    RLIMIT_DATA: (unlimited) (unlimited) bytes

    RLIMIT_STACK: 33554432 4294967296 bytes

    RLIMIT_CORE: 1073741312 (unlimited) bytes

    RLIMIT_RSS: 33554432 (unlimited) bytes

    RLIMIT_AS: (unlimited) (unlimited) bytes

    RLIMIT_NOFILE: 2000 (unlimited) descriptors

    RLIMIT_THREADS: (unlimited) (unlimited) per process

    RLIMIT_NPROC: (unlimited) (unlimited) per user

     

    We tried increasing the stack ulimit as David Gray, the case engineer, had advised and it did not make a difference in the number or frequency of the core dumps.

     

    We then tried upgrading to bash 5.0-1, the latest version available in the AIX Toolbox for Linux Applications and also updated from AIX 7100-05-02 to 7100-05-04. None of these changes made a difference.

     

    I opened another case with IBM but they said there is nothing further they can do and pointed me to this forum. I would greatly appreciate any help to figure out what might be causing this issue and how to resolve it.

     

    Here is the oslevel and bash level, and a recent core dump error from one of the servers. I also attached the core file to this thread.

    paris(root)/home/root#>oslevel -s
    7100-05-04-1914


    paris(root)/home/root#>lslpp -L bash
    Fileset Level State Type Description (Uninstaller)
    ----------------------------------------------------------------------------
    bash 5.0-1 C R The GNU Bourne Again shell
    (bash) version 5.0 (/bin/rpm)

     

    ---------------------------------------------------------------------------
    LABEL: CORE_DUMP
    IDENTIFIER: A924A5FC

    Date/Time: Mon Aug 12 13:31:34 EDT 2019
    Sequence Number: 2398
    Machine Id: 00FAD7524C00
    Node Id: paris
    Class: S
    Type: PERM
    WPAR: Global
    Resource Name: SYSPROC

    Description
    SOFTWARE PROGRAM ABNORMALLY TERMINATED

    Probable Causes
    SOFTWARE PROGRAM

    User Causes
    USER GENERATED SIGNAL

    Recommended Actions
    CORRECT THEN RETRY

    Failure Causes
    SOFTWARE PROGRAM

    Recommended Actions
    RERUN THE APPLICATION PROGRAM
    IF PROBLEM PERSISTS THEN DO THE FOLLOWING
    CONTACT APPROPRIATE SERVICE REPRESENTATIVE

    Detail Data
    SIGNAL NUMBER
    11
    USER'S PROCESS ID:
    32112722
    FILE SYSTEM SERIAL NUMBER
    7
    INODE NUMBER
    12288
    CORE FILE NAME
    /home/fcroot/core
    PROGRAM NAME
    bash_64
    STACK EXECUTION DISABLED
    0
    COME FROM ADDRESS REGISTER

    PROCESSOR ID
    hw_fru_id: N/A
    hw_cpu_id: N/A

    ADDITIONAL INFORMATION
    Unable to generate symptom string.

    Too many stack elements.
    ---------------------------------------------------------------------------

    Please let me know if I should provide any other info. : )

     

    Thanks!
    Lavanya Herbert

    Unix Administrator



  • 2.  Re: Recurring bash_64 core dumps on AIX

    Posted Wed August 14, 2019 03:35 PM
      |   view attached

    Originally posted by: hanya


     

    Also, want to add that IBM Support confirmed that other customers have opened a case about this issue (recurring bash_64 core dumps), so it's not an issue specific to our environment.

     

    Note: The core file is attached to this comment.

     

    Thanks!

    Attachment(s)

    gz
    core.gz   2.06 MB 1 version


  • 3.  Re: Recurring bash_64 core dumps on AIX

    Posted Fri August 16, 2019 10:31 AM

    Originally posted by: AyappanP


    Thanks for reporting the issue. We will check on this.

    BTW, is there any easy way to recreate it ? 



  • 4.  Re: Recurring bash_64 core dumps on AIX

    Posted Fri August 16, 2019 02:31 PM

    Originally posted by: hanya


    Hi AyappanP,

     

    Thanks for your response. There is no way for me to reproduce the issue on demand, but the core dumps happen very frequently in our environment so I should be able to capture any data you need.

    If you want to try and recreate it in your environment... I would suggest leaving some bash sessions (using the newer version of bash) running on your servers for an extended period (maybe a week or longer).

    Does that help?

    Thanks.
    Lavanya

     



  • 5.  Re: Recurring bash_64 core dumps on AIX

    Posted Mon August 19, 2019 02:26 AM

    Originally posted by: AyappanP


    Okay, i will try to recreate.



  • 6.  Re: Recurring bash_64 core dumps on AIX

    Posted Thu October 31, 2019 05:29 PM

    Originally posted by: hanya


    Hi Ayappan,

     

    Could you please provide an update on this?

     

    Thanks!

    Lavanya



  • 7.  Re: Recurring bash_64 core dumps on AIX

    Posted Mon November 04, 2019 02:18 AM

    Originally posted by: AyappanP


    We don't have an update on this right now.

    Will probably update this in a week or two. 



  • 8.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Mon June 08, 2020 08:42 AM
    The team is currently working on it. In the meanwhile, we have rebuilt bash with official upstream patches. 
    I suggest you to try with the bash_64 binary made available here in the below link and check whether the coredump still happens.

    https://testcase.boulder.ibm.com/fromibm/aix/bash_64_testfix

    ------------------------------
    Ayappan P
    ------------------------------



  • 9.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Tue June 09, 2020 12:30 AM
    ​Thanks Ayyapan. I will try the interim fix in our test environment.

    Do you know when the permanent fix is scheduled to be released?

    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 10.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Tue June 09, 2020 04:40 AM
    If this really fixes the issue, then it won't take more than one day to push it into AIX Toolbox.

    ------------------------------
    Ayappan P
    ------------------------------



  • 11.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Mon June 15, 2020 12:16 PM
    ​Hi Ayappan,

    You mentioned that the file you provided is the bash_64 binary. Do we just replace that with the existing binary in /opt/freeware? It's not an efix right?

    -rwxr-xr-x    1 root     system      1633554 Sep 28 2017  /opt/freeware/bin/bash_64

    Can you please provide instructions if we need to do anything else?

    Thanks.

    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 12.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Mon June 15, 2020 12:43 PM
    Yes, you need to just replace the binary.

    ------------------------------
    Ayappan P
    ------------------------------



  • 13.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Wed July 08, 2020 04:15 PM
    ​Hi Ayappan,

    The bash_64_testfix binary is still core dumping. We had installed it on a couple test servers around 3 weeks ago and it has core dumped a few times on both servers.

    I've attached one of the core files for your review.

    Could you please let us know how to proceed?

    Thanks.


    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 14.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Wed July 08, 2020 04:22 PM
    ---------------------------------------------------------------------------
    LABEL: CORE_DUMP
    IDENTIFIER: A924A5FC

    Date/Time: Sat Jun 20 12:53:34 EDT 2020
    Sequence Number: 2499
    Machine Id: 00FAD7524C00
    Node Id: paris
    Class: S
    Type: PERM
    WPAR: Global
    Resource Name: SYSPROC

    Description
    SOFTWARE PROGRAM ABNORMALLY TERMINATED

    Probable Causes
    SOFTWARE PROGRAM

    User Causes
    USER GENERATED SIGNAL

    Recommended Actions
    CORRECT THEN RETRY

    Failure Causes
    SOFTWARE PROGRAM

    Recommended Actions
    RERUN THE APPLICATION PROGRAM
    IF PROBLEM PERSISTS THEN DO THE FOLLOWING
    CONTACT APPROPRIATE SERVICE REPRESENTATIVE

    Detail Data
    SIGNAL NUMBER
    11
    USER'S PROCESS ID:
    21954622
    FILE SYSTEM SERIAL NUMBER
    2
    INODE NUMBER
    93
    CORE FILE NAME
    /usr/bin/core
    PROGRAM NAME
    bash_64_testfix
    STACK EXECUTION DISABLED
    0
    COME FROM ADDRESS REGISTER
    ??
    PROCESSOR ID
    hw_fru_id: 0
    hw_cpu_id: 3

    ADDITIONAL INFORMATION
    rl_signal 0
    ??
    Unable to generate symptom string.
    Stack is unusable.
    ---------------------------------------------------------------------------

    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 15.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Wed July 08, 2020 04:24 PM
    I am not able to upload the 3MB compressed core file in my reply for some reason. I get "Failed to upload" error.

    Is there another way I can send it to you?

    Thanks.

    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 16.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Thu July 09, 2020 02:56 AM
    You can try uploading it in the library section.

    ------------------------------
    Ayappan P
    ------------------------------



  • 17.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Fri July 10, 2020 07:27 AM
    What is the specific error do you get ? 
    I think this platform support specific file extension for upload only.
    Can you zip core file and make .zip extension and try uploading ?

    ------------------------------
    SANKET RATHI
    ------------------------------



  • 18.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Fri July 10, 2020 04:06 PM
    ​Hi Sanket,

    I tried uploading it as .zip file but still got the "Failed to upload" error. Please see attached screenshot.

    Thanks. 

    ​Hi Sanket,

    I tried uploading it as .zip file but still got the "Failed to upload" error. Please see attached screenshot.

    Thanks.



    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 19.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Fri July 10, 2020 04:07 PM
    Nevermind I can't upload the screenshot either. Is there any other way I can send you the file? By email perhaps? It's only 3MB.

    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 20.  RE: Re: Recurring bash_64 core dumps on AIX

    Posted Tue July 14, 2020 08:06 AM
    Edited by SANKET RATHI Tue July 14, 2020 08:06 AM
    I have sent you a direct message with the email id. If you did not receive then please let me know. 

    ------------------------------
    SANKET RATHI
    ------------------------------



  • 21.  RE: Recurring bash_64 core dumps on AIX

    Posted Thu July 16, 2020 09:47 AM
    Hello,

    We've got similiar problem with AIX 7200-04-02-2016 and bash-5.0-1 and we will appreciate any help on it.
    Please let me know if you need some additional info like snapcore etc.
    Best regards,
    Mikhail

    ------------------------------
    Mikhail Parshev
    ------------------------------



  • 22.  RE: Recurring bash_64 core dumps on AIX

    Posted Thu July 16, 2020 11:07 AM
    We are looking into this issue.

    ------------------------------
    Ayappan P
    ------------------------------



  • 23.  RE: Recurring bash_64 core dumps on AIX

    Posted Tue September 29, 2020 11:17 AM
    AIX Toolbox is updated with Bash 5.0.18
    This contains a fix for the coredump issue.

    ------------------------------
    Ayappan P
    ------------------------------



  • 24.  RE: Recurring bash_64 core dumps on AIX

    Posted Thu October 15, 2020 10:59 AM
    ​Hi Ayappan,

    I only see bash 5.0-1 in the AIX Toolbox. There is no bash 5.0.18.

    We already have 5.0-1 installed our on AIX servers and this version has the issue.

    Please advise.

    Thanks.

    Package Version License Binary RPM Source RPM Description
    bash-completion 2.9 License RPM SRPM 'Programmable completion for Bash'
    bash-doc 4.3.30 License RPM SRPM 'Documentation for the GNU Bourne Again shell (bash).'
    bash 5.0 License RPM SRPM 'The GNU Bourne Again shell (bash) version 5.0'


    ------------------------------
    Lavanya Herbert
    ------------------------------



  • 25.  RE: Recurring bash_64 core dumps on AIX

    Posted Thu October 15, 2020 11:16 AM
    Looks like the web page is not updated.
    You can get it from here --> https://public.dhe.ibm.com/aix/freeSoftware/aixtoolbox/RPMS/ppc/bash/

    ------------------------------
    Ayappan P
    ------------------------------