AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Power
#Power
 View Only
  • 1.  IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Sat July 17, 2010 05:13 AM

    Originally posted by: SumitGoyal


    Hi,

    I am getting the core dump on AIX 5.3 machine while running a 32 bit application.
    Following is the call stack.

    *stat.lstat64x(??, ??) at 0xd035f8d4
    getwd(??) at 0xd04b223c
    getcwd(??, ??) at 0xd04b1db8*

    Following is the signal received :-
    +IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a] at 0xd035f8d4 ($t19)
    0xd035f8d4 (lstat64x+0x18) 80410014 lwz r2,0x14(r1)+

    The memory details are :-
    (dbx) listi 0xd035f8bc
    0xd035f8bc (lstat64x) 38a000b0 li r5,0xb0
    0xd035f8c0 (lstat64x+0x4) 9421ffc0 stwu r1,-64(r1)
    0xd035f8c4 (lstat64x+0x8) 7c0802a6 mflr r0
    0xd035f8c8 (lstat64x+0xc) 38c00011 li r6,0x11
    0xd035f8cc (lstat64x+0x10) 90010048 stw r0,0x48(r1)
    0xd035f8d0 (lstat64x+0x14) 480000fd bl 0xd035f9cc (statx)
    0xd035f8d4 (lstat64x+0x18) 80410014 lwz r2,0x14(r1)
    0xd035f8d8 (lstat64x+0x1c) 81810048 lwz r12,0x48(r1)
    0xd035f8dc (lstat64x+0x20) 38210040 addi r1,0x40(r1)

    Please suggest if there is a problem on AIX or a patch is released for same.

    Thanks in advance,
    Sumit Goyal
    #AIX-Forum


  • 2.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Mon July 19, 2010 02:06 AM

    Originally posted by: SumitGoyal


    Some more information at register level that i was able to get :-

    (dbx) where
    stat.lstat64x(??, ??) at 0xd035f8d4
    getwd(??) at 0xd04b223c
    getcwd(??, ??) at 0xd04b1db8

    (dbx) listi 0xd04b1c98,0xd04b1db8
    0xd04b1c98 (getcwd) 93e1fffc stw r31,-4(r1)
    0xd04b1c9c (getcwd+0x4) 7c9f2379 mr. r31,r4
    0xd04b1ca0 (getcwd+0x8) 7c0802a6 mflr r0
    0xd04b1ca4 (getcwd+0xc) 93c1fff8 stw r30,-8(r1)
    0xd04b1ca8 (getcwd+0x10) 93a1fff4 stw r29,-12(r1)
    0xd04b1cac (getcwd+0x14) 90010008 stw r0,0x8(r1)
    0xd04b1cb0 (getcwd+0x18) 9421fbb0 stwu r1,-1104(r1)
    0xd04b1cb4 (getcwd+0x1c) 40820074 bne 0xd04b1d28 (getcwd+0x90)
    0xd04b1cb8 (getcwd+0x20) 80020004 lwz r0,0x4(r2)
    0xd04b1cbc (getcwd+0x24) 2c000000 cmpi cr0,0x0,r0,0x0
    0xd04b1cc0 (getcwd+0x28) 41820044 beq 0xd04b1d04 (getcwd+0x6c)
    0xd04b1cc4 (getcwd+0x2c) 83e2069c lwz r31,0x69c(r2)
    0xd04b1cc8 (getcwd+0x30) 817f0008 lwz r11,0x8(r31)
    0xd04b1ccc (getcwd+0x34) 2c0b0000 cmpi cr0,0x0,r11,0x0
    0xd04b1cd0 (getcwd+0x38) 41820034 beq 0xd04b1d04 (getcwd+0x6c)
    0xd04b1cd4 (getcwd+0x3c) 4be91a7d bl 0xd0343750 (_ptrgl)
    0xd04b1cd8 (getcwd+0x40) 80410014 lwz r2,0x14(r1)

    0xd04b1d9c (getcwd+0x104) 90030000 stw r0,0x0(r3)
    0xd04b1da0 (getcwd+0x108) 38600000 li r3,0x0
    0xd04b1da4 (getcwd+0x10c) 480000e0 b 0xd04b1e84 (getcwd+0x1ec)
    0xd04b1da8 (getcwd+0x110) 3ba00000 li r29,0x0
    0xd04b1dac (getcwd+0x114) 48000008 b 0xd04b1db4 (getcwd+0x11c)
    0xd04b1db0 (getcwd+0x118) 607e0000 ori r30,r3,0x0
    0xd04b1db4 (getcwd+0x11c) 38610040 addi r3,0x40(r1)
    0xd04b1db8 (getcwd+0x120) 48000131 bl 0xd04b1ee8 (getwd)
    (dbx) listi 0xd04b1ee8, 0xd04b223c
    0xd04b1ee8 (getwd) bea1ffd4 stmw r21,-44(r1)
    0xd04b1eec (getwd+0x4) 3ba00000 li r29,0x0
    0xd04b1ef0 (getwd+0x8) 7c0802a6 mflr r0
    0xd04b1ef4 (getwd+0xc) 90010008 stw r0,0x8(r1)
    0xd04b1ef8 (getwd+0x10) 38000000 li r0,0x0
    0xd04b1efc (getwd+0x14) 9421f4f0 stwu r1,-2832(r1)
    0xd04b1f00 (getwd+0x18) 388101a8 addi r4,0x1a8(r1)
    0xd04b1f04 (getwd+0x1c) 83e21af0 lwz r31,0x1af0(r2)
    0xd04b1f08 (getwd+0x20) 3bc106be addi r30,0x6be(r1)
    0xd04b1f0c (getwd+0x24) 90610ac4 stw r3,0xac4(r1)
    0xd04b1f10 (getwd+0x28) 63e30000 ori r3,r31,0x0
    0xd04b1f14 (getwd+0x2c) 90010044 stw r0,0x44(r1)
    0xd04b1f18 (getwd+0x30) 90810040 stw r4,0x40(r1)
    0xd04b1f1c (getwd+0x34) 980106be stb r0,0x6be(r1)
    0xd04b1f20 (getwd+0x38) 48000d59 bl 0xd04b2c78 (opendir64)
    0xd04b21e4 (getwd+0x2fc) 4182001c beq 0xd04b2200 (getwd+0x318)
    0xd04b21e8 (getwd+0x300) 4bfa413d bl 0xd0456324 (readdir64_r)
    0xd04b21ec (getwd+0x304) 60000000 ori r0,r0,0x0
    0xd04b21f0 (getwd+0x308) 607d0000 ori r29,r3,0x0
    0xd04b21f4 (getwd+0x30c) 80610040 lwz r3,0x40(r1)
    0xd04b21f8 (getwd+0x310) 2c830000 cmpi cr1,0x0,r3,0x0
    0xd04b21fc (getwd+0x314) 48000018 b 0xd04b2214 (getwd+0x32c)
    0xd04b2200 (getwd+0x318) 62a30000 ori r3,r21,0x0
    0xd04b2204 (getwd+0x31c) 4bfa3fdd bl 0xd04561e0 (readdir64)
    0xd04b2208 (getwd+0x320) 60000000 ori r0,r0,0x0
    0xd04b220c (getwd+0x324) 2c830000 cmpi cr1,0x0,r3,0x0
    0xd04b2210 (getwd+0x328) 90610040 stw r3,0x40(r1)
    0xd04b2214 (getwd+0x32c) 2c1d0000 cmpi cr0,0x0,r29,0x0
    0xd04b2218 (getwd+0x330) 38830014 addi r4,0x14(r3)
    0xd04b221c (getwd+0x334) 62c30000 ori r3,r22,0x0
    0xd04b2220 (getwd+0x338) 418601d0 beq cr1,0xd04b23f0 (getwd+0x508)
    0xd04b2224 (getwd+0x33c) 408201cc bne 0xd04b23f0 (getwd+0x508)
    0xd04b2228 (getwd+0x340) 4beddb59 bl 0xd038fd80 (strcpy)
    0xd04b222c (getwd+0x344) 60000000 ori r0,r0,0x0
    0xd04b2230 (getwd+0x348) 60760000 ori r22,r3,0x0
    0xd04b2234 (getwd+0x34c) 386106c0 addi r3,0x6c0(r1)
    0xd04b2238 (getwd+0x350) 388100f8 addi r4,0xf8(r1)
    0xd04b223c (getwd+0x354) 4bead681 bl 0xd035f8bc (lstat64x)
    (dbx) listi 0xd035f8bc,0xd035f8d4
    0xd035f8bc (lstat64x) 38a000b0 li r5,0xb0
    0xd035f8c0 (lstat64x+0x4) 9421ffc0 stwu r1,-64(r1)
    0xd035f8c4 (lstat64x+0x8) 7c0802a6 mflr r0
    0xd035f8c8 (lstat64x+0xc) 38c00011 li r6,0x11
    0xd035f8cc (lstat64x+0x10) 90010048 stw r0,0x48(r1)
    0xd035f8d0 (lstat64x+0x14) 480000fd bl 0xd035f9cc (statx)
    0xd035f8d4 (lstat64x+0x18) 80410014 lwz r2,0x14(r1)
    (dbx) registers
    $r0:0xffffffff $stkp:0x314398c0 $toc:0xffffffff $r3:0x00000000
    $r4:0xffffffff $r5:0xffffffff $r6:0xffffffff $r7:0xffffffff
    $r8:0xffffffff $r9:0xffffffff $r10:0xffffffff $r11:0xffffffff
    $r12:0xffffffff $r13:0x00000000 $r14:0x00000000 $r15:0x00000000
    $r16:0x00000000 $r17:0x00000000 $r18:0x00000000 $r19:0x00000000
    $r20:0x00000000 $r21:0x31724f98 $r22:0x31439fd4 $r23:0x00000011
    $r24:0x00000002 $r25:0x00000000 $r26:0x00000002 $r27:0x00000000
    $r28:0x00000001 $r29:0x00000000 $r30:0x31439f9d $r31:0xf0271d30
    $iar:0xd035f8d4 $msr:0x0000d032 $cr:0x00000000 $link:0xffffffff
    $ctr:0x0005fe88 $xer:0xffffffff $mq:0xffffffff
    $fr0:0xfff8000082064000 $fr1:0x4030000000000000 $fr2: 0x4000000000000000
    $fr3:0x4024000000000000 $fr4:0x4030000000000000 $fr5: 0x3c7abc9e3b39803f
    $fr6:0x4338000000000402 $fr7:0x3f56c16cd5a49dee $fr8: 0x3f811111b4af3c1a
    $fr9:0x3fa5555555555235 $fr10:0x4008000000000000 $fr11: 0x3fe0000000000000
    $fr12:0x3fc5555555555150 $fr13:0x4000a2b23f3bab73 $fr14: 0x0000000000000000
    $fr15:0x0000000000000000 $fr16:0x0000000000000000 $fr17: 0x0000000000000000
    $fr18:0x0000000000000000 $fr19:0x0000000000000000 $fr20: 0x0000000000000000
    $fr21:0x0000000000000000 $fr22:0x0000000000000000 $fr23: 0x0000000000000000
    $fr24:0x0000000000000000 $fr25:0x0000000000000000 $fr26: 0x0000000000000000
    $fr27:0x0000000000000000 $fr28:0x0000000000000000 $fr29: 0x0000000000000000
    $fr30:0x0000000000000000 $fr31:0x0000000000000000 $fpscr: 0x0000000082004000

    vector registers are not valid
    in stat.lstat64x [/usr/lib/libc.a] at 0xd035f8d4 ($t19)
    0xd035f8d4 (lstat64x+0x18) 80410014 lwz r2,0x14(r1)
    (dbx) thread info 19
    thread state-k wchan state-u k-tid mode held scope function
    >$t19 run running 2502723 k no sys lstat64x

    general:
    pthread addr = 0x3143dcf0 size = 0x290
    vp addr = 0x31440ab0 size = 0x2d8
    thread errno = 0
    start pc = 0xf0f59fe0
    joinable = yes
    pthread_t = 1213
    scheduler:
    kernel =
    user = 1 (other)
    nice = 60
    event :
    event = 0x0
    cancel = enabled, deferred, not pending
    stack storage:
    base = 0x31425000 size = 0x18000
    limit = 0x3143db98
    sp = 0x314398c0
    (dbx) print $r1
    0x314398c0
    (dbx) print 0x314398c0
    826513600
    (dbx) 0x314398c0 /c
    0x314398c0: '1'
    (dbx) print $r2
    0xffffffff
    (dbx) 0xd035f9cc /c
    0xd035f9cc: ''
    (dbx) print 0xd035f8d4
    -801769260
    (dbx) 0xd035f8d4 /c
    0xd035f8d4: '€'
    (dbx) print 0xd04b1cd8
    -800383784
    (dbx) 0xd04b1cd8 /c
    0xd04b1cd8: '€'

    Please suggest if there is some to do with the corrupted buffer or the value '€' we are getting ?

    Thanks in advance.

    Regards,
    Sumit
    #AIX-Forum


  • 3.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Mon July 19, 2010 05:39 AM

    Originally posted by: SumitGoyal


    I have dig further into machine level instruction and found out the following :-

    lstat64x is creating problem at :-
    0xd035f8d0 (lstat64x+0x14) 480000fd bl 0xd035f9cc (statx)
    0xd035f8d4 (lstat64x+0x18) 80410014 lwz r2,0x14(r1)

    the above function statx that is being called contains the call stack :-
    0xd035f9cc (statx) 8182075c lwz r12,0x75c(r2)
    0xd035f9d0 (statx+0x4) 90410014 stw r2,0x14(r1)
    0xd035f9d4 (statx+0x8) 800c0000 lwz r0,0x0(r12)
    0xd035f9d8 (statx+0xc) 804c0004 lwz r2,0x4(r12)
    0xd035f9dc (statx+0x10) 7c0903a6 mtctr r0
    0xd035f9e0 (statx+0x14) 4e800420 bctr

    where in it is assigning the value to register r2 at
    0xd035f9d8 (statx+0xc) 804c0004 lwz r2,0x4(r12)

    also the value for r12 is
    (dbx) print $r12
    0xffffffff
    and r2 is
    (dbx) print $r2
    0xffffffff
    both are showing :-
    (dbx) print 0xffffffff
    -1
    (dbx) 0xffffffff /c
    0xffffffff: warning: Unable to access address 0xffffffff from core
    'ÿ'

    Please let me know if i am doing something worng or interpreting something wrong ?
    #AIX-Forum


  • 4.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Mon July 19, 2010 07:49 PM

    Originally posted by: dukessd


    What program does the AIX errpt list for this core?
    It is all very well posting when you have run out of ideas but we were not there when you first noticed the problem.....
    #AIX-Forum


  • 5.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Tue July 20, 2010 10:23 AM

    Originally posted by: SumitGoyal


    Please find below the errpt log :-

    Class: S
    Type: PERM
    Resource Name: SYSPROC

    Description
    SOFTWARE PROGRAM ABNORMALLY TERMINATED

    Probable Causes
    SOFTWARE PROGRAM

    User Causes
    USER GENERATED SIGNAL

    Recommended Actions
    CORRECT THEN RETRY

    Failure Causes
    SOFTWARE PROGRAM

    Recommended Actions
    RERUN THE APPLICATION PROGRAM
    IF PROBLEM PERSISTS THEN DO THE FOLLOWING
    CONTACT APPROPRIATE SERVICE REPRESENTATIVE

    Detail Data
    SIGNAL NUMBER
    6
    FILE SYSTEM SERIAL NUMBER
    15
    INODE NUMBER
    12369
    PROGRAM NAME
    httpd
    STACK EXECUTION DISABLED
    0
    COME FROM ADDRESS REGISTER

    PROCESSOR ID
    hw_fru_id: N/A
    hw_cpu_id: N/A

    ADDITIONAL INFORMATION
    lstat64x 18
    ??

    Symptom Data
    REPORTABLE
    1
    INTERNAL ERROR
    0
    SYMPTOM CODE
    PCSS/SPI2 FLDS/httpd SIG/6 FLDS/lstat64x VALU/18

    The core is occuring after an hour while running the IBM http server on AIX platform.The process that core dumped is httpd and it is crashing in my module where i am trying to run the os call getcwd for getting the current working directory.
    #AIX-Forum


  • 6.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Mon July 26, 2010 10:11 AM

    Originally posted by: SumitGoyal


    More cores occuring at the same call getcwd but pointing to different functions :-

    CORE 1 :fcntl.__fcntl(??, ??, ??) at 0xd0387024 untrusted: /usr/lib/libc.a(shr.o)
    fcntl.__fcntl(??, ??, ??) at 0xd03871b0 untrusted: /usr/lib/libc.a(shr.o)
    getlogin_r(??, ??) at 0xd04d6c70 untrusted: /usr/lib/libc.a(shr.o)
    opendir64(??) at 0xd04d5b14 untrusted: /usr/lib/libc.a(shr.o)
    _getwd32(??) at 0xd04d5498 untrusted: /usr/lib/libc.a(shr.o)
    locateRenditionFile_7_5_0() at 0xd19ea38c
    CORE 2:
    readdir64.readdir64_r(??, ??, ??) at 0xd0479214 untrusted: /usr/lib/libc.a(shr.o)
    getlogin_r(??, ??) at 0xd04d6b2c untrusted: /usr/lib/libc.a(shr.o)
    opendir64(??) at 0xd04d5b14 untrusted: /usr/lib/libc.a(shr.o)
    _getwd32(??) at 0xd04d5498 untrusted: /usr/lib/libc.a(shr.o)
    One thing i notice in the core is that it is happening after an hour and the ABORT signal is raised by other thread which is pointing to the following os call stack while creating a fstream object which goes into XOpenLocale internally and finally raises an abort signal.
    abort._abort() at 0xd03ea78c untrusted: /usr/lib/libc.a(shr.o)
    .() at 0xffffffff
    myabort()() at 0xd01fa8ac
    terminate()() at 0xd01f8bf0
    terminate()() at 0xd01fa04c
    __DoThrowV6() at 0xd01fc6f0
    _XOpenLocale(int,const char*)(??, ??) at 0xd1370238
    _Getcvt__FPCc(??, ??) at 0xd13704bc
    _Getcvt() const(??, ??) at 0xd1381b2c
    _Init(const std::_Locinfo&)(??, ??) at 0xd1381a80
    locale.codecvt(const std::_Locinfo&,unsigned long)(??, ??, ??) at 0xd13a4a14
    _Makeloc(const std::_Locinfo&,int,std::locale::_Locimp*,const std::locale*)(??, ??, ??, ??) at 0xd13757d8
    locale._Locimp(const std::locale::_Locimp&)(??, ??) at 0xd13a4f5c
    Page.__ct__Q2_3std6localeGQ2_3std7codecvtXTcTcTPc__RCQ2_3std6localePQ2_3std7codecvtXTcTcTPc_(0x3052afc4, 0x3052b19c, 0x3005a178) at 0xde8604c0
    Page._Initcvt__Q2_3std13basic_filebufXTcTQ2_3std11char_traitsXTc__Fv(0x3052b14c) at 0xde860214
    open__Q2_3std13basic_filebufXTcTQ2_3std11char_traitsXTc__FPCci(0x3052b14c, 0x30a28aa9, 0x1) at 0xde8600b4
    __ct__Q2_3std13basic_fstreamXTcTQ2_3std11char_traitsXTc__FPCci(0x3052b130, 0x3052b1a4, 0xf1259a00, 0x30a28aa9, 0x1) at 0xde9226a4

    Please let me know if you have any comments on it.

    Regards,
    Sumit
    #AIX-Forum


  • 7.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Fri August 06, 2010 02:23 PM

    Originally posted by: SumitGoyal


    seems the problem is solved with the aix patch upgrade..earlier we were using libc version 5.3.9.0 but now we have upgraded it to 5.3.9.5 . and it solved our issue for sometime..
    #AIX-Forum


  • 8.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Thu August 12, 2010 02:58 PM

    Originally posted by: SumitGoyal


    Seems the problem is solved by the library upgrade but the problem is still the same.

    The httpd cores have started occuring again and again with same call stack
    #AIX-Forum


  • 9.  Re: IOT/Abort trap in stat.lstat64x [/usr/lib/libc.a]

    Posted Thu January 13, 2011 06:12 PM

    Originally posted by: SystemAdmin


    SumitGoyal wrote:
    seems the problem is solved with the aix patch upgrade..earlier we were using libc version 5.3.9.0 but now we have upgraded it to 5.3.9.5 . and it solved our issue for sometime..

    It's good for reference, Thanks for your instruction!
    #AIX-Forum