AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Power
 View Only
  • 1.  Kernel Memory Leak ?

    Posted Fri July 21, 2006 09:39 AM

    Originally posted by: SystemAdmin


    I am having some strange problems with a P550Q. Occasionally I start getting messages similar to that below which seem to indicate that I am short of "Kernel Memory". However at this time NMON and TOPAS show no problems.

    I have cleared down processes using shared memory and there is currently no load on the server other than membership of a Veritas Cluster (Oracle databse still to be built). At the same time even attempting to call up man pages results is a failure. A reboot solves the problem at least temporarily.

    Any pointers ? thanks.

    Could not load program ./smbpasswd:
    Symbol resolution failed for smbpasswd because:
    There is not enough kernel memory. Try again later.
    #AIX-Forum


  • 2.  Some details please

    Posted Mon July 24, 2006 06:21 AM

    Originally posted by: nagger


    I think some facts would be good for trying to help:

    Which AIX versions? i.e. oslevel -r
    Is this a LPAR or a whole machine?
    How much memory have you got in LPAR and whole machine?
    Have you "tuned" AIX memory settings?
    How much paging space have you got and how much its in use? lsps -a
    What percentage of memory are you allocating to shared memory segments?

    Which Samba version?

    Veritas includes (I think) kernel internal extensions and memory allocation functions but that might be Veritas filesystem only. Can you run without Veritas for a while? Have you "over cooked" the Veritas options and taken up to much memory that way? Have you checked with Veritas support or the documentations for recommended settings?

    My first pass guess would be you are running low on paging space.

    Hope this helps, N
    #AIX-Forum


  • 3.  Re: Similar Problem

    Posted Mon July 24, 2006 02:32 PM

    Originally posted by: eichher


    We got similar messages running a HACMP 5.2 cluster with AIX 5.3 ML03 CSP. Nine days after we migrated from AIX 5.1/HACMP 5.1 we had a crash on one server. Can't
    start any new process. Support first thought of a paging space problem, but we got another crash and a complete dump. Now a kernel heap problem is investigated.
    We have some kernel extension running (ORACLE 8.1.7.4, TSM SpaceManager). Do you
    have a comparable environment?
    #AIX-Forum


  • 4.  Kernel Memory leak ?

    Posted Thu July 27, 2006 11:12 AM

    Originally posted by: SystemAdmin


    Hi Nigel

    Some details

    AIX version 5.3 maintenance level 4 I believe (oslevel -r command not working at present (see below)).

    Server is a P550Q running a single partition (No VIOS).

    Has 32Gb of memory with 4Gb set for swap. 8 CPUS.

    Server is in a veritas 4.0 cluster with a similar single partition P550Q (although that server has half of the memory and CPU of the main server).

    Server is running Oracle 10g database and applications but is still in test stages rather than production.

    Have done nothing tuning wise on memory.

    1. lsps -a
    Page Space Physical Volume Volume Group Size %Used Active Auto Type
    hd6 hdisk0 rootvg 4096MB 1 yes yes lv

    Samba version 3.0.22.

    Errors are now occuring on both servers and are intermitant - See below

    servername:/cms# df -g
    Could not load program df:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# df -g
    Could not load program df:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# who
    Could not load program who:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# top
    ksh: top: not found
    servername:/cms# nmon
    Could not load program /usr/bin/ksh:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# lsps -a
    Could not load program lsps:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# df
    Could not load program df:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# man man
    Could not load program man:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# who
    Could not load program who:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    servername:/cms# lsps -a
    Page Space Physical Volume Volume Group Size %Used Active Auto Type
    hd6 hdisk0 rootvg 4096MB 1 yes yes lv

    1. oslevel -r
    Could not load program /usr/bin/id:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program uname:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    /usr/bin/oslevel598: test: argument expected
    Could not load program expr:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    /usr/bin/oslevel667: test: argument expected
    Could not load program /usr/bin/rm_mlcache_file:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program /usr/sbin/inuumsg:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program /usr/sbin/inuumsg:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program /usr/bin/rm:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program /usr/bin/rm:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).

    #AIX-Forum


  • 5.  Re: Kernel Memory leak ?

    Posted Thu July 27, 2006 12:27 PM

    Originally posted by: SystemAdmin


    Managed to get some stats out of nmon immediatly after the following command failed -

    1. hastatus
    Could not load program hastatus:
    System error: Not enough space
    lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
    x
    x
    x N N M M OOOO N N For online help type: h x
    x NN N MM MM O O NN N For command line option help: x
    x N N N M MM M O O N N N quick-hint nmon -? x
    x N N N M M O O N N N full-details nmon -h x
    x N NN M M O O N NN To start nmon the same way every time? x
    x N N M M OOOO N N set NMON ksh variable, for example: x
    x
    export NMON=cmt x
    x Version v10r for AIX53 x
    x 16 - CPUs currently x
    x 16 - CPUs configured x
    x 1498 - MHz CPU clock rate x
    x PowerPC_POWER5 - Processor x
    x 64 bit - Hardware x
    x 64 bit - Kernel x
    x Dynamic - Logical Partition x
    x 5.3.0.40 - AIX Kernel Version x
    x - Hostname x
    mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
    qqnmon-v10rqqq1=Top-BasicsqqqqqqqHost= -ibmqqqRefresh=2 secsqqq17:22.32qq
    lqMemory-UseqqqqqqqqqqqqqqqqqqqqqPagingqqqqqqqqqqqqqqqqqqqqqqqqStatsqqqqqqqqqqqk
    x Physical PagingSpace pages/sec In Out FileSystemCache x
    x% Used 12.7% 0.3% to Paging Space 0.0 0.0 (numperm) 0.7%x
    x% Free 87.3% 99.7% to File System 0.0 0.0 Process 4.7%x
    xMB Used 4049.7MB 11.9MB Page Scans 0.0 System 7.3%x
    xMB Free 27822.3MB 4084.1MB Page Cycles 0.0 Free 87.3%x
    xTotal(MB) 31872.0MB 4096.0MB Page Steals 0.0 ------x
    x Page Faults 0.0 Total 100.0%x
    xMin/Maxperm 6102MB( 19%) 24406MB( 77%) note: % of memory x
    xMin/Maxfree 960 1088 Total Virtual 35.1GB User 2.0%x
    xMin/Maxpgahead 2 8 Accessed Virtual 3.7GB 10.4% Pinned 8.3%x
    mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj


    #AIX-Forum


  • 6.  Re: Kernel Memory leak ?

    Posted Thu July 27, 2006 12:27 PM

    Originally posted by: SystemAdmin


    Managed to get some stats out of nmon immediatly after the following command failed -

    1. hastatus
    Could not load program hastatus:
    System error: Not enough space
    lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
    x
    x
    x N N M M OOOO N N For online help type: h x
    x NN N MM MM O O NN N For command line option help: x
    x N N N M MM M O O N N N quick-hint nmon -? x
    x N N N M M O O N N N full-details nmon -h x
    x N NN M M O O N NN To start nmon the same way every time? x
    x N N M M OOOO N N set NMON ksh variable, for example: x
    x
    export NMON=cmt x
    x Version v10r for AIX53 x
    x 16 - CPUs currently x
    x 16 - CPUs configured x
    x 1498 - MHz CPU clock rate x
    x PowerPC_POWER5 - Processor x
    x 64 bit - Hardware x
    x 64 bit - Kernel x
    x Dynamic - Logical Partition x
    x 5.3.0.40 - AIX Kernel Version x
    x - Hostname x
    mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
    qqnmon-v10rqqq1=Top-BasicsqqqqqqqHost= -ibmqqqRefresh=2 secsqqq17:22.32qq
    lqMemory-UseqqqqqqqqqqqqqqqqqqqqqPagingqqqqqqqqqqqqqqqqqqqqqqqqStatsqqqqqqqqqqqk
    x Physical PagingSpace pages/sec In Out FileSystemCache x
    x% Used 12.7% 0.3% to Paging Space 0.0 0.0 (numperm) 0.7%x
    x% Free 87.3% 99.7% to File System 0.0 0.0 Process 4.7%x
    xMB Used 4049.7MB 11.9MB Page Scans 0.0 System 7.3%x
    xMB Free 27822.3MB 4084.1MB Page Cycles 0.0 Free 87.3%x
    xTotal(MB) 31872.0MB 4096.0MB Page Steals 0.0 ------x
    x Page Faults 0.0 Total 100.0%x
    xMin/Maxperm 6102MB( 19%) 24406MB( 77%) note: % of memory x
    xMin/Maxfree 960 1088 Total Virtual 35.1GB User 2.0%x
    xMin/Maxpgahead 2 8 Accessed Virtual 3.7GB 10.4% Pinned 8.3%x
    mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj


    #AIX-Forum


  • 7.  Re: Kernel Memory Leak ?

    Posted Thu July 27, 2006 12:58 PM

    Originally posted by: RichardRoss


    This is a known problem. IY84780. The fix is not yet available so you'll need to contact the support center and get the efix. As a workaround, you can turn on MODS (Memory Overlay Detection System), which causes the problem to stop, but can affect performance of the system.

    How to Enable MODS
    1. bosdebug -M
    2. bosboot -a -d /dev/ipldevice
    3. reboot the system

    How to Disable MODS
    1. bosdebug -o
    2. bosboot -a -d /dev/ipldevice
    3. reboot the system

    How to check the current status of MODS
    1. bosdebug -L
    Memory debugger off <<<<< MODS is off
    Memory sizes 0
    Network memory sizes 0
    Kernel debugger off
    Real Time Kernel off
    Backtracking fault log off
    #AIX-Forum


  • 8.  Re: Kernel Memory Leak ?

    Posted Tue August 01, 2006 02:15 PM

    Originally posted by: SystemAdmin


    Thanks !!! Applied the workaround and have now had an efix from IBM.

    #AIX-Forum


  • 9.  Re: Kernel Memory Leak ?

    Posted Mon October 26, 2009 11:53 AM

    Originally posted by: GUDDLU


    Hi,

    good to hear hear that the work around of MODS did work for you. At the same time, mentioned that you got a efix from IBM.
    Could you please share some info on that efix?how can we obtain it?we are stuck with the same problem.

    We are observing the same/similar errors as below.
    #snap -c -a
    Could not load program sed:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program awk:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program sed:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program awk:
    Dependent module libc.a(shr.o) could not be loaded.
    Could not load module libc.a(shr.o).
    Could not load program sed:
    Dependent module libc.a(shr.o) could not be loaded

    Sreedhar
    #AIX-Forum


  • 10.  Re: Kernel Memory Leak ?

    Posted Mon October 26, 2009 07:43 PM

    Originally posted by: dukessd


    If your system matches this thread - AIX 5.3 TL04 - you NEED to update big time !

    The fix, APAR IY84780, was released in August 2006!

    Update to AIX 5.3 TL05 to install the fix:

    http://www-01.ibm.com/support/docview.wss?uid=isg1IY84780

    http://www-933.ibm.com/eserver/support/fixes/fixcentral/pseriesfixpackinformation?fp=5300-05

    Better still, update to a TL that still has defect support like AIX 5.3 TL07, 8, 9, 10 or 11...
    #AIX-Forum