AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Power
 View Only
  • 1.  Reboot During mksysb creation!

    Posted Tue June 05, 2007 04:33 PM

    Originally posted by: SystemAdmin


    Hello,

    I'm trying to do a mksysb image from a nim server with the following command:
    nim -o define -t mksysb -F -a server=master -a location=$mksysb_location/${box}_sysb -a source=$box -a mk_image=yes -a mksysb_flags=e ${box}_sysb

    As it is shown in the man pages of nim. The problem is that after few min during the creation of the backup the box gets rebooted and in errpt there is the following error:

    2BFA76F6 0605225507 T S SYSPROC SYSTEM SHUTDOWN BY USER

    errpt -a -j 2BFA76F6

    LABEL: REBOOT_ID
    IDENTIFIER: 2BFA76F6

    Date/Time: Tue Jun 5 22:55:13 WET 2007
    Sequence Number: 2043043
    Machine Id: 00C97B4E4C00
    Node Id: boxname
    Class: S
    Type: TEMP
    Resource Name: SYSPROC

    Description
    SYSTEM SHUTDOWN BY USER

    Probable Causes
    SYSTEM SHUTDOWN

    Detail Data
    USER ID
    0
    0=SOFT IPL 1=HALT 2=TIME REBOOT
    0
    TIME TO REBOOT (FOR TIMED REBOOT ONLY)
    0

    Has anyone got any idea why this is? I'm pretty sure the reason is NIM, on that box we have Oracle RAC database too. On the other node from the RAC i have the same problem. They both have 5G of RAM and 3G of swap space.

    This is the topas -i1 during the NIM operation:

    Tue Jun 5 22:54:31 2007 Interval: 1 Cswitch 1147 Readch 2079.6K3
    Syscall 6926 Writech2023.1K9
    Kernel 5.5 |## | Reads 736 Rawin 0
    User 6.9 |## | Writes 337 Ttyout 1007
    Wait 4.1 |## | Forks 30 Igets 0
    Idle 83.6 |######################## | Execs 30 Namei 1180
    Physc = 0.28 %Entc= 14.0 Runqueue 2.0 Dirblk 0
    Waitqueue 0.0
    Network KBPS I-Pack O-Pack KB-In KB-Out
    en2 1038.8 367.0 711.0 25.7 1013.0 PAGING MEMORY
    en5 3.0 8.0 10.0 1.0 2.0 Faults 4115 Real,MB 5120
    lo0 0.1 1.0 1.0 0.0 0.0 Steals 1808 % Comp 43.2
    PgspIn 16 % Noncomp 57.1
    Disk Busy% KBPS TPS KB-Read KB-Writ PgspOut 9 % Client 57.1
    hdisk0 81.0 3008.0 280.0 2968.0 40.0 PageIn 831
    hdisk1 0.0 82.0 7.0 33.5 48.5 PageOut 525 PAGING SPACE
    hdisk2 0.0 0.5 1.0 0.5 0.0 Sios 1077 Size,MB 3072
    % Used 17.4
    Name PID CPU% PgSp Owner NFS (calls/sec) % Free 82.5
    backbyna 897208 0.9 0.2 root ServerV2 0
    kbiod 213096 0.1 0.1 root ClientV2 0 Press:
    topas 937988 0.1 2.7 ivan ServerV3 0 "h" for help
    lrud 16392 0.1 0.1 root ClientV3 40 "q" to quit
    sh 516096 0.0 0.6 root
    ocssd.bi 540796 0.0 26.3 oracle
    oracle 389364 0.0 18.1 oracle
    oracle 446538 0.0 18.1 oracle
    oracle 377084 0.0 18.1 oracle
    oracle 442458 0.0 18.1 oracle
    oracle 520244 0.0 17.9 oracle
    sleep 835702 0.0 0.1 root
    oracle 397448 0.0 18.2 oracle
    oracle 335956 0.0 4.4 oracle
    rtcmd 168072 0.0 0.1 root
    gil 114744 0.0 0.1 root
    getty 286900 0.0 0.4 root
    racgimon 393468 0.0 51.3 oracle
    oracle 425988 0.0 3.9 oracle
    oracle 634990 0.0 13.7 oracle
    The only thing that i noticed is that all of the Memory is used and there is some swap usage. Haven't noticed in the NIM docs memory needs for nim. Maybe i misses something...Any ideas? Thanks in advance.
    #AIX-Forum


  • 2.  Re: Reboot During mksysb creation!

    Posted Wed June 06, 2007 03:20 AM

    Originally posted by: SystemAdmin


    You should look into tuning AIX's memory buffers and not just blindly add resources.
    #AIX-Forum


  • 3.  Re: Reboot During mksysb creation!

    Posted Fri June 15, 2007 03:49 PM

    Originally posted by: SystemAdmin


    did you try mounting the nfs directory from nim server and do the manual mksysb backup and see what happens during mksysb creation ?
    #AIX-Forum


  • 4.  Re: Reboot During mksysb creation!

    Posted Fri June 15, 2007 07:14 AM

    Originally posted by: SystemAdmin


    Well, i checked the memory tunables and they seem ok. Funny thing is we have other clusters with GPFS with Oracle RAC, just not using the Oracle Clusterware, and we have no problems there, thought they have more memory configured. And on other thing i noticed is that this behaviour is when you are doing cp from NFS and the system uses lots of cache from the mem, when the mem gets filled it starts stealing pages and after sometime i.e a minute or 2 it gets rebooted. Currently the main suspect is the Oracle Clusterware since if it was to be the AIX it would of logged something in the errpt as a reason for the shutdown. Still looking for some solution...of the problem.
    #AIX-Forum


  • 5.  Re: Reboot During mksysb creation!

    Posted Mon June 18, 2007 07:41 AM

    Originally posted by: SystemAdmin


    Since i got no luck in finding the reason of this shutdown. I was wondering if there is a way to find out who DID the shutdown? I know it says UID 0, but that's nothing since only root can request shutdown and both AIX processes and Oracle clusterware processes run with root authority. Is there a way to find out if the AIX it self coused the reboot or was it some process with PID number or name? So i can at least get some direction on where to troubleshoot. Thanks in advance.
    #AIX-Forum


  • 6.  Re: Reboot During mksysb creation!

    Posted Thu July 12, 2007 09:48 AM

    Originally posted by: SystemAdmin


    My customer has the similar problem. Do you get a system dump for the reboot? If not, the oprocd in Oracle could have started a reboot if it is not scheduled to run in (timeout+timeout margin) ms. Do you know if your system were overloaded when you ran nim?
    #AIX-Forum


  • 7.  Re: Reboot During mksysb creation!

    Posted Fri July 13, 2007 01:25 PM

    Originally posted by: unixgrl


    Definitely check the Oracle logs. On a RAC system, if communication fails or it can't get time on the kernel the clustering software will halt a system to avoid being in a split-brain situation.
    #AIX-Forum


  • 8.  Re: Reboot During mksysb creation!

    Posted Thu July 19, 2007 04:20 AM

    Originally posted by: SystemAdmin


    Hi Guys,
    Currently we have opened PMR with IBM, and from the investigation we found that it is defenetly not and AIX or GPFS related issue. We are now in talks with Oracle to see what they can do about it, the box is not loaded at all when the mksysb is created, it's scheduled cronjob from our NIM server that runs at night! The box does not dump, it just gets rebooted really quick, something like halt -q. In Oracle logs there are no indication of loss of communication whatsoever, i.e. failing to access the voting disk (in which case the box gets dumped, as we experienced that in other configurations we have). So now we will be, working with Oracle to fix the issue, funny thing is that there is workaround of this problem by adding memory to the LPAR, during the backup operation. But that's not a proper way to solve the issue. From IBM support they told us they had more cases like this, and it was some bug in Oracle. Just have to find out if that is the case here.
    #AIX-Forum


  • 9.  Re: Reboot During mksysb creation!

    Posted Thu July 26, 2007 01:44 PM

    Originally posted by: SystemAdmin


    We have found that the vmstat logs (using OSWatcher) show very high avm (actual virtual pages) values long (for at least 1 hour and 16 minutes) before the reboot by oprocd at our customer site. These values are ~15% greater than the physical memory that the node has. An IBM analyst is looking into this, and the VM Manager may need some turing.
    #AIX-Forum