Automation with Power

Power Business Continuity and Automation

Connect, learn, and share your experiences using the business continuity and automation technologies and practices designed to ensure uninterrupted operations and rapid recovery for workloads running on IBM Power systems. 


#Power
#TechXchangeConferenceLab

 View Only
Expand all | Collapse all

Shutdown Issue On Node With Resource Group(s) Online

  • 1.  Shutdown Issue On Node With Resource Group(s) Online

    Posted Sat December 10, 2011 07:52 AM

    Originally posted by: usaix


    There is a PowerHA installation with two nodes participating. Executing shutdown on any node that has resource group(s) online fails. I am suspecting that it is an enhanced concurrent volume group or SDDPCM issue.

    Environment Information

    • AIX 6.1 technology level 7 service pack 1
    • PowerHA 6.1 Standard Edition (service pack 6)
    • Storwize V7000 (SDDPCM, NPIV)
    • Virtual I/O Server 2.2.1.1 (FixPack 25)

    Thanks in advance.
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 2.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sat December 10, 2011 01:33 PM

    Originally posted by: SystemAdmin


    just a hint: if you want free help you better offer a little more details (and even a question shouldn't be too hard to make up)
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 3.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sun December 11, 2011 01:52 AM

    Originally posted by: usaix


    Executing shutdown on any node that has resource group(s) online fails. The shutdown procedure is initiated on the server but after stoping various services and just before unmounting the filesystems and removing the network interfaces it stops. It is a "fresh" PowerHA installation meaning there is no application installed that could keep the filesystems from being unmounted. The filesystems that are part of the resource groups do not contain any data. After this situation has occured the only way to shutdown/restart the server is by forcing a shutdown/restart through HMC. When the server is restarted the following message is recorded in the log among the other usual messages:

    C1348779 1210145611 I O SYSJ2 LOG I/O ERROR

    LABEL: J2_LOG_EIO
    IDENTIFIER: C1348779

    Date/Time: Sat 10 Dec 2011 14:56:28
    Sequence Number: 223
    Machine Id: 00F6FC804C00
    Node Id: ******
    Class: O
    Type: INFO
    WPAR: Global
    Resource Name: SYSJ2

    Description
    LOG I/O ERROR

    Probable Causes
    ADAPTER HARDWARE OR MICROCODE
    DISK DRIVE HARDWARE OR MICROCODE
    SOFTWARE DEVICE DRIVER
    STORAGE CABLE LOOSE, DEFECTIVE, OR UNTERMINATED

    Recommended Actions
    CHECK CABLES AND THEIR CONNECTIONS
    INSTALL LATEST ADAPTER AND DRIVE MICROCODE
    INSTALL LATEST STORAGE DEVICE DRIVERS
    IF PROBLEM PERSISTS, CONTACT APPROPRIATE SERVICE REPRESENTATIVE

    Detail Data
    JFS2 LOG MAJOR/MINOR DEVICE NUMBER
    0065 0001
    ERROR CODE
    0000 0006
    BUF STRUCTURE B_FLAGS
    080C 0404
    BLOCK NUMBER
    0000 0028

    Do you think that it is a configuration issue (SDDPCM, NPIV, PowerHA) or probably a bug?
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum
    #PowerHAforAIX


  • 4.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sun December 11, 2011 11:08 AM

    Originally posted by: Holgervk


    are there scsi_reserves on the disks?
    are the filesystems mounted on both servers? (recent hacmp-bug, however, only after manual rg-failover)
    arent there more io-related errors?
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 5.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sun December 11, 2011 06:06 PM

    Originally posted by: usaix


    The attribute reserve_policy is by default set to no_reserve for all hdisks that are members of the volume groups that belong to resource groups. I have not noticed any filesystems being mounted on both nodes. No other relative messages are recorder in the log.
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum
    #PowerHAforAIX


  • 6.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sun December 11, 2011 11:57 AM

    Originally posted by: SystemAdmin


    if no clues result from the logs, I'd rebuild the cluster with aix tl6 and see if it still fails.
    the tls that were released in late october (6.1tl7 and 7.1tl1) bring quite a few problems. their release notes mention library changes in order to better comply to a single unix specification version something. changes which already break localtime() in combination with specific TZ variables...
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 7.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Sun December 11, 2011 06:18 PM

    Originally posted by: usaix


    I considered it may be an AIX 6.1 technology level 7 service pack 1 issue. However, downgrading is not an option.
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum
    #PowerHAforAIX


  • 8.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Mon December 12, 2011 07:26 AM

    Originally posted by: SystemAdmin


    > downgrading is not an option
    who says this? what kind of project is this?
    i am not suggesting "downgrading", i suggest trade "bleeding edge" against "somewhat stable".
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 9.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Mon December 12, 2011 08:30 AM

    Originally posted by: usaix


    I agree with your approach but the project is in progress. We cannot change the specifications of the project.
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 10.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Mon December 12, 2011 11:33 AM

    Originally posted by: RosieK


    The filesystem affected (from your errpt entry) is 101,0 - have you checked /dev to find out which FS this is?
    Also, have you tried taking HA out of the picture, mounting the filesystems manually and trying the shutdown - what results?
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum
    #PowerHAforAIX


  • 11.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Mon December 12, 2011 03:05 PM

    Originally posted by: usaix


    Actually, 101 and 0 do not belong to a filesystem but to the volume group that is part of the resource group of the PowerHA installation. Shuting down the server after having mounted the filesystems manually completes normally. However, there is a major difference between the two cases. When the filesystems are mounted through PowerHA the volume group to which they belong is activated in enhanced concurrent mode. When I mount the filesystems manually with the cluster services inactive the respective volume group is not activated in ehanced concurrent mode. That's why I first suspected enhanced concurrent mode along with SDDPCM.
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 12.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Thu December 15, 2011 09:57 AM

    Originally posted by: usaix


    Issue has been resolved by upgrading to AIX 6.1 technology level 7 service pack 2 and PowerHA 6.1 service pack 7 (both just released). Once again, thanks everyone for the help.
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum


  • 13.  Re: Shutdown Issue On Node With Resource Group(s) Online

    Posted Tue April 10, 2012 04:27 AM

    Originally posted by: HM2M_Abdul-Azeez_Musa


    Hello,

    I am experiencing exactly the same problem as yours.
    You said you installed "PowerHA 6.1 service pack 7" but I cannot find this on fixCentral, the latest is PowerHA 6.1 service pack 5 and that is what I have running.

    Kindly assist
    #PowerHAforAIX
    #PowerHA-(Formerly-known-as-HACMP)-Technical-Forum