PowerHA for AIX

PowerHA for AIX

Connect, learn, share, and engage with IBM Power.

 View Only
  • 1.  Please help me to check the PowerHA 6.1 cross-site LVM mirroring

    Posted Fri August 03, 2012 07:41 AM

    Originally posted by: OTIT


    I am try configured cluster follow all step on the redbooks (sg247841) completed, I have some questions about LVM mirror.
    1. When sysc cluster, I'm saw this warning message. what can I do to solve this message?

    WARNING: The disk with PVID 00f7c002dcacfd4d0000000000000000 is a part of the volume
    group sapvg which participates in resource group sapprd_RG on
    node: sappr1 and site: HOsappr1.
    This PVID is duplicated on node: sappr2, site: DRsappr2 the PVID should not be
    the same PVID as node sappr1.
    and for all PVs on the same VG.

    for more inforamtions...
    1. lspv
    hdisk1 00f7c002dcacfd4d sapvg concurrent
    hdisk2 00f7c002dcad1f7b sapvg concurrent
    hdisk3 00f7c002dcad2f6d None
    hdisk4 00f7c004dcae07b0 sapvg concurrent
    hdisk5 00f7c004dcae15ed sapvg concurrent
    hdisk0 00f7c002d749ca50 rootvg active
    hdisk6 00f7c004dcae226c None

    ###############
    Physical Volume Mirror Pool
    hdisk1 MP_sappr1HO
    hdisk2 MP_sappr1HO
    hdisk4 MP_sappr2DR
    hdisk5 MP_sappr2DR

    lsmp -A sapvg
    VOLUME GROUP: sapvg Mirror Pool Super Strict: yes
    MIRROR POOL: MP_sappr1HO Mirroring Mode: SYNC
    MIRROR POOL: MP_sappr2DR Mirroring Mode: SYNC
    ################
    #####################
    1. lssrc -ls clstrmgrES
    Current state: ST_STABLE
    sccsid = "@(#)36 1.135.6.1 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 53haes_r610, 1135G_hacmp610 11/30/11 08:50:54"
    i_local_nodeid 0, i_local_siteid 1, my_handle 1
    ml_idx[1]=0 ml_idx[2]=1
    There are 0 events on the Ibcast queue
    There are 0 events on the RM Ibcast queue
    CLversion: 11
    local node vrmf is 6108
    cluster fix level is "8"
    The following timer(s) are currently active:
    Current DNP values
    DNP Values for NodeId - 1 NodeName - sappr1
    PgSpFree = 5239550 PvPctBusy = 0 PctTotalTimeIdle = 99.912996
    DNP Values for NodeId - 2 NodeName - sappr2
    PgSpFree = 5240552 PvPctBusy = 0 PctTotalTimeIdle = 99.75896
    #####################
    1. cltopinfo
    Cluster Name: SAPprd_Cluster
    Cluster Connection Authentication Mode: Standard
    Cluster Message Authentication Mode: None
    Cluster Message Encryption: None
    Use Persistent Labels for Communication: No
    There are 2 node(s) and 4 network(s) defined

    NODE sappr1:
    Network net_diskhb_DR
    hdisk6_01 /dev/hdisk6
    Network net_diskhb_HO
    hdisk3_01 /dev/hdisk3
    Network net_ether_private
    sappr1 192.168.0.28
    Network net_ether_public
    sapprd 192.168.0.29
    sappr1_boot 172.16.10.28

    NODE sappr2:
    Network net_diskhb_DR
    hdisk6_02 /dev/hdisk6
    Network net_diskhb_HO
    hdisk3_02 /dev/hdisk3
    Network net_ether_private
    sappr2 192.168.0.30
    Network net_ether_public
    sapprd 192.168.0.29
    sappr2_boot 172.16.10.30

    Resource Group sapprd_RG
    Startup Policy Online On Home Node Only
    Fallover Policy Fallover To Next Priority Node In The List
    Fallback Policy Never Fallback
    Participating Nodes sappr1 sappr2
    Service IP Label sapprd

    Total Heartbeats Missed: 0

    ##################################################################
    2. In case of DR site has failure (I'm just un-mapping disk out off two nodes on V7000@DR site)

    hdisk1 active 1999 1939 400..340..399..400..400
    hdisk2 active 1999 1819 400..220..399..400..400
    hdisk4 missing 1999 1939 400..340..399..400..400
    hdisk5 missing 1999 1819 400..220..399..400..400
    My problem is when V7000 on the DR site comeback to normal. What methods are used to solve the VG disk missing problem?

    I try to use #smit hacmp
    > System Management (C-SPOC)
    > Storage
    > Volume Group
    > Synchronize LVM Mirrors
    > Synchronize by Volume Group

    Type or select values in entry fields.
    Press Enter AFTER making all desired changes.

    Entry Fields
    VOLUME GROUP name sapvg
    Resource Group Name sapprd_RG
    • Node List sappr1,sappr2

    Number of Partitions to Sync in Parallel [2] +#
    Synchronize All Partitions no +
    Delay Writes to VG from other cluster nodes during no +
    this Sync
    An error after syncvg..

    ###########################################################################################
    sappr1: 2012-08-03T18:32:14.255602
    sappr1: 2012-08-03T18:32:14.261521
    sappr1: Reference string: Fri.Aug.3.18:32:14.ICT.2012.cl_disk_available.hdisk4hdisk5.refsappr1: Aug 3 2012 18:32:14 Starting execution of /usr/es/sbin/cluster/events/utils/cl_di
    sk_available with parameters: -s -v hdisk4hdisk5.
    sappr1: ***************************
    sappr1: Aug 3 2012 18:32:14 !!!!!!!!!! ERROR !!!!!!!!!!
    sappr1: ***************************
    sappr1: Aug 3 2012 18:32:14 cl_disk_available : Undefined disk device hdisk4hdisk5 (May have been duplicate).
    sappr1: ***************************
    sappr1: Aug 3 2012 18:32:14 !!!!!!!!!! ERROR !!!!!!!!!!
    sappr1: ***************************
    sappr1: Aug 3 2012 18:32:14 cl_disk_available : Unable to make device hdisk4hdisk5 available. Check hardware connections.
    sappr1: 0516-1396 getlvodm: The physical volume hdisk4hdisk5, was not found in the
    sappr1: system database.
    sappr1: 0516-722 chpv: Unable to change physical volume hdisk4hdisk5.
    sappr1: cl_rsh had exit code = 2, see cspoc.log and/or clcomd.log for more information
    sappr1: cl_rsh had exit code = 2, see cspoc.log and/or clcomd.log for more information
    error executing clvaryonvg sapvg on node sappr1
    Error detail:
    sappr1: can't clvaryonvg a vg which is already varied on
    sappr1: RETURN_CODE=2
    sappr1: cl_rsh had exit code = 2, see cspoc.log and/or clcomd.log for more information
    sappr1: 0516-938 /usr/sbin/syncvg: One option must be entered.
    sappr1: Usage: /usr/sbin/syncvg -i -f -H -P NumParalleLPs {-l|-p|-v} Name
    sappr1: Synchronize logical partition copies.
    sappr1: cl_rsh had exit code = 1, see cspoc.log and/or clcomd.log for more information
    cl_syncvg: Error executing syncvg P dc -v sapvg on node sappr1
    ###########################################################################################

    Something wrong on the output message which are the argument of command "hdisk4 hdisk5"

    Please suggest me a solution. Thank You.


  • 2.  Re: Please help me to check the PowerHA 6.1 cross-site LVM mirroring

    Posted Thu August 30, 2012 06:50 AM

    Originally posted by: OTIT


    1. # # My solution to solve problem which are sync VG when a mirror pool come back after one site failure.
    FYI >>>
    sappr1:/# lsvg -P sapvg
    Physical Volume Mirror Pool
    hdisk1 MP_HO
    hdisk3 MP_DR

    sappr1:/# lsmp -A sapvg
    VOLUME GROUP: sapvg Mirror Pool Super Strict: yes
    MIRROR POOL: MP_HO Mirroring Mode: SYNC
    MIRROR POOL: MP_DR Mirroring Mode: SYNC

    sappr1:/# lspv
    hdisk1 00f7c002dcacfd4d sapvg concurrent
    hdisk2 00f7c002dcad2f6d mndhb_vg_01 concurrent
    hdisk3 00f7c004dcae07b0 sapvg concurrent
    hdisk4 00f7c004dcae226c mndhb_vg_02 concurrent
    hdisk0 00f7c002d749ca50 rootvg active

    1. # # 1. Assume V7000 at the DR site failure (cluster still running) and PV missing state can not use cl_syncvg_vg
    sappr1:/# lsvg -p sapvg
    sapvg:
    PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
    hdisk1 active 1999 1799 400..200..399..400..400
    hdisk3 missing 1999 1799 400..200..399..400..400

    1. # # 2. After the V7000 at the DR site are fixed the problem and come to normal state.
    2. # # Change PV missing state to removed
    sappr1:/# chpv -v r hdisk3
    sappr1:/# lsvg -p sapvg
    sapvg:
    PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
    hdisk1 active 1999 1799 400..200..399..400..400
    hdisk3 removed 1999 1799 400..200..399..400..400

    1. # # 3. And the next step, change PV state to active state.
    sappr1:/# chpv -v a hdisk3
    0516-1010 chpv: Warning, the physical volume hdisk3 has open logical
    volumes. Continuing with change.
    sappr1:/# lsvg -p sapvg
    sapvg:
    PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
    hdisk1 active 1999 1799 400..200..399..400..400
    hdisk3 active 1999 1799 400..200..399..400..400

    1. # # 4. The LV state are stale state (can verify stale PPs by command >> # lsvg sapvg, STALE PPs: 26)
    sapprd:/# lsvg -l sapvg
    sapvg:
    LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
    saplv_mp jfs2 200 400 2 open/stale /sapdata

    sappr1:/# lsvg sapvg
    VOLUME GROUP: sapvg VG IDENTIFIER: 00f7c00200004c000000013976ce373d
    VG STATE: active PP SIZE: 256 megabyte(s)
    VG PERMISSION: read/write TOTAL PPs: 3998 (1023488 megabytes)
    MAX LVs: 256 FREE PPs: 3598 (921088 megabytes)
    LVs: 1 USED PPs: 400 (102400 megabytes)
    OPEN LVs: 1 QUORUM: 2 (Enabled)
    TOTAL PVs: 2 VG DESCRIPTORS: 3
    STALE PVs: 1 STALE PPs: 26
    ACTIVE PVs: 2 AUTO ON: no
    Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
    VG Mode: Concurrent
    Node ID: 1 Active Nodes: 2
    MAX PPs per VG: 32768 MAX PVs: 1024
    LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
    HOT SPARE: no BB POLICY: relocatable
    MIRROR POOL STRICT: super
    PV RESTRICTION: none INFINITE RETRY: no
    1. # # 5. To sync VG which are cluster resource and cluster resource still on line. ( STALE PPs: 0)
    sapprd:/# smit cl_syncvg_vg

    Next>>>
    sapprd:/# lsvg -l sapvg
    sapvg:
    LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
    saplv_mp jfs2 200 400 2 open/syncd /sapdata
    sapprd:/# lsvg sapvg
    VOLUME GROUP: sapvg VG IDENTIFIER: 00f7c00200004c000000013976ce373d
    VG STATE: active PP SIZE: 256 megabyte(s)
    VG PERMISSION: read/write TOTAL PPs: 3998 (1023488 megabytes)
    MAX LVs: 256 FREE PPs: 3598 (921088 megabytes)
    LVs: 1 USED PPs: 400 (102400 megabytes)
    OPEN LVs: 1 QUORUM: 2 (Enabled)
    TOTAL PVs: 2 VG DESCRIPTORS: 3
    STALE PVs: 0 STALE PPs: 0
    ACTIVE PVs: 2 AUTO ON: no
    Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
    VG Mode: Concurrent
    Node ID: 1 Active Nodes: 2
    MAX PPs per VG: 32768 MAX PVs: 1024
    LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
    HOT SPARE: no BB POLICY: relocatable
    MIRROR POOL STRICT: super
    PV RESTRICTION: none INFINITE RETRY: no


  • 3.  Re: Please help me to check the PowerHA 6.1 cross-site LVM mirroring

    Posted Thu August 30, 2012 07:05 AM

    Originally posted by: OTIT


    The file /etc/snmpdv3.conf for clstat command which are proved.

    cp /etc/snmpdv3.conf /etc/snmpdv3.conf.orig
    vi /etc/snmpdv3.conf

    edit and insert follow this
    #########################################################################################################

    VACM_GROUP group1 SNMPv1 public -

    VACM_VIEW defaultView internet - included -

    1. exclude snmpv3 related MIBs from the default view
    VACM_VIEW defaultView snmpModules - excluded -
    VACM_VIEW defaultView 1.3.6.1.6.3.1.1.4 - included -
    VACM_VIEW defaultView 1.3.6.1.6.3.1.1.5 - included -

    1. exclude aixmibd managed MIBs from the default view
    VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 - excluded -

    VACM_ACCESS group1 - - noAuthNoPriv SNMPv1 defaultView - defaultView -

    NOTIFY notify1 traptag trap -

    TARGET_ADDRESS Target1 UDP 127.0.0.1 traptag trapparms1 - - -

    TARGET_PARAMETERS trapparms1 SNMPv1 SNMPv1 public noAuthNoPriv -

    COMMUNITY public public noAuthNoPriv 0.0.0.0 0.0.0.0 -

    DEFAULT_SECURITY no-access - -

    logging file=/usr/tmp/snmpdv3.log enabled
    logging size=100000 level=0

    smux 1.3.6.1.4.1.2.3.1.2.1.2 gated_password

    smux 1.3.6.1.4.1.2.3.1.2.3.1.1 muxatmd_password
    smux 1.3.6.1.4.1.2.3.1.2.1.5 clsmuxpd_password
    #########################################################################################################

    stopsrc -s snmpd
    startsrc -s snmpd
    sleep 30


  • 4.  Re: Please help me to check the PowerHA 6.1 cross-site LVM mirroring

    Posted Wed December 12, 2012 02:51 PM

    Originally posted by: bodily


    Though you already figured this out. Here's my two comments.

    1)On your PVID warnings, its weird, but normal when defining sites. Even replicated resources with Enterprise Edition give this warning though some additional intelligence to check UUID I think has been added.

    2)Your problem was the disks are still missing. You can simply run the appropriate varyonvg command, or as you did, run chpv. Then you can sync them back up.