AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

 View Only
  • 1.  HACMP - network failure testing

    Posted Wed October 18, 2006 11:34 AM

    Originally posted by: SystemAdmin


    In tesing my hacmp configuration I'm trying to test a full network failure on a node. The node is an LPAR with virtual ethernet connections seutp with 4 virtual I/O adapters split across two etherchannel devices. When I try ifconfig down on both etherchannel adapters, it recovers and brings them back up. When I chdev -l <adapter> -a state=down on both adapters they stay down. However, the resources never fail over and clstat shows the service IP Interface as being UP (boot and standby are correctly shown as down). ifconfig shows both interfaces as down.

    Using: IPAT (no aliases)
    SAN disk heartbeat
    Startup - online on home node only
    AIX 5.3
    HACMP 5.3
    Two resource groups, one is home on each node

    All other tests so far are fine.
    On a related note, how can I test failure of the disk heartbeat without affecting other LPARs on the sever?


  • 2.  Re: HACMP - network failure testing

    Posted Thu October 19, 2006 10:27 AM

    Originally posted by: SystemAdmin


    FYI - after a call to IBM (and talking to a confused person who finally got an answer from someone else), the issue is virtual ethernet adapters in an LPAR. Apparently HACMP cannot correctly identify this type of "failure" correctly when using virtual ethernet adapters, and you must use IPAT with aliases. The documentation is a bit fuzzy to me, referring to virtual ethernet as VLAN but not VLAN at the physical network level. You can also do VLANs on a P5 using virtual ethernet which is what I took it to mean. Either way, you can't really test it using ifconfig down or chdev because of the way the LPAR communicates with HACMP and HACMP will not show it as being down even though bringing down the adapter with the service IP will cause the IP to move to the standby adapter. Bringing down both adapters causes them to recover and come back online. Even worse, if you chdev them down and they come up, ifconfig shows them up but HACMP verification shows them down (still down in ODM). Without aliases (I'm just not starting testing after converting to aliases), clstat can show the service IP is up, or even service and boot are up, when all adapters are chdev'd down and no networking is active. Real irritating.

    Her answer is "that's the way it works" and that a true network failure will work correctly. I'm a bit uneasy about this answer but after a couple hours on the phone and her speaking with someone else, it's the only answer I could get.


  • 3.  Re: HACMP - network failure testing

    Posted Thu October 19, 2006 10:29 AM

    Originally posted by: SystemAdmin


    Almost forgot. They claim no way to test heartbeat failure short of pulling the SAN connections which isn't going to happen with other production LPARs on this p5.

    Hope that helps someone (or someone has a better answer for heartbeat OR the networking stuff).