Problems with verify or cspoc communication

View Only

Expand all | Collapse all

Problems with verify or cspoc communication

1. Problems with verify or cspoc communication

0 Like
Archive User
Posted Fri April 10, 2009 11:54 AM

Reply
Originally posted by: Casey_B

Having a problem with verification or cspoc, with errors indicating communication errors.

Like the following:

rshexec: cannot connect to node NodeA

Or

WARNING: Unable to communicate with the remote node: NodeA.
Please check that node: hacmp11 has the /usr/es/sbin/cluster/etc/rhosts
file configured and the clcomdES subsystem running.

What could be causing this?
2. Re: Problems with verify or cspoc communication

0 Like
Archive User
Posted Fri April 10, 2009 11:58 AM

Reply
Originally posted by: Casey_B

Cluster communication is handled through the clcomd daemon.
This includes all of the cspoc commands, and also the verification
commands.

clcomd uses a cluster specific, host based access method.
clcomd also provides for a reliable method of contacting
other nodes even when there have been changes in the
the topology from fallovers, or IP label moves.

For more details, you can read the following
section in the manual:

Understanding the /usr/es/sbin/cluster/etc/rhosts file
in the HACMP administration guide for 5.5, or 5.4.

First some background:

The general flow of operation is that when node A
wants to try and talk to node B, it will send
an icmp echo request message (A "ping" message)
to all of the addresses on node B
that should be node bound.

This includes all of the boot addresses, standby addresses,
and the communication path configured when the node was added.

Node A will then listen for a response. Whichever IP label
responds back first will be used for initiating communication.

Then a connection request will be sent to the chosen IP label on node B.
Node B will check and see if the source address for node A is in it's
access allowed list (Which is in /usr/es/sbin/cluster/etc/rhosts)

So, some ideas of how to approach this problem, and common configuration
problems:

1) If there is an IP label that is not defined as part of the cluster, but is on the
same subnet as any of the cluster IP labels, then it could reply to the icmp message,
or be the source address for the connection request.

It is safer to make sure that all possible IP labels that could be used to talk between
the nodes are listed in ../cluster/etc/rhosts.

2) The permissions are wrong on ../cluster/etc/rhosts
They need to be 600, and root.system.

3) The communication path is not node bound. For instance, if it is a service label.
If the communication path is a service label, it is not guaranteed to be on a
particular node. If it is on the wrong node, then most communication will fail to the
node associated with the incorrect communication path.

4) One workaround is to truncate the /usr/es/sbin/cluster/etc/rhosts on
a node, and restart clcomd. (a file that exists, but is zero length will be
interpreted to mean that the cluster is in intial configuration, and it will
accept incoming connection requests from any IP label.)

This is different than removing the file! A non-existant file will indicate
to clcomdES to not accept any incoming connections.

(To restart clcomd:

stopsrc -s clcomdES

startsrc -s clcomdES

)

This may not work if the cluster is at HACMP 5.4.1 and above, and you have
changed the cluster topology.
The reason is that if there is a configured cluster, the odm entries are checked,
even if there is no ../cluster/etc/rhosts information.

Has anyone else seen any other similar problems?

If none of these hints help you, then I would suggest calling IBM support.
They would love to help you. :)

Hope this helps,
Casey
3. Re: Problems with verify or cspoc communication

0 Like
Archive User
Posted Thu July 14, 2016 08:31 AM

Reply
Originally posted by: UQ6M_Hubert_Samm

Let me throw my 2cents in.... we recently upgraded to PowerHA 7.2.0 SP1 - which now throws WARNINGS about kernel parameters. According to IBM, if you go into customized verification and choose Details=NO, you should not get these messages... Well, that doesn't work at all, the messages are still there.. plus, the parameters that warning are received on, are STATIC, and cannot be changed... I.M.H.O. PowerHA has enough complexity, why clutter it up with thing like this, of which we cannot control?

PowerHA for AIX

Problems with verify or cspoc communication

Archive UserFri April 10, 2009 11:54 AM

Archive UserFri April 10, 2009 11:58 AM

Archive UserThu July 14, 2016 08:31 AM

1. Problems with verify or cspoc communication

2. Re: Problems with verify or cspoc communication

3. Re: Problems with verify or cspoc communication