High Performance Computing Group

High Performance Computing Group

Connect with HPC subject matter experts and discuss how hybrid cloud HPC Solutions from IBM meet today's business needs.

 View Only
Expand all | Collapse all

RSH vs SSH within the cluster

  • 1.  RSH vs SSH within the cluster

    Posted Mon April 11, 2016 11:39 AM

    Originally posted by: thloeber


    The platform conductor documentation says that either RSH or SSH is required for remote start, stop and restart from the master host to the other hosts in the cluster.  I have added a compute host to my cluster and the egoconfig join command was success.  We also have passwordless SSH enabled for egoadmin across the master host and compute host. 

    Everything appeared to install correctly and configure correct but when I look at the egosh resource list from the master host I only see the one host. Also from the PMC Web UI I only see one host.

    I also get this error when trying to start all hosts in the cluster from the master

    [egoadmin@scmonspark01 ~]$ egosh ego start all
    Do you really want to start up LIM on all hosts ? [y/n]y
    Start up LIM on <scmonspark01.raleigh.ibm.com> ...... rsh: No such file or directory
    rsh failed; please ensure correct operation of rsh.

    the scmonspark01 is out master host. 

    My question is will ssh work for remote commands  or is rsh required?  The Managing Your Cluster section in the doc says you must use rsh to remotely start the hosts in the cluster.


    #SpectrumComputingGroup


  • 2.  Re: RSH vs SSH within the cluster

    Posted Tue April 19, 2016 04:00 PM

    Originally posted by: 6K3S_John_Nguyen


    Hi Thomas,

    Thanks for your question.

    Your issue of seeing only one entry (the master host) in the output for 'egosh resource list' on the master host is because EGO has not been started up on the compute hosts for the first time.  You will need to manually run 'egosh ego start' on each compute host.  This starts up a daemon called LIM which will then communicate with the LIM running on the master host, dynamically joining the compute host to the cluster.  When you ran 'egoconfig join <master_host>', all that did was configure the compute host to communicate with the master host (but the actual communication had not taken place yet).  After you run 'egosh ego start' on each compute host, you can periodically run 'egosh resource list' and see additional host entries in the command's output.  Once the compute host's entry is there, you will be able to remotely start/stop these hosts from the master host.

    In order to start/stop the hosts remotely, it will require that 'rsh' is setup.  From the Knowledge Center:

    "You must log on with root permissions and rsh on all hosts in the cluster. To start a host specified by name, you must be able to run rsh across all hosts in the cluster without having to enter a password; see your operating system documentation for information about configuring rsh"

    Here's some additional info on it from official documentation.

    https://www.ibm.com/support/knowledgecenter/SSVH2B_1.1.0/management_sym/host_management_win.dita


    #SpectrumComputingGroup