High Performance Computing Group

 View Only
  • 1.  about 10.1.0.14 version

    Posted Fri March 01, 2024 04:35 AM

    I installed LSF with new installation pkg and see that now the 

    there is now a child service file for each LSF daemon (lsfd-lim.servicelsfd-res.service, and lsfd-sbd.service), to handle each daemon separately.

    LSF version is upgraded to latest version 10.1.0.14,

    1, let me know is it possible to go back to old setup where we had only one service. lsfd.service ? Please here here.

    2. Also badmin reconfig is throwing below message intermittently 

    getClusterData: ls_getclustername() failed. LIM is down; try later.
    minit: getClusterData() failed. LIM is down; try later.
    main(): Failed to contact LIM: LIM is down; try later; quit master



    ------------------------------
    Nitin Gizare
    ------------------------------


  • 2.  RE: about 10.1.0.14 version

    Posted Fri March 01, 2024 03:10 PM

    LSF 10.1.0.14 adds lsfd-lim, lsfd-res, and lsfd-sbatchd service unit files to help auto restart lim/res/sbatchd if they exits unexpectedly. But this implementation seems not compatible with LSF daemon management through lsadmin/badmin/bctrld commands. You may consider to install following patch. The patch eliminates lsfd-lim, lsfd-res, and lsfd-sbatchd unit files, and work with lsadmin/badmin/bctrld properly. 

    http://www.ibm.com/support/fixcentral/swg/selectFixes?product=ibm/Other+software/IBM+Spectrum+LSF&release=All&platform=All&function=fixId&fixids=lsf-10.1-build601849&includeSupersedes=0



    ------------------------------
    YI SUN
    ------------------------------