High Performance Computing

 View Only

Chatting with your LSF cluster by Slack

By Xun Pan posted Tue April 20, 2021 10:02 PM

  

Did you ever think about talking with your LSF cluster by Slack on your cell phone? Now it is possible with LSF ChatOps tool.


LSF ChatOps is an open-source tool. It helps LSF administrators and users interact with LSF clusters on any Slack client. You can manage your LSF cluster or manage your own jobs with LSF command line directly send by Slack chat box. Here is a simple story for lsfadmin and an LSF user to use LSF ChatOps.

 
"As an LSF user, I found my job is still pending after a long time waiting. I create a Slack group with LSF administrator and send him a message."

user 10:00 pm

Hi, lsfadmin, my jobs are pending for resource shortage. I'd like to know if it is possible to get more resources?

 

"lsfadmin add the LSF-robot in and create a chat group"

 lsfadmin 10:00 pm

!bjobs -user user


LSF-robot 10:00 pm

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
148     lsfadmi PEND  normal     bimbo1.fyre             *eep 10000 Apr 19 00:50
 Job slot limit reached: 1 host;
149     lsfadmi PEND  normal     bimbo1.fyre             *eep 10000 Apr 19 00:50
 Job slot limit reached: 1 host;
150     lsfadmi PEND  normal     bimbo1.fyre             *eep 10000 Apr 19 00:50
 Job slot limit reached: 1 host;


lsfadmin 10:00 pm

!bhosts


LSF-robot 10:00pm

HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
bimbo1.fyre.ibm.co closed          -      4      4      4      0      0      0

 

lsfadmin 10:01pm

@user, this is a test cluster. Let me just enlarge the slots on the host to let your job go.


lsfadmin 10:01pm

$ bconf update host=bimbo1 MXJ=8


LSF-robot 10:00pm

bconf: Request for update host <bimbo1> accepted


user 10:01pm

!bjobs


LSF-robot 10:01pm

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
148     lsfadmi RUN   normal     bimbo1.fyre bimbo1.fyre *eep 10000 Apr 19 00:50
149     lsfadmi RUN   normal     bimbo1.fyre bimbo1.fyre *eep 10000 Apr 19 00:50
150     lsfadmi RUN   normal     bimbo1.fyre bimbo1.fyre *eep 10000 Apr 19 00:50


user 10:01pm

@lsfadmin thank you for your help!

 

Except basic LSF commands, you may also want to get job notification. When you submit a job in the Lab and want to get notified from Slack when your job is finished. Just asks LSF robot with job notification registration: `!register job-id @my-slack-account`. When your job is finished, the notification will be sent to you at the moment.

 

How it happens? LSF ChatOps is a plugin of `Errbot` which supports multiple frontend chat tools. We need to run an `Errbot` instance on any LSF server hosts with LSF plugin installed. The access rule can be configured in `config.json` to make LSF plugin with more security access control level.

LSF ChatOps software structure


Now, you can deploy LSF ChatOps with the instructions documented in our open-source repo:
https://github.com/IBMSpectrumComputing/lsf-utils/tree/master/chatops/errbot


#SpectrumComputingGroup
0 comments
12 views

Permalink