High Performance Computing

 View Only

Using the new LSF Web Service

By Bill McMillan posted Fri May 03, 2024 04:12 AM

  

In the inaugural blog of this series, I explored the driving factors for this new service and the installation process. This second blog will delve into its usage, and the third and concluding blog will examine the OpenAPI specification of the service.


In the words of Mark Twain “Supposing is good, but finding out is better”, so let’s just dive right in and look at the client side CLI that we’ve provided:

billmc@billvm1:/home/billmc$ ls -l
-rwxr-xr-x  1 billmc   billmc    29876840 Apr 24 14:21 lsf

Indeed, your eyes are not misleading you; it is a single binary. We did aim for it to be lightweight and straightforward to install, marking a significant shift from the usual 100+ binaries found in $LSF_BINDIR.


billmc@billvm1:/home/billmc$ ./lsf version
version:  v0.0.1
commit:   0.0.1 Preview

But remember, we did say we also wanted to preserve the existing user experience – which is all still there under the covers:

billmc@billvm1:/home/billmc$ ./lsf 
NAME:
  lsf, lsf - Interact with LSF.

USAGE:
  lsf [global_options...] command [arguments...] [options...]

GLOBAL OPTIONS:
  --cluster value  Set a target LSF cluster by its name
                   Alias: -c value
  --env            Submit job with user's local environment variables
                   Alias: -e
COMMANDS:
  cluster, cl    Manage LSF clusters.
  config, conf   Manage configuration.
  file, f        Manage LSF File.
  help, h        Show help.
  version, v     Display the 'lsf' command-line interface version.
  $LSFCOMMAND    Execute LSF CLI command.

AVAILABLE LSFCOMMAND:
  bacct, bapp, bbot, bchkpnt, bconf, bentags, bgadd, bgdel, bgmod, bgpinfo, bhist, bhosts, bhpart, bjdepinfo, bjgroup, bjobs, bkill, blaunch, blimits, bmgroup, bmig, bmod, bparams, bpeek, bpost, bqueues, bread, brequeue, bresize, bresources, brestart, bresume, brlainfo, brsvadd, brsvdel, brsvmod, brsvs, brun, bsla, bslots, bstatus, bstop, bsub, bswitch, btop, bugroup, busers, lsacct, lsacctmrg, lsclusters, lseligible, lsrun, lsgrun, lshosts, lsid, lsinfo, lsload, lsloadadj, lsltasks, lspasswd, lsplace, lsrtasks, qdel, qsub
Enter 'lsf help command' for more information about a command.

As you can see, the majority of LSF commands are supported.   For those using IBM Cloud, this binary integrates directly with the “ibmcloud” command line framework: 

$ ibmcloud lsf 

NAME:
  lsf, lsf - Interact with LSF.

USAGE:
  lsf [global_options...] command [arguments...] [options...]
etc

The “lsf” client is currently available for linux (x86, aarch64, pp64le) and for Windows 64bit.   MacOS on Arm will be added later - but you can always create your own – as I’ll explain in the third part of this series.

Before I explore this further, let’s take a quick look at the interaction between the client and the server.
 

Authentication

As mentioned in the previous blog, https is enabled by default.   So the first line in authentication is the client presenting a valid https certificate.

Next we need to actually logon to the service - the default method is username/password, but we can also use an API_KEY which I will talk about in the final blog of this series.    Today, LSF Application Center provides support for several popular Single Sign On solutions, and we will be adding these to the web service.

User Authentication
Assuming authentication is successful, the client is granted a unique access token which must be presented with every subsequent connection, meaning every single command sent to the LSF cluster is subject to an authentication check.
The use of HTTPS with verified SSL certificates (TLSv1.2) ensures a high level of security against man-in-the-middle and replay attacks. We have rigorously adhered to IBM's Security and Privacy by Design principles, conducting thorough static application security testing (SAST) and dynamic application security testing (DAST). However, as this is a web server capable of executing specific commands for authenticated users, caution is advised when making it accessible over the internet. It is strongly advised to position it behind the primary firewall and mandate a VPN for access.

Let’s go

Unlike the traditional LSF CLI, with this new “lsf” client, we need to first authenticate to the service.  In the first blog I turned off https during the installation (not recommended!) but it makes this example easier to follow as I don't need a client certificate, With https disabled, the service will be running on port 8088 (8448 if https is enabled).


billmc@billvm1:/home/billmc$ ./lsf cluster logon --username billmc --url http://lwshostA:8088
Password> *********
OK

All good, so now we can execute LSF commands as normal:

billmc@billvm1:/home/billmc$ ./lsf lsid
IBM Spectrum LSF Standard 10.1.0.14, Apr 24 2023
Copyright International Business Machines Corp. 1992, 2016.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

My cluster name is clusterA
My master name is lsfmgrA


billmc@billvm1:/home/billmc$ ./lsf bsub sleep 100
Job <20770> is submitted to default queue <normal>.


billmc@billvm1:/home/billmc$ ./lsf bjobs 20770
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
20770   billmc  RUN   normal     lwshostA    lsfexecA1    sleep 100  Apr  9 15:56

As we can see, the “FROM_HOST” is the host where the LSF Web Service is running, not the web service submission point.   We will be adding an option to display the FROM_HOST as the web service end point.

Command Execution

The above examples also illustrates that because the command is ultimately invoked “in” the cluster,  all your existing esubs/wrappers etc will also still be invoked.  This was a conscious design decision to ensure that the behaviour of the web service and existing CLI are consistent.  We didn't want users being able to bypass any administrator configured filters by using the web service directly.  Furthermore, the commands that can be run are encoded into the client, and double checked on the server side.

Command Execution

And a one, and a two....

We did want this new interface to be able to do more than the existing CLI/API, such as to be able to communicate with multiple clusters.

I’ve already set up a second cluster, and that one has been installed with https enabled.  So before I can use it, I need to import the certificate into the “lsf” client.  The default certificate can be found in the server installation package.

billmc@billvm1:/home/billmc$ ./lsf config set --cacert cacert.pem
OK

I can now connect to the second cluster, on port 8448 this time:

billmc@billvm1:/home/billmc$ ./lsf cluster logon --username billmc --url https://lwshostB:8448
Password> ********
OK

So I’m now authenticated to both clusters:

billmc@billvm1:/home/billmc$ ./lsf cluster list
Default   Name        Version                                            URL
*         clusterA    IBM Spectrum LSF Standard 10.1.0.14, Apr 24 2023   http://lwshostA:8088
          clusterB    IBM Spectrum LSF Standard 10.1.0.14, Apr 24 2023   https://lwshostB:8448

The “conf” option allows you to specify a default cluster to use (if you are connected to more than one) and whether you want queries to be sent to all connected clusters.  Settings are persisted in $HOME/.bluemix/plugins/lsf/config.json

billmc@billvm1:/home/billmc$ ./lsf conf show
Key               Value
DefaultCluster    clusterA
DefaultQueryAll   true
Cacert            cacert.pem

Now let’s submit two more jobs

billmc@billvm1:/home/billmc$ ./lsf bsub sleep 1000
Job <10101> is submitted to default queue <normal>.
$lsf –cluster clusterB bsub sleep 2000
Job <10102> is submitted to default queue <normal>.

As my DefaultCluster is clusterA, I didn’t need to specify -cluster clusterB for the first job.   I can query the jobs in both clusters:

billmc@billvm1:/home/billmc$ ./lsf bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
10101   billmc  PEND  normal     lwshostA                 sleep 1000  Apr 22 10:38
10102   billmc  PEND  normal     lwshostB                 sleep 2000  Apr 22 10:39

As “DefaultQueryAll=True”, bjobs will contact both clusters and get the job listing.  I have LSF’s “Global ID” capability enabled in both clusters, ensuring jobs have unique ID’s.   This also means that I don’t need to specify the cluster when issuing a query, as the client will contact the appropriate cluster(s) automatically, so if you are using multiple clusters, you don’t need to remember which job is in which cluster.

billmc@billvm1:/home/billmc$ ./lsf bjobs -l 10102
Job <10102>, User <billmc>, Project <default>, Status <RUN>, Queue <normal>, Command
                     <sleep 2000>, Share group charged </billmc>
Mon Apr 22 10:39:06: Submitted from host <lwshostB>, CWD <$HOME>;
Mon Apr 22 10:39:17: Started 1 Task(s) on Host(s) <lsfexecB1>, Allocated 1
                     Slot(s) on Host(s) <lsfexecB1>, Execution Home </home/billmc>, 
                     Execution CWD </home/billmc>;

Across the great divide

Obviously one of the key benefits of accessing the cluster via web services is that the client side can be virtually anywhere - as long as you can can open a secure connection to the cluster, you can submit and manage your jobs.

Oh yes, data, well let's just assume that it's in the cluster already and that makes things nice and simple...and for many people that will be true, Companies will want to keep their IP secure within their firewalls and not want ingress or egress.  But if you did...the LSF client has some options to manage data transfer over the same secure connection.

billmc@billvm1:/home/billmc$ ./lsf file
NAME:
  file, f - Manage LSF File.

USAGE:
  file COMMAND

COMMANDS:
  list, ls          List files from the LSF Web Services host.
  upload            Upload the file from the source to the destination on LSF Web Services host.
  download          Download the source file from the LSF Web Services host to the destination file.
  delete, rm        Delete the file from the LSF Web Services host.
  repository, repo  List all configured repositories from a specific LSF cluster.

The "repo" option will show where the web service administrator will allow files to be uploaded to/download from. In this case I am only allows to/from my home directory - any other path will fail.

billmc@billvm1:/tmp$ ./lsf file repo

Name   Path
all    /home/billmc/

billmc@billvm1:/tmp$ ./lsf file upload input.file /home/lsfadmin/input.file
FAILED
Permission denied, cannot upload file to /home/lsfadmin/input.file
Run 'lsf file repository' to determine the available path.

billmc@billvm1:/tmp$ ./lsf file upload input.file /home/billmc/input.file

OK


I trust this overview has provided a glimpse into the capabilities of the new client and its ability to directly submit and manage workloads across multiple clusters.  In the concluding part of this series, I will explore the direct use of the web service API and the creation of bindings for additional languages like Python.

Should you wish to access the package before the release of part three, you may request it here.


#Spectrum-LSF

0 comments
50 views

Permalink