IBM Spectrum Computing Group

Expand all | Collapse all

LSF with PowerAI docker from dockerhub

  • 1.  LSF with PowerAI docker from dockerhub

    Posted Wed February 26, 2020 10:28 AM


    I can do this without issues.
    $docker run -ti --env LICENSE=yes ibmcom/powerai:1.7.0-snap-ml-ubuntu18.04-py36-x86_64 bash

    I been trying to get this working via LSF. 

    I followed the steps and configured docker in lsb.applications

    Begin Application
    NAME = powerai
    DESCRIPTION = Example PowerAI application
    CONTAINER = docker[image( \
    options(--rm --net=host --ipc=host --env LICENSE=yes \
    -v /opt/mldl:/opt/mldl \
    /opt/mldl/scripts/ \
    ) starter(root) ]
    EXEC_DRIVER = context[user(gilbert)] \
    starter[/opt/ibm/lsf/10.1/linux3.10-glibc2.17-x86_64/etc/] \
    controller[/opt/ibm/lsf/10.1/linux3.10-glibc2.17-x86_64/etc/] \
    End Application

    Also set up LSF as specified in here -

    But when I try to submit a job with

    $bsub -Is -app powerai bash

    (base) gilbert@gilbert-aa:/opt/mldl/scripts$ bjobs -d -l 418

    Job <418>, User <gilbert>, Project <default>, Application <powerai>, Status <EX
    IT>, Queue <interactive>, Interactive pseudo-terminal shel
    l mode, Command <bash>, Share group charged </gilbert>
    Wed Feb 26 23:02:30: Submitted from host <gilbert-aa>, CWD </opt/mldl/scripts>;
    Wed Feb 26 23:02:30: Started 1 Task(s) on Host(s) <gilbert-aa>, Allocated 1 Slo
    t(s) on Host(s) <gilbert-aa>;
    Wed Feb 26 23:02:30: Exited with exit code 127. The CPU time used is 0.0 second
    Wed Feb 26 23:02:30: Completed <exit>.

    r15s r1m r15m ut pg io ls it tmp swp mem
    loadSched - - - - - - - - - - -
    loadStop - - - - - - - - - - -

    Combined: select[(defined(docker)) && (type == any)] order[r15s:pg]
    Effective: select[(defined(docker)) && (type == any)] order[r15s:pg]

    What am I doing wrong? 

    gilbert is also the lsf admin. 


  • 2.  RE: LSF with PowerAI docker from dockerhub

    Posted Wed February 26, 2020 12:38 PM
    Be sure to perform the setup step related to the EXEC_DRIVER at this URL:

    Try the docker run as lsfadmin

     docker run -ti bash

    1) lsfadmin must be able to run Docker commands on each machine with Docker. Consult the Docker documentation on Managing Docker as a non-root user at the following website:

    2) check and make sure the docker scripts in $LSF_SERVERDIR are owned by lsfadmin had have permission 700 or 500:


    John Welch

  • 3.  RE: LSF with PowerAI docker from dockerhub

    Posted Thu February 27, 2020 01:25 PM
    You need an "@" in front of script.
    @/opt/mldl/scripts/ \
    Also, you can try to just mount /etc/passwd and /etc/group in container.  
    For example
    -v /etc/passwd:/etc/passwd \
    -v /etc/group:/etc/group \

    For more details on the subject, please see this article/blog:

    John Welch