High Performance Computing Group

High Performance Computing Group

Connect with HPC subject matter experts and discuss how hybrid cloud HPC Solutions from IBM meet today's business needs.

 View Only
  • 1.  LSF working with Conda environments

    Posted Fri May 31, 2019 09:03 AM

    Hi

    I am trying to get LSF working with conda environments. This is on a single node. 

    When I run it standalone, it's fine. nvidia-smi shows that it is running.  

    But when I run it using bsub it fails

    -------------
    (pytorch1.10) [root@newell1 benchmark-dso]# more out.txt

    THCudaCheck FAIL file=../aten/src/THC/THCGeneral.cpp line=52 error=100 : no CUDA-capable device is detected

    Traceback (most recent call last):

      File "./EDSR/src/main.py", line 35, in <module>

        main()

      File "./EDSR/src/main.py", line 25, in main

        _model = model.Model(args, checkpoint)

      File "/opt/benchmark-dso/EDSR/src/model/__init__.py", line 26, in __init__

        self.model = module.make_model(args).to(self.device)

      File "/opt/anaconda3/envs/pytorch1.10/lib/python3.6/site-packages/torch/nn/modules/module.py", line 386, in to

        return self._apply(convert)

      File "/opt/anaconda3/envs/pytorch1.10/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply

        module._apply(fn)

      File "/opt/anaconda3/envs/pytorch1.10/lib/python3.6/site-packages/torch/nn/modules/module.py", line 199, in _apply

        param.data = fn(param.data)

      File "/opt/anaconda3/envs/pytorch1.10/lib/python3.6/site-packages/torch/nn/modules/module.py", line 384, in convert

        return t.to(device, dtype if t.is_floating_point() else None, non_blocking)

      File "/opt/anaconda3/envs/pytorch1.10/lib/python3.6/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init

        torch._C._cuda_init()
    ----------------------------

    What am I doing wrong? 



    ------------------------------
    GILBERT THOMAS
    ------------------------------

    #SpectrumComputingGroup


  • 2.  RE: LSF working with Conda environments

    Posted Mon June 03, 2019 01:53 AM
    I resolved it. I need to ask for gpus in my bsub command.

    ------------------------------
    GILBERT THOMAS
    ------------------------------



  • 3.  RE: LSF working with Conda environments

    Posted Mon June 03, 2019 02:01 PM
    Based on your solution, I guess you have LSF gpu enforcement enabled.

    ------------------------------
    YI SUN
    ------------------------------



  • 4.  RE: LSF working with Conda environments

    Posted Mon June 03, 2019 02:01 PM
    Based on your solution, I guess LSF GPU enforcement is also enabled.

    ------------------------------
    YI SUN
    ------------------------------