High Performance Computing Group

 View Only
  • 1.  GPUs: MIG mode and accounting information

    Posted 16 days ago

    Hi,

    to increase the throughput on our GPU cluster, we run some GPUs with 'MIG mode' enabled.  When it comes to accounting, we face some problems here:

    • when using MIG mode, the GPU usage information of a job is lost, i.e. energy and memory usage (we are aware of, that this is a limitation in Nvidia's DCGM implementation)
    • we lack a simple way to extract information about the assigned MIG in bacct (or via the C API), to be able to tell the user, that the job didn't run on a full GPU, but only a part of it.  If we were to do "real" billing, the users would probably not want to pay for a full H100, if their jobs ran on 1/7 of the H100, or another partition size, only!

    We are aware of, that we can get some information from bacct via the '-gpu' option, e.g. a per-task information like this:

    GPU_ALLOCATION:
     HOST             TASK GPU_ID  GI_PLACEMENT/SIZE    CI_PLACEMENT/SIZE    MODEL        MTOTAL  FACTOR MRSV    SOCKET NVLINK/XGMI
     hostA            0    0       4/3                  4/3                  NVIDIAH100PC 79.6G   9.0    0M      0      -

    We don't have a 'JSON' option for bacct, and parsing the line based output above can be difficult. 

    We have our own in-house implementations of 'bacct', though, that produces 'JSON' output, which we then can feed into our accounting setup.  However, there is no API documentation how to access this GPU/MIG information via the C API!

    BTW, it would also be nice to get this information while the job is running, i.e. via bjobs!

    Anybody else, having this issue?  We can't be the only HPC site, using MIG mode with LSF, and having this problem.

    Any hints from the IBM LSF experts/developers, how to get access to this information via the C API documentation? 

    Thanks!



    ------------------------------
    Bernd Dammann
    ------------------------------


  • 2.  RE: GPUs: MIG mode and accounting information

    Posted 15 days ago

    With regarding to bjobs and C API, you may look into struct jobinfoEnt->(struct extJobInfoEnt). Detail can be found in lsbatch.h shipped with LSF package.



    ------------------------------
    YI SUN
    ------------------------------



  • 3.  RE: GPUs: MIG mode and accounting information

    Posted 15 days ago

    Thanks!  I have already look at the lsbatch.h file, and dug my way through the nested structures.  I also found two variables in the migJobInfo struct, giIdSize and ciIdSize, but when I print them, their values are sometimes in the range of 1025-1028, and not in the range I would expect, e.g. when comparing with the bjobs/bhost output.  Is there some mask, that needs to be applied?  That's not at all clear, when looking at the header file, only! 



    ------------------------------
    Bernd Dammann
    ------------------------------



  • 4.  RE: GPUs: MIG mode and accounting information

    Posted 14 days ago

    Try convert the data this way, e.g.

    giIdSize>>8&0xFF



    ------------------------------
    YI SUN
    ------------------------------



  • 5.  RE: GPUs: MIG mode and accounting information

    Posted 13 days ago

    Thanks!  That did the trick!  ">>8 & 0xFF" for the placement, and "& 0xFF" for the size, respectively! 

    Now, we can get information about the MIG sizes at runtime, but lack still the information in the accounting, but can probably use the EFFECTIVE_GPU_REQ field in the accounting information, to get this.  So far, so good - but what we really lack is more detailed information, like GPU memory usage, power consumption, etc, when in MIG mode.  The latter might be an issue, as Nvidia doesn't provide job based power usage for MIGs, but what about the memory usage?  This is accessible via nvidia-smi, and thus via libnvml, too.  Can this be added as a feature to LSF? 



    ------------------------------
    Bernd Dammann
    ------------------------------



  • 6.  RE: GPUs: MIG mode and accounting information

    Posted 12 days ago

    LSF can collects GPU job mem usage and power consumption through DCGM integration. But it seems Nvidia hasn't had a workable solution for MIG in DCGM mode. 



    ------------------------------
    YI SUN
    ------------------------------