Hi,
we are running one of our A100 nodes in MIG setup, and wanted to look at GPU resources usage with 'bacct -gpu -l JOBID'. All we get on GPU information looks like this (sorry for the formatting, but this editor doesn't allow indenting):
Host based accounting information about this job:
HOST CPU_T MEM SWAP
hostA 409.00 76M 0M
GPU ID: 259:1
Total Execution Time: -
Energy Consumed: -
SM Utilization (%): -
Memory Utilization (%): -
Max GPU Memory Used: -
GPU ID: 1
Total Execution Time: -
Energy Consumed: -
SM Utilization (%): -
Memory Utilization (%): -
Max GPU Memory Used: -
GPU ID: 149:10:n
Total Execution Time: -
Energy Consumed: -
SM Utilization (%): -
Memory Utilization (%): -
Max GPU Memory Used: -
GPU Energy Consumed: -
First of all: why do we see three GPUs, though the job was running in one CI, only? And it looks like we do not get any usage information ... Are we missing something here?
------------------------------
Bernd Dammann
------------------------------
#SpectrumComputingGroup