Primary Storage

 View Only
Expand all | Collapse all

Jobs pending despite requested resources being available

  • 1.  Jobs pending despite requested resources being available

    Posted Wed March 10, 2021 02:26 PM

    Hi,

    I'm running Spectrum LSF Community Ed 10.1.0 and a number of my users are seeing their jobs pending (often for an hour or more) despite the requested resources (typically memory) being available on one or more hosts in the cluster.

    Running a bjobs - lp on one of the affected JobIDs just provides the following:

    PENDING REASONS:

    Job requirements for reserving resource (mem) not satisfied: 4 hosts;

    despite sufficient memory being available on a number of hosts (determined as physical RAM minus max(RAM in use, sum of LSF requested RAM)) to satisfy their resource requirement.

    How can I debug the issue further? The vast majority of jobs submitted with resource requirements run correctly with little or no time spent pending. It's a seemingly sporadic issue that I'm struggling to diagnose.

    Any help would be appreciated,

    Mike


    #SupportMigration
    #Spectrum
    #Support


  • 2.  RE: Jobs pending despite requested resources being available

    Posted Wed March 10, 2021 04:08 PM

    One possible thing is that a running job requesing memory uses rusage[mem=n] in its resource requirement, so LSF reserves memory (requested minus current in use) for running job even though the job is not currently consuming this much of memory. Reserved memory cannot be used by other jobs. You can use bhosts -l <host name> to check if there is memory reservation and actual available memory available on the host for scheduling.


    #Spectrum
    #Support
    #SupportMigration