IBM Spectrum Computing Group

Expand all | Collapse all

Sending signal only to top-level process

  • 1.  Sending signal only to top-level process

    Posted 15 days ago
    We have a tool stack in which the top-level process launched by users is a wrapper that launches various vendor tools.

    We need to be able to send a signal using bkill that is sent only to the top-level wrapper and not to any child processes. This is because we don't control the signal handler behavior of the underlying tool stack, but we need for the wrapper to be able to take actions based on the signal passed to it. For example if the vendor tool receives an INT it may exit abruptly, so instead we want our wrapper to receive the INT and instruct the vendor tool to exit gracefully.

    Is there any way for us configure LSF (ideally on a per-job basis) to cause bkill to send signals only to the tool launched via the bsub command, and not to that tool's child processes?

    ------------------------------
    Sam Huffman
    ------------------------------


  • 2.  RE: Sending signal only to top-level process

    Posted 14 days ago
    Sam,

    What version and patch level of LSF are you using today?  Are you trying to provide an opportunity to checkpoint prior to exit?  Have you looked into Application Profile Job Controls?  Inside the Job Controls you have complete control of what signal is sent to what processes. 

    Additionally, you have to consider LSF's default signal escalation and whether or not you are using cgroup process containment.  I believe the default is to always first signal the leader pgid, but if the leader process does not exit within a definable time period, LSF starts performing signal escalation.  Once that starts, all bet's are off.  It's been a while since I read the documentation, but there at least is a global LSF setting for the timing between escalation steps.  I'm not sure it's definable at either the queue, job or application profile level though.

    ------------------------------
    Larry Adams
    ------------------------------