IBM Spectrum LSF User Group

Expand all | Collapse all

Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

  • 1.  Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Thu January 30, 2020 10:31 AM
    Dear all,

    we are considering to implement shared temporary directories (by setting TMPDIR via LSB_JOB_TMPDIR and LSB_SET_TMPDIR) in our cluster.  However I fear, that this could lead to data corruption, while processes on different hosts, belonging to the same application, create their temporary data in the common TMPDIR.

    A common name pattern of temporary files is a name containing the processes' ID (PID).  When two processes, belonging to the same job and the same application by chance have the same PID on two (or more) nodes, then they might overwrite each other's temporary data in the shared directory.

    This situation is maybe statistically rare, but not impossible.

    Has anyone working experience with shared $TMPDIR in an LSF cluster (good or bad) and especially with the situation lined out above?

    Cheers
    Frank


    ------------------------------
    Frank Thommen
    ------------------------------


  • 2.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 11:17 AM
    The actual job temp directory will be $LSB_JOB_TMPDIR/jobID.tmpdir. JobID is unique in one cluster, job data file is stored under jobID.tmpdir. Further you can add %H host name pattern as part of LSB_JOB_TMPDIR.

    One concern I think for using shared TMPDIR is it may overload filer.

    ------------------------------
    YI SUN
    ------------------------------



  • 3.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 12:02 PM
    Further to Sun Yi's (Tropical Sun) response.  I would suggest that if the PID is writing a potentially identical file per thread, that the application should gather it's PID using functions calls like 'getmypid()' or similar and then append that as the file suffix to avoid complications around duplicate file names.

    Otherwise, LSF provides several replacement variables as mentioned by Tropical Sun.  Some file systems such as Spectrum Scale prefer separate directories for TMPDIR and other things like the LSF Spool directory to increase resiliency and decrease lock contention, which are defaulted to $HOME/.lsbatch (a bad thing actually).  So, it's best to use LSB_STDOUT_DIRECT=Y in the lsf.conf.  That way output and error files will be written directly to the location requested by the user, which can be controlled by pattern such as "bsub -cwd /scratch/%U/%J.%I/ -o %J.%I.out -e %J.%I.err ./a.out".

    If you are planning on having islands of shared scratch, this is very common place, and you should be fine so long as the file system can support the IOPs.

    ------------------------------
    Larry Adams
    ------------------------------



  • 4.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 12:32 PM
    Thank you Sun Yi and Larry,

    we have no control over what is run on the cluster. The applications
    come from various sources, sometimes they are self programmed by our
    bioinformaticians. However in my use case the issue of duplicate
    filenames can not be avoided by using the PID, but only by having random
    elements in the filename: An application with two processes with the
    same PID 123 (one on each host) would write myapp.123.tmp from both
    hosts (and create corrupted data). With random elements it would e.g.
    write myapp.123.tmp.dyh5 from one and myapp.123.tmp.2y6x from the other
    node.

    Unfortunately I see only very few applications implementing such a
    naming scheme for their temporary data.

    Also the hostname pattern woult not help in this case, as it just adds
    one level of hierarchy but no "randomness". For a job, $TMPDIR will
    still be the same for all hosts where this job runs.

    We have already separated the spool directory from the homedirectories
    (via JOB_SPOOL_DIR in lsb.params) and the shared tmp space would be on a
    highly performant lustre filesystem. There shouldn't be issues with
    performance or disk space ;-).

    Have you implemented shared tmp space and what are your experiences with it?

    frank




  • 5.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 02:09 PM
    But I thought LSB_JOB_TMPDIR can be used to change TMPDIR.

    ------------------------------
    YI SUN
    ------------------------------



  • 6.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 02:50 PM
    Yes, that is the point: LSB_JOB_TMPDIR sets a temporary directory and
    with LSB_SET_TMPDIR this tempdir can be set as $TMPDIR for all processes
    of a job. If it is shared, I fear that there can be data codduption due
    to data overwrites.

    frank




  • 7.  RE: Experiences with shared $TMPDIR (LSB_JOB_TMPDIR) in the cluster?

    Posted Fri January 31, 2020 05:00 PM
    Edited by Larry Adams Fri January 31, 2020 05:12 PM
    If you are running LSF 10.1.0.7+ you can enforce the LSF_SERVERDIR and from there, you can force the running of an ESUB, for example esub.default that will run for every job.  In that ESUB, you can inspect the job's submission environment, the entire jobs submission arguments, and change or augment any of them.

    The only place this would break, is if one of your users were doing something in a wrapper script that does something foolish.  I suspect that the ESUB should be your last line of defense against someone hurting themselves or others.

    Most of the consulting work I've done over the years drove the users to interact with either a web or cli based submission framework, where all application integrations were under tight Engineering control.  So, the likelihood of problems happening would generally be the fault of the Engineering team.  I'm not sure that helps here though.

    In several cases, there were shared scratch directories, and we always make sure that the CWD was unique under the shared scratch as shown below, like I stated in the prior response.

    /scratch/user/jobid.jobindex/my_temp_stuff_here

    The above is very common.  I hope that helps.

    ------------------------------
    Larry Adams
    ------------------------------