List of Contributions

Larry Adams

This individual is no longer active. Application functionality related to this individual is limited.

Contact Details

My Content

1 to 20 of 29 total
Posted By Larry Adams Fri August 27, 2021 12:38 PM
Found In Egroup: High Performance Computing Group
\ view thread
Just tested, that did not go over too well: [lsfadmin@vmhost6 configdir]$ bhist -l 101 | grep "Running with" | wc -l 80 ​ But as soon as I set the max value in the queue, the job went suspended. I would suggest you talk to the LSF admin and suggest they add that setting as it will prevent the ...
Posted By Larry Adams Fri August 27, 2021 12:32 PM
Found In Egroup: High Performance Computing Group
\ view thread
Romain, That would be easy enough to test. Just have the script exit. Make the script command be "exit 1". That should cause a re-queue loop if the setting is taken. I know that several variables can be re-read from the environment, but I'm not too certain about this one as it's more of a mbatchd ...
Posted By Larry Adams Thu August 26, 2021 11:30 AM
Found In Egroup: High Performance Computing Group
\ view thread
Roman, It's simple to do. Here is a section from the man pages for lsb.queues MAX_JOB_REQUEUE Specifies the maximum number of times to requeue a job automati‐ cally. Syntax MAX_JOB_REQUEUE=integer Valid values 0 < MAX_JOB_REQUEUE < INFINIT_INT INFINIT_INT is defined in lsf.h. ...
Posted By Larry Adams Thu August 26, 2021 11:15 AM
Found In Egroup: High Performance Computing Group
\ view thread
Frank, What I would recommend is that you open an RFE. Bill McMillan, the Offering Manager, will pass some final determination. Also, he may simply ask support to provide you the two scripts. My suggestion is that they should have done this for Explorer from the beginning. I have not installed ...
Posted By Larry Adams Thu August 26, 2021 08:45 AM
Found In Egroup: High Performance Computing Group
\ view thread
Actually, Franks statement is only partially true. In the LSF Suites, there is a unit file called 'acd'. The 'acd' unit file will manage all services as a single service. I would expect that IBM support could provide this unit file upon request if this is a standalone version of the Application Center ...
Posted By Larry Adams Mon June 21, 2021 01:57 PM
Found In Egroup: High Performance Computing Group
\ view thread
Another approach is to use Guarantee Pools. They provide a nice policy that allows you to delegate not only CPU's but more importantly MEMORY to the various consumer groups, including queues. This is what I recommend to all customers these days. ------------------------------ Larry Adams ------ ...
Posted By Larry Adams Tue April 27, 2021 09:18 AM
Found In Egroup: High Performance Computing Group
\ view thread
Sam, What version and patch level of LSF are you using today? Are you trying to provide an opportunity to checkpoint prior to exit? Have you looked into Application Profile Job Controls? Inside the Job Controls you have complete control of what signal is sent to what processes. Additionally, you ...
Posted By Larry Adams Mon January 04, 2021 09:37 AM
Found In Egroup: High Performance Computing Group
\ view thread
The short answer is yes. You simply need to know the correct syntax. I would be nice if we could support YAML more longer term though. Personally, I don't like XML, but use it where I have to. ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Thu December 17, 2020 04:18 PM
Found In Egroup: High Performance Computing Group
\ view thread
If you use the schedule by slot setting in lsb.params, you can set the per host scheduling slots as large as you want it. I have customers that set it to 1000, and the only thing that controls scheduling is memory reservations. It works perfectly. Now, if you use Affinity "bsub -R 'afinnity[code|thr ...
Posted By Larry Adams Tue December 01, 2020 09:57 AM
Found In Egroup: High Performance Computing Group
\ view thread
Nothing since 10.2.0.6. ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Tue December 01, 2020 09:26 AM
Found In Egroup: High Performance Computing Group
\ view thread
Jamie, We have not yet released fix pack 11. The CE releases are updated periodically at offering management discretion. At this moment, I don't know when the next release is planned. Larry ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Mon August 10, 2020 11:26 AM
Found In Egroup: High Performance Computing Group
\ view thread
Abhishek, Application Center uses OS authentication. You should look through your syslog and audit log to find the reason. If you have further issues, open a support case with IBM Support. ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Mon June 22, 2020 05:42 PM
Found In Egroup: High Performance Computing Group
\ view thread
Please read the following: https://docs.nvidia.com/deploy/mps/index.html ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Mon June 22, 2020 05:40 PM
Found In Egroup: High Performance Computing Group
\ view thread
To my knowledge, the mps server is for a single user and not multiple users, but I could and have been known to be wrong in the past. The Knowledge Center is a good place to start. ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Mon June 22, 2020 05:16 PM
Found In Egroup: High Performance Computing Group
\ view thread
Abhishek, The only status that you can share a GPU with other users is to use the GPU option: bsub -gpu "num=x:mode=shared" ./a.out You can also reserve memory on the GPU, but it's not enforced. Then it becomes important to have the memory reservation align with what your code is doing. If you ...
Posted By Larry Adams Thu May 14, 2020 02:04 PM
Found In Egroup: High Performance Computing Group
\ view thread
LSF works on pretty much anything Intel, AMD, and now the latest ARM processors as well. We support other chipsets like Sparc and Power too. ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Thu May 14, 2020 02:02 PM
Found In Egroup: High Performance Computing Group
\ view thread
Both AMD and ARM are giving Intel a run for their money presently. The planned Marvel ThunderX3 ARMv8 chip is going to be a killer with 384 threads and 128 PCIe Gen4 lanes, power scaling that you would expect from an ARM chip and great L1 - L3 cache. Though not an Intel command set which presents a blocker ...
Posted By Larry Adams Tue March 24, 2020 12:09 PM
Found In Egroup: High Performance Computing Group
\ view thread
Roni, Right now, you can use the GPU ELIM Template Graphs and also make sure that you enable GPU Collection in Console > Configuration > RTM Settings > Poller. If you do that second part, you should have some GPU rusage after the jobs are complete. Starting in RTM 10.2.0.1, you will be able to ...
Posted By Larry Adams Tue March 10, 2020 11:09 AM
Found In Egroup: High Performance Computing Group
\ view thread
I'm sorry, all of this is so refreshing ;) ------------------------------ Larry Adams ------------------------------
Posted By Larry Adams Tue March 10, 2020 10:44 AM
Found In Egroup: High Performance Computing Group
\ view thread
Ben, You better grab and install over your existing install the latest version of CE. The one you are using is very old. I think the version out there now is LSF 10.1.0.6, and we are planning on having the 10.1.0.9 released in the next month or so. Of course, there is no support. If you want that, ...