Decision Optimization

Decision Optimization

Delivers prescriptive analytics capabilities and decision intelligence to improve decision-making.

 View Only
Expand all | Collapse all

How the threads are used

Archive User

Archive UserFri February 20, 2015 01:05 PM

ALEX FLEISCHER

ALEX FLEISCHERSun February 22, 2015 05:48 AM

  • 1.  How the threads are used

    Posted Fri February 20, 2015 01:05 PM

    Originally posted by: Phoebe_Qi


    Dear all:

    Hi. 

    I have a question about how the threads are used in CPLEX when solving a MILP. 

     

    When CPLEX do branch and bound, does it distribute the nodes in the branches to different thread and calculate them in parallel? 

    By that, I mean if I can get computing resource that can have more than 16 processors, if I set the thread parameter to a larger number, does it help to speed up the solving process?

     

    I am running some MILP problem which takes days to get a solution, so I really appreciate if anyone can answer this question.

     

    Phoebe


    #CPLEXOptimizers
    #DecisionOptimization


  • 2.  Re: How the threads are used



  • 3.  Re: How the threads are used

    Posted Tue February 24, 2015 01:31 AM

    As the documents quoted by Alex explain, CPLEX uses processors in the way you suspected: CPLEX creates one thread for each processor and then performs the tree search in parallel.

    However, blindly increasing the number of threads beyond 16 may not necessarily result in the expected performance improvement. So it might be useful to first take a quick look at the model and try to understand why it takes so long. Does it require a lot of search tree nodes? Is node throughput very small? Can you show a log output of such a run that takes multiple days?


    #CPLEXOptimizers
    #DecisionOptimization


  • 4.  Re: How the threads are used

    Posted Tue March 10, 2015 03:46 PM

    Originally posted by: Phoebe_Qi


    Daniel:

    You are absolutely right about this. I tried increasing the number of threads but unfortunately, it still takes forever to run the program. I think you raise an excellent point that the search may be slow due to two different reasons and increasing number of threads do not solve the problem magically. 

    I know that since I linearize the MINLP to MILP , it may increase the number of variables significantly and thus results lots of branches in CPLEX. 

    But I am not clear about how to identify the reason in log file. Could you please give me some suggestions on how to distinguish whether it is becuz of the small node throughput?

    Attached is the output file from a program that runs 4 days..and still have no optimal result. 

    I apology in advance if this is not the log file you are looking for.


    #CPLEXOptimizers
    #DecisionOptimization


  • 5.  Re: How the threads are used

    Posted Thu March 12, 2015 02:53 AM

    As far as I can tell from your log file, progress on primal and dual bound is both rather slow. Have you tried playing with CPX_PARAM_EMPHASIS? Does any of the possible settings for this parameter give you a smaller relative gap in shorter time? Also, is it easy for you to compute a good feasible (not necessarily optimal) solution that you could provide to CPLEX as a MIP start? The initial solutions that CPLEX finds look rather poor, so if you can come up with something better easily then this may help.


    #CPLEXOptimizers
    #DecisionOptimization


  • 6.  Re: How the threads are used

    Posted Thu March 12, 2015 04:08 PM

    Originally posted by: Phoebe_Qi


    Daniel:

    It is not easy to find a feasible solution, it is a complex networking optimization problem. 

    Thanks to your advice and I will try to change the  CPX_PARAM_EMPHASIS parameter and see if it can accelerate the solving process. 


    #CPLEXOptimizers
    #DecisionOptimization


  • 7.  Re: How the threads are used

    Posted Fri March 20, 2015 03:40 AM

    Note that for a MIP start you don't need a complete solution. If you can only provide values for some of the integer variables (the more the better) then CPLEX will automatically attempt to find values for the other variables. This may of course still be too complex in your case.


    #CPLEXOptimizers
    #DecisionOptimization


  • 8.  Re: How the threads are used

    Posted Tue March 17, 2015 03:03 PM

    Originally posted by: Phoebe_Qi


    Daniel:

    Hi. Sorry for bothering you again. 

    I am using a computing resource from university, which is a 408-node Cray CS-300 cluster, where each node is with two octa-core Intel Sandy Bridge CPUS and 64 GB of memory. 

    So I assume for each node, I have 16 threads that CPLEX and run parallel. 

    Then in order to accelerate the solving process, I requested three nodes per job, which means I have 48 processor cores, and therefore, I set the thread parameter to 48 in my c++ program as follows: "cplex.setParam(IloCplex::Threads, 48);"

    As you mentioned earlier, it did not work. ..

    Then I realize maybe CPLEX does not actually run 48 threads, so I am wondering is there anyway that CPLEX can output the number of threads it actually use to run the program?

    Thanks. 

    I have done web search on this questions but I can only find the way to output the parameter setting, but not the actual usage of threads,  I really wish you can help me with this, I really appreciate it. 


    #CPLEXOptimizers
    #DecisionOptimization


  • 9.  Re: How the threads are used

    Posted Fri March 20, 2015 03:22 AM

    CPLEX prints the number of threads used in the log. The lines look something like this:

    Parallel mode: deterministic, using up to 8 threads.

    In your case I guess something else in play: By default CPLEX is a shared memory application. So even if you allocate 3 nodes in your cluster, CPLEX will only run on a single node. Settings threads to 48 in this case will just create 48 threads on this 16 core node. Thus the node will be heavily oversubscribed which probably results in a performance degradation.

    We have recently added a distributed memory variant of CPLEX. You could try that on your cluster. The details about using that can be found in the user's manual.


    #CPLEXOptimizers
    #DecisionOptimization


  • 10.  Re: How the threads are used

    Posted Mon March 23, 2015 09:51 PM

    Originally posted by: Phoebe_Qi


    Daniel:

    Hi. 

    I studied how to use the Open MIP to run the program as distributed parallel MIP.

    In the VMC file, it requires to specify the host machines that will be used as workers. However,since the cluster I am using takes a shell file and then use a scheduler to assign the machines to me, so I won't be aware of the machines that I can use when I submit my job. 

    Is there anyway to solve this issue?

    This distributed parallel way is really complicated, but I wish I can learn and try it since it is very attractive to have more processors and memory.

    Sorry for bothering you and thanks for your help.


    #CPLEXOptimizers
    #DecisionOptimization


  • 11.  Re: How the threads are used

    Posted Mon March 23, 2015 10:16 PM

    Originally posted by: Laci Ladanyi


    Probably the cluster allows you to start an mpi world on the nodes you got assigned. In that case you can start the distributed mip without specifying the nodes in the vmc file, cplex will automatically figure out everything. All you have to do is issue "set distmip config mpi" in the interactive (or set the corresponding parameter in your code). See the "convenient shortcut" section at http://www-01.ibm.com/support/knowledgecenter/SSSA5P_12.6.1/ilog.odms.cplex.help/CPLEX/UsrMan/topics/parallel_optim/distribMIP/06_openMPI.html

    --Laci


    #CPLEXOptimizers
    #DecisionOptimization


  • 12.  Re: How the threads are used

    Posted Mon March 23, 2015 10:47 PM

    Originally posted by: Phoebe_Qi


    Thanks for your reply.

    I read the convenient shortcut for interactive session using command "set distmip config mpi",

    but is there a way to get the VMC generated by the cluster in  c++ code? 

    I can read-in the configuration file in C++ code, but I cannot find a  corresponding c++ code for  "set distmip config mpi".

    Thanks. 


    #CPLEXOptimizers
    #DecisionOptimization


  • 13.  Re: How the threads are used

    Posted Tue March 24, 2015 06:04 AM

    If by "C++ code" you mean Concert technology for C++ then there is. In Concert technology for C++ a VMC is loaded via

    IloCplex::copyVMConfig(char const *);
    IloCplex::readVMConfig(char const *);

    The equivalent for the shortcut in the interactive should be

    cplex.copyVMConfig("mpi");

    Note that you will have to link your application with additional libraries if you use distmip features in your code, see this chapter in the user manual.


    #CPLEXOptimizers
    #DecisionOptimization


  • 14.  Re: How the threads are used

    Posted Tue March 24, 2015 10:07 PM

    Originally posted by: Phoebe_Qi


    Daniel:

    Hi.
    I have two things to discuss with you. 
     
    First, as I mentioned, I am using the HPC resource in my univerisity to do networking researches. 
    Since my optimization model is complex, I would like to try this distributed parallel mode of CPLEX using open MPI.
    I talked with the computational scientist who manages the HPC resource about how amazing CPLEX is for  operational research, and this distributed parallel feature is a perfect fit for distributed memory HPC. 
    Since CPLEX has academic initiative program and the HPC resources can only be accessed by academic researchers, therefore,  
     he would like to discuss with you to see if IBM is interested in letting the university
    to set up the distributed parallel CPLEX with OpenMPI as a module in our HPC resource , 
    then researchers in Operational Research can use it to solve large scale optimization problems. 
     
    We have both distributed-memory and shared-memory systems, here is a link to the HPC resources and the software list that is now available to our researchers. 
    I am now using "Blueridge" , which is a distributed-memory 408-node Cray CS-300 cluster.
    If you are interested, can I send an email that includes you and the computational scientist in my university, so that we can set up something?
     
    Second, 
    I have some questions about how to modify my conventional local-machine CPLEX c++ program and makefile.
    I studied through the documentations about using Open MPI with distributed parallel MPI, 
     

    http://www-01.ibm.com/support/knowledgecenter/SSSA5P_12.6.1/ilog.odms.cplex.help/CPLEX/UsrMan/topics/parallel_optim/distribMIP/06_openMPI.html

     
    According to my understanding, the modification process should be 
    1. add c++ code to read-in configuration file.
    cplex.readCopyVMConfig(vmconfig);
     
    2. change the makefile to link with -lcplexdistmip and -ldl
     
    3. set the enviroment variable to run the application: LD_LIBRARY_PATH=(shared bin directory of CPLEX):(path to the OpenMPI library)
     
    Now I have the following questions:
    1. In the example makefile CPLEX provides (I attached it), it shows how to run the application with process transport:
     "
     DMEXEC = LD_LIBRARY_PATH=../../../bin/x86-64_linux:$$LD_LIBRARY_PATH
     dm_execute_cpp_process: $(DM_CPP_EX)
      $(DMEXEC) ./ilodistmipex1 $(EXDATA)/process.vmc $(EXDATA)/p0033.mps
    "
    However, if I want to run the application with open MPI, how should the LD_LIBRARY_PATH be used to run the executable file?
     
    2. Since we submit the job with  shell script, I do not know which machines are gonna be assigned to me.
    Therefore, I cannot set up the VMC file in advance, you replied last time about using "cplex.copyVMConfig("mpi");" 
    so that cplex can figure out everything. 
    Does that mean I don't have to read in a vmc file using "cplex.readCopyVMConfig(vmconfig);" ?
     
    3. what does the "rank" property in the VMC file mean? if I let cplex generate the VMC file, is the rank also gonna be assigned properly?
     
    there is a script to specify the master and workers, do I need to do this if I use the C++ API and let the cplex generate the VMC file?
     
    I apologize in advance that since I have never worked with CPLEX remote object and open MPI, I have lots of questions about it. 
    Thanks for your patience and help, I really appreciate it.

    #CPLEXOptimizers
    #DecisionOptimization


  • 15.  Re: How the threads are used

    Posted Fri March 27, 2015 05:59 AM

    Can you please drop me an email at daniel(dot)junglas(at)de(dot)ibm(dot)com so that we can talk about putting CPLEX on your HPC system and get you setup with distributed MIP and MPI?

    The instructions about distributed MIP and MPI assume that you are familiar with MPI, so they may be a little short for people who are not. No problem asking questions about that. Let me try to answer some of the questions you asked:

    1. That depends on the mpirun binary you are using. Are you going to use OpenMPI or MPICH?
    2. Correct, cplex.copyVMConfig("mpi") will automatically create a VMC that tells CPLEX to use all machines that are currently available in your MPI setup. No need to explicitly load a VMC via cplex.readVMConfig(...)
    3. The "rank" property specifies the MPI rank. In MPI machines are identified by their rank.
    4. No. If you use cplex.copyVMConfig("mpi") then CPLEX will choose master and worker automatically.

    It is probably easier to discuss this via email than through this Forum.


    #CPLEXOptimizers
    #DecisionOptimization