Decision Optimization

Decision Optimization

Delivers prescriptive analytics capabilities and decision intelligence to improve decision-making.

 View Only
Expand all | Collapse all

CPLEX out of memory although nodeFile on disk

Archive User

Archive UserFri March 23, 2018 06:29 AM

  • 1.  CPLEX out of memory although nodeFile on disk

    Posted Thu June 02, 2016 01:16 PM

    Originally posted by: DavidGravot


    Hi

    I'm running some Cplex model (through a DOC optimization server). Eventually,  we have Out Of memory while some settings have been defined to avoid that, namely

    WorkMem set to 2048 (Mo)

    TreLim set to 100000 (Mo)

    NodeFileInd to 2 (node storage on disk)

     

    i'm wondering what - aside of these safeguards - would potentially cause this random out of mem ?

    David

     

     


    #CPLEXOptimizers
    #DecisionOptimization


  • 2.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri June 03, 2016 01:13 AM

    At which point do you run out of memory? Is this during B&B or even before that?


    #CPLEXOptimizers
    #DecisionOptimization


  • 3.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri June 03, 2016 03:07 AM

    Originally posted by: DavidGravot


    [6/2/16 6:11:13:952 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \n 23446  9501    36126.7294  1016    37085.9456    36122.2377   821924    2.60%
    [6/2/16 6:11:13:952 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         * 24730 10652      integral     0    36736.8029    36122.2377   842788    1.67%
    [6/2/16 6:11:13:952 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \n 24730 10652      integral     0    36736.8029    36122.2377   842788    1.67%
    [6/2/16 6:11:13:952 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         * 24730 10651      integral     0    36736.8029    36122.2377   842788    1.67%
    [6/2/16 6:11:13:952 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \n 24730 10651      integral     0    36736.8029    36122.2377   842788    1.67%
    [6/2/16 6:11:23:812 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \nThere may be further error information in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \nThere may be further error information in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         GUB cover cuts applied:  54\ninformation in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Clique cuts applied:  718\n4\ninformation in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Cover cuts applied:  1558\n4\ninformation in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Implied bound cuts applied:  130\nmation in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Flow cuts applied:  2771\nd:  130\nmation in the clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Mixed integer rounding cuts applied:  4020\nthe clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Flow path cuts applied:  7\n applied:  4020\nthe clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Zero-half cuts applied:  179\npplied:  4020\nthe clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Lift and project cuts applied:  10\n:  4020\nthe clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:387 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Gomory fractional cuts applied:  141\n 4020\nthe clone logs.\n9 MB 821241    2.64%n.\n
    [6/2/16 6:11:25:418 COT] 000000d5 IloCplex      W [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Warning: MIP starts not constructed because of out-of-memory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:418 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         \nRoot node processing (before b&c):\nbecause of out-of-memory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:418 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite   Real time             =   88.30 sec. (33929.63 ticks)\nmory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Parallel b&c, 16 threads:\n  88.30 sec. (33929.63 ticks)\nmory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite   Real time             = 1206.36 sec. (460166.01 ticks)\nory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite   Sync time (average)   =  142.65 sec.\n(460166.01 ticks)\nory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite   Wait time (average)   =  155.17 sec.\n(460166.01 ticks)\nory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite                           ------------\n(460166.01 ticks)\nory status.\n1    2.64%n.\n
    [6/2/16 6:11:25:434 COT] 000000d5 IloCplex      I [PROC-74] com.decisionbrain.utils.OutputStreamSlf4jAdapter innerWrite         Total (root+branch&cut) = 1294.65 sec. (494095.64 ticks)\nory status.\n1    2.64%n.\n
    [6/2/16 6:11:26:620 COT] 000000d5               I [PROC-74] java.util.logging.LogManager$RootLogger log         syserr: ilog.concert.IloException: CPLEX Error  1001: Out of memory.
    [6/2/16 6:11:26:635 COT] 000000d5               I [PROC-74] java.lang.Throwable appendLnTo         syserr: 
    [6/2/16 6:11:26:635 COT] 000000d5               I [PROC-74] java.lang.Throwable appendLnTo         syserr:      at ilog.cplex.cppimpl.cplex_wrapJNI.IloCplex_solve__SWIG_0(Native Method)
    [6/2/16 6:11:26:635 COT] 000000d5               I [PROC-74] java.lang.Throwable appendLnTo         syserr:      at ilog.cplex.cppimpl.IloCplex.solve(IloCplex.java:1142)
    [6/2/16 6:11:26:635 COT] 000000d5               I [PROC-74] java.lang.Throwable appendLnTo         syserr:      at ilog.cplex.IloCplex.solve(IloCplex.java:11798)
    [
    

    Here is the console . it seems the out of memory occurs during the solve process , the time limit was 3600 seconds and I can see this when the exception is thrown :  Real time = 1206.36

    David


    #CPLEXOptimizers
    #DecisionOptimization


  • 4.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri June 03, 2016 11:05 AM

    OK, the out-of-memory seems to happen during the tree search. Does it help to either reduce the number of threads or further reduce the WorkMem parameter?


    #CPLEXOptimizers
    #DecisionOptimization


  • 5.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri June 03, 2016 12:03 PM

    Originally posted by: DavidGravot


    I can try to see if it helps although this crash is not replicable at each time. 

    What's your motivation to reduce WorkMem or Number of threads?  Otherly said, with my previous settings, am I not already having enough safeguards to write node file on disk quite early, therefore I would not see how I can run out of memory ?


    #CPLEXOptimizers
    #DecisionOptimization


  • 6.  Re: CPLEX out of memory although nodeFile on disk

    Posted Tue June 07, 2016 01:03 AM

    The question is whether "quite early" is early enough. The WorkMem parameter basically only applies to the size of the tree, i.e., the memory required for search tree nodes. It does for example not account for the memory required to store the model (or the per-thread copies of the model). So depending on how close the WorkMem parameter is to the amount of RAM actually available to your process your setting may still be too high.


    #CPLEXOptimizers
    #DecisionOptimization


  • 7.  Re: CPLEX out of memory although nodeFile on disk

    Posted Tue September 06, 2016 08:26 AM

    Originally posted by: DavidGravot


    Hi

    The target configuration is a Xeon with 8 cores (16 logical processors), with 30 GB of RAM, about 32 Go of disk space available to write node files

    I know that if I run this process "in-memory" through Cplex Interactive optimizer and an even more powerfull machine (64 GB), I reach an out-of-memory (Cplex error 1001) after  3 hours 

     

    Here are the current settings I set to switch to an other strategy to run on the target machine

    • WorkMem 2048 MB
    • I set up TreLim to 15,000 Mo (15 Go)
    • Node file on disk and compressed (NodeFileInd=3)

    What happens here is that after one hour of run, the CPLEX stops at 40% gap with the status CPLEX MemLimFeas ("Limit on tree memory has been reached, but an integer solution exists")

    So it seems that my node file exceeded the 15 Go, right ?

     

    How can I improve this setting ? My intuition would be to raise WorkMem as for instance 28 GB instead of 2048 MB, so I can use much of the in-memory available before writing on file. Is it a good idea ? Any other advice ?

     

    By the way, is there a command  in the interactive optimizer that shows memory usage ?

    Thanks

    David

     


    #CPLEXOptimizers
    #DecisionOptimization


  • 8.  Re: CPLEX out of memory although nodeFile on disk

    Posted Tue September 06, 2016 04:01 PM

    Assuming that the problem lies with the size of the tree, there is (as with all things computing) a potential tradeoff between time and space. If you force the algorithm to dive deeper before backtracking, you should get fewer persistent nodes. The ultimate change in this direction would be to change the search strategy to depth-first search, which minimizes the tree size (but may maximize the search time, although you never know).

    Besides the nuclear option (DFS), there are a few parameters you can tweak to maintain a hybrid between DFS and breadth-first while diving deeper and thus hopefully shrinking the tree.


    #CPLEXOptimizers
    #DecisionOptimization


  • 9.  Re: CPLEX out of memory although nodeFile on disk

    Posted Tue September 06, 2016 05:12 PM

    Originally posted by: DavidGravot


    Thanks Paul for your feedback

    I'm actually trying to figure out what values I shall give to these 2 parameters (WorkMem and TreLim) to avoid OutOfMemory (using a node file on disk as soon as we consume more than WorkMem) and not reaching too soon TreLim. Actually, since the TreLim refers to the size of the uncompressed tree, I have no clue on fixing this TreLim. Does TreLim unit it refer to GB (memory usage) or to GO (disk storage) by the way ?

     

    We don't want to go for DFS, but we may consider also strong branching . If you know any other parameters, I'm curious to know

    Thanks

    David


    #CPLEXOptimizers
    #DecisionOptimization


  • 10.  Re: CPLEX out of memory although nodeFile on disk

    Posted Mon September 19, 2016 02:35 AM

    It is hard to relate the TreLim parameter to the amount of data actually stored on disk if you used compressed files (which is recommended). If you use a WorkMem setting that makes sure CPLEX starts swapping out nodes before you hit a memory limit then the only remaining limit is disk capacity. If CPLEX fills up your disk then you may indeed counter this by setting TreLim, but I would think that in this case you really should do some of the things Paul suggested.

    The TreLim parameter applies to the uncompressed size of the while tree (including nodes in memory and nodes on disk).

    The interactive has no command to show memory consumption but this information should be easily available from the operating system. On Linux for example you can use 'top' or look at the VmHWM field in /proc/<pid>/status.

    Something that just came to my mind: Does your server implement any per-process memory limits? Even if the server has 30 GB memory, it may enforce much stricter limits on individual processes. This would imply that you need to use smaller values for WorkMem than you would guess from the total amount of RAM available.


    #CPLEXOptimizers
    #DecisionOptimization


  • 11.  Re: CPLEX out of memory although nodeFile on disk

    Posted Mon November 20, 2017 12:06 AM

    Originally posted by: Evandro Ferreira


    I am using the cplex 12.7 student version.
    I would like to know if 64-bit version is available for students because I saw in a publication that there is a difference in memory and my windowns are 64-bit (http://www-01.ibm.com/support/docview.wss?uid = swg21399920).


    #CPLEXOptimizers
    #DecisionOptimization


  • 12.  Re: CPLEX out of memory although nodeFile on disk



  • 13.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 06:29 AM

    Originally posted by: DavidGravot


    Hello

     

    I face again the problem and our customer eventually crashes its disk because of 100 Go compressed node files generated 

    This shows pathological behavior of the Branch & Cut here and I agree that I have to work on that with reducing number of threads, do DFS or strong branching, or review my model ...

    But meanwhile, I'm looking for a safeguard policy first  : how can I ensure to set TreLim to a proper level so that we really stop the optimization before crashing the disk ? 

     

    thanks

     

    David


    #CPLEXOptimizers
    #DecisionOptimization


  • 14.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 06:44 AM

    What is your current value of TreLim? CPLEX is supposed to stop if the total size of the tree (including nodes swapped to disk!) exceeds that limit.


    #CPLEXOptimizers
    #DecisionOptimization


  • 15.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 06:49 AM

    Originally posted by: DavidGravot


    When the job crashed the disk limit, they were simply no limit set to TreLim 

    Now, my question is : what value shall I set to TreLim to guarantee that their disk space will not be totally filled ?

    For instance, they have a 100 GB storage space on the disk that they could use. If I use 'node file on disk and compressed', I guess TreLim=100,000 MB would end up with even less disk space on the disk (since compression). Is it correct ? However, this seems crazy to me to imagine a branch & cut tree that would represent 100 GB ! But I guess this is what happened ... (although I don't have the log file to confirm...)


    #CPLEXOptimizers
    #DecisionOptimization


  • 16.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 07:06 AM

    Correct, as you can see in the reference documentation the tree limit is "size of uncompressed tree in megabytes". If they are willing to allow 100000 MB for storage of the node files then I suggest to set TreLim=100000 and the node file indicator to "compressed". This will usually use less than 100000MB disk space before CPLEX stops. On the other hand, CPLEX does not check too frequently whether the tree size limit is exceeded, so it may overshoot the limit a little. So leaving some elbow room make sense.

    Whether you want to accept a search tree that big is a different question. I guess the more important question is how many nodes that tree actually had. Is it a large tree with small nodes or a small tree with very large nodes. Depending on this you may take further actions to limit the search tree. Or ultimately improve/adjust the model so that it solves faster :-)


    #CPLEXOptimizers
    #DecisionOptimization


  • 17.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 08:00 AM

    Originally posted by: DavidGravot


    Thanks Daniel

    Can you elaborate on what you mean by small vs large nodes? For me, a node is simply a choice point where you select a fractional variable to branch left or right


    #CPLEXOptimizers
    #DecisionOptimization


  • 18.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri March 23, 2018 11:44 AM

    CPLEX stores more data in a node than just the branching decision. It may for example store an optimal simplex basis so that it is fast to continue with that node after jumping to it later. It may also store norms for steepest edge pricing. The later can for example be avoided by choosing a pricing strategy that does not require norms.


    #CPLEXOptimizers
    #DecisionOptimization


  • 19.  Re: CPLEX out of memory although nodeFile on disk

    Posted Thu March 29, 2018 08:34 AM

    Originally posted by: DavidGravot


    Thanks for your advices. We successfully were able to stop the search setting TreLim to 60,000 MB. The log displays things such as 

    tree = 88981.81 MB
    ...
    CPLEX SOLVE STATUS = Optimal|
    CPLEX STATUS = MemLimFeas|
    

    So basically the TreLim limit might have been measured beyond 60 Go while the tree was previously over this limit, but I think this is due to the fact that CPLEX does not check this limit continuously


    #CPLEXOptimizers
    #DecisionOptimization


  • 20.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri April 06, 2018 08:20 AM

    Thanks a lot for your feedback. Overshooting the limit by 20 GB seems a lot in this case. I won't go as far as calling this a bug but I have still filed a user wish to better meet that limit. No guarantee that this is going to change (soon) but at least the issue is now recorded in our system.


    #CPLEXOptimizers
    #DecisionOptimization


  • 21.  Re: CPLEX out of memory although nodeFile on disk

    Posted Fri October 12, 2018 02:04 AM

    I finally found that this is indeed a bug in CPLEX :-( When checking the tree size limit there are some parts of the tree that CPLEX does not account correctly for. This leads to overshooting the limit, sometimes by a lot.

    Potential workarounds include these:

    1. Parse the CPLEX log. CPLEX regularly issues log lines that include the current tree size. Abort the search if these messages indicate a large tree. This can be done by redirecting CPLEX output to a function (or stream in higher level APIs) for parsing and then abort from there.
    2. Use an info callback and in that callback check the size of the current process. If that gets too big then abort from the callback.

    #CPLEXOptimizers
    #DecisionOptimization