Originally posted by: EhsanN
Thanks for the clarifications on the thread-safety issue. I compared performance of the code for two cases of one-thread and eight threads. For one instance that requires few branching, the difference was very little and almost negligible. However for another instance that requires more branching, the difference is not negligible anymore. The corresponding logs are as follows:
One Thread (adding 103 lazy cuts and exploring 23 nodes):
MIP emphasis: balance optimality and feasibility.
MIP search method: traditional branch-and-cut.
Parallel mode: none, using 1 thread.
Root relaxation solution time = 0.02 sec. (6.08 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap Variable B NodeID Parent Depth
0 0 38651.1003 2 25483.7557 684
0 2 38651.1003 2 38651.8808 684 0 0
Elapsed time = 0.56 sec. (89.87 ticks, tree = 0.00 MB, solutions = 0)
* 7 7 integral 0 42426.7201 38858.0453 1324 8.41% Z(1) U 7 5 5
* 8 4 integral 0 41368.6433 38858.0453 1330 6.07% Z(1) D 8 5 5
13 5 cutoff 41368.6433 39581.2747 1849 4.32% Z(8) D 13 10 4
* 16 6 integral 0 40529.4060 39597.4238 2080 2.30% Z(7) U 16 15 4
User cuts applied: 103
Root node processing (before b&c):
Real time = 0.56 sec. (89.94 ticks)
Sequential b&c:
Real time = 25.37 sec. (1889.67 ticks)
------------
Total (root+branch&cut) = 25.93 sec. (1979.61 ticks)
Eight Threads (adding 156 lazy cuts and exploring 16 nodes):
MIP emphasis: balance optimality and feasibility.
MIP search method: traditional branch-and-cut.
Parallel mode: deterministic, using up to 8 threads.
Root relaxation solution time = 0.06 sec. (7.94 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap Variable B NodeID Parent Depth
0 0 38651.1003 2 25483.7557 684
0 2 38651.1003 2 38651.1003 684 0 0
Elapsed time = 5.09 sec. (93.75 ticks, tree = 0.01 MB, solutions = 0)
* 9 9 integral 0 42165.1593 38880.0334 2372 7.79% Z(1) U 48 24 3
10 6 cutoff 42165.1593 38880.0334 2867 7.79% Z(1) D 41 25 4
13 5 40567.1511 1 42165.1593 38880.0334 3446 7.79% Z(6) U 49 16 2
* 15 2 integral 0 40529.4060 38880.0334 3801 4.07% Z(1) D 35 3 4
User cuts applied: 156
Root node processing (before b&c):
Real time = 5.02 sec. (90.63 ticks)
Parallel b&c, 8 threads:
Real time = 547.36 sec. (2062.52 ticks)
Sync time (average) = 424.42 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 552.38 sec. (2153.15 ticks)
In addition, I've included time profiler report for both cases:
One Thread:
Total Time: 26.05
Time Spent outside Callback: 5.02
Time Spent in Lazy Constraint Callback (minus the time spent in separation method): 0.51
Time Spent in the Separation Method: 20.51
Time Spent Solving the Separation Problem: 15.90
Eight Threads:
Total Time: 549.87
Time Spent outside Callback: 7.86
Time Spent in Lazy Constraint Callback (minus the time spent in separation method): 102.85
Time Spent in the Separation Method: 439.16
Time Spent Solving the Separation Problem: 27.38
As one can see, in case of using eight threads, a lot of time is spent in the lazy constraint callback and the separation method. Perhaps, there is some kind of confusion on accessing the IloCplex object for the separation problem. In the code, I defined the separation problem problem IloCplex object and its related objects (i.e., variables, model, and constraints) and then only update its objective function each time the separation method is called (a similar approach to the ilobendersatsp example).
#CPLEXOptimizers#DecisionOptimization