Originally posted by: EdKlotz
Python 3.6 should be fine. I did write a DoCplex version of your model, which is attached. It did not result in the 1217 error, and cvxpy looks like the culprit. I'll get into the program below, but first let me get back to the 1217 error you reported with your code. I asked a colleague familiar with the cvxpy github repositories to look into this, and he concluded the 1217 errors is superfluous and harmless. When you solve a MILP or MIQP, there is no associated LP, basis matrix, dual values, or various other info that pertains only to LPs. The cvxpy API code appears to be calling an LP only solution query routine, which results in the 1217 error. But this does not stop anything from proceeding; it just generates a confusing but superfluous error message. You can check this by verifying that the following segment of your code still executes:
if prob.status=='optimal':
ww=np.vstack((Shorts.value,Longs.value)).T
wt=np.sum(ww,axis=1)
summ=[prob.solver_stats.solve_time, sum(wt), sum(abs(wt)), obj11]
If that executes, you are getting solution values, which shows that a solution actually is available. And if those code does not execute, what status do you get in the code just below it that does?
else:
print(prob.status)
If we are talking about the same behavior, I bet you don't see any status here with a 1217 value.
Returning to the DoCplex version I have attached, it's a bit more general; you specify the return data in a file on the command line, and after that on the command line specify the multiplier for the quadratic objective term. I ran it with your return data, and no 1217 error occurs. I have attached both the program and your sample data; you would run it with
python longshort.py 5.0
to reproduce your data. As a disclaimer, I am not an extremely Pythonic programmer, so if you see my code uses some form of for loop where you used a more vector oriented operation, do not assume that means DoCplex lacks the more vector oriented approach you used.
I did learn one other thing that might be of interest regarding faster performance. Your code builds the sum of squares of each long + short term for an asset as follows:
cp.sum_squares(Shorts+Longs)
In other words, you are taking the sum of squares of some linear expressions, which causes the model building code in the cvxpy API (and probably the DoCplex API as well, although I didn't test it) to have to create extra variables and constraints to model these linear expression. It is unaware that in fact only one of any short or long activity for a particular asset can take on a nonzero value (and you should not expect that level of preprocessing by any modeling language or modeling API). So when I look at the LP file of the model you attached to the forum, I see these constraints and variables, where x1,...,x20 appear as squared terms in the objective:
+ [ 10 x1 ^2
+ 10 x2 ^2 + 10 x3 ^2 + 10 x4 ^2 + 10 x5 ^2 + 10 x6 ^2 + 10 x7 ^2
+ 10 x8 ^2 + 10 x9 ^2 + 10 x10 ^2 + 10 x11 ^2 + 10 x12 ^2 + 10 x13 ^2
+ 10 x14 ^2 + 10 x15 ^2 + 10 x16 ^2 + 10 x17 ^2 + 10 x18 ^2 + 10 x19 ^2
+ 10 x20 ^2 ] / 2
Subject To
c1: - x1 + x21 + x22 = 0
c2: - x2 + x23 + x24 = 0
c3: - x3 + x25 + x26 = 0
c4: - x4 + x27 + x28 = 0
c5: - x5 + x29 + x30 = 0
c6: - x6 + x31 + x32 = 0
c7: - x7 + x33 + x34 = 0
c8: - x8 + x35 + x36 = 0
c9: - x9 + x37 + x38 = 0
c10: - x10 + x39 + x40 = 0
c11: - x11 + x41 + x42 = 0
c12: - x12 + x43 + x44 = 0
c13: - x13 + x45 + x46 = 0
c14: - x14 + x47 + x48 = 0
c15: - x15 + x49 + x50 = 0
c16: - x16 + x51 + x52 = 0
c17: - x17 + x53 + x54 = 0
c18: - x18 + x55 + x56 = 0
c19: - x19 + x57 + x58 = 0
c20: - x20 + x59 + x60 = 0
My formulation doesn't have this because I exploit the complementarity condition on the long/short activities when building the objective:
#
# Constraints are done; build the objective
#
objexpr = model.quad_expr()
objexpr += lambd*(model.sumsq(Long) + model.sumsq(Short))
objexpr -= (model.scal_prod(Long, retcoeffs) + model.scal_prod(Short, retcoeffs))
model.set_objective("min", objexpr)
On other words for any asset [i], if you know that Long[i]*Short[i] = 0, then you know (Long[i] + Short[i])^2 = Long[i]^2 + Short[i]^2. This permits the creation of sum of squares of individual variables rather than linear expressions, and avoids the need for the auxiliary variables and associated constraints in your model. I mention this because the performance difference was surprisingly different. With CPLEX 12.9 and 12 threads, your version of the model took 14-15 seconds and 700000 nodes, whereas my formulation took about .1 of a second:
MIP - Integer optimal solution: Objective = -1.7131737013e-01
Solution time = 0.10 sec. Iterations = 406 Nodes = 146
Deterministic time = 3.33 ticks (35.00 ticks/sec)
I don't think I made a mistake in the formulation, but you can easily test this by modifying the way you build the sum of squares term to try this out. This is independent of the Python modeling API used. I'm still a bit puzzled why this would make such a huge difference (particularly regarding node count), and we'll look into that and see if your presolve could pick up this automatically. But in the meantime, try this change in your code and see if it helps performance.
#CPLEXOptimizers#DecisionOptimization