Decision Optimization

 View Only
  • 1.  Sorting or k-largest value for array of expressions

    Posted Wed May 12, 2021 08:58 AM
    Hi all,

    I'm currently trying to implement a scheduling objective function from a paper (Algorithm 5 of https://ars.els-cdn.com/content/image/1-s2.0-S0377221720307451-mmc1.pdf, which is the appendix of https://doi.org/10.1016/j.ejor.2020.08.032).
    For this I either need to know the k-largest value of an array of integer expressions or I need to sort the integer expressions.

    I have come up with an O(n^2) solution that gives me the indices that would sort the array and handles tie breaks:


    where 1 is the indicator function and C is an array of integer expressions.
    For instance, if C = [5, 5, 15, 10]; then S = [0, 1, 3, 2].

    In Python:
    indices = []
    for i, ci in enumerate(expr):
        nr_cj_smaller_than_ci = cp.sum(cj < ci + (cj == ci) * (j > i) for j, cj in enumerate(expr))
        indices.append(nr_cj_smaller_than_ci)

    This kind of works but I have the feeling this slows me down. Is there a more efficient implementation? For us, M < 20 usually.

    Thanks and best regards
    Sebastian

    ------------------------------
    Sebastian Bayer
    ------------------------------

    #DecisionOptimization


  • 2.  RE: Sorting or k-largest value for array of expressions

    IBM Champion
    Posted Wed May 12, 2021 11:59 AM
    A tree sort would be O(n log(n)). Whether that is faster in your case is an empirical matter. More complicated sorts such as tree sort tend to have some "fixed costs", so they outperform pairwise comparison for "large" n but not for "small" n.

    Also, if you only need the k-th largest entry (or the k largest entries), it might be faster to sort the first k entries (using something like a linked list). For each subsequent integer, compare it to the largest entry in the linked list and, if smaller, insert it into the list in its sorted position and drop the last (largest) entry of the list. So you have O(k^2) work to sort the initial list, followed by O(n) comparisons taking either constant time (the new value is bigger than the current k-th biggest entry) or O(k) work (if you have to insert the new value in the list), for O(n*k) overall effort. Again, whether this beats the simple approach for M < 20 is an empirical question.

    I assume you anticipate doing this repeatedly. If you only need to find the k-th largest entry once, any time savings by getting fancy would be too small to notice.

    ------------------------------
    Paul Rubin
    Professor Emeritus
    Michigan State University
    ------------------------------



  • 3.  RE: Sorting or k-largest value for array of expressions

    Posted Fri May 14, 2021 01:26 AM
    Dear Paul,

    many thanks for your reply! Yes indeed, as this is supposed to be an objective function, we will have to call it many times.
    I think I forgot to say in my first post that this is about CPLEX CPO. Do you see a way how to implement tree sort in CPO?

    Regards
    Sebastian

    ------------------------------
    Sebastian Bayer
    ------------------------------



  • 4.  RE: Sorting or k-largest value for array of expressions

    Posted Fri May 14, 2021 02:34 AM
    I think that you should try using 'element' expressions for sorting the array of values. The formulation will be in O(n) instead of O(n^2). Here is an example:

    model = CpoModel()
    n=10
    
    # MODEL
    c = [integer_var(min=10,max=10+n)  for i in range(n)]
    
    model.add(all_diff(c))
    model.add(c[0]>c[1])
    model.add(c[4]<c[5])
    # ...
    
    # PART OF THE MODEL DEDICATED TO SORT
    s = [integer_var(min=0,  max=n-1)  for i in range(n)] # position of variable c[i] in sort
    v = [integer_var(min=10, max=10+n) for j in range(n)] # value of variable at position j
    
    for i in range(n):
        model.add(element(v,s[i])==c[i])
        if 0<i:
            model.add(v[i-1]<=v[i])
            
    # SOLVE
    sres = model.solve()
    
    print("c = {}".format([sres.get_var_solution(c[i]).get_value() for i in range(n)]))
    print("s = {}".format([sres.get_var_solution(s[j]).get_value() for j in range(n)]))
    ​


    ------------------------------
    Philippe Laborie
    ------------------------------



  • 5.  RE: Sorting or k-largest value for array of expressions

    Posted Fri May 14, 2021 04:29 AM
    Edited by System Fri January 20, 2023 04:15 PM

    Hi Philippe,

    many thanks for the hint with 'element'! I did not know this function existed, this looks really promising.

    When I was trying to adapt your example to my model I have spotted some possible issues.
    The major deviation from your example is the way how I construct the 'c' array - in my model it will eventually be the end times of parallel machines, i.e. the time when the last job on a machine is finished.

    As a first step I have set  'c' to the end times of a list of jobs.

    from docplex.cp import model as cp
    
    n = 10
    
    model = cp.CpoModel()
    
    jobs = [cp.interval_var(size=1) for _ in range(n)]
    
    c = [cp.end_of(jobs[j]) for j in range(n)]
    
    # PART OF THE MODEL DEDICATED TO SORT
    s = [cp.integer_var(min=0, max=n - 1) for i in range(n)]  # position of variable c[i] in sort
    v = [cp.integer_var(min=1, max=10+n) for j in range(n)]  # value of variable at position j
    
    for i in range(n):
        model.add(cp.element(v, s[i]) == c[i])
        if 0 < i:
            model.add(v[i - 1] <= v[i])
    
    sres = model.solve()


    This model works.

    However, if I replace the min number in
    v = [cp.integer_var(min=1, max=10+n) for j in range(n)]  # value of variable at position j​

    with

    v = [cp.integer_var(min=0, max=10+n) for j in range(n)]  # value of variable at position j​​
    the model can not be solved, I'm getting this message:

    docplex.cp.solver.solver.CpoSolverException: Nothing to read from local solver process. Process seems to have been stopped (rc=-11).​



    A second variation that leads to the same error is when I add a machine together with a no-overlap constraint:

    machine = cp.sequence_var(jobs)
    model.add(cp.no_overlap(machine))


    Am I doing anything wrong?




    ------------------------------
    Sebastian Bayer
    ------------------------------



  • 6.  RE: Sorting or k-largest value for array of expressions

    Posted Fri May 14, 2021 05:20 AM
    You are doing nothing wrong. You found a bug in CP Optimizer ! We will fix it.

    An easy work-around is to use search parameter SearchType=Restart as the bug occurs in a strategy used by the automatic search but not in the 'Restart' context.

    I also realize that using only the 'element' constraint may not propagate well and may result in a lot of branching in the search. Here is your example slightly modified with two ideas:
    1- Avoid tie-breaks on jobs ending at the same date (I guess that in this case, you can decide in advance which one is preferred). That is why in the example I'm not sorting on the end time 'end' but on 'N*end+i' where 'N' is the number of jobs and 'i' the index of the current job. This way, you can add a strict lower than constraint (<) on the sorted values.
    2- Also use an inverse array [p[j]...] representing for each value at position j in array v the index of the job. In terms of constraints, p is the inverse array of s (that is: p[s[i]]=s[p[i]]=i)

    from docplex.cp.model import *
    
    n = 10
    H = 1000
    
    model = CpoModel()
    
    jobs = [interval_var(size=1, end=(0,H),name="j{}".format(i)) for i in range(n)]
    model.add(no_overlap(jobs))
    
    c = [(n*end_of(jobs[j]))+j for j in range(n)]
    
    # PART OF THE MODEL DEDICATED TO SORT
    s = [integer_var(min=0, max=n-1, name='s{}'.format(i)) for i in range(n)]  # position of variable c[i] in sort
    p = [integer_var(min=0, max=n-1, name='p{}'.format(j)) for j in range(n)]  # id i of variable at position j in sort
    v = [integer_var(min=0, max=n*H, name='v{}'.format(j)) for j in range(n)]  # value of variable at position j
    
    for i in range(n):
        model.add(element(v, s[i]) == c[i])
        model.add(element(c, p[i]) == v[i])
        if 0 < i:
            model.add(v[i-1] < v[i])
    model.add(inverse(s,p))
    
    sres = model.solve(SearchType="Restart")​


    ------------------------------
    Philippe Laborie
    ------------------------------



  • 7.  RE: Sorting or k-largest value for array of expressions

    Posted Tue May 25, 2021 04:14 AM
    Many thanks for your help, Philippe! That looks really promising. I will give it a shot later today.

    ------------------------------
    Sebastian Bayer
    ------------------------------