Decision Optimization

 View Only
  • 1.  Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Wed October 21, 2020 04:03 AM
    Edited by System Fri January 20, 2023 04:14 PM
    Hi all,

    I'm occasionally getting this error when using docplex.cp:

    LocalSolverException: Nothing to read from local solver process. Check its availability.

    The weird thing is that this appears randomly, and when trying again in the same Python script, the second time is successful. I.e., this works:

    try:
        solution = solve(constraints)
    except LocalSolverException as e:
        # try again
        solution = solve(constraints)


    Where solve() is my function that creates a model using CpoModel, add constraints, and calls CpoSolver on it.

    I did some manual modifications to the docplex library and added debug prints.
    The error is coming from here:

    def _read_frame(self, nbb):
        """ Read a byte frame from input stream
        Args:
            nbb: Number of bytes to read
        Returns:
            Byte array
        """
        # Read data
        data = self.pin.read(nbb)
        if len(data) != nbb:
            if len(data) == 0:
                # Check if first read of data
                if self.process_infos.get(CpoProcessInfos.TOTAL_DATA_RECEIVE_SIZE, 0) == 0:
                    if IS_WINDOWS:
                        raise LocalSolverException("Nothing to read from local solver process. Possibly not started because cplex dll is not accessible.")
                    else:
                        raise LocalSolverException("Nothing to read from local solver process. Check its availability.")


    This code reads data from STDOUT of the subprocess, which is running "cpoptimizer -angel".
    I can verify that indeed this is the first byte being read from the process, and that the process is not returning anything.
    If I change some more things, I can see that the process is not alive, and the returncode from it is -9.

    So I guess my question is: why is "cpoptimizer -angel" exisiting with error code -9, and why is this happening randomly?
    Can I somehow enable more debug prints to gather more info?

    Thanks.



    ------------------------------
    Tomer Vromen
    ------------------------------
    #DecisionOptimization


  • 2.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Thu October 22, 2020 02:23 PM

    Hi Tomer,

    This is very strange and unusual. There may be several possible explanations. Can you please do the following checkings.

    1) Check that the solver sub-process can not be killed for some external cause: lack of memory, problem with anti-virus, etc.

    2) Run the cpoptimizer process in standalone and give back in this topic the start banner that is printed (to get the exact solver version).
    If the solver is an academic version, there may be limitations on the size of problems that can be solved.

    3) Check if this problem occurs on a particular model, or on all of them.
    If it is on a single model, can you please export it in CPO format using the method mdl.export_model(out=<file_name>)
    You can then try to solve it directly using cpoptimizer in standalone (commands "read <file>" and then "optimize"). If it fails, it is possibly a bug in the solver.
    In any case, if you can send us your model in CPO format, it would be nice.

    Thanks a lot. I hope we will find the issue.



    ------------------------------
    Olivier Oudot
    ------------------------------



  • 3.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Thu October 22, 2020 02:24 PM
    Hi Tomer,

    Your problem is very strange and unusual. Could you please check the following.

    1) Check that there is no external cause that may explain that the process is killed externaly (lack of memory, anti-virus, etc).

    2) Check the version of your solver.
    To do this, start the cpoptimizer process in standalone and copy-pase in this topic the complete banner that is displayed.
    If your version is an academic version, big problems can not be solved and are rejected.

    3) Verify that your problem occurs whatever is the model you want to solve, or only on a single one.

    4) If so, it could be a bug in the solver that occurs randomly.
    Export your model in CPO format using method mdl.export_model(out=<filename>).
    Try to solve this model directly with cpoptimizer process in standalone. Commands are "read <filename>" and then "optimize".
    If possible, post your model in this topic.

    I hope we will find your problem. Sorry for the inconvenience.

    ------------------------------
    Olivier Oudot
    ------------------------------



  • 4.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Sun October 25, 2020 09:58 AM
    Hi Olivier,

    After further investigation, I found that docplex - the Python library - is the one killing the cpoptimizer process.

    First clue was the Python documentation for Popen.returncode. It states:
    "A negative value -N indicates that the child was terminated by signal N (POSIX only)." So something was is sending a SIGKILL to the process.

    After some looking around I found this logic in the constructor in solver_local.py:

    # Read initial version info from process
    self.version_info = None
    timer = threading.Timer(1, lambda: self.process.kill() if self.version_info is None else None)
    timer.start()
    evt, data = self._read_message()
    timer.cancel()

    This timer for 1 second is very arbitrary, and seems like it's not enough in my case.
    I run the tests on a corporate cloud environment where not all disks are always mounted, and therefore the first access can be slow.
    Specifically, launching the cpoptimizer process can take more than 1 second sometimes, and this causes the timer to be triggered before even reading the first message.

    I think there are 2 issues here:
    1. Arbitrary time limit of 1 second for launching the process.
    2. Using SIGKILL causes a very confusing error message.

    Right now my solution is to catch the LocalSolverException error and try again, since only the first access is too slow. But that's an ugly workaround...

    ------------------------------
    Tomer Vromen
    ------------------------------



  • 5.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Mon October 26, 2020 07:01 AM
    Hi Tomer,

    You are totally right, your workaround works but is ugly. 

    I will ASAP change this code to:
     - Put this timer delay configurable,
     - Throw a more explicit exception if the case occurs.

    It will be available in the next deployment in Pypi (in some days).

    Thanks a lot for your investigation, and sorry for the inconvenience.

    ------------------------------
    Olivier Oudot
    ------------------------------



  • 6.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Thu November 26, 2020 09:31 AM
    Hi Olivier,

    I saw that there's a new release with a fix for this (and for other smaller issues I reported). Thank you for the quick resolution!

    ------------------------------
    Tomer Vromen
    ------------------------------



  • 7.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Thu November 26, 2020 12:19 PM

    Hi Tomer,

    Thanks to you ! You have contributed to ameliorate the product.

    Don't hesitate to report any other problem you may find, even small. 



    ------------------------------
    Olivier Oudot
    ------------------------------



  • 8.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Tue December 29, 2020 12:47 PM

    Hi Oliver, I met the same issues for docplex, my python code run well in my Windows development environment, but when I deployed it to linux, I met the error"docplex.cp.solver.solver_local.LocalSolverException: Nothing to read from local solver process. Check its availability".

    It seems there is a fix pack for this issue, would you please kindly provide the link for download?

    Many thanks



    ------------------------------
    KE ZHANG
    ------------------------------



  • 9.  RE: Getting random "Nothing to read from local solver process" from docplex.cp

    Posted Mon April 03, 2023 09:57 AM

    Hi,

    Just for reference because I faced the same issue and this page is the first result on google.
    This timeout is now configurable using:

    context.solver.local.process_start_timeout = 10


    Best,

    Pierre Tassel



    ------------------------------
    Pierre Tassel
    ------------------------------