MQ

 View Only
  • 1.  Client retry attempts

    IBM Champion
    Posted Wed August 16, 2023 04:10 AM

    Hi Folks,

    I've got a connection to a multi-instance QM. At the moment it's the standby QM that is the primary. In my client connection list I have the primary and then secondary in that order. The client is taking about 20 seconds to recognise that the primary QM is down and then moving on to connect to the secondary QM. I think this behaviour is due to the connect_timeout value in the client: 

    Connect_Timeout = numberThe number of seconds before an attempt to connect the socket times out. The default value of zero specifies that there is no connect timeout.

    IBM MQ channel processes connect over nonblocking sockets. Therefore, if the other end of the socket is not ready, connect() returns immediately with EINPROGRESS or EWOULDBLOCK. Following this, connect will be attempted again, up to a total of 20 such attempts, when a communications error is reported.

    The connect_timeout definition says that it will ALWAYS try 20 times on a failing connection so I assume my client is taking around a second to fail and then retrying 20 times. is there anyway that I can reduce this retry number?

    many thanks !
    John.

     



    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------


  • 2.  RE: Client retry attempts

    IBM Champion
    Posted Fri August 18, 2023 01:29 AM

    I suspect actually you are waiting for the TCP layer to timeout the connect() call which is normally 20 seconds. In this situation I assume the IP address of the primary (the first one in the list) either doesn't exist in the system at all, or cannot be reached. To change this I think you need to be looking for TCP/IP settings which would be machine wide.

    Cheers,
    Morag



    ------------------------------
    Morag Hughson
    MQ Technical Education Specialist
    MQGem Software Limited
    Website: https://www.mqgem.com
    ------------------------------



  • 3.  RE: Client retry attempts

    IBM Champion
    Posted Fri August 18, 2023 04:11 AM

    Hi Morag,

    the customer changed this value to 2 seconds and it now takes around 2 seconds to try the connection to the secondary QM (the now alive one). So, I'm a little confused as to what this documentation is actually telling me. 

    Maybe it's trying to say that it will try both connections 20 times before failing? Still not clear on why, if we don't set this, then it takes 20 seconds to fail and try the second address. 

    However, we have a solution that works for us :-)



    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------



  • 4.  RE: Client retry attempts

    Posted Fri August 18, 2023 12:33 PM
    Edited by Tim Zielke Tue August 22, 2023 08:27 AM

    You should ask IBM to clarify this doc, but I read it as follows.

    Here is the complete doc:

    Connect_Timeout = number

    The number of seconds before an attempt to connect the socket times out. The default value of zero specifies that there is no connect timeout.

    This attribute can be read by C, unmanaged .NETIBM MQ classes for Java, and IBM MQ classes for JMS clients.

    IBM MQ channel processes connect over nonblocking sockets. Therefore, if the other end of the socket is not ready, connect() returns immediately with EINPROGRESS or EWOULDBLOCK. Following this, connect will be attempted again, up to a total of 20 such attempts, when a communications error is reported.

    If Connect_Timeout is set to a non-zero value, IBM MQ waits for the stipulated period over select() call for the socket to get ready. This increases the chances of success of a subsequent connect() call. This option might be beneficial in situations where connects would require some waiting period, due to high load on the network.

    There is no relationship between the Connect_Timeout, ClntSndBuffSize, and ClntRcvBuffSize parameters.

    Here is how I read it:

    This section is for when the Connect_Timeout=0:

    IBM MQ channel processes connect over nonblocking sockets. Therefore, if the other end of the socket is not ready, connect() returns immediately with EINPROGRESS or EWOULDBLOCK. Following this, connect will be attempted again, up to a total of 20 such attempts, when a communications error is reported.

    This section is for when the Connect_Timeout>0:

    If Connect_Timeout is set to a non-zero value, IBM MQ waits for the stipulated period over select() call for the socket to get ready. This increases the chances of success of a subsequent connect() call. This option might be beneficial in situations where connects would require some waiting period, due to high load on the network.

    I could be misunderstanding it, but that makes more sense to me based on how a TCP connect call works over a nonblocking socket. When a TCP application connects over a nonblocking socket, you get control right away and do not block. This means that the TCP three way handshake connection could still be in process. In order for the client application to then know if the connection was successful, usually a select call is made on the socket. It sounds like for the default behavior of Connect_Timeout=0, the socket is polled up to twenty times with the select call until it is flagged as a failed connection. For the use case of Connect_Timeout>0, the select is called with the provided timeout value, and then will fail if that value is exhausted. That is at least how TCP code is typically written for these use cases and what it sounds like the documentation is describing.



    ------------------------------
    Tim Zielke
    ------------------------------