MQ

  • 1.  Spring not connecting correctly to HA multi-instance

    Posted Tue September 21, 2021 05:18 AM
    Hi Folks,
    Using Spring-boot application with asynch receive JMS listener (various recent versions of spring). Working against a windows multi-instance QM (9.2). See two things:

    1) Either the connection never re-appears after fail-over. This happens occasionally and usually around two fail-overs after an exception has been registered at the JMS listener.

    2) The connection appears, however the asynch state of the connection is inactive (as if spring is using a stale connection perhaps?) . I can get this to occur with just a few fail-overs on a regular basis.

    I've tried using multiple Spring containers from the SimpleJms container through to the default container.

    I can't repeat the issue when I use ibm-mq-spring-boot library. However, this lib is not supported (hmmm).

    My worry is this 
    1) I use the ibm-mq-spring-boot lib and it is masking the issue and the issue will reappear later
    2) There is an underlying issue with the base MQ JMS auto-reconnect that spring boot just happens to be bringing to the surface quicker than if we weren't using it.

    Can anyone confirm that they have thoroughly tested the MQ JMS Auto-reconnect behaviour within a spring environment?
    Presumably asynch re-connection is supported by IBM with spring? How come the ibm-mq-spring-boot lib works - the code doesn't look that complex but clearly doing enough to not get this issue?

    thanks for any help on this !
    John.




    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------


  • 2.  RE: Spring not connecting correctly to HA multi-instance

    Posted Wed September 22, 2021 02:57 AM
    An important feature of ibm-mq-spring-boot is that increases the DefaultMessageListenerContainer receive timeout (spring.jms.listener.receiveTimeout) from 1 second to 30 seconds. Maybe this is the reason for the differences you see in your tests.

    If you are using the DefaultMessageListenerContainer, the JavaDoc says you should not use a CachingConnectionFactory, which is default in Spring Boot. That's maybe also a source of reconnection problems.


    ------------------------------
    Daniel Steinmann
    ------------------------------



  • 3.  RE: Spring not connecting correctly to HA multi-instance

    Posted Wed September 22, 2021 04:53 AM
    Thanks Daniel,
    I was using the caching connection factory but removed it. Interesting about the Default receive timout - I'll try the none IBM lib with that setting and see if it has any affect.

    However, having discussed this even more with my customer I'm finding that they see the first issue (where MQ doesn't connect at all) in their none Spring environment as well. I'm coming to the conclusion this may be an underlying issue with either the MQ libs or their network. I doubt it's their network as I see no netstat data when it's failing which suggests, to me, that MQ just isn't seeing the connection loss .

    thanks for the tip though.

    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------



  • 4.  RE: Spring not connecting correctly to HA multi-instance

    Posted Fri September 24, 2021 04:42 AM
    Putting message here from Josh McIvor:
    Message From: Josh McIver

    What is the SVRCONN HBINT set to?  By default it is 300 which means that a heartbeat is only initiated by the client every 5 minutes and if the connection passes through some sort of network device that will time it out (firewall or load balancer for instance) you may need to increase the timeout or lower the HBINT.

    Also with a HBINT of 300 the client may take up to 360 seconds and queue manger may take up to 370 seconds to notice the connection was broken.

    If the HBINT is 60 or more the timeout is HBINT + 60.  If the HBINT is less than 60 then the timeout is HBINT*2.

    I have found 15 is a good HBINT, not too much traffic and broken connections take only up to 30 seconds on the client side and 40 seconds on the queue manger side to notice (the extra ten is because the client is always supposed to initiate the heart beat, the queue manger waits HBINT + 10 if it does not receive one from the client it will then send a heartbeat to the client and it's timeout of HBINT starts at that point.  So if it is 15 seconds then queue manger waits 25 seconds, send heart beat then waits 15 more to receive a response.

    Hope this may help.

    ------------------------------
    Josh McIver
    ------------------------------

    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------



  • 5.  RE: Spring not connecting correctly to HA multi-instance

    Posted Fri September 24, 2021 04:46 AM
    Hi Josh,

    been there, unfortunately. Changing heartbeat does not affect the case where the connection never happens (at that point there is no connection at all to break according to NETstat and I've left it for 15 minutes and nothing. It did seem to affect the speed at which fail-over happened when all went well - as expected.

    cheers,
    john.

    ------------------------------
    John Hawkins
    Integration Consultant
    ------------------------------