IBM Integration Community Come for answers. Stay for best practices. All we’re missing is you. Join / Log in Ask a question
We recently upgraded our MQ servers from 9.2 to 9.3. Since then we've noticed issues with keeping the client applications balanced across the cluster. Before 9.3 this was not an issue. We probably need to tweak some of the options, either at the qmgr, client or both. I've been reviewing the latest IBM documentation around application balancing and have a few questions about some of the newer states/options.First, some background:-- We have a two-node uniform cluster, each running a single qmgr.-- Clients use a CCDT to connect.-- Suspect clients are all MQ consumers.-- Four instances of each client are running.-- After the initial server restart the DIS APSTATUS command shows all instances of each client are balanced (i.e., 2 connected to each qmgr).Questions 1Running DIS APSTATUS('app-name') type(local) shows below output:. I'm concerned that it has MOVABLE(NO). From the docs, the reason is INTRANS, meaning that it's inside a transaction. This client is using transactions. At what point does this client become "movable"? The docs indicate there are parameters that can be set to allow the qmgr to preempt the INTRANS state to maintain the balance. What is the best practice for this? The code is written to respond to rollbacks, so having one occur due to a rebalance is acceptable.AMQ8932I: Display application status details. APPLNAME(app-name) CONNS(2) IMMREASN(INTRANS) IMMCOUNT(0) IMMDATE( ) IMMTIME( ) MOVABLE(NO) TYPE(LOCAL) BALTYPE(SIMPLE) BALOPTS(NONE) BALTMOUT(10)AMQ8932I: Display application status details. APPLNAME(app-name) CONNS(2) IMMREASN(INTRANS) IMMCOUNT(0) IMMDATE( ) IMMTIME( ) MOVABLE(NO) TYPE(LOCAL) BALTYPE(SIMPLE) BALOPTS(NONE) BALTMOUT(10)Questions 2One of our clients showed as balanced right after the qmgrs were restarted, but a few minutes later all four instances had reconnected to one of the qmgrs. The IMMREASN had changed to NOREDIRECT. This is also a new state with 9.3, and indicates the client has informed the qmgr it cannot process redirect hints. It is using the CCDT, so what else could cause this change?Thanks,Jim
DIS APSTATUS('app-name') type(local)
AMQ8932I: Display application status details.
IMMCOUNT(0) IMMDATE( )
IMMTIME( ) MOVABLE(NO)
Just looking at Question 1 in this reply. The best-practice for using clients that work with transactions is the default. The default is to only ask a client to rebalance when it is NOT in the middle of a transaction. This seems fairly sensible to me.
However, if you want to change this default setting and go back to the behaviour in V9.2 whereby the client could be asked to rebalance even while it is in the middle of a transaction, then you can set up your mqclient.ini file thus:-
It is necessary to supply all three of these attributes even though we only want to change the third one. If you want different values for the first two, go ahead and change them, but it's the third attribute that affects the transaction behaviour.
I too am a little surprised by the output showing MOVABLE(NO). It should be fairly easy to test whether your clients are told to move when they finish a transaction. One easy way to test is to just to SUSPEND one of the QMGRs from the CLUSTER, and watch what happens to the client applications. Then once you have seen enough, RESUME the suspended QMGR again.
Thanks, Morag. I will try your suggestion about suspending one of the qmgrs. Regarding a change to the settings, do you know if there is a way to implement this on the server side? It will take some time to modify and test all the clients. I am hoping there is a way to tell a 9.3 server to behave like 9.2 in regards to balancing, at least until we can do further research.Jim
ApplicationDefaults: Type=Simple BalanceTimeout=Never BalanceOptions=IgnTrans
For question 2, the full set of root causes for a NOREDIRECT immovable reason (at least to best of my knowledge) are:
Of these I would guess the most likely to catch you unawares would be accidentally setting MQSERVER when a CCDT has been provided.
It is true to note that prior to recent changes applications which could not correctly honour redirect requests (for one of the above reasons) would sometimes be asked to rebalance - the effect of this would be that the app would reconnect 'randomly' to the Uniform Cluster (which might over time result in an even spread, but could also have undesirable effects, e.g. 'bouncing' around the cluster). If you have specific applications that fall foul of one of the above checks but which you would still want the cluster to attempt to rebalance that may be possible - probably worth emailing me directly with further details.Thanks, Anthony
Hi again JimFollowing on from your and Morag's comments on Question 1:
As Morag says, the new default is nearly always 'sensible' behaviour - we don't make changes in default behaviour lightly, (so this was considered carefully in the design stages) and the reason we did so here was that the 9.2 behaviour seemed 'unusual' to the point of being almost outright incorrect. It is nearly always preferable to allow a client application to complete its work naturally and close out the transaction, rather than interrupting the unit of work immediately. As a 'backstop' for badly behaved applications, the default behaviour is to interrupt after 10s to rebalance (if required - i.e. we haven't already managed to find better candidates to restore balance).I'm afraid there is no way to modify this behaviour purely from the server side - it should also be noted that the ini settings Morag suggests will only affect MQI (e.g. C) client applications, you don't mention what client libraries/runtime are involved here?
I'm slightly surprised/disappointed that you are not seeing an IMMDATE/IMMTIME for the INTRANS application, as the general intent is that for temporary immovable states these fields should indicate how long that situation is expected to persist - there may be some fundamental reason I've forgotten why that is not possible in this case, but at the least documentation in that area should probably be improved - you could raise a ticket to look into this if it's a significant issue for you but I'll take the action in the background to investigate further.RegardsAnthony