The Flow Analysis view of the web UI from IIB v9.0 can be used to identify a number of performance related problems. You can use the view in a tight cycle of making iterative changes and see, after redeployment, with the next set of statistics (usually 20 seconds later) if they have made a difference. Alternatively, you can first do an in-depth analysis of the flow before deciding on making any changes to the flow..
Ready?
As explained in an earlier post, you first need to enable the statistics collection for a flow. This can be done from the WebUI:
Figure 1. Enable statistics for a flow from the WebUI
or from the command line: e.g.
mqsichangeflowstats
Login to the WebUI (if you haven’t already done so), typically on a url like http://
Figure 2. Select a flow and open its statistics tab
Here the tools you get on the Statistics page:
- three line charts for some flow statistics of your choice from a set of some 26 metrics and a choice of time intervals for it. You can change the metrics independently on each line chart or you can choose a different time range. “Session” is the time interval since this view has been opened.
Figure 3. Line charts for a flows statistics
- a tabular view of node statistics
Figure 4. Statistics for each node in the flow
- a static, read-only flow profile view, with the nodes and the connection between them
That’s the what is available. Let’s look at how you can use those.
What is normal anyway?
It is extremely helpful to establish the expected patterns of behaviour for message flow metrics when the message flow is running well. This would be data such as message rate, response time and Average CPU/message values. This can then help you quickly determine whether the current behaviour that is being observed is as expected or unusual in some way.
In some cases, it may be possible to determine the cause of a change in behaviour or resolve a problem from a single set of statistics. In other cases, you may need to observe behaviour over several periods to see whether there is a trend or change in trend developing.
This blog post will focus on a common performance problem, that of a low message rate. Future blog posts will look at different problems.
Your problem: Message rate that is low or dropping relative to expectations.
First step is then to select Message Rate in one of drop down boxes by the graphs on the Flow Analysis view and analyse its shape over time.
Possible causes
- No messages to be processed
Let’s eliminate first the simplest reason behind this problem, that is nothing to do.
Look at the shape of the graph, including the maximum value for the message rate. Is this as high as you’d expect or is it flatlining on zero? You can also select Total Input Message or look at the Number of invocations column for the flow in the Flow Comparasion view.
Messages coming in? Good, let’s see other possibilities.
- A high number of backouts in message flow processing
You would want to try selecting “Message rate”, “Total number of commits” and “Total number of backouts” on each of the graphs and see if there is any correlation.
You’d like a high number of commits, correlating closely with message rate. You would expect the number of backouts to be zero. Any backouts should be investigated.
- Another flow is the culprit
You want to include connected flows. Often more than one message flow is involved in processing messages. Examples would be request and reply processing or an aggregation flow in which requests are sent to external systems for processing. In this case look at all of the flows that form part of the processing of the application.
You can use the Flow Comparison view at the right level (maybe application) to focus on related flows or you can open the Statistics page for all these related flows in multiple browsers tabs or windows and analyse the statistics this way.
- A high response time for a synchronous call within the message flow
It is a good idea to check Average Elapsed Time/Invocation and Total Elapsed time metrics and order the Nodes Table by Average Elapsed time to find the slowest node and its type. Does that node make a slow synchronous call? Or is there an MQGET node in the middle of a flow waiting for a response message that has not yet arrived. Expand the flow profile section to see how the nodes are connected.
Processing rates may have slowed because of a call to an external service has substantially increased due to a problem with service. In such a situation the flow would be running slowly because of the performance of another application or service not as a result of a problem with the flow.
- Lack of resources (CPU, IO, memory) for the message flow to process with
The best way to check this quickly is to use a system level monitoring tool like Task Manager on Windows, nmon on AIX, top on Linux etc. to see how busy the system as a whole is. If the system is busy processing other components like an application server or database then the flows are not going to be able to process messages as required.
To see whether a flow is heavily dependent on the availability of CPU, that is CPU bound, look at metrics such as Average Elapsed Time/Invocation, Average CPU Time/Invocation, Total CPU Time, Total Elapsed time, Number of Threads in Pool, Times Maximum Number of Threads Reached and analyse correlation between the shapes of these graphs.
Flows that are CPU bound will have Average CPU Time/Invocation values that are very similar to Average Elapsed Time/Invocation. Flows which have a significant difference in those times are not CPU bound.
In future posts, we’ll look at other common problem with flows’ performance that the Flow Analysis view can help identify.
Acknowledgements: Many thanks to Tim Dunn for reviewing this post.