Introduction
The Business Transaction Monitoring (BTM) capability provided in IBM Integration Bus V10.0.0.3 allows a business user to track the life-cycle of a business transaction that has been processed by multiple message flows. To do this, BTM exploits and builds on the existing message flow monitoring functionality to capture and correlate events that are published by the message flows involved in a business transaction and record this information to a database.
This article explores the performance considerations when using the BTM feature of IIB. It is suggested you also review these other articles in this series:
- Business Transaction Monitoring – Why, what, how
- Business Transaction Monitoring in IIB
- Advanced usages of Business Transaction Monitoring
- Archiving Business Transaction Monitoring Data
- Business Transaction Monitoring vs Record and Replay
Overview
BTM builds on the existing Event Monitoring and Record/Replay features of IIB. Monitoring event messages are published from the message flow threads using IBM MQ PubSub. By configuring a subscription on the monitoring topic, these messages are put to a queue, from which they are read by a nominated Integration Server and written to a database.
The Integration Node also maintains an in-memory cache, which is used to correlate the events of a Business Transaction Definition (BTD). This cache is optimised to ensure viewing of the business transactions and their state performs well. There are no configuration options for the in-memory cache, and therefore it is not a focus of this article.
Instead, this article focuses on the performance considerations of recording the monitoring events.
The following diagram shows the different components required to record business transactions, which are:
- IBM Integration Bus
- IBM MQ
- A database (IBM DB2 or Oracle)
As a message flow processes transactions, it publishes (using IBM MQ PubSub) monitoring event messages to a topic. These publications are written to the SYSTEM.BROKER.DC.RECORD queue via a subscription. The ‘BTM Recorder’ gets these messages from the queue and records them to the configured database, as well as maintaining the in-memory cache.
For simplicity, the diagram shows the ‘BTM Recorder’ in a separate Integration Server, but it can be a part of any Integration Server. Also, one or many message flows, deployed to one or many Integration Servers within the same Integration Node can publish monitoring event messages as part of a business transaction.
A single subscription queue (SYSTEM.BROKER.DC.RECORD) and BTM Recorder exist per Integration Node to record and manage all monitoring event messages.
Performance Considerations
Use the following guidance to ensure the Integration Node and BTM feature performs optimally.
Monitoring events
Each event message constructed by the message flow, and published, increases the amount of CPU work within the Integration Server. Therefore it is recommended that consideration is given to the number of event messages needed to record a business transaction, and only critical events of the business transaction are published and recorded.
Monitoring events from common message flows may also be included in more than one business transaction definition. If this is the case, the events will be recorded multiple times in the database.
In addition to the number of events, there are several configuration options available when defining a monitoring event on a node within the IBM Integration Bus Toolkit, which can also have an impact on performance.
The first of these is the event filter:
Specifying an event filter increases the amount of work needed to publish a monitoring event. This may be especially true if the expression refers to data not yet parsed. However, if by using a filter the overall number of emitted event messages are reduced then there could be a net benefit. It is also important to remember, events that are part of a business transaction definition cannot be optional.
The second option, is the inclusion of a payload in the event:
Including payload data may require the Integration Server to parse input data, add the extra data to the monitoring event message and subsequently serialise a larger message.
It is therefore recommended that care is taken when selecting which monitoring event messages to emit, and whether a filter should be applied, as well as whether payload data is really needed.
SYSTEM.BROKER.DC.RECORD queue
All monitoring events emitted by any Integration Servers for a single Integration Node are subscribed and put to the single MQ queue SYSTEM.BROKER.DC.RECORD.
Depending on the expected load on this queue, consideration should be given to ensure this queue performs optimally. In particular, if the queue depth is likely to increase during busy periods, the queue buffer sizes should be tuned to avoid I/O.
BTM Recorder
The BTM Recorder gets messages from the SYSTEM.BROKER.DC.RECORD queue, and correlates them as part of a business transaction, whilst maintaining the in-memory cache and writing the event messages to the database.
By default, the Integration Node dynamically determines on start-up which Integration Server will host the BTM Recorder. It is possible to specify an Integration Server for the BTM Recorder, which is discussed in one of the other articles in this BTM series.
Database
IIB provides the DDL to create the necessary tables and indexes to support the record/replay and BTM features. To ensure the database works optimally, it is recommended that any network latency between the BTM Recorder and the database be kept to a minimum. Also ensure the database has sufficient CPU resources, and is located on the fastest disks possible.
Performance Testing
As with all performance testing, it is recommended to start small and gradually increase the workload, whilst observing key metrics such as: transaction rate, CPU utilisation, network bandwidth and I/O.
When performance testing BTM, it may be preferable to test different types of business transactions separately, before combining them in a mixed workload test. This is to establish the cost for each of the different types.
The overall performance of publishing a message to the MQ queue and the BTM Recorder getting the messages and inserting them into the database, will determine the maximum number of monitoring events that can be handled by a single Integration Node.
If the rate at which the monitoring events are emitted exceeds that of the BTM Recorders capability, then it is likely that the queue depth of SYSTEM.BROKER.DC.RECORD will start to grow. Whilst this is OK for short periods of time, and assuming that there is an equivalent reduced period of workload to allow the BTM Recorder to catch-up, it would not be sustainable permanently. Therefore, performance of the BTM recorder will depend on the design of the application, the number of monitoring events that are part of a business transaction definition, and the throughput.
As each system on which the BTM solution is deployed is likely to perform differently, it is therefore essential to monitor the queue during the performance test phase, as it will be the indicator for when the BTM Recorder reaches it maximum throughput.
Performance Results
Configuration
To demonstrate the guidance above, the following 3 tests were run:
- Message flow emitting 2 monitoring events (Start and End).
- Message flow (same as for 1) including an XPath filter.
- Message flow (same as for 1) including the message payload bitstream as binary.
In all 3 cases the message flow was:
Monitoring events were added to both nodes. For the MQ Input node the monitoring event source was ‘Transaction Start’, and for the MQ Output node it was the ‘In terminal’.
The filter test included the following XPath filter expression for both events:
The event payload bitstream test included the following options:
The business transaction definition of the events (BTD) was:
Hardware/Software
All tests were conducted on the following hardware and software:
The hardware consisted of:
- IBM xSeries x3850 X6 with 1 x Intel(R) Xeon(R) CPU E7-4820 v2
- 2.00GHz processors with HyperThreading turned off
- ServeRAID M5210 SAS/SATA Controller with 4GB Flash/RAID 5 Upgrade option (47C8668)
- 136GB 15K 6.0Gbps SFF Serial SCSI / SAS Hard Drive – ST9146853SS x2 (mounted directly)
- IBM 120GB 2.5in G3HS SATA MLC Enterprise Value SSD – 00AJ395 – x2 (Configured in RAID0)
- IBM 200GB SAS 2.5in MLC SS Enterprise SSD – 49Y6144 – x2 (Configured in RAID0)
- 32 GB RAM
- Emulex Dual Port 10GbE SFP+ VFA IIIr
The software consisted of:
- Red Hat Enterprise Linux Server release 7.2
- WebSphere MQ V7.5.0.5
- IBM Integration Bus V10.0.0.3
- DB2 v10.5.0.7
Results
The results show the throughput as transactions per second (TPS) of the main flow (MQInput -> MQOutput), and the amount of CPU utilisation of the DataFlowEngine process (Integration Server).
Each transaction through the flow emits 2 monitoring event messages.
The CPU is approximately the same across all 3 tests at 170-180% of 1 CPU. This is only for the DataFlowEngine process, which includes the main flow and the BTM Recorder. IBM MQ and DB2 CPU utilisation is not included.
What can be observed from the above graph is the effect on throughput by including either a filter or an event payload bitstream.
Whilst running these tests, the depth of the SYSTEM.BROKER.DC.RECORD queue was continuously monitored.
It was found that at approximately 1000 TPS on the main flow (2000 monitoring event messages per second) the queue depth would start to increase as the BTM Recorder was at its limit.
Summary
This article has reviewed the performance characteristics of the BTM function available in IIB 10.0.0.3. It has given guidance on how best to configure monitoring events for performance, and shown the difference in results.