Back in 2015, with v10 of IBM App Connect Enterprise (at that time known as IBM Integration Bus), the hard dependency on a locally installed IBM MQ server was removed. Since that time we have been able to install IBM App Connect Enterprise completely independently of MQ making the installation and the resultant topology lighter and simpler, and since v22.214.171.124 (December 2019) it is possible to use MQ client connections for virtually all interactions that previously required a local server. With the move toward more granular installation of integrations, especially in containers, knowing exactly when we can exploit this more lightweight installation has become all the more relevant.
This post explores when we can install IBM App Connect Enterprise independently and, conversely, under what circumstances might we still need a locally installed IBM MQ server? Put another way, when do we really need a local IBM MQ server as opposed to when will connecting to remote MQ servers will suffice?
There are two very different reasons App Connect Enterprise makes use of IBM MQ:
- As an asynchronous messaging provider
- As a co-coordinator for global (two phase commit) transactions
As explained later in this article, it is only the latter use of MQ which is the real driver for a locally installed MQ server, but read on as this use may be less common than you thought.
Benefits of being dependency-free in container-based environments
The question of whether IBM App Connect Enterprise requires a local MQ server becomes more pertinent as we move to container-based deployment and we explore installation of integrations in a more granular way. Rather than having a single installation containing every integration, we can move to deploying small groups of integrations in isolated containers.
Container technologies such as Docker combined with container orchestration facilities such as Kubernetes make it possible to rapidly stand up a discrete set of integrations within a container in a scalable, secure, and highly available configuration. This represents a radically different approach to the implementation of integrations, potentially providing greater agility, scalability and resilience.
This fine-grained deployment is part of what we refer to as agile integration. It is much broader than just containerization, also encompassing changes to the ownership of integrations (decentralization) and aligning with cloud-native principles, but it begins by breaking up the traditional heavily centralized ESB pattern into discrete groups of integrations.
In the more traditional centralized architecture, the infrastructure has to have a local MQ server, even if only a small number of the integrations require it.
In the more modern fine-grained architecture, we can decide whether a local IBM MQ server is required based on the specific needs of the small set of integrations it contains.
There are significant benefits in being able to stand up an integration server that does not need a local IBM MQ server, and these become particularly pronounced in a container based environment such as Docker:
- The size of the installation is dramatically reduced, and thereby the size of the Docker image. This reduces build times due to the reduced image creation time, and reduces deployment times as a smaller image is transported out to the environments.
- The running container uses significantly less memory as it has no processes associated with the MQ server. Cloud infrastructure used for container-based deployment is often charged based on memory rather than CPU so this can have a significant impact on running cost.
- Start-up of a containers are much faster as only one operating system process is started – that of the integration engine. This improves agility by reducing test cycle time, and improves resilience and elastic scalability by being able to introduce new runtimes into a cluster more rapidly.
- MQ holds its message data on persistent volumes, and specific servers need access to specific volumes within the MQ topology. If IBM App Connect Enterprise has a local MQ server, it becomes locked into this topology. This makes it more complex to elastically add new servers to handle demand dynamically. Once again, this makes it harder to take advantage of the cost benefits of elastic cloud infrastructure.
When can we manage without a local MQ server?
Clearly, the simplest case where a local MQ server is not needed is where we have a flow that does not put or get IBM MQ messages. Given there are around 100 nodes that can be used to create flows in IBM App Connect Enterprise, and only perhaps a dozen of those involve MQ, there are plenty of integrations that can be created without MQ at all. We can expose and invoke RESTful APIs and SOAP based web services, perform database transactions, read and write files, connect to enterprise applications like SAP, Siebel and much more, without the need for a local MQ server. Indeed, IBM App Connect Enterprise now enables, since v11, connectivity to over 100 cloud based SaaS applications too.
What if we are using IBM MQ, or another transactional resource (for example a database)?. We should start by making a clear statement that we do not need a local queue manager for IBM App Connect Enterprise to communicate with IBM MQ queues. IBM MQ provides an MQ client that enables messages to be placed on and retrieved from remote queues using single-phase commit transactions, without the need for a local MQ server.
A single-phase commit is the transactional protocol used when our transaction only spans one transactional resource. For example, we only want to work with messages on a single queue manager, or perform actions on a single database. With a single-phase commit transaction, the actual transaction is essentially happening within the one resource we are talking to.
Using client-based connections, when creating flows in IBM App Connect Enterprise, is simply a matter of choosing the client setting on the configuration of the MQInput, MQOutput, MQGet, or MQReply nodes. Alternatively, the use client-based connections can be applied consistently across multiple nodes in flows along with other MQ properties using an MQEndpoint policy, and as of v126.96.36.199 this policy can be used as a default for virtually all nodes, including the aggregation nodes, the timeout nodes, and the Collector, Sequence and Resequence nodes. The SAP Input node has been able to use client connections since v7, though it does not use MQEndpoint policies directly.
So, if a given flow uses only client connections to MQ, no local MQ server is required, as shown in diagram a).
As can be seen from diagram b) the same is true of any interaction with a single transactional resource, such as a database.
Indeed, we can even interact with multiple transactional resources as shown in c) so long as we don’t need them to be combined into a single unit of work (i.e. if one fails, they all rollback to where we started).
If we do want to combine multiple resources into a single unit of work as shown in diagram d), that’s when we will require a local MQ server, to co-ordinate the required two-phase commit transaction. We’ll talk more about this later.
Can I talk to multiple queues in the same transaction without a local MQ server?
Starting with the simplest case, we can perform multiple MQ updates in the same transaction via a client connection as long as they are all on the same queue manager as in diagram e).
The complete set of interactions with all queues can be committed (or rolled back) together. This is because this interaction is performed using only a single-phase commit since the queue manager is itself only a single resource manager.
To be clear, multiple updates to queues on the same queue manager do not necessarily require a local queue manager, and can be done over a client connection.
If we do have to talk to two separate queue managers there are three options as shown in f), g) and h). Let’s look at each one separately.
In diagram f) we update each queue individually. Because each transaction is a separate one phase commit transaction, no local MQ server is required.
However, there is of course a small risk, just as there was in diagram c), that something could in the middle of the flow such that the first transaction occurs but the second one doesn’t. If this risk would be a concern and we really need to treat the updates as a single unit or work then we will need to consider one of the other methods as shown in g) and h).
It is possible, as shown in diagram g), to configure an MQ topology such that IBM App Connect Enterprise can perform actions across queues that reside on multiple different queue managers, still without needing a local MQ server.
The technique here is to have one queue manager directly connected to the Integration Server (we can call this a “concentrator”), and make all of the queues involved in the transaction available on that. This is done by setting up remote queue definitions on the concentrator queue manager for the queues that reside on other queue managers. IBM App Connect Enterprise can then “see” all the queues via the concentrator queue manager, and perform the multi-queue interaction over a single-phase commit which can be done with the client connection. The concentrator queue manager then takes on the task of performing the more complex transactional and persistent behaviour across multiple queue managers, using standard MQ channels to achieve the desired coordinated results.
Co-ordinating a two-phase commit requires a local MQ server
If you look back at diagram d), you will see that a local MQ server is required because we want IBM App Connect Enterprise to combine changes to two (or more) separate resources, such as the queue and database in d) or between two separate queue managers as in diagram h) below.
Something needs to act as a transaction manager across both resources. IBM MQ is capable of being a transaction manager in a two-phase commit transaction on behalf of IBM App Connect Enterprise, but only if it is locally installed.
The transaction manager then performs what is known as a two-phase commit, where the overall transaction is broken down into a prepare phase – where each of the resources makes a promise to complete the work if asked – and a separate commit phase, where all the resources are requested to complete their individual units of work. If any one of the resources is unable to complete in a reasonable time, then the transaction manager can request a rollback of all the involved resources.
It is for this two-phase commit transaction coordination that an IBM MQ server must be installed locally to IBM App Connect Enterprise for scenarios d) and h).
What are the alternatives to a two-phase commit transaction?
It would be fair to say that two-phase commit is nowhere near as common as you might expect, given the apparent advantages a truly atomic transaction across two or more resources should have in terms of data consistency. The reality is that:
- It is complex to set up, requiring the additional transaction managers for coordination of the overall transaction
- It requires the resources involved to trust a totally independent transaction manager. They will have to trust that will be efficient with the use of locks against the resource between the prepare and commit phases.
- It introduces considerable complexity to architect for disaster recovery, and even high availability configurations. Two-phase commit requires complete consistency across both the transaction logs of the transaction manager, and all the distributed resources involved. Architecting this consistency even in disaster recovery situations can be very difficult.
It is very rare to see two-phase commit between two separate systems owned (and funded) by fundamentally different parts of an organisation. It is sometimes found is within a single solution, where all resources (e.g. database and queues) are owned by the same team or part of the same product. Even in this situation, designers are often looking for alternatives.
Furthermore, modern RESTful APIs, which are increasingly becoming the predominant way that distributed systems talk to one another, work over the HTTP(S) protocol. They are not transactional on any level, let alone able to take part in two phase commit. So, if in the future we are able to design complex solutions involving multiple systems that are only available over RESTful APIs, we will need alternative approaches to distributed transactions.
For many circumstances, it is preferable therefore to look at alternative designs to two-phase commit. Here are a couple of commonly used designs:
- Re-tries and idempotence. For many scenarios, it is possible to ensure that the operations on the target systems are idempotent. So, if for example the request was to process a payment, then if the same payment were submitted twice, it would still only result in one payment being processed. With idempotent target systems, we may be able to remove the need for two phase commit, by simply ensuring we perform re-tries until success occurs. This provides eventual consistency as opposed to absolute consistency – which may or may not be appropriate for your use case.
- Saga/compensation pattern. The saga pattern was introduced in 1987 as a way to handle distributed transactions across independent systems. It works by bringing together atomic actions, either by chaining them, or by having a central orchestrator. If one of the systems should fail, then a compensating action is performed on all those systems which have already processed the event, in a non-transactional way. This pattern has been implemented in many forms over the intervening years. Business Process Execution Language (BPEL) is an example of an orchestration based saga implementation. Equally, chains of message flows – each one asynchronously leading into the next – could also be set up to perform a choreography style saga implementation. There is a resurgence of interest in the saga pattern in recent years due to the highly distributed nature of microservice applications.
So there certainly are alternatives (such as those by design mentioned above), and also more practical product specific mechanisms (such as that discussed earlier for handling updates to queues across multiple queue managers). Ultimately, whether we choose an alternative to two-phase commit will depend on how much benefit we believe we stand to gain from the increased simplicity of the IBM App Connect Enterprise topology.
When else do I need a local MQ server?
In addition to the use cases discussed so far, at the time of writing a small collection of features in IBM App Connect Enterprise require a local MQ server. These are the ConnectDirect nodes, and the content-based filtering PubSub capability integrated with the MQ queue manager, and these are still tied architecturally to the local MQ server.
As mentioned above, the previous restrictions on stateful nodes (such as the Collector and aggregate nodes) have been lifted at v188.8.131.52, and these nodes now work remotely; the state queues used by the nodes can be configured to use different names for each server to avoid conflicts on queues such as SYSTEM.BROKER.EDA.COLLECTIONS. As of v12.0.7, the FTE nodes can now also use remote default queue managers.
So, in short, the existing dependency on local MQ for these nodes is much less important than before.
Why do we have so many integrations using server connections?
It is common to find older integration flows that use server connections to a local MQ even when they are not required. This is largely because a local MQ server in the past could be assumed to be present. It also provided some extra comfort at a time when networks outages were more common to know that a message would at least reach the local queue and the messaging system would eventually take care of delivery. Today, networks are generally more reliable and the need for local MQ servers is much reduced. Indeed, many customers are significantly simplifying their MQ topologies, moving to more centralised options such an MQ Appliances to host all their queue managers in high availability pair. So, perhaps with some minor re-configuration, many existing integrations could simply use client-based connections instead of server connection, and they would no longer have a dependency on a local queue manager.
As we have seen, many scenarios can be achieved without the need for a local queue manager. The single main exception is where IBM App Connect Enterprise is required to perform a two-phase commit transaction. However, counter that with that fact that for many of the circumstances that might have used two phase commit in the past (multiple queue managers, combined database and queue updates, etc.) there are viable alternatives. The benefits of being able to independently administer and scale our IBM App Connect Enterprise and IBM MQ topologies may well make those alternative patterns very attractive. Add the fact that with a more fine-grained integration deployment approach, we don’t have to build a one-size fits all topology. We can instead just introduce local MQ server for the integrations where it is most needed.
Thanks to Kim Clark, Hugh Everett, Trevor Lobban, Claudio Tagliabue, and Ben Thompson for their assistance in the preparation of this post.