Message Image  

12-factor integration

 View Only
Thu July 09, 2020 04:25 PM

12-factor apps have become a key yardstick by which components are measured to establish whether they are truly ready for cloud native deployment. It seems reasonable therefore that if we were to pursue lightweight integration, then we would want to use these principles there too, to create a 12-factor integration. Before we start, many thanks to Rob Nicholson, Ben Thompson, and Carsten Bornert for their significant contributions to this post.

The 12 factors ensure an application can take advantage of the cloud infrastructure in order to deliver agility and scalability benefits. Although 12factor.net does not refer specifically to microservice architecture (MSA), it is typically used to define the appropriate characteristics for applications built as microservices. Applications built using the microservice approach typically have a need to integrate with data sources outside of the microservice architecture. For example, to integrate with systems of record built decades ago or services provided by a partner. One approach would be to write these integrations in code. In this article, we make a bold assertion; within a microservice architecture, we assert that it is perfectly reasonable to implement integration logic using an integration engine. In this case, we are still writing ‘code’ but the code we are writing is expressed in the domain specific language of the integration engine which has been specifically designed to perform integration tasks in an efficient and clear way. Furthermore, we assert that such an approach is a sensible one given integration engines have been specialized over decades specifically to do integration.

However, if we are to use an integration engine inside our microservice architecture, it had better function as a compliant 12-factor app, otherwise we’d be giving up some flexibility as compared to the ‘just code’ approach. In this post we expand on our previous discussion around use of IIB for lightweight integration and show how integrations built on IBM Integration Bus (IIB) can be compliant with the 12-factor approach. Indeed they can be used to implement 12 factor integration.

Let’s take each of the factors in turn and see how best to achieve them using IIB’s features:
  
1. Codebase – One codebase per app, tracked in revision control, many deploys. 

In the case of IIB, the “codebase” refers to all the artifacts that go together to make up the definition of a given integration – essentially all the files that end up in the Broker Archive file (BAR file) when you build the integration. This consists mostly of XML based files describing the integration flow itself, and any related descriptors such as for example Open API Specification Swagger files describing REST interfaces, the DFDL files defining data formats and so on. These essentially make up the domain specific language for integration. There may also be language based files if you are using Java for example. A set of these files are gathered together into an “application” which then provides the complete “codebase” for a self-contained integration which can be built and deployed independently to an IIB runtime. The tooling has excellent integration with common source control software such that the codebase can be tracked in version control and incorporated into build pipelines for continuous integration. 

2. Dependencies – Explicitly declare and isolate dependencies

The primary dependency of the codebase is of course the IIB runtime itself. An IIB installation these days is nothing more than a set of binaries laid out in a directory structure. This suits layered filesystem-based container runtimes such as Docker well, as an image containing the relevant version of the binaries can be created and used as the basis of builds and as the target for the deployment.

Any further dependencies relate to the codebase itself. IIB enables packaging of code into libraries. These libraries can either be contained within the application or can be handled as a separate sharable dependency. Regardless of what scope the libraries are defined at, in the build phase they are combined into the executable ensuring it is a fully contained capability. The application/library structure fully defines all dependencies required by the running integration.

3. Configuration – Store configuration in the environment

IIB provides extensive support to externalize environment-specific configuration from code. This includes both configuration needed to access resources / backing services (e.g. anything from URLs and data source names to access credentials) as well as user-defined properties (e.g. to implement custom log levels). Most configuration can be done by providing a deployment environment specific configuration overrides file that is combined with the bar file using mqsiapplybaroverride when a release is created.
There are other mechanisms for configuration such as configurable services and policy files. These still provide separation of configuration from the codebase, but ideally we want to use a single technique, and the override file provides good coverage and is currently the most appropriate from a 12-factor point of view.

4. Backing services – Treat backing services as attached resources

In straightforward scenarios it could be said that IIB has no backing services at all. It doesn’t require a database, or even file storage, and in most cases it no longer even has the dependency on MQ that was present prior to v10.

If we look at the concept of “backing services” more broadly, you might consider that anything IIB connects to is effectively a resource. So if IIB is for example connecting SAP to Salesforce, then SAP and Salesforce are certainly “resources” required for the integration to run effectively. This is one area where a component that primarily performs integration (whether written in code or an integration engine), is somewhat different to a self-contained application. An integration component by definition makes its living by its ability to connect to sources of data beyond its own runtime. We’re clearly not going to be able to eliminate an integration’s dependency on the systems between which it is integrating. How dependent an integration is on the systems it connects to has more do to with the design choices made around the individual integration itself. For example should we choose a messaging-based protocol rather than a real time API or how does the protocol manage temporary loss of network. So these have little to do with IIBs ability to run as a 12-factor app, and are more to do with the fundamental challenges of integration solution design. So long as we adhere to good principles on those connectivity points such as separation of environment specific connection properties from the codebase and choosing the appropriate protocols and interaction patterns for the service levels required, we will minimize the effect that the connected systems have on our integration.

5. Build, release, run – Strictly separate build, release and run stages

For a repeatable headless build process, IIB’s mqsipackagebar or mqsicreatebar command line tool should be used in a build pipeline defined using tools such as Ant, Git and Jenkins enabling continuous integration. This creates a deployment artefact (a BAR file) which can then be used for all environments.

As described in the Configuration section, the environment specific aspects are either picked up from environment variables or provided in an “overrides” file. The final release artefact can either be deployed as part of the startup script of the container using mqsideploy, or a pre-deployed image can be prepared for faster start up times.

6. Processes – Execute the app as one or more stateless processes

There is no runtime state for an IIB runtime. Should an integration server need to be replaced, a brand new runtime can be started up in its place and immediately begin taking on work.

In line with our recent post on “lightweight integration”, IIB is a lightweight engine that can be used in a very different way to the large centralized ESB patterns of the past. For 12-factor integration, we would want to deploy our integrations in a more fine grained way. For instance using Docker containers and Kubernetes to enable us to quickly deploy and orchestrate many lightweight IIB instances. Each container would consist of only one isolated integration server. We could then consider deploying only a small number of integrations to each IIB runtime, perhaps even just one integration in each runtime. This would play well into the idea of a single decoupled codebase for each “app”, such that the integration could be scaled independently, and changes could be made quickly without affecting other integrations, making this style of deployment well suited to cloud native deployments that 12-factor apps are designed for. It should be noted that this is exactly the approach taken by our fully managed IIB on Cloud service. Behind the scenes we are using containers, each running a single bar file, enabling us to provide features such as true elastic scaling, and simplifying things like the introduction of new product versions through rolling upgrades without downtime.

7. Port binding – Export services via port binding

There are multiple different ways to expose IIB via HTTP with listeners at various different levels. In our 12-factor integration style, to keep the container lightweight, we would have only a single integration server per container. In this situation the most appropriate mechanism would be to use the embedded HTTP listener within the integration server itself. The embedded HTTP listener reduces the potential points of failure, and simplifies the relationship between whether a runtime is up, and whether it is listening. HTTP load balancing across the servers is delegated to the cloud platform such that it has complete control of introducing or removing servers to adjust for load.

8. Concurrency – Scale out via the process model

Due to the stateless nature of IIB, it naturally scales out horizontally, making it straightforward to take advantage of the elastic scaling provided by cloud infrastructures.

Although multiple versions of IIB can be run and managed concurrently on the same machine by the IIB master processes, in a more 12-factor style, we would instead run multiple Docker containers and delegate to an orchestration framework such as Kubernetes to perform the scaling.

9. Disposability – Maximize robustness with fast start-up and graceful shutdown

The aim here is to ensure that a collection of runtimes can be treated as “cattle not pets”. So rather than keeping a runtime alive and nurturing it with runtime changes you instead simply shut it down and start a new one up with containing the required changes. An IIB runtime containing only a small number of moderately complex flows starts up in seconds. Where extremely fast start times are required, a Docker image can be prepared that has the bar file already deployed. As such, a new version of an integration can be pre-imaged ready for rapid canary tests or a full rolling update. IIB runtimes can be simply stopped via command line or script. These commands can be run from a remote machine. The shutdown process is designed to ensure any running integrations are completed gracefully.

10. Dev/prod parity – Keep development, staging, and production as similar as possible

Early environments such as development attract no license fee and there is no different in the development and product binaries. The product install is nothing more than binaries laid out on the file system so is very easy to replicate across environments, or pre-build a container image to ensure they are identical. Tools such as Chef or Puppet are also often used for consistent environment provisioning.

The deployable artefact is the bar file and is used across all environments without change other than overriding of environment specifics discussed under “Configuration”. The release/deploy process should also be scripted, and can and should be identical on all environments.

11. Logs – Treat logs as event streams

IIB outputs logs to stdout by default such that they can be aggregated across runtimes to provide a consolidated view. As an out of the box experience, IIB can be configured to send logs to a centralized location on Bluemix where they are collated and shown in a set of pre-built Kibana dashboards, which relates to IBMs broader strategy for providing product insights across all products. Where a custom log indexing and analysis is in place, standard patterns for log collation can be used, for example by redirecting the stream to a private ELK stack using Filebeat. This in itself satisfies the logging needs for 12-factor.

There are also many other forms of logging information also available from IIB such as Accounting & Statistics logs that provide specific information on mediation flows as they run, Resource Statistics that provide a continuous view of how the underlying resources (CPU, memory) are being used for performance monitoring and capacity planning. Event Monitoring provides logs based on the business data passing through IIB. Of course flows can perform their own explicit logging by pushing data to files, or messaging systems such as MQ or Kafka. For testing purposes, it’s also worth noting that there is a Record & Replay mechanism such that events passing through the flows can be captured in a way that makes it easy to store them for future regression testing.

It is also possible to log to Grafana, log4j from java compute node and also via a plugin node support pack.
For splunk we have provided a starting point on github, and there are plenty of others available in the wider community.

12. Admin processes – Run admin/management tasks as one-off processes

The principle objective here is to ensure one off admin processes are run in the same environment as the application, against the same codebase to ensure consistent access to the underlying models. IIB comes with a suite of administrative commands for managing the live environment, and these are a versioned part of the product binaries for any given version. These enable consistent administrative access to the runtime.

Much of the description of this factor on 12factor.net, especially with regard to a REPL shell, centers around admin processes that work with the application’s persistent storage. IIB is a state-free engine handling only transitory data as it passes between systems. As such it does not have a persistent data store of its own, so aspects such as how to fire of a data migration, or inspect the applications runtime data model against that on disk don’t really apply to IIB.

Conclusion

So there it is, IIB can be run in the style of a lightweight runtime to implement 12-factor integration. We’re seeing a lot of focus on this approach, and for IIB on Cloud we are using it ourselves. For details on our current thinking in this space and our future plans, we would encourage interested parties to sign up to the IIBvNext Early Access Program, and of course watch out for further posts and articles on this topic.

5 comments on"12-factor integration"

  1. Roberto de Gasperi July 09, 2018

    Excellent article, with regards to point 2…

    There is a lot to think about in terms of the granularity of Applications / Integration Servers going forwards. If anything, the is an abundance of methods available for “grouping” resources together (deployed standalone, grouped within an Application / Service project, grouping within an Integration Server, grouping these under a Node / nodeless deployment).

    Of course it would be great if such decisions were based on philosophical choices alone (MSA granularity / ideal levels of sharing and isolation, change control practices etc).

    But, given the scoping issues that occur with regards to IIB Applications (alone) and their Static Library counterparts when working with message definition files, and the fact that scoping is not changing for ACEv11, what do you see as the strategic vision for how to structure / package-up user code in an Microservices Architecture – are there any choices or will everyone be forced to work with user code packaged within Application / Service projects and using Shared Libraries ALWAYS in order to avoid scoping issues and contention with IIB / ACEv11?

    I ask this because I see a contradiction in movement towards “sharing-less (or sharing nothing)” and the inevitable forced use of IIB / ACE Shared Libraries which may mean more granular microservices components in order to mitigate the impact of that sharing.

    I’d rather have a more fully flexible product that enables users to make the correct choices according to their circumstances.

    Reply (Edit)
    • kim.clark@uk.ibm.com July 25, 2018

      Roberto, thanks for your insightful comment. Applications and libraries (shared and static) remain a strategic part of the way we see users packaging integrations going forward. These provide suitable packaging to enable suitably decoupled fine-grained integration deployment. Generally speaking, we observe that the increasing popularity of containers means there is a gradual trend of users to opt for less artifacts per integration server, and then to use separate servers to isolate workloads as opposed to using very large server “pets” with multiple artifacts (all isolated from one another within the server). Having said this, long term we have no desires to “force” users to take this approach nor to force users to adopt shared libraries. So, if users would like to use static libraries which are packaged into an application for isolation purposes then we absolutely aim to continue this support. Of course, we certainly hear your concerns regarding message definition files, and we’re aware of the specific discussions we’ve had with you on this topic under NDA as part of our Beta program. We certainly hear your concerns, and we’ll continue to work with you on improvements in this area. Thanks again for taking the time to comment and for your deep analysis.

      Reply (Edit)
      • Roberto de Gasperi August 07, 2018

        Thanks Kim, always good to hear your thoughts…

        I suspect that granularity will continue to be a key discussion point, one that may never be fully thrashed out.

        More recently I have seen calls for “less granular” integration services… one of the prime movements towards ReST over the last few years has been the desire for more granular, “chatty” interfaces – at first, this seemed to fit-in nicely as microservices popularity grew, but coupled with the portability of microservices (to a cloud environment), the challenge is sometimes reversed – to construct less granular interfaces in order to reduce chatty-ness (due to latency etc between cloud and non-cloud environments, and multi-cloud environments).

        It’s a wonderful contradiction in many ways… micro(component) developers occasionally desire “macro” interfaces (for their own consumption) in order to compensate for the environment that they find themselves deployed within.

        The key thing as an integration community is having the flexibility to utilise product features as needed at the time, free from unnecessary restrictions and workarounds.

        Many thanks once again!

        Reply (Edit)
  2. hirschel April 24, 2018

    re point 3. I disagree with the use of “deployment environment specific configuration overrides file with mqsiapplybaroverride” I much prefer configurable services for environment specific information. The reason is around visibility. With a configurable service, one can query the settings, and one know immediately the assigned value. If its done with a bar override, the values are not visible through tools like IBExplorer or mqsireportproperties. With the override, one has to look externally at the overridefile and the (Jenkins) deploy job to verify exactly was overwritten,

    Reply (Edit)
    • kim.clark@uk.ibm.com June 06, 2018

      A very perceptive comment. It’s certainly true that some properties do not report their new values via mqsireportproperties until after a restart. We do have plans to improve the visibility of property values in future releases. However, if we take the 12-factor concept to its logical extreme, and deploy flows as containers based on immutable Docker images as in our cattle not pets with IIB article, we would expect there would be less need to query property values at runtime as they would be fixed at design time in the Docker image. A change to a property value would imply a new image. Ultimately the source code repository could be trusted as the source of truth for the value of the property values. Clearly this ignores properties that change by environment – these would be stored in a platform specific mechanism such as Kubernetes secrets. However, we recognize this is a very different way of working and many customers may not move to this any time soon. As such we will also be improving the way that values are both set and reported at runtime. A key move in this direction in ACEv11.0.0.0 was to unify IIBv10 configurable services and IIBv10 policies as just ACEv11 policies. We would encourage customers to get involved in our Beta programme if they would like more visibility into future plans.

      Reply (Edit)

#AgileIntegrationArchitecture
#Integration