App Connect

Are you hitting flow memory limits in your App Connect flows? Here are the whys

By Chengxuan Xing posted Thu September 03, 2020 03:39 PM

  

App Connect on IBM Cloud uses instance IDs to separate data and App Connect flows of different App Connect instances logically. This is known as soft isolation in a multi-tenant architecture. Comparing to hard isolation, which assigns a physically isolated resource to each tenant, this approach is more cost-efficient to run and maintain because only a single service cluster is needed to serve all the tenants.

However, because the computation resources within a soft isolated multi-tenant system are shared between all tenants. It creates a challenge for us to ensure the service quality of each tenant to not be impacted by the workload of other tenants. When it comes to processing App Connect flow, one of the methods that are used to achieve this goal is to limit the size of the flow context of each flow.

What is the flow context?

In a short sentence: a flow context is a JavaScript object that stores basic flow information and the output data of each node in a flow.

In an App Connect service cluster, flows are processed by different types of flow engines (e.g. API flow engine and event-driven flow engine).

A node in an App Connect flow will be executed as a task. A task is the smallest unit of a flow process and is implemented using node.js events. Based on the capacity of a flow engine. A certain number of tasks will be handled by a flow engine in parallel at the same time by fetching and queueing them from and onto the node.js event loop. Tasks from different flows or even different instances can be interleaved with each other on the event loop in the same flow engine. The processors of tasks are stateless and rely on flow context, a stateful object, that stores and passes flow specific data between tasks of a flow process.

The size of the flow context is checked whenever its content changes. An error will be thrown if the size of a flow context object exceeds the allocated memory limit of a given flow process. As a result, the flow process will be stopped before it overloads the flow engine so that it won’t affect other flow processes running inside the same flow engine instance. The memory limits for each type of flows are documented in our FAQ: https://developer.ibm.com/integration/docs/app-connect/faq-2/#faq_operational-limits.

How does a flow context work?

To build a flow that efficiently uses the allocated memory limit, we need to understand the basics of how flow context is measured and used:

How is the size of a flow context measured?

A flow context is a JavaScript object, Buffer.byteLength function is used to calculate the size of string values. Non-string values are converted into string values before measuring. Object pointers are handled to avoid double counting issue during the calculation.

What can cause the context of a flow context to change?

The size of the basic flow information (flow id, flow settings etc) can almost be ignored compares to the flow memory limit. The output data of the nodes in a flow are the main contributors to the overall size of a flow context. We can group App Connect nodes into the following category based on how they write data into a flow context:

Simple nodes

These nodes usually perform a simple task. Using application node as an example, they are used to perform CRUD operations against connected SaaS systems (e.g. IBM Cloudant, IBM COS, etc). Apart from the retrieve (READ) operation of an application node, the other tasks usually return a minimal amount of data. You should avoid deep copying (explained later) the output data of a retrieve node into the flow context.

Nodes with branches

Nodes in this category are If node and For each node. Both nodes have branches in which output data of the nodes are scoped. Once a branch is processed, the output data generated by the nodes in the branches will be removed from the flow context and are not accessible from the downstream nodes of the main flow. If you want to store data from a branch into the flow context, you need to define an output schema and map the data from the nodes in the branches into the output object. The branch scoped output data will contribute to the size of a flow context until the branch is processed, by when the scoped output data is removed. Flow memory limit may become lower for branch process when running in the parallel mode for a For each node. This is explained in details in the later section.

Nodes with customised output object

Apart from the If node and For each node that needs an output object to collect data from the branches, we also have some helper nodes which allows you to generate an output object after data transformation. Examples are Set Variable node and CSV/JSON/XML parser nodes.

It’s important to know that if you map a field from the output data of an upstream node as the input value of these nodes, the value of the field will be deep copied into the output object. Therefore, the same value will be duplicated in the flow context if you map them into an output object. This is not a problem for simple fields (strings, numbers, booleans, etc). But it can become a problem for large objects (e.g. the binary content of a file retrieved from a file system).

Claim check support for large file data transfer

The claim check pattern is widely used to avoid sending large messages to a message bus. The idea is that the sender of a large message provides a claim check that can be used by the receivers to retrieve that large message directly. Therefore, only the claim check need to be passed over by the message bus not the large message itself.

In order to use this pattern, it requires both the send and the receiver of a message to support claim check mechanism. In App Connect, this pattern is supported by many of the connectors (connectors are the task executor of application nodes). If a claim check field is directly mapped into a field of a node that supports this pattern, flow engine will pass the claim check to the connector, and the connector will download the data from the sender directly. Therefore, only the size of the claim check will be counted in the flow context. On the other hand, if a claim check field is mapped in a field of a node that doesn't support claim check, or some transformation mappings are used on the claim check field, the flow engine will redeem the claim check, which retrieves the large message from the sender, and replace the claim check with the redeemed content in that referenced field. This often results in a dramatic size change of the flow context and could cause the flow memory limit to be exceeded.

What can cause the memory limit of a flow to change?

Batch process node

A batch process node is used to process a large volume (millions) of records over a long period of time (30 days). A batch process has three phases: extract records, process each extracted record using a batch process flow, execute a post-action when all extracted records are processed. The second phase can be treated as an asynchronous branch compared to a synchronous branch in a For each node, a batch process flow is triggered for each record and it has a different memory limit than normal a flow process. This limit is lower than a normal flow process because the number of batch process flow is significantly higher than the number of normal flow. At the time of writing this blog, normal flows have 100MB memory limit while the batch process flows have 15 MB memory limit.

Memory limit split in the for-each node

Apart from the lower memory limit in batch flow processes, there is another situation in which the size limit of a flow context is reduced.

When you choose the option “Process items in parallel in any order (optimized for best performance)” on a For each node. Multiple sub-flow processes are spawned in parallel to increase the processing speed. To make sure the total memory usage of the spawned sub-flow processes does not exceed the overall flow memory limit, each sub-flow process will be assigned with a memory limit that is calculated by dividing the overall flow memory limit with the number of spawned sub-flow processes. The number of the concurrent sub-flow processes is capped at 50.

For example, if the overall flow memory is 100MB and there are 25 sub-flow processes running in parallel, each sub-flow process will be allocated with 4MB memory limit.

Summary

To sum up, flow context is used by flow engines to ensure fairness in a soft isolated multi-tenant system. By understanding how it works, you can organise your App Connect flow better to use the allocated memory limit more efficiently.

If you have any questions or suggestions, please share them in the comments!


#Featured-area-1
#Featured-area-1-home
#AppConnect

Permalink