BPM, Workflow, and Case

High level thoughts about BAW on container from a performance point of view (BAW Performance and Scalability Series)

By Torsten Wilms posted Thu January 07, 2021 09:41 AM

The objective of this article is to provide high level performance thoughts about Business Automation Workflow (BAW) on Container with focus on the BPMN process runtime.
It is not a comprehensive tuning guide. Existing monitoring and tuning articles are linked in this article, more will follow over time.

1) The BAW on Container runtime architecture is now based on Kubernetes and WebSphere Liberty
The simplified figure below shows the BAW on container architecture based on WebSphere Liberty servers:

a) Threadpools
In the BAW Liberty based implementation all threads run in a single threadpool, which has by default an unlimited size. This is different to BAW on tWAS.
That means, incoming work is not throttled by the limitation of the threadpool size. 
In our performance tests, we relied on the threadpool self tuning capabilities of WebSphere Liberty's default executor, without the need to change defaults. Refer for more details to https://www.ibm.com/support/knowledgecenter/SSEQTP_liberty/com.ibm.websphere.wlp.doc/ae/twlp_tun.html

b) Connection Pools
In contrast to the threadpool, the database connection pool size is limited.
The default of the BAW database connection pool is set to 200. This allows to serve around 100 BAW threads per pod, without wait time on the connection pool with a well-tuned database. The connection pool size of the jdbc/TeamWorksDB datasource can be modified by editing the BAW configmaps. 

c) Java Heap
Refer to the following article to monitor and tune the java heap, if necessary:

2) Task Indexing & Queries with Process Federation Server (PFS) and Elastic Search
PFS and Elastic are installed and enabled by default in BAW on container. Both are a substantial and required part of task processing.
In contrast to BAW on prem, task searches and queries are now served by Elastic Search. This allows to serve complex queries more performant. There will be by design a time lag between task creation and task availability in the index. Tasks are not instantly queryable and visible to users after their creation. The time lag depends on the indexing interval (default 5s) triggering the update of the index. 
The index is implemented with Elastic Search. For Elastic Search tuning, the following article is recommended: https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-search-speed.html

3) Scaling
The BAW statefulset can be scaled horizontally and vertically.
Horizontal scaling: The BAW statefulset can be scaled horizontally by adding more pod replicas. In our performance tests we see linear scaling by adding pods with an almost 2x scaling factor. BAW on container also supports the autoscaler technology provided by OpenShift, which allows automatic horizontal scaling.
Vertical scaling: The BAW statefulset can be scaled vertically by increasing the resource limits per pod. In our performance tests we see linear scaling by adding virtual CPUs with an almost 2x scaling factor. 

4) Database Performance
All tuning recommendations for BAW databases are still valid. Refer to the following articles for more details:

References and BAW on container performance articles:
- Spot Garbage Collection Overhead in BAW on container (BAW Performance and Scalability Series) https://community.ibm.com/community/user/automation/blogs/florian-leybold1/2020/12/17/spot-garbage-collection-overhead-in-baw-on-contain
- Using a PostgreSQL database in the context of business automation workflows (Part1: Introduction non-container version) https://community.ibm.com/community/user/automation/blogs/stephan-volz/2020/12/21/postgresql-with-workflows