Containers, Kubernetes, OpenShift on Power

 View Only

Explore Multi Arch Compute in OpenShift cluster with IBM Power systems

By Mel Bakhshi posted Tue November 28, 2023 08:48 AM

  

In the ever-evolving landscape of computing, the quest for optimal performance and adaptability remains constant. This study delves into the performance implications of deploying applications on a Multi Arch Compute OpenShift Container Platform (OCP) cluster, comparing it with a cluster exclusively built on IBM Power architecture. Our findings reveal that, with or without Multi Arch Compute, there is no significant impact on performance.

Introduction

Many mission-critical business applications and their data are hosted on the IBM Power platform, where the co-location of the application with the data on IBM Power systems can provide application performance advantages. However, other considerations may lead enterprises to use different hardware architectures such as ARM or x86 to run their applications. Therefore, the ability to host applications on a cluster made up of a mix of different hardware architectures gives more freedom to choose the underlying hardware architecture without requiring multiple clusters. Multi-architecture Kubernetes cluster support is the way that hybrid clouds can run different components of the application on different hardware platforms. An OCP cluster with Multi Arch Compute is a cluster that supports compute machines with different hardware architectures. With the latest OCP 4.14 release announcement, the control plane (master nodes) of an OCP cluster can be deployed on the x86 platform, while the data plane (worker nodes) can be hosted on systems that are running with Power, x86, or both architectures.

Cluster specifications

For a valid performance comparison, it was crucial to ensure minimal differences between the Multi Arch Compute cluster and the cluster built entirely on the IBM Power architecture. In particular, the configuration of the underlying hardware infrastructure and testing methodologies needed to be the same. The two OCP clusters used for the performance comparison have similar resource allocations and configurations. Despite the hardware architecture differences, it was necessary to ensure the systems were as equivalent as possible. The main difference between these two clusters is that the Multi Arch Compute cluster is based on the x86 platform architecture.

PowerVM cluster infrastructure

For the OCP cluster infrastructure where the control and data planes were both based on the Power architecture, the remainder of this article will call it the PowerVM cluster. The PowerVM’s control plane (master) nodes consisted of three PowerVM LPARs from a Power9 system, while the three data plane (worker) nodes were from a Power10 system. The following diagram depicts the layout of the Power systems used for this cluster.

PowerVM Cluster infrastructure

The network, and storage configuration of the PowerVM cluster were based on 25G SR-IOV network and a local Non-volatile memory express (NVMe) storage. The OCP cluster traffic is configured using a private network and the local NVMe storage is used to reduce the latency associated with storage IO. For the compute, the default SMT8 level is used to maximize the database transaction throughput. The following table shows the resource allocation for the PowerVM based OCP cluster nodes. 

OCP resource allocation table1

Multi Arch Compute cluster infrastructure

The Multi Arch Compute-based OCP cluster infrastructure consisted of three VMs guests as the control plane (master) nodes from a single Intel(R) Xeon(R) Platinum 8168, three data plane (worker) nodes from a Power10 system, and three data plane (worker) nodes from a single Intel(R) Xeon(R) Platinum 8468V. The following diagram shows the layout of the systems used for the cluster.

MAC architecture

The network, and storage configuration of the Multi Arch Compute OCP cluster is like the PowerVM cluster using the private 25G SR-IOV based network and local NVMe storage. Like the PowerVM cluster, the local NVMe storage is also used to reduce the latency associated with storage IO. The following table outlines the resource allocation for the Multi Arch Compute OCP cluster nodes.

MAC OCP resource allocation table2

Workload

Apache Fineract is an open-source core banking system for scalable and secure operations of financial institutions. This workload was chosen since it is an open-source project and a real application which can be deployed on an OCP cluster. By default, the JDK and database used within the Fineract workload are Azul and MariaDB respectively. With the widely accepted use of the Java and PostgreSQL within enterprises, JDK 17 and EDB PostgreSQL 14 were used for running the workload. 

Fineract architecture

The open source Fineract GitHub provides the required JAR and WAR files to be used for building the necessary container images. The workload has two main components: application server and database. Spring boot with its built in Tomcat server is used for the application server, where the application code is performing Java-based read and write operations using the JDBC API to connect to the database.

To simulate the client server model, the application code and database are running in different VMs. The application’s read operation run on one worker node, while the write operations run on a different worker node. The database runs on yet another worker node.

PowerVM OCP configuration

To take full advantage of the compute cores on the Power10 system, the three LPARs (worker nodes) were placed on its first socket, where all 20 physical cores were allocated to these three worker nodes. With the default SMT8 level, the two worker nodes used for the application had eight physical cores while the worker node used for the EDB database had four physical cores. The following table shows the resource allocation of the worker nodes.

OCP Worker nodes resource allocation table3

With eight physical cores and SMT8 allocated to each worker nodes, there were 64 virtual CPUs per worker node which allowed five pods per worker node. With four physical cores and SMT8, the worker node where the database instance resided had 32 virtual CPUs which were used by a single pod. The following table shows the allocation of the resources per pod.

OCP Resouces per pod table4

The following diagram shows the OCP layout as described previously.

PowerVM cluster worker nodes

Multi Arch Compute OCP configuration

The three Power-based worker nodes within the Multi Arch Compute OCP cluster were based on a separate, but identical Power10 system as was used for the PowerVM cluster. The hardware and resources allocation were also identical as described earlier.

The three Intel-based worker nodes within the Multi Arch Compute OCP cluster were hosted on the Intel(R) Xeon(R) Platinum 8468V with a total of 48 cores per socket. The two worker nodes used for the application code were each allocated with 20 physical cores, and the worker node used for the database had eight physical cores. The following table shows the resource allocation of the worker nodes.

OCP worker nodes resource allocation table5

For the Multi Arch Compute OCP cluster, the same number of pods were used as the PowerVM cluster described earlier; however, given the core strength of the Power platform, the pod resource allocations were different between the Power and x86 worker nodes. On the x86 platform, each physical core with hyper threading enabled represents 2 vCPU, which in turn translates into 2000 milli cores of Kubernetes cluster resources. On the other hand, each physical core on the Power platform with SMT8, represents 8 vCPU, or 8000 milli cores of the Kubernetes cluster resources. Given these differences, each pod on the Power worker node gets 12800 milli cores, or 1.6 physical cores, while each pod on the x86 gets 8000 milli cores or 4 physical cores. Table following table shows the allocation of the resources per pod.

OCP Resources per pod table6

The following diagram shows the OCP layout as described previously.

Multi Arch Compute cluster diagram

Test scenario and results

Testing scenarios for the Fineract open banking workload involve various financial transactions and interactions. These tests simulate a variety of fund transfers, payments, and withdrawals, which, in turn, assess the system's performance under high load and concurrent user access. Using a systematic testing approach allows us to ensure a thorough evaluation of the application and EDB PostgreSQL database's resilience and scalability within the OCP cluster environment. This involves stress testing to evaluate the performance under high loads. To do that, a JMeter client, situated externally to the OCP cluster, leveraged a private 25G SR-IOV-based network configuration to dispatch client requests to the Fineract application. The Fineract application, in turn, is accessible via services specifically designed for its read and write operations. To facilitate database interaction, the read and write pods utilize JDBC, connecting to the database through the internal service dedicated to database connectivity. This network setup ensures efficient communication and data flow within the OCP cluster architecture.

Test scenario definition

The main objective of the performance comparison was to measure the number of the transactions per seconds across the two clusters. On both clusters, Power worker nodes were used to run the workload. The types of queries, transaction volumes, and data loads that the EDB PostgreSQL database will be subjected to were identical. Apache JMeter was used to simulate heavy loads and stress conditions on the application and database. The JMeter script performed realistic user interactions, ensuring the stress test is representative of actual usage patterns. The same Jmeter script and server were used to run the workload across the two clusters as shown below. Note that the CPU allocation for the 3 worker nodes (LPAR) were from the first socket of the Dual Chip Module (DCM) P10 system, where the assigned CPU for the two LPARs were contained within a single ship, and the third LPAR CPU was split across the two chips. 

Multi Arch Compute cluster diagram 2

Test results

A full test scenario run consisted of 4 distinct operations:

  • Write operation to make a deposit to a saving account (SavingDeposit).
  • Write operation to withdraw from a saving account (SavingWithdraw).
  • Write operation to post interest for a saving account (PostInterest).
  • Read operation to Inquire the account balance (BalanceInquiry).

A single run with 4 operations as described, took 10 minutes, where at the end of a single run, the application is redeployed, and database was restored with no transaction data. Six times the run was performed for each cluster. From the following test results, it is evident that there is no significant performance difference between the Multi Arch Compute and PowerVM clusters.

Test result1
Test result2
Test result3

Summary

 Multi Arch Compute in an OpenShift cluster, supporting different processor architectures, enables the use of diverse hardware architectures. This approach provides flexibility in choosing the best-suited hardware for specific workloads, optimized resource utilization by matching workloads with the most appropriate architecture, and can lead to cost savings by selecting hardware that is more cost-effective for certain tasks. Ultimately, the decision to adopt a Multi Arch Compute approach in an OpenShift cluster depends on the specific needs of the applications, the available hardware landscape, and other relevant variables.

Permalink