Decision Management (ODM, ADS)

 View Only

Performance Check List of OCP for CP4BA 24.0.0

By Johanne Sebaux posted Mon July 15, 2024 05:56 AM

  

Performance Check List of OCP for CP4BA 24.0.0 

Target audience: Performance Tester with Administrator role

Estimated duration: 90 minutes

Article also available as PDF in the library: https://community.ibm.com/community/user/automation/viewdocument/performance-check-list-of-ocp-for-c-2?CommunityKey=c0005a22-520b-4181-bfad-feffd8bdc022&tab=librarydocuments 

Moving to production requires to take care of some core performance aspects, so that you avoid troubleshooting afterward

Here is a list of check points we've gathered when planning to use the Cloud Pak for Business Automation (CP4BA) 24.0.0, with a focus on:

  - Operational Decision Manager (ODM)

  - Automation Decision Service (ADS)

For more details on those components and capabilities, please refer to CP4BA documentation.

This article takes the performance tester point of view and split checkpoints into several categories:

·      Hardware

·      Networking

·      Runtimes

·      Performance tests

Hardware


Checkpoint

  • Verify that the hardware behind is not obsolete.

Procedure

  1. In the OpenShift console, open a Terminal on a Pod or a Worker Node
  2. Launch command: cat /proc/cpuinfo
  3. You should also verify that your cluster is homogeneous at hardware level
  4. The bogomips information might be a good reference to compare your environments

Kubectl command line example

kubectl exec icp4adeploy-odm-decisionserverruntime-69b8d46c77-jqfv4 -n dba2400 -- cat /proc/cpuinfo

Networking


Network performance is key especially with a micro service architecture like the CP4BA one.

In this category, we target Router configuration, Route annotations and Load balancing settings.

Router


Checkpoint

  • Verify that the router configuration is correctly sized. To do this you have to check the number of router pods and verify the CPU consumption of the router pods.

Procedure

  1. In the OpenShift console, open: Home > Search > IngressController > all projects > default > replicas (default was 2)
  2. Update the value to 5 replicas if needed 

Kubectl command line example


Get the number of replicas for the default Ingress Controller:

kubectl get IngressController default -n openshift-ingress-operator -o=jsonpath='Replicas: {.status.availableReplicas}{"\n"}'

Increase the number of replicas for the default Ingress Controller:

kubectl patch IngressController default -n openshift-ingress-operator --type=json -p '[{ "op": "replace", "path": "/spec/replicas", "value": 5 }]'

HAProxy


Checkpoint

  • Verify the HAProxy configuration to ensure that the load is correctly balanced in round-robin and with the right number of nbthread
  • Of course, here we check the Apache HAProxy as load balancer, but this checkpoint applies to your load balancing solution, and with your procedure though.

Procedure

  1. Find the HAProxy node address, the IT person in charge of the cluster setup should be able to help.
  2. Use command line: ssh root@haproxy address
  3. Look into the haproxy.cfg
    1. Use command line: vi /etc/haproxy/haproxy.cfg
    2. check ingress-https backend, and switch from "balance source" to "balance roundrobin"
    3. check nbthread value to 4 (default is 1; here we advise to augment it to 5)
  4. Restart HAProxy
    1. Use command line: systemctl daemon-reload
    2. Use command line: systemctl restart haproxy
    3. (optional) systemctl status haproxy

Route Annotation for ADS and ODM runtimes


Checkpoint

  • Verify that the load is balanced between every runtime. 
  • This annotation is key when the application which calls the decision services is running on a limited set of addresses.

Procedure

  1. In the OpenShift console, open the routes of CPD route (landing page to access ADS and ODM)
  2. Verify that the annotation haproxy.router.openshift.io/balance: roundrobin is well defined in the CPD route

Kubectl command line example


Get Route annotation from the Zen route:

kubectl get Route cpd -n dba2400 -o=jsonpath='Balance: {.metadata.annotations.haproxy\.router\.openshift\.io/balance}{"\n"}'

Upgrade the annotation if needed:

kubectl annotate route cpd -n dba2400 --overwrite haproxy.router.openshift.io/balance='roundrobin'

Runtimes

To get best performance for ODM and ADS runtimes you must consider CPU and memory settings. 

We advise to check CPU/memory requests AND limits and align their values.

In your CASE package, go to the relevant folder in cert-kubernetes/descriptors/patterns to find all of the templates:

For ODM, use the fully customizable decisions template, ibm_cp4a_cr_production_FC_decisions.yaml, to copy lines from and paste them into your CR file.

For ADS, use the fully customizable decisions template, ibm_cp4a_cr_production_FC_decisions_ads.yaml, to copy lines from and paste them into your CR file.

For more information about downloading your CASE package, see Preparing a client to connect to the cluster.

Checkpoint

  • Verify that CPU/memory requests AND limits for ODM and ADS runtimes are equal

Procedure

  1. This is set inside the CR 

Kubectl command line example


Consult the actual Decision Server Runtime configuration:

kubectl get ICP4AClusters dba2400bai -n dba2400 -o=jsonpath='Decision Server Runtime Config: {.spec.odm_configuration.decisionServerRuntime}{"\n"}'


Consult that Decision Server Console is running (this component is always installed in an ODM deployment):

echo "Decision Server Console Config: "$(kubectl get pod | grep decisionserverconsole | awk '{print $3}')


Consult the actual Decision Center configuration:

kubectl get ICP4AClusters dba2400bai -n dba2400 -o=jsonpath='Decision Center Console Config: {.spec.odm_configuration.decisionCenter}{"\n"}'


Consult the actual Decision Runner configuration:

kubectl get ICP4AClusters dba2302bai -n dba2302 -o=jsonpath='Decision Runner Config: {.spec.odm_configuration.decisionRunner}{"\n"}'


Same patterns apply for ADS.

Performance tests


You might want to verify you have sufficient performance results, and for this, you might already have invested in a performance tool like JMeter.

JMeter is good at "injecting" requests to leverage runtimes. 

In this case, you should take care of the JMeter process behavior and its possible network latency that could impact performance results.

JMeter


Checkpoint

  • Verify JMeter process usage (CPU/RAM)
  • CPU should not reach 80%
  • Check the level of network usage should not reach 75%.

Procedure

1.     From your injector machine, run "top" or "htop" for cpu/ram usage when JMeter is running to control that the limit reached is not on the JMeter side: you do not have to reach 100% of your CPU limit.

2.     If you want to test your application without having the overhead of the network (session creation and authentication), prefer using:

1.     The Basic Authentication method

2.     The "Same user on each iteration" option checked on the Thread group in combination with an HTTP Cookie Manager

3.     The "Use KeepAlive" option checked on the HTTP Request

Latency


Checkpoint

  • Verify the latency between the machine where JMeter is executed and the OpenShift cluster under test

Procedure

  1. Ping the cluster from the bench machine
  2. Check that the average round trip is under 50ms.

Take Away


There is a useful troubleshooting documentation:

-       ODM: https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/24.0.0?topic=manager-troubleshooting

-       ADS: https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/24.0.0?topic=services-troubleshooting  

#AutomationDecisionServices(ADS)

#OperationalDecisionManager(ODM)

#CloudPakforBusinessAutomation

#RedHatOpenShift

#topology

0 comments
11 views

Permalink