Best practice sharing - How can streams flow accommodate lots of concurrent users
Author: ngoracke@us.ibm.com , bjhwjia@cn.ibm.com
Background
A customer is preparing for an important educational activity in which there would be 55 students using the Click Stream Sample concurrently. To support this education activity, customer is planning to build a Streams instance that can accommodate the 55 concurrent users. They want to know what's the CPU/Memory configuration required for this case to ensure the performance and stability ?
Assumption
There's sufficient resource capacity and available license entitlement.
CPD 3.5 & Click Stream Sample in use
Solution sharing
Step 1 - Scaling the Lite (Control Plane) & Stream Flow assembly to the Medium T-Shirt size.
With the Medium T-Shirt size, the customer can scale to which will increase the number of replicas in use. We recommend they do that.
cpd-cli scale -a lite --config medium -n <namespace>
cpd-cli scale -a streams-flows --config medium -n <namespace>
This KC link would be helpful https://www.ibm.com/support/knowledgecenter/en/SSQNUZ_3.5.0/cpd/admin/scaling-svcs.html
Step 2. Resource configuration & Tuning
If each student will be submitting their own application from streams-flows simultaneously, they may want to consider increasing the "Build pool size maximum" value to accommodate more concurrent builds. This will increase the total cpu/mem footprint of streams in the cluster, though. If they can stagger builds slightly, or if each student will not be building and submitting their own version of the application, this will not be necessary. The default is 5. That means 5 concurrent builds, and additional builds will wait in a queue and potentially time out and need to be retried. This value can be adjusted on the instance edit page.
Navigation:
Caption

If each student will be submitting their own application, the default "Request" and "Limit" sizes are 0.1 Request/ 2 Limit CPU and 1 Request/ 2 Limit GB memory. Each application submitted will require that much free CPU and memory in the cluster. So the total would be 5.5. Request/110Limit CPU, and 55 Request/ 110 Limit GB memory.
They must have enough free space on the cluster to accommodate the "Request" size. The Limit will increase the amount each worker is overcommitted, and may reduce reliability/performance. They will need to increase the "Maximum CPU (cores) for all jobs" from the default of 20 to > 110 to allow for that many concurrent submissions. This value can be adjusted on the instance edit page.

Steady state, the Click Stream application uses about 0.3 CPU and 200mi Memory, so if they don't have a cluster large enough to use the defaults, they can reduce the cpu and memory Limit for "Application" nodes to 0.5CPU and 300mi Memory. This will decrease the amount CPU is over-committed, and reduce the amount of free memory required. They will still need to increase the "Maximum CPU (cores) for all jobs" from the default of 20 to > (limit value x 55) to allow for that many concurrent submissions. These values can be adjusted on the instance edit page.

For the Streams instance, the default "limits" for the instance should be sufficient, but it's recommended they increase the "requests" for their instance on the Rest, Management, Security, and Build pods. They should make the "request" closer to the limit, if they have that kind of capacity available in their Openshift cluster. If not, the largest strain will be CPU on the Management pod, so that should be prioritized first. These values can be adjusted on the instance edit page.

Conclusion
This solution has been verified in customer's environment and it went smoothly.
#CloudPakforDataGroup