When deploying containerized IBM API Connect Gateways in a cloud like AWS, Azure or GCP, one would like to make sure that Kubernetes scheduler tries to ensure resiliency for these containers across multiple . IBM API Connect Operator does not ensure that by default, but it can be instructed to do so.
In this blog post I will explain how to configure IBM API Connect Custom Resources to ensure that IBM API Connect Gateway Pods are spread across all possible Availability Zones.
Availability Zones, Node Affinities and Anti-affinities
Availability Zones (AZ) are logically or physically segregated locations where compute resources are placed. They are engineered to be isolated from failures in other AZs and provide low latency connectivity between them. They often translate to separate Data Centers. Hosting applications across multiple AZs enable resiliency to any failures local to single AZ. High Availability (HA) mechanisms in Kubernetes rely on concept of quorum, therefore it is desirable to have 3 AZs to host highly resilient applications, and ensure that Kubernetes or RedHat Openshift clusters deployed in a cloud span all possible AZs.
In Kubernetes and RedHat Openshift Node affinity and Node anti-affinity allow you to specify on which nodes your Pods are being scheduled. Affinities and anti-affinities can be either soft, or hard.
Hard node affinity (requiredDuringSchedulingIgnoredDuringExecution) will prioritize HA rules over capacity/uptime. In real life deployments soft node affinity (preferredDuringSchedulingIgnoredDuringExecution) is more desirable.
Soft node affinity will prioritize capacity and uptime over HA rules. To better capture the difference between hard and soft affinities, let’s consider following situation:
In above scenario when Node 1 fails and application Pods are using hard node affinity, Kubernetes scheduler will NOT schedule failed pods in Node 2 or Node 3, as this will break HA Rules. This effectively prevents running more than one replica of the same application on one node.
However, when using soft node affinity, provided that there’s sufficient resources to run failed pod on other nodes – Kubernetes scheduler will do so. The downside with soft affinities is that the placement is not guaranteed. Kubernetes will take other aspects, like available CPU and memory, into consideration when scheduling pods.
To learn more about Node Affinities and Anti-Affinities, please refer to Kubernetes Documentation.
The problem
When deploying IBM API Connect Gateways by default IBM API Connect Operator is applying soft anti-affinity on a node level, that would most likely result in following scenario when using as many nodes as AZs:
However, often Kubernetes and RedHat Openshift clusters are configured with more nodes than AZs hosting a lot of other pods than just the ones for IBM API Connect. In this scenario, the use of the default settings can result in the following:
Analytics subsystem:
- ingestion
- storage
- director
- mtls-gw
Management subsystem:
- apim
- taskmanager
- ldap
- lur
- ui
- client-downloads-server
- analytics-proxy
- portal-proxy
Portal subsystem: