Enterprises operating event-driven, message-centric platforms—especially on OpenShift/Kubernetes—continually struggle to efficiently scale Kafka consumers. Traditional resource-based autoscaling (such as CPU/memory triggers) fails to adapt to dynamic workloads, resulting in:
- Under‑provisioning: Consumers fall behind during traffic surges, increasing lag and impacting downstream processing or user experiences.
- Over‑provisioning: Spending unnecessary resources (and cost) during periods of low or no Kafka traffic.
Consider an e‑commerce microservices architecture deployed on Red Hat OpenShift (or Kubernetes), where:
- Multiple services ingest orders, logs, or notifications via Kafka.
- Traffic patterns are unpredictable—daily peaks during promotions, campaigns, or external triggers.
- If Kafka consumer lag increases, services may cause delays.
The underlying root issues include:
- Mismatch of scaling triggers: Traditional Horizontal Pod Autoscaler (HPA) responds to resource utilization rather than Kafka-specific events.
- Consumer lag accumulation: Lag builds up when consumers can’t keep pace with incoming messages, leading to bottlenecks.
- Static resource allocation: Static replica counts or threshold-based scaling can’t adapt rapidly to fluctuating load.
- Idle or redundant replicas: Without awareness of Kafka partitions and lag, extra consumers either sit idle or are insufficient during spikes.
KEDA : The Solution to Kafka-Based Scaling Challenges
KEDA (Kubernetes Event-Driven Autoscaler) is an open-source component for Kubernetes that allows applications to scale dynamically based on the number of events in various external systems like message queues or databases. It works alongside the standard Kubernetes Horizontal Pod Autoscaler (HPA), extending its capabilities to enable scale-to-zero functionality and more responsive, cost-efficient resource management by reacting to real-world demands beyond just CPU and memory usage.
Key Features and Benefits
What is Kafka Consumer lag?
In Apache Kafka, consumer lag is a delay in the time it takes a message to move from a producer (which generates messages) to a consumer (which receives them).
Some amount of lag is inevitable because it will always take some amount of time for data to move between producers and consumers. But in a well-designed, well-managed Kafka cluster, lag should be minimal – typically, just a handful of milliseconds.
Openshift Cluster : Ensure you have a running Openshift cluster set up and accessible.
Kafka Cluster : Ensure RedHat Streams for Apache Kafka Operator is installed and Kafka Cluster Instance is created.
KEDA Installation : KEDA needs to be installed on your Openshift Cluster before you can use it.
Please find the below steps for installing KEDA on s390x,
Download KEDA CRDs yaml,
Edit the yaml with tag "main", instead of "2.17.2",
image: ghcr.io/kedacore/keda-admission-webhooks:main
image: ghcr.io/kedacore/keda-metrics-apiserver:main
image: ghcr.io/kedacore/keda:main
oc apply -f keda-2.17.2.yaml
Test autoscale with Keda and Kafka:
Create a consumer application deployment,
NOTE : This is a sample deployment we have used. Users can try the same sample deployment to achieve the validation.
image: registry.redhat.io/amq-streams/kafka-40-rhel9:3.0.0-15
bin/kafka-console-consumer.sh --bootstrap-server my-cluster-kafka-bootstrap.test.svc:9092 --topic orders --group order-processing-group --from-beginning
Create a ScaledObject,
apiVersion: keda.sh/v1alpha1
name: order-processor-scaledobject
namespace: test # make sure this matches your deployment's namespace
bootstrapServers: my-cluster-kafka-bootstrap.test.svc:9092
consumerGroup: order-processing-group
Now produce a large amount of message to the topic and see if autoscaling happens correctly.
Create a Kafka Topic
./kafka-topics.sh --create --topic orders --bootstrap-server my-cluster-kafka-bootstrap.test.svc:9092 --replication-factor 3 --partitions 3
Produce a large amount of message to the topic
./kafka-producer-perf-test.sh --topic orders --num-records 1000000 --throughput -1 --producer-props bootstrap.servers=my-cluster-monitor-kafka-bootstrap.test.svc:9092 batch.size=1000 acks=1 linger.ms=100000 buffer.memory=4294967296 compression.type=none request.timeout.ms=300000 --record-size 1000
This will produce more num of messages continuously to the topic.
Check the lag,
oc run kafka-lag-check -ti --image=registry.redhat.io/amq-streams/kafka-40-rhel9:3.0.0-15 --rm --restart=Never -- bash -c "bin/kafka-consumer-groups.sh --bootstrap-server my-cluster-kafka-bootstrap.test.svc:9092 --describe --group order-processing-group"
Here lag is more than 500, So ideally autoscale (up and down) should happen.
Once Producer is completely done producing, check whether it is scaling down,
Conclusion:
By combining KEDA with Red Hat OpenShift and IBM Z / LinuxONE, enterprises can unlock truly event-driven, intelligent scaling for Kafka-based applications. Instead of relying on static, resource-based triggers, workloads now scale automatically with real-time Kafka traffic—eliminating lag, optimizing costs, and improving responsiveness. This seamless integration ensures maximum performance, efficiency, and reliability for mission-critical, event-driven architectures.
Ref: