To simplify operational monitoring and alerting on Red Hat OpenShift the IBM MQ Certified Container that is delivered in Cloud Pak for Integration emits a range of queue manager (server) scope
metrics using a Prometheus interface that is consumed by the OpenShift monitoring tools.
Alongside metrics for the queue manager another monitoring best practice is to track metrics for individual queues and topics such as queue depth that give an indication of application health, however the MQ Certified Container does not currently offer an option to publish this finer grained data. It has always been possible to configure monitoring for IBM MQ queue and topic metrics in OpenShift with some manual effort, but thanks to recent changes we have now made it easier than ever – as you’ll see in this tutorial!
In this blog I will show how to deploy an MQ Prometheus monitor pod that connects to the queue manager in order to publish the additional metrics that are not exposed directly. Using a demonstration cluster running in OpenShift on IBM Cloud I will then show how to use the native IBM Cloud Monitoring service to integrate the new data points with your operational monitoring and alerting processes such as PagerDuty.
Although the example here uses IBM Cloud Monitoring, the same techniques can be used in any environment using the Prometheus interface to inject the metrics into your own preferred monitoring tool.
Let's get started!
Step 1: Build the MQ metrics package to create a Prometheus monitor pod for the queue manager
The IBM MQ Prometheus monitor is part of a the mq-metric-samples open-source repo hosted on github.com that enables us to quickly and easily build a Prometheus monitoring agent that connects to an IBM MQ queue manager and exposes a wide variety of statistics about that queue manager.
To get started, let’s download and compile the Prometheus monitor into a Docker container locally on our laptop.
# Create a directory to host the GitHub repo if you don’t have one already
# Clone the metrics repo onto your local machine
git clone email@example.com:ibm-messaging/mq-metric-samples.git
Now we can use the pre-supplied script to create a local Docker container that exposes a Prometheus endpoint which will present our MQ metrics in the standard format defined by Prometheus, which will enable the data to be easily consumed by a wide range of monitoring offerings.
Building container mq-metric-samples-gobuild:5.2.0
Compiled programs should now be in /Users/myuser/tmp/mq-metric-samples/bin
=> exporting to image
=> => exporting layers
=> => writing image sha256:1333ef...69d7c
=> => naming to docker.io/library/mq-metric-prometheus:5.2.0
We can see the built Docker container on the local machine as follows:
REPOSITORY TAG IMAGE ID CREATED SIZE
mq-metric-prometheus 5.2.0 13332223dbe2 2 minutes ago 202MB
Optionally, if we have a queue manager that is accessible from the local laptop then we can launch the metrics agent Docker container to demonstrate that the container exposes the Prometheus endpoint as we expect:
# Modify the values in the environment variables to match your configuration
# Keep the CONFIGURATIONFILE variable set to the empty string
docker run --rm -p 9157:9157 \
-e IBMMQ_CONNECTION_QUEUEMANAGER="QM1" \
-e IBMMQ_CONNECTION_CONNNAME="myhost(1414)" \
-e IBMMQ_CONNECTION_CHANNEL="SYSTEM.DEF.SVRCONN" \
-e IBMMQ_CONNECTION_USER="myusername" \
-e IBMMQ_CONNECTION_PASSWORD="mypassword" \
-e IBMMQ_GLOBAL_CONFIGURATIONFILE="" \
IBM MQ metrics exporter for Prometheus monitoring
Build : 20210410-173628
Commit Level : 3dd2c0d
Build Platform: Darwin/
INFO Trying to connect as client using ConnName: myhost(1414), Channel: SYSTEM.DEF.SVRCONN
INFO Connected to queue manager QM1
INFO IBMMQ Describe started
INFO Platform is UNIX
INFO Listening on http address :9157
You can then connect to the local Prometheus endpoint using your browser at http://localhost:9157/metrics
Step 2: Deploy the Prometheus monitor to an OpenShift cluster and see it presenting queue depth data via the Prometheus endpoint
Now we will deploy the Prometheus monitor to a real OpenShift cluster – in this example running in IBM Cloud.
To start with, we have to push the Docker image from our local machine up to a container registry instance that can be accessed by the cluster, for which will use the IBM Cloud Container Registry:
ibmcloud cr login
ibmcloud cr region-set uk-south
ibmcloud cr namespace-add my-icr-namespace
docker tag mq-metric-prometheus:5.2.0 uk.icr.io/my-icr-namespace/mq-metric-prometheus:1.0
docker push uk.icr.io/my-icr-namespace/mq-metric-prometheus:1.0
ibmcloud cr image-list --restrict my-icr-namespace
Repository Tag Digest Namespace Created Size
uk.icr.io/my-icr-namespace/mq-metric-prometheus 1.0 3e1111098ab2 my-icr-namespace 1 hour ago 76 MB
Now we log in to the OpenShift cluster using the “oc” CLI tool, switch to the OpenShift project (Kubernetes namespace) where our queue manager is running, and if we haven’t done so already we add the configuration to allow that namespace to pull images from the Container Registry:
# Apply your specific login properties here to match your cluster
oc login --token=sha256~<yourtoken> --server=https://yourserver:yourport
# Switch to the namespace that contains your IBM MQ queue manager pod
oc project cp4i
# If needed, copy the ICR secret to this namespace so that it can pull the container image
# (insert your own namespace in place of “cp4i” if needed)
oc get secret all-icr-io -n default -o yaml | sed 's/default/cp4i/g' | oc create -n cp4i -f -
We will pass the configuration settings to the metrics monitor pod using a ConfigMap and Secret, so create those now modifying the Connection properties in the example below with the values necessary to address the queue manager, and any other customization you wish to carry out. In the particular the CONNNAME attribute must be set to the name of the “Service” object that is created for your queue manager which allows it to be accessed from inside the cluster.
The “objects” and “global” properties specified in the ConfigMap example below give the following useful behavior for our specific scenario:
- The QUEUES attribute instructs the monitoring agent to look for any queues that don’t start with SYSTEM or AMQ (those being generally for internal use by the queue manager). You may wish to customize this further to meet your needs
- The SUBSCRIPTIONS attribute excludes subscriptions owned by the queue manager itself
- The TOPICS attribute disables metrics for all topics as we are going to look only at queues in this example
- The settings for USEPUBLICATIONS and USEOBJECTSTATUS reduce the set of metrics that will be emitted by this monitor so that it avoids overlapping too much with the metrics that are already emitted automatically by the queue manager container
- The CONFIGURATIONFILE attribute is set to empty string and instructs the container not to look for a configuration file, since we are applying our configuration using environment variables from the ConfigMap
- LOGLEVEL is set to INFO – in some cases you might modify this to DEBUG if you wish to do detailed investigation of the statistics
oc create configmap metrics-configuration \
# Also create a Secret that defines the username and password to be used to access
# the queue manager. Leave these values empty is no credentials are required
oc create secret generic metrics-credentials \
It is important the settings you define above are suitable to allow the metrics pod to connect successfully to the queue manager. For example if you are using the default settings for an MQ v9.2 queue manager in Cloud Pak for Integration then applications are not able to connect using the SYSTEM.DEF.SVRCONN channel – you can relax that default restriction for development purposes using the following command against your queue manager pod. In other cases you may need to check your queue manager security configuration to establish how to grant access to your application.
oc exec quickstart-cp4i-ibm-mq-0 -- /bin/bash -c "echo 'SET CHLAUTH('SYSTEM.*') TYPE(ADDRESSMAP) ADDRESS(*) ACTION(REMOVE)' | runmqsc"
Now is also a good time to create any queues that you want to use in your queue manager. By default the metrics pod queries the list of queues at startup time, and then only once per hour after that (which can be modified using the IBMMQ_GLOBAL_REDISCOVERINTERVAL if desired), so if you create additional queues after deploying the metrics pod they will not show up in the metrics until you either restart the metrics pod or you wait for up to an hour!
We’re now ready to deploy the metrics pod using the sample OpenShift objects provided in the github repo:
# Create a new ServiceAccount that will ensure the metrics pod is
# deployed using the most secure Restricted SCC
oc apply -f sa-pod-deployer.yaml
# Update the spec.containers.image attribute in metrics-pod.yaml to match
# your container registry and image name
# Deploy the metrics pod using the service account
oc apply -f ./metrics-pod.yaml --as=my-service-account
# Create a Service object that exposes the metrics pod so that it can
# be discovered by monitoring tools that are looking for Prometheus endpoints
# Note that the spec.selector.app matches the metadata.labels.app property
# defined in metrics-pod.yaml
oc apply -f ./metrics-service.yaml
If everything has gone to plan, we can now look at the logs of the metrics pod and see that it has started up successfully, and is being polled by the monitoring infrastructure roughly once a minute:
oc logs mq-metric-prometheus
IBM MQ metrics exporter for Prometheus monitoring
Build : 20210410-173628
Commit Level : 3dd2c0d
Build Platform: Darwin/
time="2021-04-13T20:12:52Z" level=info msg="Trying to connect as client using ConnName: quickstart-cp4i-ibm-mq(1414), Channel: SYSTEM.DEF.SVRCONN"
time="2021-04-13T20:12:52Z" level=info msg="Connected to queue manager QM1"
time="2021-04-13T20:12:52Z" level=info msg="IBMMQ Describe started"
time="2021-04-13T20:12:52Z" level=info msg="Platform is UNIX"
time="2021-04-13T20:12:52Z" level=info msg="Listening on http address :9157"
time="2021-04-13T20:12:55Z" level=info msg="IBMMQ Collect started 14000001720300"
time="2021-04-13T20:12:55Z" level=info msg="Collection time = 0 secs"
time="2021-04-13T20:13:55Z" level=info msg="IBMMQ Collect started 14000003035700"
time="2021-04-13T20:13:55Z" level=info msg="Collection time = 0 secs"
Optionally, if you want to see the data being emitted by the metrics pods you can make your own call to the Prometheus endpoint by exec’ing into your queue manager pod and using curl to call the endpoint, for example:
oc exec quickstart-cp4i-ibm-mq-0 -- /bin/bash -c "curl mq-metric-prometheus-service:9157/metrics"
# HELP ibmmq_qmgr_channel_initiator_status Channel Initiator Status
# TYPE ibmmq_qmgr_channel_initiator_status gauge
# HELP ibmmq_qmgr_command_server_status Command Server Status
# TYPE ibmmq_qmgr_command_server_status gauge
# HELP ibmmq_qmgr_connection_count Connection Count
# TYPE ibmmq_qmgr_connection_count gauge
If your metrics pod does not work as shown in the previous two snippets you will need to debug the cause of the failure, which is typically either due to problems connecting to the queue manager (such as incorrect hostname or port), or authorization errors when connecting or opening relevant queues (due to the MQ security settings).
You can generate additional information on failures by updating IBMMQ_GLOBAL_LOGLEVEL=DEBUG in the ConfigMap and then restarting the metrics pod for the change to take effect. This will cause a printout of all the configuration variables that are being read in at the start of the pod execution and also printing of MQ error codes such as “MQRC_NOT_AUTHORIZED ” when failures occur.
Step 3: Configure the IBM Cloud Monitoring service to collect data from the OpenShift cluster
If you haven’t done so already we need to create an instance of the IBM Cloud Monitoring service which store the data from our OpenShift cluster. The Provisioning an instance page describes how to do this from the catalog UI or via the IBM Cloud CLI.
Once you have provisioned an instance you will be able to see it from the Monitoring tab of the IBM Cloud Observability dashboard as shown below:
# Replace the cluster and instance parameters with the names for your objects
ibmcloud ob monitoring config create --cluster myclustername --instance "IBM Cloud Monitoring-mattr-test"
With the monitoring configuration in place the metrics that are being emitted by the metrics pod will now start flowing automatically into your IBM Cloud Monitoring instance!
Step 4: Visualize the queue depth data in IBM Cloud Monitoring
To see your metrics in action you can now open your Monitoring instance by clicking on the “Open dashboard” link on the right-hand side of the Monitoring tab in the IBM Cloud Observability page.
Click on the “cpu.used.percent” field that is selected by default and type “queue” into the filter dialog, and you will be able to select the “ibmmq_queue_depth” attribute from the drop down list as shown here:
Then continue by configuring the graph settings as shown here:
- Time: Maximum
- Group: Maximum
- Segment by: “queue” (type in this word)
Note that a number of factors affect how quickly the data will show up in the Monitoring dashboard:
- Frequency with which the metrics pod queries the queue manager for new data (which is configurable using the IBMMQ_GLOBAL_POLLINTERVAL property)
- Frequency with which the Prometheus infrastructure in the cluster polls the metrics endpoint (typically 1 minute)
- Latent time to transmit the data to the IBM Cloud Monitoring infrastructure
Typically the data arrives in the Monitoring service within a minute or two, but you may need to investigate tuning the relevant parameters if you wish to increase the speed with which data is delivered.
Step 5: Configure an alert to automatically notify your Operations team via PagerDuty if it exceeds the expected level
As a busy Ops/SRE practitioner I don’t want to wait around watching a UI all day and night to find out if my application is misbehaving, so we can now use the built-in functions of IBM Cloud Monitoring to create a push alert if our monitoring detects that a queue is filling up beyond what would typically be expected.
In fact, it’s very easy to set up a wide range of alert types using IBM Cloud Monitoring by switching over to the Alerts tab as shown below, where in a matter of seconds we can set up an alert to issue a PagerDuty notification if the depth of the MARKETING queue goes over 50 messages for a period of 2 minutes as shown below:
In this tutorial you have seen how in a matter of minutes you can use the sample IBM MQ Prometheus metrics package to:
- build and deploy a new container into an OpenShift cluster running in IBM Cloud that monitors the depth of queues on an IBM MQ queue manager and
- reports that information into the IBM Cloud Monitoring infrastructure, from where you can use powerful alerting mechanisms to notify your Ops team if the system is experiencing problems.
The exact same techniques can also be used with OpenShift clusters running in any type of environment – simply use the MQ Prometheus metrics package to emit your metrics into your own monitoring infrastructure via the Prometheus interface.
Happy monitoring of IBM MQ in Cloud Pak for Integration!
STSM and Lead Architect, IBM Cloud Pak for Integration
#cloudpakforintegration #mq #cp4i #bestpractice