In this blog, we will describe how to enable OpenShift Container Platform (OCP) monitoring for
IBM Cloud Pak For Integration (CP4I) 2021.1. In the example, we are using a
RedHat Openshift Kubernetes Service (ROKS) cluster. However, these steps are applicable to other OpenShift services.
Overview
OCP 4.4 introduced a TechPreview feature for Application Monitoring, which GA'd in OCP 4.6. The
IBM Common Services (CS) monitoring service is changing to take advantage of the OCP monitoring feature. Once OCP monitoring is enabled, developers can run PromQL queries on application metrics on the OCP Console, by switching into
Developer perspective in the OpenShift Web Console and navigating into the 'Monitoring' section.
Since CS 3.6, the monitoring service has two modes,
CS monitoring and
OCP monitoring:
- CS monitoring installs a full Prometheus stack, the legacy common services implementation.
- OCP monitoring leverages OCP Monitoring (prometheus stack), provides customised Grafana and is the strategic direction going forward.
Model |
Operators |
Support OCP 4.5 and older |
Support OCP 4.6 and later |
CS Monitoring |
Exporter, PrometheusExt, Grafana |
Yes |
Yes |
OCP Monitoring |
Grafana |
No |
Yes |
Prerequisites
- you need to have the OpenShift CLI
oc
command installed on your local machine. See Getting started with the OpenShift CLI
- you are logged into your OpenShift cluster as a user with cluster administration privileges i.e. the cluster-admin role
We will also assume you have installed
Cloud Pak for Integration (CP4I).
Finally we will use an IBM Message Queue (MQ) instance to test the logging. You should:
- install the MQ operator as described here
- create an MQ instance as described either here or here
Once the MQ instance has been created, you should be able to see it on the CP4I home page:
Click on the menu button in the top-left corner and then browse to
Integration runtimes:
You should then see a resource table showing your MQ instance:
Checking for an existing monitoring dashboard
Click on the menu icon on the right side of the MQ row in your resources table and then select the "Monitoring" option from the menu.
Note there is often short delay before the Grafana dashboard tab appears. You may also need to enable popups in your browser for your cluster.
You should then see the Grafana dashboard, for example:
Checking which monitoring stack you are using
You can check which monitoring stack is being used by checking the pod names in the CS
ibm-common-services namespace :
oc get po -n ibm-common-services | grep monitoring
If
CS monitoring is being used, you will see something similar to:
alertmanager-ibm-monitoring-alertmanager-0 3/3 Running 0 108m
ibm-monitoring-collectd-689d976674-6ccwj 2/2 Running 0 108m
ibm-monitoring-exporters-operator-7b75487bfb-4c6n8 1/1 Running 0 110m
ibm-monitoring-grafana-6889f4b5c9-zvg9c 3/3 Running 5 109m
ibm-monitoring-grafana-operator-c75496696-wtf7t 1/1 Running 0 110m
ibm-monitoring-kube-state-6db6b74cd8-fmp7r 2/2 Running 0 108m
ibm-monitoring-mcm-ctl-685b97ff-np9rb 1/1 Running 0 108m
ibm-monitoring-nodeexporter-5m4sn 2/2 Running 0 108m
ibm-monitoring-nodeexporter-6khpz 2/2 Running 0 108m
ibm-monitoring-nodeexporter-6m5ws 2/2 Running 0 108m
ibm-monitoring-nodeexporter-gkcgm 2/2 Running 0 108m
ibm-monitoring-nodeexporter-hbc7w 2/2 Running 0 108m
ibm-monitoring-nodeexporter-rz56g 2/2 Running 0 108m
ibm-monitoring-nodeexporter-wrmhm 2/2 Running 0 108m
ibm-monitoring-prometheus-operator-667f78db48-ll7pt 1/1 Running 0 109m
ibm-monitoring-prometheus-operator-ext-5dffccc4f5-cf6lz 1/1 Running 0 110m
prometheus-ibm-monitoring-prometheus-0 4/4 Running 5 108m
If
OCP monitoring is already being used, you will see something similar to:
ibm-monitoring-grafana-5b9bbdcd-495dg 4/4 Running 15 3d21h
ibm-monitoring-grafana-operator-76bc8bbdc8-5vsns 1/1 Running 0 3d22h
Enabling OCP monitoring
Assuming you are not already using OCP monitoring, you can enable it by following these instructions.
If you already have CS monitoring enabled (as is likely), you should edit the existing
OperandConfig and
OperandRequest objects. If not, you will need to create the relevant config and request objects. You can check whether you have either or both of these already with:
oc get operandconfig -n ibm-common-services
and:
oc get operandrequest -n ibm-common-services
This second command is likely to return several results - you are looking for one called
common-service.
Edit (or create) an
OperandConfig CR with the following configuration to use the OCP metrics as a data source:
apiVersion: operator.ibm.com/v1alpha1
kind: OperandConfig
metadata:
name: common-service
namespace: ibm-common-services
spec:
services:
- name: ibm-monitoring-grafana-operator
spec:
grafana:
datasourceConfig: ### Enable using the OCP metrics as a data source
type: "openshift" ### Remove these two lines to switch back to CS Monitoring
operandRequest: {}
Edit (or create) the
OperandRequest for just the Grafana operand:
apiVersion: operator.ibm.com/v1alpha1
kind: OperandRequest
metadata:
name: common-service
namespace: ibm-common-services
spec:
requests:
- operands:
- name: ibm-monitoring-grafana-operator
registry: common-service
You will also need to ensure OpenShift is configured for
user-defined project monitoring by applying the following
ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
enableUserWorkload: true
Once enabled, a developer can run
PromQL queries on application metrics on OCP Console, by switching into the
Developer perspective and navigating to the 'Monitoring' section.
You can check the status of the monitoring pods by issuing the following command:
oc get po -n ibm-common-services | grep monitoring
You should see something like the following when all pods are up (e.g. all show "Running") and all containers are available:
ibm-monitoring-grafana-5b9bbdcd-495dg 4/4 Running 15 3d21h
ibm-monitoring-grafana-operator-76bc8bbdc8-5vsns 1/1 Running 0 3d22h
Note that if you also have the CS monitoring stack enabled, you will see a number of other pods.
Using monitoring
For the full IBM Common Services monitoring user guide, please see
IBM Cloud Pak foundational services Monitoring service .
Stopping and uninstalling the IBM Common Services monitoring stack
Important: do
NOT disabled CS monitoring until OCP monitoring has been enabled as described above.
If you are moving from an existing CS monitoring stack to OCP monitoring, you may want to remove the existing CS monitoring stack.
This involves:
- removing the monitoring exporters
- removing the alert manager
- removing the Prometheus monitoring core and extensions
- removing unneeded operand requests
Removing the monitoring exporters
Check the exporter instance:
oc get exporter -n ibm-common-services
This should give something similar to:
NAME AGE
ibm-monitoring 3d22h
Delete the exporter instance with:
oc delete exporter ibm-monitoring -n ibm-common-services
Removing the monitoring alert manager
Check the alert manager with:
oc get alertmanager -n ibm-common-services
This should give something similar to:
NAME VERSION REPLICAS AGE
ibm-monitoring-alertmanager 1 3d22h
Delete the alert manger with:
oc delete alertmanager ibm-monitoring-alertmanager -n ibm-common-services
Removing the Prometheus monitoring extensions
Check the Prometheus monitoring extensions with:
oc get prometheusext -n ibm-common-services
This should give something similar to:
NAME AGE
ibm-monitoring 3d22h
Delete it with:
oc delete prometheusext ibm-monitoring -n ibm-common-services
Removing core CS Prometheus monitoring
Check the Prometheus monitoring extensions with:
oc get prometheus -n ibm-common-services
This should give something similar to:
NAME VERSION REPLICAS AGE
ibm-monitoring-prometheus 1 3d22h
Delete it with:
oc delete prometheus ibm-monitoring-prometheus -n ibm-common-services
Removing the uneeded operand requests
Check for the existing monitoring-related operand requests:
oc get operandrequest -n ibm-common-services | grep monitoring
This should give something similar to:
monitoring-exporters-operator-request 5d3h Running 2021-04-08T11:31:10Z
monitoring-grafana-operator-request 5d3h Running 2021-04-08T11:31:10Z
monitoring-prometheus-ext-operator-request 5d3h Running 2021-04-08T11:31:10Z
Delete the monitoring exporters and Prometheus operand requests but NOT the monitoring Grafana operator:
oc delete operandrequest monitoring-exporters-operator-request -n ibm-common-services
oc delete operandrequest monitoring-prometheus-ext-operator-request -n ibm-common-services
Removing uneeded CS monitoring operators
Check which CS monitoring ClusterServiceVersions are installed:
oc get csv -n ibm-common-services | grep monitoring
This should give something similar to:
ibm-monitoring-exporters-operator.v1.10.1 IBM Monitoring Exporters 1.10.1 ibm-monitoring-exporters-operator.v1.10.0 Succeeded
ibm-monitoring-grafana-operator.v1.11.2 IBM Monitoring Grafana Operator 1.11.2 ibm-monitoring-grafana-operator.v1.11.1 Succeeded
ibm-monitoring-prometheus-operator-ext.v1.10.1 IBM Monitoring Prometheus Extension 1.10.1 ibm-monitoring-prometheus-operator-ext.v1.10.0 Succeeded
Delete the monitoring exporters operator and the monitoring Prometheus operator but NOT the monitoring Grafana operator:
oc delete csv ibm-monitoring-exporters-operator.v1.10.1 -n ibm-common-services
oc delete csv ibm-monitoring-prometheus-operator-ext.v1.10.1 -n ibm-common-services
Now check which monitoring pods are still active:
oc get po -n ibm-common-services | grep monitoring
This should give something similar to:
ibm-monitoring-grafana-5bf659c45c-rmr7k 4/4 Running 0 2d14h
ibm-monitoring-grafana-operator-c75496696-9cq4l 1/1 Running 0 2d14h
Acknowledgements
Thanks to
James Hewitt for providing much of the overview and core configuration information.
#IBMCloudPakforIntegration(ICP4I)#OCP#redhatopenshift#monitoring