Cloud Pak for Data

Cloud Pak for Data

Come for answers. Stay for best practices. All we’re missing is you.

 View Only

IBM Cloud Pak for Data: Enable Prometheus Alert Notifications For Internal Certificate Monitoring

By BHARATH DEVARAJU posted yesterday

  

Cloud Pak for Data uses certificate manager for managing the lifecycle of internal certificates. These internal certificates are configured to be automatically renewed, for example internal-tls-certificate is renewed once every 60 days. Whenever the certificates are renewed the pods mounting the secrets are automatically restarted to facilitate the availability of new certificates for the applications. This process can affect the availability of your applications resulting in downtimes until the pods are restarted. 

In the following article we aim to demonstrate a method for monitoring the certificate renewal process by enabling alert notifications when the certificate is due to expire. This gives users more control in planning for a downtime and whether to bring forward the renewal process with manual intervention. 

    Pre-requisties

  •      Red Hat OpenShift certificate manager is installed on the cluster.
  •       As a cluster administrator, enable the user workload monitoring within your OpenShift monitoring configuration by executing the following command,

cat <<EOF |oc apply -f -

apiVersion: v1

kind: ConfigMap

metadata:

  name: cluster-monitoring-config

  namespace: openshift-monitoring

data:

  config.yaml: |

    enableUserWorkload: true

EOF

  • Verify whether the monitoring components for user workloads are running,

·          oc get pods -n openshift-user-workload-monitoring

Enable Service monitor for the certificate manager

The cert-manager Operator for Red Hat OpenShift operands exposes metrics by default on port 9402 at the /metrics service endpoint. You can configure metrics collection for the cert-manager operands by creating a ServiceMonitor custom resource (CR) that enables Prometheus Operator to collect custom metrics

Run the following command as a cluster administrator, to enable the service monitor

cat <<EOF|oc apply -f -

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  labels:

    app: cert-manager

    app.kubernetes.io/instance: cert-manager

    app.kubernetes.io/name: cert-manager

  name: cert-manager

  namespace: cert-manager

spec:

  endpoints:

    - honorLabels: false

      interval: 60s

      path: /metrics

      scrapeTimeout: 30s

      targetPort: 9402

  selector:

    matchExpressions:

      - key: app.kubernetes.io/name

        operator: In

        values:

          - cainjector

          - cert-manager

          - webhook

      - key: app.kubernetes.io/instance

        operator: In

        values:

          - cert-manager

      - key: app.kubernetes.io/component

        operator: In

        values:

          - cainjector

          - controller

          - webhook

EOF

After the ServiceMonitor CR is created, the user workload Prometheus instance begins metrics collection from the cert-manager Operator for Red Hat OpenShift operands.

Finally, create a PrometheusAlert Rule for notifying whenever the certificates are due to expire. For example, the following alerts will notify users all the certificates that are about to expire within the next 7 days. The duration can be customized by updating the expression as needed.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cert-manager-alerts
  namespace: cert-manager
  labels:
    prometheus: k8s
    role: alert-rules
spec:
  groups:
  - name: cert-manager.rules
    rules:
    - alert: CertificateExpiringSoon
      expr: |
        certmanager_certificate_expiration_timestamp_seconds - time() < 86400 * 7 # 7 days
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Certificate '{{ $labels.name }}' is expiring soon."
        description: "The certificate '{{ $labels.name }}' will expire in less than 7 days."

The following notification is sent out for every certificate that matches the corresponding Prometheus rule.

Conclusion

By monitoring the certificate lifetime users can be alerted about the impending certificate renewal activity and take necessary precautions to minimise application disruptions.

0 comments
10 views

Permalink