AIOps

AIOps

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Monitor AIOPS using Instana

By Gurpreet Kaur posted Fri November 15, 2024 08:37 PM

  

Monitoring CP4AIOps using Instana

Co-author: Ben Stern, Pratik Patel   

Cloud Pak for AIOps is highly relied upon 24x7 by enterprises as their IT/Network operations management solution. So it is important to monitor Cloud Pak for AIOPS and the underlying infrastructure, to prevent outages/performance issues that might impact users. IBM Instana provides best-in-class monitoring for microservices-based applications like CP4AIOps. Instana automatically discovers an application's components, and the underlying OpenShift platform, and provides a graphical representation of the application's topology. 
 
This blog details how you can configure Instana monitoring for CP4AIOps.

Monitoring Setup:

1. Setup Kubernetes Monitoring

 Install Instana Kubernetes agent (using operator method) : 

 

Check on the cluster if instana-agent pods on all worker nodes are running.

[root@api.aiops24.cp.fyre.ibm.com ~]# oc get pods -n instana-agent
NAME                        READY   STATUS    RESTARTS        AGE
instana-agent-2krj4         1/1     Running   0               8d
instana-agent-4sbb7         1/1     Running   0               8d
instana-agent-crrzw         1/1     Running   1 (7d20h ago)   8d
instana-agent-dx89d         1/1     Running   0               8d
instana-agent-h49vz         1/1     Running   0               8d
instana-agent-jz6kb         1/1     Running   0               8d
instana-agent-mxbq9         1/1     Running   0               8d
instana-agent-qms9z         1/1     Running   0               8d
instana-agent-tqnz7         1/1     Running   0               8d
instana-agent-vs5vw         1/1     Running   0               8d
instana-agent-xldtk         1/1     Running   0               8d
k8sensor-8657c6b5b9-7wckj   1/1     Running   0               8d
k8sensor-8657c6b5b9-r94j5   1/1     Running   0               8d
k8sensor-8657c6b5b9-ssq5f   1/1     Running   0               8d

Check discovered Kubernetes cluster in Instana UI -> Platforms -> Kubernetes

Click on Cluster name to launch Kubernetes monitoring dashboard

Instana provides built-in health rules for Kubernetes platform monitoring.

Reference: https://www.ibm.com/docs/en/instana-observability/current?topic=references-built-in-events-reference#kubernetes

2. Instrumenting CP4AIOps

2.1 Update instana-agent configmap for CP4AIOPS  for KAFKA and Postgres

[root@api.aiops24.cp.fyre.ibm.com ~]# oc edit configmap instana-agent

apiVersion: v1
data:
  cluster_name: aiops24
  configuration-disable-kubernetes-sensor.yaml: |
    com.instana.plugin.kubernetes:
      enabled: false
  configuration.yaml: |
    com.instana.plugin.kafka:
      #jmxUsername: ''
      #jmxPassword: ''
      #jmxPort: '' # default jmx port is 1099
      topicsRegex: '.*'
      brokerPropertiesFilePath: '/opt/kafka/config/server.properties'
      collectLagData: 'true' # true or false. The default value is true
      #sslTrustStore: '/path/to/truststore.jks'
      #sslTrustStorePassword: 'kafkaTsPassword'
      #sslKeyStore: '/path/to/sslKeyStoreFile.jks'
      #sslKeyStorePassword: 'kafkaKsPassword'
    com.instana.plugin.postgresql:
      user: 'aiops_topology'
      password: 'password'
      database: 'aiops_topology' # by default PostgreSQL will use 'user' as database to connect to.
    # Manual a-priori configuration. Configuration will be only used when the sensor
    # is actually installed by the agent.
    # The commented out example values represent example configuration and are not
    # necessarily defaults. Defaults are usually 'absent' or mentioned separately.
    # Changes are hot reloaded unless otherwise mentioned.
    # It is possible to create files called 'configuration-abc.yaml' which are
    # merged with this file in file system order. So 'configuration-cde.yaml' comes
    # after 'configuration-abc.yaml'. Only nested structures are merged, values are
    # overwritten by subsequent configurations.
    # Secrets
    # To filter sensitive data from collection by the agent, all sensors respect
    # the following secrets configuration. If a key collected by a sensor matches
    # an entry from the list, the value is redacted.
    #com.instana.secrets:
    #  matcher: 'contains-ignore-case' # 'contains-ignore-case', 'contains', 'regex'
    #  list:
    #    - 'key'
    #    - 'password'
    #    - 'secret'
    # Host
    #com.instana.plugin.host:
    #  tags:
    #    - 'dev'
    #    - 'app1'
    # Hardware & Zone
    #com.instana.plugin.generic.hardware:
    #  enabled: true # disabled by default
    #  availability-zone: 'zone'
kind: ConfigMap 

PostgreSQL in CP4AIOPS Instrumentation :

Reference: https://www.ibm.com/docs/en/instana-observability/current?topic=technologies-monitoring-postgresql

For "com.instana.plugin.postgresql" section in above configmap, you need to create a new user in Postgres database. To login into Postgres you need user and password from the secret <installation-name>-edb-postgres-superuser.

[root@api.aiops24.cp.fyre.ibm.com ~]# oc get secret/aiops-edb-postgres-superuser --template='{{.data.username | base64decode}}'
postgres

[root@api.aiops24.cp.fyre.ibm.com ~]# oc get secret/aiops-edb-postgres-superuser --template='{{.data.password | base64decode}}'
B6AK9Jqp****************************************************************

Now login into any postgres pod (running in cp4aiops namespace) using the above username/password to create new user 'aiops_topology' and give access to this user for metrics collection.


[root@api.aiops24.cp.fyre.ibm.com ~]# oc rsh aiops-edb-postgres-1

bash-4.4$ psql --host localhost --username postgres

Password for user postgres: <password from above secret>

psql (13.14)

SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)

Type "help" for help.

postgres=# create user aiops_topology with password 'password;

CREATE ROLE

postgres=# grant SELECT ON pg_stat_database to aiops_topology;

postgres=# GRANT ALL PRIVILEGES ON DATABASE aimanager TO aiops_topology;

Enable statistics collection in the postgres configuration by adding track parameters to the existing yaml file.

[root@api.aiops24.cp.fyre.ibm.com ~]# oc edit cluster common-service-db

  postgresql:
    enableAlterSystem: true
    parameters:
      track_activities: "on"
      track_counts: "on"
      track_io_timing:"on"

After all the configuration changes restart instana-agent pods in the OCP cluster.

3. Instana Dashboards

In Instana UI, define an application perspective from the namespace in which CP4AIOPS is deployed

You can observe the CP4AIOps application in Instana UI with out-of-the-box discovered metrics.

 

Services tab showing latency, call rates, etc for different services in the CP4AIOps application that are available out of the box

3.1 KAFKA Monitoring Dashboards

Kafka Cluster Dashboard

Kafka Cluster Consumer Group Lag

References: https://www.ibm.com/docs/en/instana-observability/current?topic=technologies-monitoring-kafka

KAFKA Built-in Events: https://www.ibm.com/docs/en/instana-observability/current?topic=references-built-in-events-reference#kafka

3.2 Postgres Monitoring Dashboards

PostgreSQL Built-in Events: https://www.ibm.com/docs/en/instana-observability/current?topic=references-built-in-events-reference#postgresql-db

3.3 Cassandra Monitoring Dashboards

Basic Cassandra monitoring is available out of the box (no custom config required).

Launch Infrastructure and search Cassandra -> select the node from the cluster that runs Cassandra and then open Cassandra Cluster Dashboard.

References: https://www.ibm.com/docs/en/instana-observability/current?topic=technologies-monitoring-cassandra

Cassandra Built-in Events:  https://www.ibm.com/docs/en/instana-observability/current?topic=references-built-in-events-reference#cassandra-cluster

4. Create Custom alerts for CP4AIOps KPIs

You can create custom events in addition to built-in events provided by Instana. For example below sample alert is created when there is KAFKA lag on any of the monitored consumer groups.

Using the Instana connector in CP4AIOps all these alarms can be ingested into CP4AIOPS for correlation with platform and network alarms.

I hope this article is helpful. For more information about Configuring Instana and its monitoring capabilities: https://www.ibm.com/docs/en/instana-observability/current?topic=configuring-monitoring-supported-technologies

2 comments
43 views

Permalink

Comments

Thu January 23, 2025 02:55 PM

This is excellent!

We are starting to see how to do the monitoring of AIOps using built-in Prometheus (in OCP) which is perfect. But this is even better.

Maybe the Instana team should be looking at this and prepare an out-of-the-box dashboard and alarm patterns specific for AIOps.

Super article Gurpreet thank you!

Sun November 17, 2024 12:21 PM

Great article, Gurpreet !