AIOps

 View Only

How to monitor your AIOps system with AIOps! Part 2, Topology

By Neil Boyette posted Tue April 16, 2024 12:26 PM

  

In the first blog in this series, Patrick O’Neill explained how Cloud Pak for AIOps can ingest events from many sources and carry out analytics and how to take advantage of this to improve the self-monitoring of the Cloud Pak for AIOps cluster itself.

In this blog, we will build on this by creating the topology visualization of your Cloud Pak for AIOps instance. Cloud Pak for AIOps supports ingesting resource data from many different sources, which it can then combine into a single holistic view of your estate. This topology then opens up the ability to not only see how different aspects of the estate are related to one another, it also allows for more advanced analytics such as topological based alert correlation, probable cause determination, inventory management and more.

Since Cloud Pak for AIOps runs on Red Hat Openshift, we only need a single observer to get started, namely the Kubernetes observer. Note that the Kubernetes observer works with many of the Kubernetes based platforms such as Red Hat Openshift and Google Kubernetes Engine.

1. Prepare

There are a lot of options that you can take advantage of, but here we will focus on the basics first.

At a minimum you will need 4 pieces of information:

1.    1. The Kubernetes IP address

2.    2. The Kubernetes API port

3.    3. The Kubernetes token

4.    4. The project (namespace) in which Cloud Pak for AIOps is installed in.

Luckily these are all easily available. As described in the Kubernetes observer jobs documentation, the best way is to create a separate cluster role and service account. Here though, we’ll also show how to use an administrator account to get the same information for simplicity.

Note that using an administrator’s token in this manner is a quick way to bring in an initial view of Cloud Pak for AIOps, however this token will expire. To setup longer term monitoring, use the separate cluster role and service account as noted above.

2a. Create and Use Service Account

1.    Create a service account and then assign the service account to the same namespace as kubectl:

oc create serviceaccount asm-k8s-account

To verify that the service account exists:

oc get serviceaccount

2.    Bind the appropriate ClusterRole or role to the asm-k8s-account service account. For an example on how to create the proper roles see the documentation.

The command binds the ClusterRole or role to the service account that is created on the 'default' namespace.

Example command for clusterole binding:

oc create clusterrolebinding asm-k8s --clusterrole=asm:kubernetes-observer --serviceaccount=default:asm-k8s-account

Example command for role binding:

oc create rolebinding asm-k8s --role=asm:kubernetes-observer --serviceaccount=default:asm-k8s-account

Output:

rolebinding.rbac.authorization.k8s.io/asm-k8s created

3.    Obtain the Kubernetes service account token.

Look for the mounted service account token secret from the secrets section:

oc describe serviceaccount asm-k8s-account

Example:

oc describe serviceaccount asm-k8s-account

Name: asm-k8s-account

Namespace: default

Labels: <none>

Annotations: <none>

Image pull secrets: asm-k8s-account-dockercfg-wwzx2

Mountable secrets: asm-k8s-account-dockercfg-wwzx2

Tokens: asm-k8s-account-token-tzl52

Events: <none>     

Describe the asm-k8s-account-token-******* (which in this example is tzl52) to obtain the token's value:

oc describe secret asm-k8s-account-token-tzl52

You now have the proper token, you can use the following steps to see the Kubernetes host, port and project, substituting the token retrieved here for the one in the administrator account option.

2b. Use Administrator Account

With the administrator account, sign into the Red Hat Openshift administration console, and under your profile select Copy login command.

This will likely ask you to authenticate again, and then it will display your token as follows. This will also include the API server and port.

You now have 3 of the 4 pieces of data. The last is the project in which Cloud Pak for AIOps is installed in, which you can see in the Projects page in the Red Hat Openshift console. Note that it defaults to cp4waiops.

3. Configure the Observer

In the Integrations catalog (under Integrations > Add integration) in the Cloud Pak for AIOps console, select Kubernetes and select Load as the job type, and fill in the required fields.

Under additional parameters, enter in the project in which Cloud Pak for AIOps is installed so that you only observe it, and not other applications installed on the cluster.

Finally, under the Job schedule indicate how frequently you’d like to refresh the information. Given that this is a relatively small application, which won’t use a lot of cluster resources to refresh, we can set it to refresh every 5 minutes.

Remember that you’ll want to ensure that the token doesn’t expire and thus use the separate cluster role and service account as noted above.

More information on all the option fields is in the documentation.

4. View Topology

Once the observer runs for the first time, it will create a resource group for it using the name provided in the observer.

Notice that the observer brings in, not just the resources but also any alerts on them. Normally you should see mostly informational alerts, but if there are errors or warnings, these will also show here.

 

You now have a view of all the resources which make up the Cloud Pak for AIOps installation. This will also show changes as the cluster’s configuration is updated, or the Cloud Pak for AIOps itself is updated.

5. Create Topology Resource Groups

While it's great to have an overview of the entire set of resources for the Cloud Pak for AIOps, it’s easier to look at specific portions of it. One example is that its great to see a view for each node to see which work load is on it, and whether there are any issues.

To ease creating groups for each of the nodes, we will use a dynamic resource group template. This is explained in more detail in the documentation.

Start by selecting the Resource group templates icon on the Resource Management page. Then select New Template and then Dynamic Template. This brings you to the template builder page.

We will start by searching for a worker to use as an example. In the Search box in the top right, enter the text worker and then select one of the worker nodes that was found. In the example, I selected worker1.katui.cp.fyre.ibm.com.

Next select the worker’s resource for it to render a preview.

In the group we will want to see both the node and its work load, so right click it and select Get Neighbors and click on All. This will show both the pods running on the node as well as information such as its ip.

We will finish the template by giving it a name and description, selecting the resource group type (I used compute) and adding a tag to ease filtering later. I used self-monitoring as an example.

Save the template and cloud pak for AIOps will create the template and automatically create resource groups for all the nodes, each showing the workload.

0 comments
35 views

Permalink