AIOps

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

#ITAutomation
#AIOps
#CloudPakforAIOps
#AIOps

View Only

Back to Blog List

AIOps Bitesize: Recipes for Topology Manager merge processing

By Matthew Duggan posted Mon December 16, 2024 07:02 AM

What and why?

Heterogeneous IT and network environments featuring diverse vendors, technologies and equipment offer cost advantages, flexibility and improved resilience. However, this diversity introduces challenges for management software including managing siloed data and different models, protocols, APIs and data lifecycles.

In this AIOps Bitesize, we explore how IBM CloudPak for AIOps Topology Manager simplifies these complexities. By using its merge processing capability, you can create comprehensive end-to-end models of your applications/services and their underlying IT and network infrastructure (etc) across silos - integrating data from any source, including your proprietary ones. This flexibility goes beyond traditional management tools and lets you tailor the product to your unique needs and data.

While ready-to-use configuration is provided, understanding the principles and guidelines for merge processing is important should you want to further adapt the product to your needs.

For extra context, I recommend reading [this article] about Topology Manager's data processing and [this article] about measuring and visualising how similar topology resources and their relationships are.

Let's Go

Check out the product documentation at these links for this AIOps Bitesize:-

https://www.ibm.com/docs/en/cloud-paks/cloud-pak-aiops/4.7.1?topic=elements-configuring-rules

What should I merge together?

It's tempting to want to merge everything that's common between data sources but that is not necessary to satisfy many use-cases. Merge processing is typically targeted at 'high value' resources that serve as high quality intersect points between different data sets. This works well because different integrations typically have complementary strengths and weaknesses. For instance, understanding how IT and application infrastructure depends on the network can be answered by finding an appropriate intersects between IT and network data - acting as a bridge between them. Merge processing also helps normalise resources when different terminology is used.

The following figure portrays the subjective quality of various resource types for merging although it should be noted that deviations from this can be useful as it depends on your data and goals. We consider the following characteristics:-

Applicability - is a type of resource commonly seen across data sources?
Configuration - how much configuration is typically needed to successfully merge, e.g. to handle ambiguity.
Quantity - an indication of the general quantity of this resource type which can have implications on processing and results.
Stability - how likely the resource type is to change or be deleted.
Uniqueness - how unique do these resource types tend to be in the data sources?

Ideally you'd merge using something that's broadly applicable, requires minimal configuration, is highly unique and stable, and has a quantity such that processing is minimised while maximising the intersect points found between the topologies of different sources. Some thoughts for each resource type in the chart are as:-

Container Image - Highly unique which minimises config, pretty stable and good quantities but they have low applicability which would limit them to IT/application management scenarios. Good for linking build pipeline with deployed instance data.
Container Instance - Highly unique which minimises config but they're unstable, have high quantities and low applicability which would limit them to IT/application management scenarios. Low stability can result in faster-moving sources removing their perspective on a resource merged with slower moving sources which Topology Manager can handle but be aware of that.
Device - Typically pretty unique and stable, good quantities and highly applicable - the "GoTo" resource type to use for many scenarios. May need more config to use effectively, e.g. mitigate naming differences and may need to merge dissimilar resource types, e.g. host and IP address.
FQDN - I'm including fully qualified and not here. An obvious choice and can work well if they're well maintained as they're intuitive, have good uniqueness and quantities, are widely applicable and easy to configure. You may need to normalise between fully qualified versions and not. Often found as properties of devices.
IP Address - An obvious choice - highly applicable, generally stable but can have high quantities although mitigated by IP's often being properties of devices. Watch for RFC1918 IP addresses which can lower uniqueness although this is environment dependant. If you've got them, then consider namespacing as needed.
Kubernetes Service - Good choice if you want a stable mediator between app composition data held in a CMDB and the more ephemeral Kubernetes bits underneath. Good quantities although low applicability limits them to app/IT use-cases and watch for ambiguity of naming with other Kubernetes resource types. Similar comments apply to Kubernetes deployments.
MAC address - An obvious choice - highly applicable, generally stable and pretty similar to IP addresses. Often found on ports and can be highly unique but watch for the non-unique ones.
Port/Network Interface - They can be good as they may have desirable IP and MAC address properties and are pretty stable but can need careful treatment for IP unnumbered and virtual interfaces and it raises the bar on the sources to merge with. Some sources may not see the ports but do see IP or MAC addresses resulting in dissimilar merge scenarios. Watch for ambiguity and don't rely on the likes of ambiguous device-centric ifIndices and so qualify accordingly.

When shouldn't I use merge processing?

As a general rule, just merge what is necessary to satisfy your use-cases and have a clear sequencing of use-cases in mind to deliver.

With the introduction of File Enrichment Rules, the enrichment recipe via merge processing is less relevant now, although note that it's possible to use File Enrichment and Merge Rules at the same time on the same data. The following table and diagram describe the differences between the two approaches.

Approach	Efficiency	Flexibility
File Enrichment Processing	File Enrichment processing is done during Observer job processing and so doesn't require downstream processing by the merge service. They also don't require the use of composites to expose the combined data.	File Enrichment processing is highly flexible due to the use of rules for scope and lookup controls and the ability to enrich resources with many properties held in a file's JSON record. The user can build the enrichment files using mechanisms of their choice. It is less flexible than Merge Processing if a mixture of enrichment and other recipes are needed.
Merge Processing	Merge processing is less efficient than File Enrichment processing for enrichment use-case because two or more Observer jobs resources' must be merged via common mergeToken values. This requires the use of metadata and processing of data by the merge service which is driven by Kafka messages from the topology service. It also requires additional records and relationships to be stored in the databases.	Merge Processing is more flexible than File Enrichment processing because it not only has flexible scope and token controls, but it also enables the use of multiple merge recipes simultaneously.

Merge Recipes

Topology Manager non-destructively merges resources that have common mergeToken property values while honouring their independent lifecycles. This lets us seamlessly transition from, for example, network data to IT and application data and track changes to the topology over time. The goal of merge processing can be categorised as one or any mixture of the following goals. Which you end up using is a function of the data being merged.

Enrichment is useful if you want to enrich data from one source with data from others. E.g. enriching VMware virtual machines with the names of services that use them from your CMDB - something not natively known to VMware. Adding location data is also a common enrichment goal.
Join is useful if you want to extend the topology from one source into the topology of another. E.g. adding better quality network dependency information to server data from an APM tool.
Glue is useful if you have domain knowledge that knows two or more resources are actually the same thing but their data does not align in any way. E.g. a network discovery tool has found a Kubernetes worker node by its IP address whereas another source only knows its hostname.

How does merge processing work?

For merges to occur, resources must contain a mergeTokens property that contains a set of string values. Any resources sharing the same lowercase mergeToken value are merged together into a composite. Composites allow Topology Manager to non-destructively union resources, their properties, state and relationships into what appears to be a single record while honouring the lifecycle of each contributor to the composite. Merge processing automatically handles any changes to the composites should their membership change. The following diagram summarises merge processing.

TOP TIP: Merging of resources is limited to a maximum of 20 different instances being combined.

What about merge tokens?

Merge tokens are the set of lower-case values a resource has for its optional mergeTokens property and result in such resources being candidates for merging. Selection of mergeTokens values either provided by the Observer (e.g. files) or better, via appropriately scoped Merge Rules. See [this article] for more info about this topic. Selection of mergeTokens can be categorised as:-

Property Promotion where you simple specify the property name(s) to promote to mergeTokens, for example specifying accessIPAddress will result in its value being added to the set of mergeTokens.
Literal is where you want to inject your own specific value into a mergeToken, often as part of tightly scoped rules or for naming spacing. For example, specifying ${name}MyValue when using an include filter such as ^.*$ against name. This results in a mergeToken value of MyValue because the expression against name does not use a regex capture group.
Regex capture group is where you want to capture part of property value, such as specifying ^([a-zA-Z)+).*$ against a name property.
Concatenation is where you combine multiple properties or expression results into a single mergeToken value, often with a delimiter. For example, specifying ${accessProtocol}:${accessIPAddress}, ${hostname}.ibm.com to fully qualify an FQDN, or combining a regex capture group with a property or literal.

How are composites modelled?

If two or more resources share the same mergeToken value, a composite is created that is then related to the resources considered to be the same thing via a compositeOf edge, as shown by the following figure.

TOP TIP: Merge processing doesn't require the resources to be merged to be of the same type or use the same properties to drive mergeTokens.

TOP TIP: In the event of a property collision, the composite will return the value held by the most recently updated resource in the composite.

Recipe Examples

Here's some examples of the recipes described above although note that which one you use is really a function of your source data and so it's fine to mix and match.

Enrichment

Use-case: As an AIOps administrator, my users want to see the names of services that use some of their devices to provide them with better context, search and filtering capabilities. Note that some devices can support many services.

We'll introduce a File Observer file that enriches data from another source with the service names. In this case, we'll include tags in the file we'll load and merge. As previously mentioned, File Enrichment processing is a better choice for just enriching data but let's run with it to show it it works. The file we'll build will mirror the naming and identity used in the Kubernetes data we have.

We'll enrich worker 1 and 2 in the table below. This is assuming that the servers already have mergeTokens that include their host_name property values as merge tokens which would need an appropriate merge rule - see the Glue recipe below.

Perform the following steps:-

Create a merge rule that promotes the uniqueId property value to merge tokens. Here we're targeting any server loaded from the File Observer which is where our Kubernetes data came from.
Create and load a File Observer enrichment file that contains the following data.
TOP TIP: I've qualified the tags here with "Svc" to denote what they mean.
Verify that the resources have been enriched with the tag data.

Join

Use-case: As an AIOps administrator, my users want to combine service user-journey data in our CMDB with Kubernetes data to represent the sequence in which our customers interact with Kubernetes services. This helps my users understand which step in their customer's user journey will be affected by problems with the Kubernetes services.

We'll use a File Observer file that contains the user journey data attach that to the Kubernetes data. The file we'll build will mirror the naming and identity used in the Kubernetes service data we have.

Perform the following steps:-

Create a merge rule to promote the value of service uniqueId properties to merge tokens.

Create and load File Observer file representing the customer journey of the service. This file represents the sequence of services a user interacts with when they place an order. We'll join our custom topology to the Kubernetes services via their UUID in this case although one could also use a combination of namespace and service name to give independence of UUID changes. NOTE: that I'm using a custom 'next' edgeType here to represent the sequence order. Here's my file:-

V:{"_operation": "InsertReplace", "uniqueId": "Service User", "entityTypes": ["person"], "name": "Service User"}
V:{"_operation": "InsertReplace", "uniqueId": "79e16436-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "front-end", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"}, {"_toUniqueId": "79624a90-de96-11e8-b739-00000a33043d", "_edgeType": "next"}  ] }
V:{"_operation": "InsertReplace", "uniqueId": "79624a90-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "catalogue", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"}, {"_toUniqueId": "7ad36516-de96-11e8-b739-00000a33043d", "_edgeType": "next"} ] }
V:{"_operation": "InsertReplace", "uniqueId": "7ad36516-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "orders", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"}, {"_toUniqueId": "78f0ccfb-de96-11e8-b739-00000a33043d", "_edgeType": "next"} ] }
V:{"_operation": "InsertReplace", "uniqueId": "78f0ccfb-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "carts", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"}, {"_toUniqueId": "7b50da28-de96-11e8-b739-00000a33043d", "_edgeType": "next"} ] }
V:{"_operation": "InsertReplace", "uniqueId": "7b50da28-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "payment", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"}, {"_toUniqueId": "7cbfda53-de96-11e8-b739-00000a33043d", "_edgeType": "next"} ] }
V:{"_operation": "InsertReplace", "uniqueId": "7cbfda53-de96-11e8-b739-00000a33043d", "entityTypes": ["service"], "name": "shipping", "_references": [ {"_fromUniqueId": "Service User", "_edgeType": "uses"} ] }

Verify the user journey is attached to the Kubernetes services. Here we can see the sequence of services from using the front-end, through placing an order and finally the order being shipped.

Glue

Use-case: As an AIOps administrator, my users want to merge dissimilar device data they know represents the same device. In this case, a device in the network management data is the same as a Kubernetes worker node but their data does not naturally align.

We'll use a File Observer file that acts as 'glue' between the network and Kubernetes data. The following screenshots show the network and Kubernetes data we're working with.

Network

{
    "_createdAt": "2024-12-15T09:36:09.386Z",
    "_id": "v2yxg-aeT-y5fuufUgD0ZA",
    "accessIPAddress": "172.31.6.6",
    "accessProtocol": "IPv4",
    "accessScopeTokens": [
        "file.observer:itnm2.txt"
    ],
    "className": "InferredCE",
    "createTime": 1734255369386,
    "description": "Inferred CE",
    "entityTypes": [
        "host"
    ],
    "matchTokens": [
        "entityid.23492282"
    ],
    "name": "172.31.6.6 5678",
    "sysDescription": "Inferred CE",
    "sysObjectId": "CE",
    "tags": [
        "CE",
        "Inferred CE",
        "InferredCE"
    ],
    "uniqueId": "ITNMDOMAIN:172.31.6.6_inferred_CE_for_PE_ny-pe13-crme38.na.test.lab_VRF_5678",
    "vertexType": "resource"
}

Kubernetes

{
    "accessScopeTokens": [
        "file.observer:kubernetes.txt"
    ],
    "clusterName": "mybiz",
    "entityTypes": [
        "server"
    ],
    "host_name": "asm-demo-worker-2.fyre.ibm.com",
    "host_scope_version": "1.9.1",
    "kernel_version": "4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018",
    "matchTokens": [
        "asm-demo-worker-2.fyre.ibm.com",
        "asm-demo-worker-2"
    ],
    "name": "asm-demo-worker-2.fyre.ibm.com",
    "os": "linux",
    "tags": [
        "clusterName:mybiz"
    ],
    "uniqueId": "asm-demo-worker-2.fyre.ibm.com",
    "vertexType": "resource"
}

There's a number of ways we could meet this goal. If you want to target very specific resources, you could do it without the glue file by creating a pair of merge rules - one to promote (say) the network host access IP address to a mergeToken and another very targeted rule for the Kubernetes worker node that injects a literal string for network accessIP address as a mergeToken.

In this case, we'll use a glue file on the basis that we may want to merge many resources together and so want a repeatable and easy to configure solution. Here, merge processing shall be performing 3-way merges.

Perform the following steps:-

Create a merge rule that promotes the accessIPAddress and host_name property values to merge tokens. Here we're targeting any host or server loaded from the File Observer. This will ensure that at least one of those property values is promoted to a merge token and both if they're present.
TOP TIP: Scope the rules tightly to ensure they act only on the data we want them to act on.
Run the Network and Kubernetes Observer jobs again and verify they've got mergeTokens as expected.
Load the 'glue' File Observer job to tie the dissimilar records together. In this case, we'll ensure that a server record in this file has both the IP address and hostname of the server in question which corresponds to the network and Kubernetes data, resulting in a 3-way merge.
V:{"_operation": "InsertReplace", "uniqueId": "asm-demo-worker-2.fyre.ibm.com", "entityTypes": ["server"], "name": "asm-demo-worker-2.fyre.ibm.com", "accessIPAddress", "172.31.6.6", "host_name": "asm-demo-worker-2.fyre.ibm.com" }
Run the File Observer job and verify that the three records are combined as shown by the following screenshot and sample record. Note the composite reflects the different entityTypes contributed from the Network and Kubernetes data, that it shows the _compositeId and _compositeOfIds properties, and the Data origin tab shows the contributing data sources.

API Fun

Let's have a quick look at the APIs and this is assuming you've set global.enableAllRoutes:true in the helmValuesASM section of your AIOps ASM operator instance to expose the API routes externally. The URL you need to access the Swagger docs and live APIs via your browser will be something like https://aiops-topology-merge-cp4aiops.challenge210/1.0/merge/swagger#.

TOP TIP: To get your API route, use the following OCP command: oc get route | grep topology-merge | awk '{print $2}'

TOP TIP: To get your API credentials, use the following OCP commands:

oc get secret aiops-topology-asm-credentials -o jsonpath='{.data.username}' | base64 -d

oc get secret aiops-topology-asm-credentials -o jsonpath='{.data.password}' | base64 -d

Get a list of composites

curl -X 'GET' \
  'https://aiops-topology-merge-cp4aiops.challenge210/1.0/merge/composites?_include_count=false' \
  -H 'accept: application/json' \
  -H 'X-TenantID: cfd95b7e-3bc7-4006-a4a8-a73a79c71255'

{
  "_executionTime": 18,
  "_offset": 0,
  "_limit": 50,
  "_items": [
    {
      "_id": "dJQsQOcrTa-8uap4vEKpFQ"
    },
    {
      "_id": "vqG1-B5lSYm187VZBL987Q"
    },
...
  ]
}

Getting the members of a specific composite

curl -X 'GET' \
  'https://aiops-topology-merge-cp4aiops.challenge210/1.0/merge/composites/dJQsQOcrTa-8uap4vEKpFQ/of?_field=name&_field=uniqueId&_field=entityTypes&_reevaluate=false&_return=nodes' \
  -H 'accept: application/json' \
  -H 'X-TenantID: cfd95b7e-3bc7-4006-a4a8-a73a79c71255'

{
  "_executionTime": 36,
  "_items": [
    {
      "uniqueId": "78f0ccfb-de96-11e8-b739-00000a33043d",
      "name": "carts",
      "entityTypes": [
        "service"
      ],
      "_id": "aZIwejxERzS1z6npnsPkaA"
    },
    {
      "uniqueId": "78f0ccfb-de96-11e8-b739-00000a33043d",
      "name": "carts",
      "entityTypes": [
        "service"
      ],
      "_id": "BrKYaA7ZSjeqPbMwwhvRbg"
    }
  ]
}

Asking if a resource is part of a composite (using the topology-service API)

curl -X 'GET' \
  'https://aiops-topology-topology-cp4aiops.challenge210/1.0/topology/resources/aZIwejxERzS1z6npnsPkaA/references/in/compositeOf?_return=nodes&_include_count=false&_include_status=false&_include_status_severity=false&_include_metadata=false' \
  -H 'accept: application/json' \
  -H 'X-TenantID: cfd95b7e-3bc7-4006-a4a8-a73a79c71255'

{
  "_executionTime": 33,
  "_offset": 0,
  "_limit": 50,
  "_items": [
    {
      "_id": "dJQsQOcrTa-8uap4vEKpFQ",
      "changeTime": 1734300629970,
      "vertexType": "composite",
      "_modifiedAt": "2024-12-15T22:10:29.970Z",
      "createTime": 1734300629970,
      "_createdAt": "2024-12-15T22:10:29.970Z"
    }
  ]
}

Final Thoughts

In this AIOps Bitesize, we used Topology Manager's merge processing to create a topology that spans data sources and management domains. You saw how to use merge recipes to meet common goals and how to use the APIs to explore your merged resources and their composites.

What would your scenario be?

#Highlights-home

0 comments

56 views

Permalink

https://community.ibm.com/community/user/blogs/matthew-duggan/2024/12/03/aiops-bitesize-topology-manager-merge-rules

AIOps

AIOps

AIOps Bitesize: Recipes for Topology Manager merge processing

By Matthew Duggan posted Mon December 16, 2024 07:02 AM

What and why?

Let's Go

What should I merge together?

When shouldn't I use merge processing?

Merge Recipes

How does merge processing work?

What about merge tokens?

How are composites modelled?

Recipe Examples

Enrichment

Join

Glue

Network

Kubernetes

API Fun

Get a list of composites

Getting the members of a specific composite

Asking if a resource is part of a composite (using the topology-service API)

Final Thoughts

Permalink

Additional
Resources

Office

Quick Links

AIOps

AIOps

AIOps Bitesize: Recipes for Topology Manager merge processing

By Matthew Duggan posted Mon December 16, 2024 07:02 AM

What and why?

Let's Go

What should I merge together?

When shouldn't I use merge processing?

Merge Recipes

How does merge processing work?

What about merge tokens?

How are composites modelled?

Recipe Examples

Enrichment

Join

Glue

Network

Kubernetes

API Fun

Get a list of composites

Getting the members of a specific composite

Asking if a resource is part of a composite (using the topology-service API)

Final Thoughts

Permalink

Additional Resources

Office

Quick Links

Additional
Resources