AIOps

AIOps

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.


#ITAutomation
#AIOps
#CloudPakforAIOps
#AIOps

 View Only

Topology Path Correlations in AIOps 4.13

By Stephane Millar posted 29 days ago

  

Leveraging Topology Paths for Alert Correlation in IBM AIOps 4.13

Modern enterprises IT and Network environments generate an overwhelming volume of alerts. And what is worse, is that problem is growing while the size of the team managing them does not.

Topology path correlations help to reduce the noise by extending the set of existing mechanisms available within AIOps for alert clustering and incident reduction. These path correlations automatically provide cross-domain alert correlation along paths through the topology.

image

The new path correlations do not obsolete the existing topology group-based correlation. In some cases, they provide a suitable alternative - for example to minimise over-correlation - while in others, the correlations can work in conjunction. A key difference is that the path correlations do not have the initial set up cost of the group-based correlations, which require groups to be judiciously constructed up-front.

Examples of supported path correlations include:

  • Network switch problems with connected compute resource alert
  • Port-level alerts with local or remote card failures
  • Database errors with alerts on dependent services

Building a Topology

The topology graph can represent anything, from static infrastructure, through virtual environments up to hosted customer-facing applications. Pre-defined integrations can pull in resource representations from APIs across the range, while SDK observers provide the ability to model custom data from proprietary sources as inter-related resources and groups.

Whatever the sources, a cross-domain model can be built based on the merge of overlapping resources.

Exploiting the Topology

Path correlations traverse the topology graph along dependency chains, within containment hierarchies and to adjacent communicating resources of the cross-domain model. These will correlate alerts occurring on related resources within a configurable time window, without the challenging administrative effort currently required to create topology groups for correlation. Filtering can be applied to alerts or to resource types.

The alert groups formed by these correlations can automatically be combined on overlap – both with each other and with different flavours of alert grouping  - to further reduce noise and accelerate resolution. IT issues can be associated with underlying network problems to allow different teams to collaborate more effectively.

Topology groups continue to provide valuable context, such as summarising the set of alerts affecting a particular area or visualising the contents of a building. But topology correlation no longer depends on these groups.

Explaining the Correlation Results

Alerts provide a launch to the view of any topology paths used to form alert groups, explaining why the grouping was formed. Conversely, the topology provides a tool to see what paths would be found when seeded on a selected resource, to help in planning.

Example of the remote aggregation policy grouping a card failure with remote link down alerts:

image

Summary

Topology path correlations reduce resolution time by leveraging the cross-domain topology model to group related alerts in addition to the existing mechanisms.

Time-to-value is reduced, as the knowledge stored within the topology can be automatically leveraged without the administrative burden of defining targeted topology groups.

This creates more focussed, higher quality incidents which can be more effectively prioritized and addressed.

0 comments
40 views

Permalink