SevOne

SevOne

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

How to detect behavioural changes on your network – Real Scenario

By Raul Gonzalez posted Thu December 14, 2023 08:27 AM

  

Let’s start with a big truth: the network is evolving and it is increasing the complexity on the operational and engineering teams, specially when they need to troubleshoot any problem that occurred on the network. New technologies such as network virtualization (with the famous vNFs and cNFs) or the new cloud providers are moving away from the legacy way of operating the network, and that increases complexity.

On the other hand, the arrival of AI/ML technologies into the network observability world has improved the way we see the performance of the network and how we can troubleshoot it. Now we can understand whether something is a problem or not much faster than when we didn’t understand normal behaviour of the network. And obviously people are getting used to these technologies and they want to apply them everywhere they can/need.

Anomaly detection example

But there is still a frontier that we haven’t reached yet

With AI/ML we have managed to understand the normal behaviour of the network and to detect anomalies on all those metrics that we are collecting. Things like “there are fewer VPN connections than expected for this time of the day”, or “there is a suspicious increase of hits on a particular firewall rule”.

But the frontier we haven’t reached yet is how to understand the actual communications (i.e. IP flows) going through the network. And this is what customers have started asking.

A few weeks ago I was in a call with a company that raised this very same question. They were using some custom ML algorithm to detect anomalies on the network, but they asked how they could do the same with the IP flows (conversations) going through the network. They described the following scenario:

·        Their AI/ML system detects an anomaly on the amount of traffic generated on an interface

·        When you review the traffic going through that interface using netflow technologies they don’t see anything generating more traffic than any other. 

The problem they were facing here is the fact that they don’t know the normal behaviour of the IP flows going through that interface, and this is happening because they cannot have AI/ML that analyses the flow traffic.

Example of metric to flow correlation

And why is this happening?

There are so many different combinations of variables available in netflow that this doesn’t allow us to learn the normal behaviour of the IP flows.  Bear in mind that IPFIX has more than 500 potential fields (https://www.iana.org/assignments/ipfix/ipfix.xhtml) , making it millions and millions of potential combinations…

Example of netflow fields available

But, if we were able to limit the fields that we want to learn, and limit the combinations available to just a few, we could configure our monitoring platform to learn the normal behaviour of those IP flows using those fields.

Let me show you an example: when we are using netflow, possibly we are more interested in knowing which are the IPs and the applications generating traffic, rather than knowing the min and max TTLs or ipv6 extension headers.

This means that we could limit the AI/ML algorithm to learn only the normal behaviour of the combination Source IP, Destination IP, Application. This would help us massively in the use case mentioned before, because when we detect the abnormal increase of the interface traffic, now we could drill down to that interface and figure out which one of the IP flows (source IP, dest IP and app) is behaving abnormally and, therefore, pinpoint the source of the problem.

Inventory of IP flows

How we can do this?

Using the Rapid Network Automation (RNA) module inside SevOne Automated Network Observability (SANO) we can create a workflow that gets the data from the SevOne flow module, ingests it back as metric, and then the usual magic of AI/ML will happen, allowing us to see the current behaviour of that IP flow compared with the normal behaviour learnt through AI/ML.

IP Flow normal behaviour

IP Flow pattern analysis

On top of this, as we are collecting the IP flow data in metric format, we can also create alert policies to generate notifications when any of those monitored IP flows is behaving abnormally.

IP Flow anomaly detection alert

Conclusion

Using AI/ML based technologies are available for metric based data, however it is not commonly available for flow based data, even if the value of these technologies applied to flow data has been proven. Therefore using SevOne’s ability to migrate data from flow to metric allows us to also learn the normal behaviour of the IP flows and helps us minimize MTTR (and others MTTx) by detecting anomalies on the IP flows that will point to the potential root cause of the problems.

RNA Workflow available here: https://community.ibm.com/community/user/aiops/viewdocument/test-41?CommunityKey=fe9d91df-352c-4846-9060-189fd98d00ca&tab=librarydocuments

 

0 comments
22 views

Permalink