SevOne

SevOne

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Increase visibility of Hyper-V environments and discover infrastructure bottlenecks with SevOne – A real customer scenario

By Raul Gonzalez posted 2 days ago

  

“The problem is always the network”

Typical statement that I keep hearing from application teams when there is a problem with the performance or availability of any application – The problem is always the network.

And please forgive me for not taking part into this argument, mainly because I’m a network guy myself and I know there is a lot of truth on this statement 😊, although the database is also a common suspect…

Getting serious, to answer this question, “is it the network?” we need of Network Performance Monitoring (NPM) tools that monitor the network (things like switches, routers, firewalls, load balancers, WiFi, SDN, SDWAN…) but also Application Performance Monitoring (APM) solutions that monitor the performance of the applications. But then the question is:

Which tool do we use to monitor the infrastructure?

Which tool do we use to monitor equipment such as VMWare or Hyper-V servers?

Most APM solutions (and also NPM) will focus only at the server or virtual machine layer, providing metrics such as CPU (load, co-stop…), memory, disk, traffic…. But as we mentioned before, if it is always the network, how can we dig a little bit deeper and get network related metrics of these environments?

Example WMI Data

Let’s take a step back – Windows monitoring

Let’s focus on Windows infrastructure first, but the same could apply to Linux server as well as other hypervisors and hyperscalers.

Windows servers can be monitored using different polling methods, being the most common one WMI. SNMP used to be supported but lack of granularity and loss of support made WMI the only realistic option.

If we focus on Hyper-V infrastructure, WMI will give us a lot of relevant information such as virtual machine details (status, CPU, memory, disk, interface data…)  but what about network data? Because with NIC data from the vms is not enough, to fully understand what is going on inside the Hyper-V infrastructure we need performance data of the Hyper-V virtual switches, and that is something not available through WMI (or SNMP).

Example Hyper-V Data Collected Through PowerShell Scripts

Real Use Case

Let’s talk about a real use case. Working with a famous Spanish bank about their current challenges, they told us that their main issue was related to the performance of some of their applications, that slowed during peak times of the day, but they couldn’t figure out why this was happening. Most of their infrastructure was running in Hyper-V and the bank had different observability tools their infrastructure, but they couldn’t find out where the issue is.

Example Hyper-V Virtual Switch Data

Solution – IBM SevOne

After understanding the underlying issue with the customer, it was quite clear for us that the solution to this problem would be easily solved by installing IBM SevOne in their environment in order to gain full network observability, including the Hyper-V network layer.

Once SevOne was installed, we managed to monitor all their network equipment as well as their infrastructure layer (Cloud and Hyper-V) in a single tool providing not only visibility to all the data but also providing advanced analytics using ML algorithms. We collected not only the most common KPIs from Hyper-V environments using WMI as polling methods, but we managed to go deeper and started collecting virtual switch information running PowerShell commands to collect metrics such as bytes sent/received, errors and discards, and configuration data such as ACL count, MAC spoofing enabled…. This granular level of data allowed the bank to uncover a bottleneck at the virtual switch layer that was impacting the performance of the applications.

Data Collected From Virtual Switch

Cherry on top - sFlow

With this new collected data, the bank understood there was congestion on the virtual switches, however there was still an unsolved problem, who is generating this traffic? What kind of traffic is hogging the network? Again the solution was SevOne.

Leveraging sFlow (flow technology compatible with Hyper-V) we analysed the traffic within the Hyper-V cluster and learned the behaviour of the traffic going through the cluster, getting information such where the traffic is being generated, to which other servers (internal or external) the vms are talking to, and which applications are generating most of the traffic.

Example Flow Data

This was vital information that allowed the teams in the bank to optimize the design of their Hyper-V clusters and eliminate the performance issues of their applications completely.

SevOne is the answer

Having all the data from your network and infrastructure in a single tool is key to understand and proactively detect performance issues on your applications. Having a single APM solution to monitor all your infrastructure including your Hyper-V, VMWare or even cloud providers is not enough to understand the network performance of that infrastructure, therefore a NPM solution that can collect ALL the performance data, regardless of the protocol or technology used (SNMP, WMI, PowerShell, API, CSV, Flow….) is paramount to reduce MTTR and avoid performance issues on your applications.


#TechnicalBlog
0 comments
9 views

Permalink