I recently posted this post on how you can help IBM Storage Support help you by ensuring you are utilizing the full monitoring features available on your storage systems and switches. You should also have at least the free version of IBM Storage Insights installed. If you have Storage Insights Pro or Storage Insights for Spectrum control, there are some additional steps that you should take that will benefit both you and the IBM Support team resolve your problems as quickly as possible.
IBM Storage Insights Pro and Storage Insights for Spectrum Control come with some powerful features for grouping and organizing storage resources. These features are found under the Groups menu. You can organize your storage resources into Applications, Departments and General Groups.
There is a hierarchy to the organization of resources. Departments can contain sub-departments, Applications or General Groups. Applications can contain hosts or other applications. General Groups can contain volumes or storage systems.
Applications and Departments let you create the applications that are critical to your business and assign them to the same departments they are part of in your actual business. You can define an application (such as a database) and then add the hosts to that application that run that database. When you do this, Storage Insights automatically pulls the storage systems and volumes associated with those hosts into the application that you created.
General Groups let you group volumes and storage systems together. One use case is after groupinb volumes in a General Group you can define alerts for the members of that group. The Alert Policies feature provides a similar function for storage systems and other types of resources, but not for volumes. I recommend that you continue to use Alert Policies to manage alerts on storage systems, but you cannot currently add volumes to an Alert Policy directly. This can be important because different types of physical storage (Flash Core vs nearline) will have different performance expectations and a response time that is valid for Flash Core is not achievable on nearline drives. So a volume backed by nearline will see constant alerts if you were to configure an alert that is expecting flash storage. Also, volumes with different I/O patterns will have different response time expectations. See this post
for an example of one such I/O pattern. General Groups enable you to group volumes, storage systems, hosts or other resources however you wish.
How Using the Grouping Features Helps IBM Storage Support
Storage systems that are organized into at least applications help IBM Storage Support more quickly identify potentially affected resources when you have a problem. A typical performance problem statement from a client is "our XXX database performance is very slow". Before IBM Support can be gin to work on a problem, we need to know what hosts are affected, what storage systems are providing the storage to the hosts and and what volumes those hosts are using. If a customer has organized the resources using the Storage Insights features, identifying the affected components is much easier than trying to do it without Storage Insights. Here are some real-world examples that illustrate how using the grouping features is beneficial:
A customer who had a number of volumes being replicated via Global Mirror Change Volumes (GMCV) on a pair of SVC clusters. The issue was that a subset of 20 or so volumes out of a few hundred were nearly always behind on their recovery point. Out of the group, some would catch up, then fall behind, then a different set in the subset would catch up, etc. So while the group of 20 was nearly constant, the volumes that were actually behind would frequently change. We had the customer create a General Group of the 20 volumes, then on any given day the customer could tell us which volumes were behind. It was much easier to look at those 20 and identify particular volumes than repeatedly filter them out of the few thousand that the customer had. Over time we were able to determine that the volumes that were falling behind had an I/O pattern that was spikes of very high write I/O activity, then a longer period of very low. The volumes would fall behind during the intense writes, then not catch up because the low I/O activity meant they were at a lower priority for replication. Having both the Storage Insights performance data and the ability to put these volumes in a group made it much easier to diagnose the issue.
Another customer had an application that would intermittently have performance problems, and users were complaining about the slowness of the application. The customer had several virtual hosts spread across 10 VMWare servers in a cluster. These virtual hosts were running the application. The virtual hosts could be running on any of VMware servers at any given time, and any of several dozen volumes could also be affected. We had the customer create an application and add the VMWare hosts to the application. This automatically pulled in the storage systems, volumes and backing storage for the application. We were able to much more quickly determine the root cause of the problem as the backing storage was being overdriven without having to repeatedly filter on hosts or volumes. The problem would have been resolved eventually, but the pattern was more clearly seen when we were able to start with a much smaller set of volumes and hosts.
You can see how utilizing the Groups features of Storage Insights Pro can benefit both you and the the IBM Support teams. If you want to find out more about the features, visit the Storage Insights Youtube Channel
and check out the videos there that cover Departments, Applications and Groups.