There are various quotes that are readily available around what tools are in general. You might often have heard about "a person is only as good as their tools" or "a tool is only as good as the team using it." I would like to demystify these concepts of Applications/Departments and General Groups in IBM Storage Insights / IBM Spectrum Control. It's not that the official documentation is not helpful, it is, but nothing can beat real world practical examples .
If you are a passionate about storage like me, you are probably subscribed to the following IBM storage communication distribution list : https://w3.ibm.com/w3publisher/gsle-storage/storage-distribution-list and if you are not, then you are missing out on important communications. One morning as I was browsing through my emails I saw this Storage Alert :
Link for the image : https://www.ibm.com/support/pages/node/6368715?myns=s028&mynp=OCST8GLJ&mynp=OCST5GLJ&mync=E&cm_sp=s028-_-OCST8GLJ-OCST5GLJ-_-E
This caught my attention since the capability to monitor DS8000 Safeguarded Copy (TLDR : these volumes are protected, point-in-time backups of critical data, with minimum impact and effective resource utilization) was recently added to the product so naturally my mind went into problem solving mode: How am I going to catch this potential backup issue?
First and foremost I had to identify which of my DS8000 storage devices from the floor have safeguarded copies, I quickly remembered that I already had a NOC dashboard with the devices I'm looking for and by enabling the "Safeguarded Capacity (GiB)" column I could quickly grasp from this higher level view the devices that have safeguarded capacity.
At this point, I'm aware of which device could be potentially affected and I want to be alerted (using a IBM Storage Insights alerts) when such a copy operation occurs, because the Storage Alert that was previously mentioned: "While performing a SafeGuarded Copy backup, a warmstart may result in inconsistent backup or undetected data loss." This part is really troubling, so what's next?
I'm going to create/define a General Group with this DS8000 device (or multiple devices that have this safeguarded capacity). This is accomplished by right clicking the device and selecting "Add to General Group."
I'm going to define an alert for the "Safeguarded Capacity (GiB)" metric inside the general group. The alert will look something along the following lines:
I've selected the particular value of 18 GiB in order to get a high severity email alert for the case that outlines that this threshold has been violated. I've specifically selected this "high water mark" value, because, if any new safeguarded copy operation occurs this value will be aggregated at the device level and most likely this capacity value will be exceeded, triggering the alert. The same logic can be applied at thin provisioned volume level, but the high level water mark will be set on the Used Capacity of the safeguarded volume. Unfortunately at this point I could not find any related performance metric, and again the same logic could have been applied, arguably it would have been more sensitive to detect it through performance data monitoring then events/capacity changes, but eventually the outcome would have been the same.
I've chose to do it as the device level since I was inclined for a more coarse granularity trap, this way any new volume that is created with safeguarded copies is being automatically caught, however, defining it at finer granularity level (for example, Volume alerts) is a more effective "trap" on existent volumes that were configured for the device. Ideally both need to be defined, to make sure that nothing slips through the cracks.
You might be wondering why did I choose a General Group and not an Application? The answer is fairly simple, general groups are mechanisms to quickly group and view information about storage resources. The groups can be defined at top level (device) or entity/subentity level (pools, volumes, ports). As you can see in the above screen capture, the general group only has the:
- Subgroups (for creating hierarchies - up to 5 levels deep)
An Application is more complex. An Application is a program or project that consumes storage resources within an organization, it can be used to model the storage usage and consumption in your environment and present overall health status ..etc
This screen capture is an example of an Application, it has a few more options in the navigation :
- Related Resources
I did not need to use an Application, the General Group was good enough to accomplish my goal very quickly. I would like to emphasize that Applications can only be created with leaf sub-entities (volumes, shares ..etc) and cannot be defined using top entities (block/file/object storage device). An Application also has more powerful creation capabilities using filters. While General Groups has to be created manually by selecting members (that's also a possibility for Applications - explicit addings), the Application filters can automatically add members.
But then a naming convention must be thoroughly followed. Without a naming convention one cannot simply create a filter for all the safeguarded volumes, so you would end up picking them explicitly.
A good way to think about Applications is through a bottom up building approach. You can either :
- manually pick leaf sub-entities explicitly (volumes, shares, filesets, vaults)
- create them through a filter
then the related members for the Application case are automatically discovered and all the health/capacity/performance information is readily available, for a more broader impact analysis point of view. But for my need of just detecting a Safeguarded Copy the related actions was more then I needed.
A subtlety worth mentioning for Applications is that they don't need to have what I earlier called "coarse granularity trap" of the General Groups, because they are automatically served through the related members. Ultimately the differences come down to building paradigms and capabilities.
Now, a few words about Departments. Departments are simply divisions within a business line. A Department might use five applications and be part of another two departments (known as a subdepartment). In a real world, the devices managed by IBM Storage Insights/IBM Spectrum Control are shared resources, but then the usage needs to be modeled on business entities, which are typically hierarchical.
In closing, it is essential to have good tools, but it's also essential that the tools should be used in the right way!