
Resiliency By Design:
Enhance your FlashSystems Resiliency with IBM Copy Services Manager
As the architect and a developer on IBM Copy Services Manager (CSM), I'm often asked,
What are the advantages to using CSM when managing FlashSystems?
This is a great question. IBM FlashSystems is a great IBM product and has made numerous changes recently to simplify their design when it comes to replication and cyber resiliency without compromising it's scalability and flexibility. New features like an internal scheduler for Snapshot or Policy Based Replication for remote copy replication, have made it far easier for customers to setup and manage their data resiliency.
One of my standard answers to this questions has always been that CSM helps provide a single pane of glass view for managing your FlashSystems. You can connect CSM to all local and remote FlashSystems and CSM can help you manage the replication across sites. But FlashSystem Grid is making it easier for customers now to not only manage environments with multiple FlashSystems, but also manage the workloads and adjust the environment accordingly. So I'm not going to focus this blog on that.
So what am I going to focus this blog on? Simple. Automation.
NOTE: If you'd like to avoid my long winded explanation on CSM....feel free to jump right to the answer to the question
"When should you use CSM over just the FlashSystems interface or other solutions like Copy Data Manager"?
Why is Automation Key When Designing a Resiliency Solution?
So before I dig into what CSM can do in terms of automation, I think I should cover why Automation is important and...well....what exactly I mean be automation.
Let's start with the latter. What do I mean by Automation. In today's world, we are in a constant flux. Especially with the introduction of AI, new technology is constantly popping up across not only storage system solutions like I mention in FlashSystems above, but in servers, applications, networks....the entire framework of an enterprise business. We are no longer in a world where we can rely on a "one size fits all"....or in this context "one solution fits all" strategy. Customers adapt to new technologies at different rates, or have different interpretations of exactly what their particular needs are when it comes to creating a resilient environment. In other words, two customers might be using the same underlying product (such as DB2, MongoDB, etc), but they may be using them entirely different ways. There are for sure commonalities in how they use them....but differences non-the-less.
So when I talk about Automation, I'm not only talking about taking manual steps out of day to day processes, I'm talking about the ability to build on an Automation framework that can quickly adapt to individual needs. And as those needs change, the ability to quickly modify the solution. Cause let's face it, whether it's due to internal or external (like the government), definitions of what it means to be resilient are constantly changing.
Having an Automation framework that you build on, adapt with, as well as easily learn (as a newer workforce emerges) is...well, as I put it in the sub-title...key.
Does CSM provide an Automation framework that you can build on, adapt with and easily learn?
You bet it does! Would probably have been a big build up to a let down if I had answered that question as no right? haha
When CSM was first created, we realized that while a lot of customers did things similar...they still ended up creating unique solutions. So while CSM has a number of "canned solutions" we call sessions, we built the product so that these sessions could be used to create unique solutions.
We also realized that attempting to manage these solutions at an application level might "lock us in" to certain aspects that wouldn't be adaptable for customers. To say it a little more bluntly...we saw that trying to manage the solutions at the application level might result in numerous requests from customers for support that we might find difficult to deliver on in a timely manner. And if we can't deliver it in a timely manner....that breaks the customers ability to adapt to the ever changing set of requirements I discussed previously.
So instead..CSM focused on making CSM robust and adaptable enough to make it easy to integrate it into any solution.
There are two main components to this: CSM Scheduled Tasks and the CSM APIs.
What is a CSM Scheduled Task?
CSM scheduled tasks were originally designed to be a simple scheduler that could kick of Safeguarded Copy backups for DS8000 storage systems at a given interval. But since they're initial creation they've grown to be SO much more.
Scheduled tasks can be be comprised of a set of actions which can help you automate commands, health checks, etc, across one or more sessions. You can think of it as a way to build a unique set of automation steps within CSM, without having to do any scripting. These actions you can setup include the following:
-
-
Run a given command against a session
Wait for the session to reach a certain state
Wait for a role pair to reach a given percent complete in the copy
Validate the consistency of a role pair
Validate the RPO for GM on a given session
Run an external script to coordinate CSM with other external tasks
The above actions can be put together in any fashion so that you can simply schedule it....or click a button to run through those steps instead of having to do it manually.
For example, you could use this solution to create an application consistent Snapshot of your data in a FlashSystems environment but creating the following Scheduled Task:
-
-
Run External Script to quiesce the application
Issue 'Create Safeguarded Snapshot' against a given session
Run External Script to resume the application

Or you can use it along with the Refesh Thin Clone capability on the FlashSystem, to automate not only the creation of the Snapshot...but the VALIDATION as well!
-
-
Run External Script to quiesce the application
Issue 'Create Safeguarded Snapshot' against a given session
Run External Script to resume the application
Issue "Refresh Thin Clone" command to apply the latest snapshot to an existing thin clone. Using this option lets you update the host that does the validation, without having to attach that host to a new set of volumes! Quick. Easy. Simple.
Run External Script to invoke a set of actions on the host system attached to the volumes in the Thin Clone that will validate your unique set of data
A scheduled task can also not only be scheduled on a hourly or daily/weekly basis....but if can be configured so that it can be invoked remotely via our APIs.

And finally, Scheduled Tasks in CSM even have the ability to run other tasks on a successful execution or a different ask on a failed execution. Meaning you can build if then else type logic within the scheduled tasks without having to write a single line of code!!!
What are the CSM APIs?
CSM does have a traditional CLI. Automation can be created using that CLI. But anyone that has built automation on a CLI knows there are some drawback to doing so. The CLI typically has to be installed on the server with the automation. This means that it also has to be maintained. When the server is updated the CLI should be updated as well. Which could lead to compatibility issues etc if it isn't.
My focus for CSM Automation instead is on what I'll refer to as a 3 tier API framework.
The CSM automation framework all starts with a robust REST interface. CSM's REST interface gives the customer the ability to issue any command to the CSM session. Click here to see the CSM REST documentation.
CSM also provides a python library called pyCSM. This can be installed in a python environment with pip and uses the CSM REST API to allow customers to write python scripts against a CSM server. Click here for the pyCSM documentation.
And finally, CSM provides a CSM Ansible collection. This Ansible collection is built on the pyCSM library which uses the CSM REST API. Click here to get to the CSM Ansible collection.
This 3 tier framework gives customers the flexibility to development automation to any of the layers that they feel fit best in their environment. If a customer users Ansible...great! But if they'd prefer python or REST...it's there as well. Here's a visual representation of the framework. It's ALL about flexibility!!!

That's all great....but what can CSM really do with FlashSystems? Can we do what I need it to do?
Man....you're just FULL of great questions today. Having a automation framework is one thing....but if it doesn't support everything I need on the FlashSystems....what good is it right?
The CSM team has worked tirelessly to stay on top of the latest and greatest that FlashSystems has to offer. We may not have all the support right out the gate of a release....but we're soon on it's heels. But I think the best way to address this is to try to list out a number of the features that CSM provides with FlashSystems. While CSM does support the legacy FC, MM and GM solutions.....this blog will stay focused on the newer technologies.
This is just a list at the time this blog was written...and I'm probably missing some. :)
Snapshot Support
-
-
- Snapshots
- Safeguarded Snapshots
- Advanced Scheduling for Snapshots
- Schedule different intervals with different retentions on the same volume group
- Schedule both safeguarded and non safeguarded on the same volume group
- Ability to call external scripts in order to quiesce virtually ANY application to create application consistent Snapshots
- Prepare Snapshot and Prepare Safeguarded Snapshot support
- Restore Snapshot to Production
- Create Thin clones from Snapshots
- Refresh Thin clones from Snapshots
- Create Full clones from Snapshot
- Schedule and Automate the creation of a Snapshot with the Refresh of a Thin Clone - I'll come back to this one in a little bit!!!
Asynchronous Policy Based Replication (PBR) Support
-
-
- Async PBR (both partition and non-partition based) -> Automatic Discovery based on Volume Group Policy
- Async PBR with HA replication -> Automatic Discover based on Volume Group Policy
- Ability to automate site switches with Async PBR
- Automatically create local and remote Snapshot sessions for all Storage Systems in an Async PBR VG
- CSM allows you to create/schedule "local" Snapshots against the VG
- CSM knows which site is the "production" site and can apply the "local" policy automatically to that site
- CSM knows which storage systems is the "active" production storage system when in an HA configuration
- CSM allows you to create/schedule "remote" Snapshots against the VG
- CSM knows which site is the "remote" site and can apply the "remote" policy automatically to that site
- Support for Async PBR "Create Checkpoint" features allowing better automation for site failovers or Snapshot management on an Async PBR enabled VG
- Support Recovery Test Mode for Async PBR sessions
So when should you use CSM over just the FlashSystems interface or other solutions like Copy Data Manager?
Congrats to those that got here by reading through my long diatribe on CSM. And if you jumped straight here for the summarized list or reasons....let's get right into it!
The above might give you a better understanding of what CSM can do...but when looking at options for a solution...here are some key things to consider.
Multiple Scheduling Options
While the FlashSystems has a built in scheduler now, the CSM scheduler can do a lot more. Not only can it schedule at hourly or daily times, but it can be setup to do things like create Snapshots at different intervals or retentions or Safeguarded or not....all against the same Volume Group! Do you need flexibility in how you schedule the creation of Snapshot or other actions?
Easy to Setup Scheduled Tasks
CSM Scheduled tasks are extremely easy to setup and with the run task on success/failure options, can be setup to create solutions without having to know the intricacies of a CLI, scripting language, etc. This makes it extremely easy to not only maintain...but to teach. So when those new college hires come in to help manage your storage, they can pick it up quickly!
Quick adoption of new FlashSystems features
The CSM team is constantly working to stay on top of the latest and greatest that FlashSystems has to offer. As they keep innovating for you...we keep making it easy for you to manage those innovations! And we're quick to react to customer requests through the IDEA portal.
Extreme Flexibility and Adaptability
Copy Data Manager can help you manage your Snapshots at an application level, but does it support the application you're using? Does it have the features you need? Not only does CSM stay on top of the latest FlashSystems features, but the architecture I described above makes it ideal for helping you create unique solutions....including application level management!!! CSM works closely with IBM Technology Expert Labs and can help work with you no matter WHAT application you use. Scheduled Tasks can invoked external scripts for virtually any application or environment! When your business decides to pickup a newly hyped application...you can quickly adapt it into your existing solution.
Distributed and Mainframe support
While this post is directed at FlashSystems customers....are you a FlashSystem customer that also has a mainframe environment? Having a central strategy across your FlashSystems and DS8000 storage may be a key selling point. So not only do you get a single pane of glass across your FlashSystems, but you can have it across your DS8000s as well. And building a Cyber Vault architecture that can be implemented for your unique customer requirements across your entire business may be a critical!
As you can see...CSM can be a powerful tool in creating an Resilient, Flexible and Easy to maintain solution for your business needs.
Hope you found this blog helpful.
Please reach out if you have any additional questions on IBM Copy Services Manager and how it can help you build out a resilient solutions.
Randy Blea - CSM Architect <blead@us.ibm.com>
And to reach out for more details on how IBM Technical Expert Lab services can help you use CSM to build a Resilient Solution, reach out in the following link
https://www.ibm.com/account/reg/signup?formid=MAIL-consult
For example you can reach out and ask about "“Services to implement application consistent SGCs in CSM”.
#ds8900
#copyservicesmanager
#safeguardedcopy#flashsystem