Support Create Checkpoint feature on FlashSystems Asynchronous Policy Based Replication session
When replicating data using FlashSystems Asynchronous Policy Based Replication (PBR), the storage system has the ability now to create a checkpoint which a customer can use to ensure that all data written to the sources prior to that checkpoint has been replicated to the remote storage system.
Starting with CSM 6.3.15, this feature is now available on CSM Async PBR sessions. Customers can issue the Create Checkpoint command against the session in order to tell the FlashSystem to create a checkpoint for that volume group, and the CSM session will then automatically monitor the storage system for when that checkpoint completes.
After issuing the Create Checkpoint command to the session, the CSM session will go into a "Waiting for Checkpoint" state. Under the covers a thread will kick off which will query the hardware to determine when that check point has completed. When complete, the session will go right back to the Prepared state. This new command is available on both two site Async PBR as well as Async PBR with High Availability.



There are a LOT of different use cases for this type of functionality. For example, Chris Bulmer pointed out one such use case in his blog "Making the most of async policy-based replication".
The solution that Chris mentioned, was the ability to do a planned failover with no data loss. Using the create checkpoint feature, customers can be assured that ALL data after quiescing the application has been successfully written to the remote storage system before failing over.
Let's show you how you can do this with a CSM Scheduled Task.
You can create a single scheduled task that will walk through all of these steps for you. The actions in the task would be something like the following:
-
-
- Run External Script to quiesce the application on the current host system
- Issue Create Checkpoint to the CSM Async PBR session managing the Volume Group Replication
- Create a Wait For State action that waits for the session to return to a Prepared state. (Step 2 puts the session into Waiting for Checkpoint state, so when it's back in Prepared we know the check point is done).
- Issue Failover to the Async PBR session
- Run External Script to resume the application....but issued to the server at the remote site.
- Issue Confirm Production at Site 2 to the Async PBR session....this confirms to CSM that production has switched sites
- Issue Start H2->H1 command to the Async PBR session. This restarts replication in the opposite direction.
In only a few short minutes you've now created a Scheduled task in CSM that can be invoked either manually or via external automation using the CSM REST interface....and have successfully failed over you applications to the new site.

But what other use cases can we think of. Another use case might be to allow you to create a Safeguarded Snapshot at both the local and remote site making sure they have the exact same data. This solution combines the ability from the last CSM release, to manage local and remote Snapshots directly from the Async PBR session. The following task will work no matter the direction of the replication!
To create this Scheduled Task we would do the following:
-
- Run External script to quiesce the application
- Issue Create Safeguarded Snapshot to the Async PBR session to create a local Snapshot
- Issue Create Checkpoint to the session to create the checkpoint
- Create a Wait For State action that waits for the session to return to a Prepared state.
- Issue Create Recovery Site Safeguarded Snapshot to create a snapshot on the remote storage system
- Run External script to resume the application

The support for Create Checkpoint could in theory even be used to replicate the data to a "third" or "fourth" site. For example, you could create a CSM Scheduled task that does the following:
-
- Create a Snapshot of local data
- Issues a Refresh Thin Clone command to apply that data to a Thin Clone which would now have the data from that latest Snapshot. That Thin Clone Volume Group would be setup with it's on Async PBR replication policy to a third of fourth site.
- Issue Create Checkpoint to that Thin Clone Volume Group
- Create a Wait For State action that waits for the session to return to a Prepared state.
The above could then be run on a given interval with the CSM Scheduled task giving you an Async replication of the data to that third or fourth site with an RPO based on how often you run the task.
Hopefully, you can see that there are a lot of possibilities here and that using a CSM Scheduled Task is not only quite easy....but can help automate these scenarios whether scheduling through CSM, or by invoking it through python or REST.