Upgrading IBM Spectrum Scale sync replication / stretch cluster setup in PureApp
Use case : In a GPFS stretch cluster / GPFS synchronous mirroring setup (which consist of 2 sites with 3rd site as tie breaker) with shared storage deployment model ( block based storage at the backend) one is required to upgrade the setup due to EOL or other issues in a typical IBM PureApp based installation (which is based on server and not CCR ).
i) All the physical NSD servers on both side OR
ii) Entire Backend block based storage on both the sites OR
iii) both i) and ii)
Note: PureApp approach is that it "treats" spectrum scale systems as racks where upgrade requires to "replace" racks".
Ideally, in Scale world, one would add the new servers and nsds to the cluster, and then either use mmrpldisk, or "add new disks-->suspend_old_ones-->restripe" - which will do the migration in "one round" instead of several rounds like needed to do with PureApp based setup explained below.
So, for non Pureapp based installation the process would be much easier -> https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.3/com.ibm.spectrum.scale.v5r03.doc/bl1adm_rpldisk.htm
Below are the guideline steps for up-gradation of such a setup.
In the below set of steps, there is a Spectrum Scale sync replication setup (also sometimes called as stretch cluster). The cluster has two sites , one called as Primary and other as Secondary. The problem definition is : How to migrate the Old/existing deployment (called as OldPrimary and OldMirror) with new Deployment (NewPrimary and NewMirror) which has an upgraded stack of Spectrum Scale nodes along with Block storage and operating systems). The upgrade steps below are planned in such a way that the OldMirror is upgraded to be NewPrimary and OldPrimary is upgraded to be NewMirror).
Representation is as follows:
OldPrimary -> P1
OldMirror -> P2
NewPrimary -> P3
NewMirror -> P4
Steps to reflect the fact that the OldMirror is being replaced by the NewPrimary and not the NewMirror. So this is the sequence -
P2 replaced with P3
P1 replaced with P4
1. Ensure file system operation with only one replica
a. Remember value of strict replication: