Authors: Madhu Thorat, Wayne Sawdon
Overview
IBM’s recent release of its Container Storage Interface (CSI) driver for Spectrum Scale introduces the ability to take a snapshot of your Kubernetes persistent volumes (PVs). Snapshots provide a point-in-time copy of each volume which can be used to create a consistent, safe-guarded backup of the data. When used with IBM’s Spectrum Protect Plus (SPP) each backup is an incremental backup from the prior snapshot. That is, only the data that has changed since the last backup will be copied to the new backup. This allows your data to be safe guarded more rapidly and stored more efficiently. Furthermore, once a backup has been created, the recent release of the CSI driver allows that backup to be restored to its original volume for disaster recovery or restored to a new volume for DevOps or testing or offline analytics.
The volume snapshot feature was introduced with IBM Spectrum Scale CSI driver v2.2.0 and is included in the most recent release v2.4.0 of CSI driver for IBM Spectrum Scale. It requires IBM Spectrum Scale v5.1.1.0 or higher. Currently, volume snapshots are restricted to persistent volume claims (PVCs) based on Spectrum Scale’s independent filesets.
Let’s walk through the snapshot and restore function to see how to use this feature when running stateful application on Kubernetes or Red Hat OpenShift. We will also see how to use this feature with Spectrum Protect Plus to provide safe-guarded backups for your data.
Deploying the CSI driver
For installing the recently released v2.4.0 of the IBM Spectrum Scale CSI driver, please follow the instructions at https://www.ibm.com/docs/en/spectrum-scale-csi?topic=240-planning and https://www.ibm.com/docs/en/spectrum-scale-csi?topic=240-installation
Let’s Get Started !
Let us first create a sample StorageClass and PersistentVolumeClaim which would be referred further in snapshot creation examples.
Sample StorageClass and PersistentVolumeClaim (PVC)
As currently snapshots can be created only from independent fileset-based PVCs, below example shows storage class for independent fileset. Here it is assumed that the user wants to create PVCs under the file system gpfs0. "8809226123290901444" is the cluster ID of the primary cluster.
# cat storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ibm-spectrum-scale-csi-fileset-sc
provisioner: spectrumscale.csi.ibm.com
parameters:
volBackendFs: gpfs0
clusterId: "8809226123290901444"
reclaimPolicy: Delete
# cat pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ibm-spectrum-scale-csi-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: ibm-spectrum-scale-csi-fileset-sc
Note: All YAML files presented in this blog can be applied with kubectl apply -f <filename> on Kubernetes. Replace ‘kubectl’ with ‘oc’ for Red Hat OpenShift.
Creating a Volume Snapshot
Sample VolumeSnapshotClass
A VolumeSnapshotClass is like a StorageClass which specifies the driver specific attributes for the snapshot to be created. The VolumeSnapshotClass name would be used further when creating a volume snapshot.
# cat volumesnapshotclass.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: ibm-spectrum-scale-csi-snapshot-class
driver: spectrumscale.csi.ibm.com
deletionPolicy: Delete
Sample VolumeSnapshot
VolumeSnapshot is a copy of a volume content on the storage system.
# cat volumesnapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: ibm-spectrum-scale-vol-snapshot
spec:
volumeSnapshotClassName: ibm-spectrum-scale-csi-snapshot-class
source:
persistentVolumeClaimName: ibm-spectrum-scale-csi-pvc
Note: The source persistent volume claim (PVC) must be in the same namespace in which the volume snapshot will be created.
If everything went well, the snapshots must be created.
# kubectl get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
ibm-spectrum-scale-vol-snapshot true ibm-spectrum-scale-csi-pvc 10Gi ibm-spectrum-scale-csi-snapshot-class snapcontent-8c768910-28d1-6a56-8e12-8495149095094 1d20h 1d20h
Creating a volume from a source snapshot
After successful creation of below volume, the resultant PVC “ibm-spectrum-scale-pvc-from-snap” should have data from the source snapshot “ibm-spectrum-scale-vol-snapshot”.
# cat pvcfromsnapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ibm-spectrum-scale-pvc-from-snap
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: ibm-spectrum-scale-storageclass
dataSource:
name: ibm-spectrum-scale-vol-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
Note1: The source snapshot should be in the same namespace as the volume that is created.
Note2: Capacity of the volume must be greater than or equal to the source snapshot's restore size.
Note3: Here you can create any kind of volume (example: lightweight volume or dependent fileset volume).
Usage of Volume Snapshot feature by IBM Spectrum Protect Plus (SPP)
IBM Spectrum Protect Plus (SPP) provides Container Backup Support to protect volume data that was allocated by a storage plug-in that supports the Container Storage Interface (CSI) provided for Kubernetes or Red Hat OpenShift environments. You can perform snapshot backup operations to create locally stored backup copies in the cluster, or on storage external to the cluster.
For file system-based storage type file-based, incremental copy backup and restore operations are performed. During incremental backups, only new and changed data is copied. For IBM Spectrum Scale, the IBM Spectrum Protect Plus utilizes the IBM Spectrum Scale CSI driver snapshot capabilities that are used for backup and restore operations. Container backup support was extended to include support for IBM Spectrum Scale CSI driver 2.2.0 from IBM Spectrum Protect Plus V10.1.8 onwards.
Backup and Restore
Container Backup Support provides multiple types of backup and restore functions for your cluster resources and persistent data. To protect data you can have container service level agreement (SLA) policies define how often snapshot backup and copy backup operations are run, and how long snapshots and copy backups are retained. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for taking snapshots to create a backup. Similarly the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for creating a PVC where restore is done.
To initiate backup and restore requests, you can use the IBM Spectrum Protect Plus (SPP) user interface or Kubernetes or OpenShift command line. To get more information about various backup and restore types check the IBM Spectrum Protect Plus v10.1.8 documentation at https://www.ibm.com/docs/en/spp/10.1.8?topic=overview-backup-restore-types
Example of command line command line backup: Scheduling backups of persistent volume claims
Below YAML file describes a BaaS request to backup a persistent volume. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for taking snapshots to create a backup.
# cat backup-pvc.yaml
apiVersion: "baas.io/v1alpha1"
kind: BaaSReq
metadata:
name: mysql-project-pvc
namespace: mysql-ns
spec:
requesttype: Backup
sla: [“weekdays”, “weekends”]
volumesnapshotclass: storagecluster-snapclass
Following is description of the parameters in the above backup request:
name: Name of the request.
Important: Here the name must be identical to the PVC name that is to be backed up.
namespace: The namespace in which the PVC exists.
requesttype: The type of request that is submitted (here it is backup).
sla: [sla_policy]: Specifies the SLA policy that determines the schedule, retention, and snapshot-prefix for backup operations. More than one SLA policy can be specified by using a comma-separated list within the brackets. Ensure that you use the correct case when you specify the SLA policy name. Note, the policy names are case-sensitive in YAML files.
volumesnapshotclass: Specifies the snapshot class for the volume. If no snapshot class is specified, the default snapshot class is used if the sidecar container csi-snapshotter in the default snapshot class matches the provisioner of the volume. Otherwise, the backup request is invalid.
Example of command line restore: Restoring persistent data
Prior to restoring persistent data, the available restore points must be determined by running the following command:
describe BaaSReq <pvc_name> -n <namespace>
Below example shows sample output of the command. The available restore points are identified by the timestamp of the snapshot or copy backup.
# oc describe baasreq mysql-project-pvc -n mysql-ns
Name: mysql-project-pvc
Namespace: mysql-ns
...
Origreqtype: backup
Requesttype: backup
Size: 1073741824
Sla:
ESCC-OCP
Spppvcname: OCP: mysql-ns:mysql-project-pvc
Volumesnapshotclass: storagecluster-snapclass
Status:
Laststatusupdate: 2021-09-12 12:43:22
Snapshotname: escc-ocp-1027-2116-1797668d557
Timestamp: 2021-09-12 12:14:25
Type: FAST
Snapshotname: escc-ocp-1027-2116-17973d5a5bf
Timestamp: 2021-09-12 06:14:26
Type: FAST
Note: Here FAST means the backup type for a snapshot was taken during a snapshot backup operation.
Below YAML file shows a restore request for data to be restored from a snapshot backup. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for creating a PVC where restore is done.
# cat restore-pvc.yaml
apiVersion: "baas.io/v1alpha1"
kind: BaaSReq
metadata:
name: restore-mysql-project-pvc
namespace: mysql-ns
spec:
requesttype: restore
pvcname: mysql-project-pvc
targetvolume: mysql-project-pvc-new
storageclass: storagecluster-storageclass
restorepoint: 2021-09-12 06:14:26
restoretype: copy
Following is description of the parameters in the above restore request:
name: The name of the request for the restore job. The name must be unique and must not match the name of an existing PVC. A restore request must be created for each subsequent restore of the same PVC. That is, to restore a PVC again, create a request and specify a different request name in the YAML file.
namespace: The namespace for the request.
Note: The CLI does allows restores to the original namespace only.
requesttype: Restore.
pvcname: The original name of the PVC to be restored.
targetvolume: The name of the (new) restore target PVC.
storageclass: The storageclass that is to be used for the new PVC.
Restriction: Although storageclass of the restore target can be different from the storageclass of the original PVC, the new storageclass must be of the same Provisioner type.
restorepoint: Specifies the timestamp of the source snapshot or copy backup that is to be restored. The timestamp is in Coordinated Universal Time (UTC) format. If no timestamp is specified, the most recent snapshot or copy backup is restored.
restoretype: fast | copy.
For more information on IBM Spectrum Protect Plus (SPP) Container Backup Support, please check the IBM documentation at https://www.ibm.com/docs/en/spp
References:
1. IBM Spectrum Scale Container Storage Interface Driver
https://www.ibm.com/docs/en/spectrum-scale-csi
2. IBM Spectrum Scale CSI Driver Source Code Repository
https://github.com/IBM/ibm-spectrum-scale-csi
3. IBM Spectrum Protect Plus (SPP) Container Backup Support
https://www.ibm.com/docs/en/spp