File and Object Storage

File and Object Storage

Software-defined storage for building a global AI, HPC and analytics data platform 

 View Only

IBM Spectrum Scale CSI Volume Snapshots and its usage by IBM Spectrum Protect Plus

By Madhu Punjabi posted Mon December 20, 2021 09:11 AM

  

Authors: Madhu Thorat, Wayne Sawdon


Overview

IBM’s recent release of its Container Storage Interface (CSI) driver for Spectrum Scale introduces the ability to take a snapshot of your Kubernetes persistent volumes (PVs). Snapshots provide a point-in-time copy of each volume which can be used to create a consistent, safe-guarded backup of the data. When used with IBM’s Spectrum Protect Plus (SPP) each backup is an incremental backup from the prior snapshot. That is, only the data that has changed since the last backup will be copied to the new backup. This allows your data to be safe guarded more rapidly and stored more efficiently. Furthermore, once a backup has been created, the recent release of the CSI driver allows that backup to be restored to its original volume for disaster recovery or restored to a new volume for DevOps or testing or offline analytics.

The volume snapshot feature was introduced with IBM Spectrum Scale CSI driver v2.2.0 and is included in the most recent release v2.4.0 of CSI driver for IBM Spectrum Scale. It requires IBM Spectrum Scale v5.1.1.0 or higher. Currently, volume snapshots are restricted to persistent volume claims (PVCs) based on Spectrum Scale’s independent filesets.

Let’s walk through the snapshot and restore function to see how to use this feature when running stateful application on Kubernetes or Red Hat OpenShift. We will also see how to use this feature with Spectrum Protect Plus to provide safe-guarded backups for your data.

Deploying the CSI driver

For installing the recently released v2.4.0 of the IBM Spectrum Scale CSI driver, please follow the instructions at https://www.ibm.com/docs/en/spectrum-scale-csi?topic=240-planning and https://www.ibm.com/docs/en/spectrum-scale-csi?topic=240-installation

Let’s Get Started !

Let us first create a sample StorageClass and PersistentVolumeClaim which would be referred further in snapshot creation examples.

Sample StorageClass and PersistentVolumeClaim (PVC)

As currently snapshots can be created only from independent fileset-based PVCs, below example shows storage class for independent fileset. Here it is assumed that the user wants to create PVCs under the file system gpfs0. "8809226123290901444" is the cluster ID of the primary cluster.

# cat storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ibm-spectrum-scale-csi-fileset-sc
provisioner: spectrumscale.csi.ibm.com
parameters:
  volBackendFs: gpfs0
  clusterId: "8809226123290901444"
reclaimPolicy: Delete

# cat pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ibm-spectrum-scale-csi-pvc
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ibm-spectrum-scale-csi-fileset-sc

Note: All YAML files presented in this blog can be applied with kubectl apply -f <filename> on Kubernetes. Replace ‘kubectl’ with ‘oc’ for Red Hat OpenShift.

Creating a Volume Snapshot

Sample VolumeSnapshotClass

A VolumeSnapshotClass is like a StorageClass which specifies the driver specific attributes for the snapshot to be created. The VolumeSnapshotClass name would be used further when creating a volume snapshot.

 # cat volumesnapshotclass.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ibm-spectrum-scale-csi-snapshot-class
driver: spectrumscale.csi.ibm.com
deletionPolicy: Delete


Sample VolumeSnapshot

VolumeSnapshot is a copy of a volume content on the storage system.

# cat volumesnapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: ibm-spectrum-scale-vol-snapshot 
spec:
  volumeSnapshotClassName: ibm-spectrum-scale-csi-snapshot-class
  source:
    persistentVolumeClaimName: ibm-spectrum-scale-csi-pvc

Note: The source persistent volume claim (PVC) must be in the same namespace in which the volume snapshot will be created.

If everything went well, the snapshots must be created.

# kubectl get volumesnapshot
NAME    READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT  RESTORESIZE  SNAPSHOTCLASS       SNAPSHOTCONTENT                                    CREATIONTIME   AGE
ibm-spectrum-scale-vol-snapshot true    ibm-spectrum-scale-csi-pvc         10Gi        ibm-spectrum-scale-csi-snapshot-class   snapcontent-8c768910-28d1-6a56-8e12-8495149095094  1d20h          1d20h


Creating a volume
from a source snapshot

After successful creation of below volume, the resultant PVC “ibm-spectrum-scale-pvc-from-snap” should have data from the source snapshot “ibm-spectrum-scale-vol-snapshot”.

# cat pvcfromsnapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
   name: ibm-spectrum-scale-pvc-from-snap  
spec:
   accessModes:
   - ReadWriteMany
   resources:
      requests:
         storage: 10Gi
   storageClassName: ibm-spectrum-scale-storageclass
   dataSource:
      name: ibm-spectrum-scale-vol-snapshot
      kind: VolumeSnapshot
      apiGroup: snapshot.storage.k8s.io

Note1: The source snapshot should be in the same namespace as the volume that is created.

Note2: Capacity of the volume must be greater than or equal to the source snapshot's restore size.

Note3: Here you can create any kind of volume (example: lightweight volume or dependent fileset volume).

Usage of Volume Snapshot feature by IBM Spectrum Protect Plus (SPP)

IBM Spectrum Protect Plus (SPP) provides Container Backup Support to protect volume data that was allocated by a storage plug-in that supports the Container Storage Interface (CSI) provided for Kubernetes or Red Hat OpenShift environments. You can perform snapshot backup operations to create locally stored backup copies in the cluster, or on storage external to the cluster.

For file system-based storage type file-based, incremental copy backup and restore operations are performed. During incremental backups, only new and changed data is copied. For IBM Spectrum Scale, the IBM Spectrum Protect Plus utilizes the IBM Spectrum Scale CSI driver snapshot capabilities that are used for backup and restore operations. Container backup support was extended to include support for IBM Spectrum Scale CSI driver 2.2.0 from IBM Spectrum Protect Plus V10.1.8 onwards.

  

Backup and Restore

Container Backup Support provides multiple types of backup and restore functions for your cluster resources and persistent data. To protect data you can have container service level agreement (SLA) policies define how often snapshot backup and copy backup operations are run, and how long snapshots and copy backups are retained. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for taking snapshots to create a backup. Similarly the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for creating a PVC where restore is done.

To initiate backup and restore requests, you can use the IBM Spectrum Protect Plus (SPP) user interface or Kubernetes or OpenShift command line. To get more information about various backup and restore types check the IBM Spectrum Protect Plus v10.1.8 documentation at https://www.ibm.com/docs/en/spp/10.1.8?topic=overview-backup-restore-types

  

Example of command line command line backup: Scheduling backups of persistent volume claims

Below YAML file describes a BaaS request to backup a persistent volume. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for taking snapshots to create a backup.

# cat backup-pvc.yaml
apiVersion: "baas.io/v1alpha1"
kind: BaaSReq                 
metadata:
  name: mysql-project-pvc
  namespace: mysql-ns
spec:
  requesttype: Backup
  sla: [“weekdays”, “weekends”]
  volumesnapshotclass: storagecluster-snapclass

Following is description of the parameters in the above backup request:

name: Name of the request.
Important: Here the name must be identical to the PVC name that is to be backed up.
namespace: The namespace in which the PVC exists.
requesttype: The type of request that is submitted (here it is backup).
sla: [sla_policy]: Specifies the SLA policy that determines the schedule, retention, and snapshot-prefix for backup operations. More than one SLA policy can be specified by using a comma-separated list within the brackets. Ensure that you use the correct case when you specify the SLA policy name. Note, the policy names are case-sensitive in YAML files.
volumesnapshotclass: Specifies the snapshot class for the volume. If no snapshot class is specified, the default snapshot class is used if the sidecar container csi-snapshotter in the default snapshot class matches the provisioner of the volume. Otherwise, the backup request is invalid.

  

Example of command line restore: Restoring persistent data

Prior to restoring persistent data, the available restore points must be determined by running the following command:

describe BaaSReq <pvc_name> -n <namespace>

Below example shows sample output of the command. The available restore points are identified by the timestamp of the snapshot or copy backup.

# oc describe baasreq mysql-project-pvc -n mysql-ns
Name: mysql-project-pvc
Namespace: mysql-ns
...
   Origreqtype: backup
   Requesttype: backup
   Size: 1073741824
   Sla:
      ESCC-OCP
   Spppvcname: OCP: mysql-ns:mysql-project-pvc
   Volumesnapshotclass: storagecluster-snapclass
Status:
   Laststatusupdate: 2021-09-12 12:43:22
   Snapshotname: escc-ocp-1027-2116-1797668d557
   Timestamp: 2021-09-12 12:14:25
   Type: FAST
   Snapshotname: escc-ocp-1027-2116-17973d5a5bf
   Timestamp: 2021-09-12 06:14:26
   Type: FAST

Note: Here FAST means the backup type for a snapshot was taken during a snapshot backup operation.

Below YAML file shows a restore request for data to be restored from a snapshot backup. On a storage cluster having IBM Spectrum Scale, the IBM Spectrum Protect Plus uses the IBM Spectrum Scale CSI driver for creating a PVC where restore is done.

# cat restore-pvc.yaml
apiVersion: "baas.io/v1alpha1"
kind: BaaSReq
metadata:
   name: restore-mysql-project-pvc
   namespace: mysql-ns
spec:
   requesttype: restore
   pvcname: mysql-project-pvc
   targetvolume: mysql-project-pvc-new
   storageclass: storagecluster-storageclass
   restorepoint: 2021-09-12 06:14:26
   restoretype: copy

Following is description of the parameters in the above restore request:

name: The name of the request for the restore job. The name must be unique and must not match the name of an existing PVC. A restore request must be created for each subsequent restore of the same PVC. That is, to restore a PVC again, create a request and specify a different request name in the YAML file.
namespace: The namespace for the request.
Note: The CLI does allows restores to the original namespace only.
requesttype: Restore.
pvcname: The original name of the PVC to be restored.
targetvolume: The name of the (new) restore target PVC.
storageclass: The storageclass that is to be used for the new PVC.
Restriction: Although storageclass of the restore target can be different from the storageclass of the original PVC, the new storageclass must be of the same Provisioner type.
restorepoint: Specifies the timestamp of the source snapshot or copy backup that is to be restored. The timestamp is in Coordinated Universal Time (UTC) format. If no timestamp is specified, the most recent snapshot or copy backup is restored.
restoretype: fast | copy.

For more information on IBM Spectrum Protect Plus (SPP) Container Backup Support, please check the IBM documentation at https://www.ibm.com/docs/en/spp

References:

1. IBM Spectrum Scale Container Storage Interface Driver
https://www.ibm.com/docs/en/spectrum-scale-csi


2. IBM Spectrum Scale CSI Driver Source Code Repository
https://github.com/IBM/ibm-spectrum-scale-csi


3. IBM Spectrum Protect Plus (SPP) Container Backup Support
https://www.ibm.com/docs/en/spp

1 comment
48 views

Permalink

Comments

Fri December 24, 2021 12:50 AM

Good article, Madhu. Just wanted to check if all of the said configuration steps can be achieved through GUI.