Storage Fusion

 View Only

IBM Storage Fusion Backup as Code

By JC Lopez posted Wed June 07, 2023 09:25 PM

  

Why Backup as Code?

From the developer's seat

With container orchestration platforms, the end user is expected to be able to create an application and the tooling that goes with so that upon deployment, not only the applications comes online but the application can be natively integrated with the environment where it runs and leverages all of the services provided by the platform at will. Such services should include backup and restore for stateful applications and should be shipped together with the application (e.g. via the application GitHub repository).

From the infrastructure team's seat

As mentioned above, container orchestration platform are natively designed to be self contained and self service and therefor, it would be ludicrous to expect the infrastructure team to grant access to a resource every time it is needed. In these environment the application is expected to be granted access automatically to said requested resources, based on rules and fences created by the infrastructure team. Such rules and fences to safeguard the platform would include topics like:

  • Storage quotas (maximum number of bytes or files an application can use)
  • Compute resources (maximum amount of memory or CPU an application can consume)
  • Security (userid and grouped assigned to applications together with specific Security Constraints)

However, the infrastructure team does not operate a single Red Hat OpenShift cluster and it would be inconceivable to picture the deployment of all rules, fences and services  being performed one cluster at a time via a graphical user interface. This is where backup as code comes into play.

Not only the developers can leverage the backup and restore services via custom resources and integrate these functionalities with the application code they are creating but the infrastructure can actually configure those services in the same way to reach the level of industrialization required in large environments.

Let's do this

Infrastructure configuration

The infrastructure team will be responsible for defining:

  • Where the application data and metadata being backed up is physically stored
  • How often the application will be backed up
  • How long the application backup will be kept

For this example we will use an external Amazon S3 bucket as an endpoint for what is know as the IBM Storage Fusion Backup Storage Location. To configure the BackupStorageLocation custom resource you will need the following information:

  • The type of S3 endpoint (multiple offered by IBM Storage Fusion e.g. aws, cos, s3, ...)
  • The name of the S3 bucket where the backup will store everything
  • The credentials required to connect to the S3 endpoint
  • Some optional parameters depending on the type of S3 endpoint (e.g. the Region name for AWS)

Step 1 - Store your S3 credentials

cat <<EOF | oc create -f -
---
apiVersion: v1
kind: Secret
data:
  access-key-id: {insert_your_aws_access_key_id}
  secret-access-key: {insert_your_aws_secret_access_key}
metadata:
  name: bsl-myaws-secret
  namespace: ibm-spectrum-fusion-ns
type: Opaque
EOF

Step 2 - Create the Backup Storage Location

cat <<EOF | oc create -f -
---
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: BackupStorageLocation
metadata:
  name: bsl-myaws-endpoint
  namespace: ibm-spectrum-fusion-ns
spec:
  type: aws
  credentialName:  bsl-myaws-secret
  provider: isf-backup-restore
  params:
    region: {region_name}
    bucket: {bucket_name}
    endpoint: https://{endpoint_fqdn}
EOF

Step 3 - Wait for the Backup Storage Location to come online

echo "Wating to wait for the BSL to come online"
while true
do
   bslstatus=$(oc get backupstoragelocations.data-protection.isf.ibm.com bsl-myaws-endpoint -n ibm-spectrum-fusion-ns -o jsonpath --template="{.status.phase}")
   if [ "x${bslstatus}" == "xConnected" ]
   then
      echo "Backup Storage Location Connected"
      break
   else
      echo "Backup Storage Location not Connected yet"
      sleep 1
   fi
done

Step 4 - Create External Backup Policy (using external S3 endpoint)

cat <<EOF | oc create -f -
---
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: BackupPolicy
metadata:
  name: cli-external
  namespace: ibm-spectrum-fusion-ns
spec:
  backupStorageLocation: bsl-myaws-endpoint
  provider: isf-backup-restore
  retention:
    number: 5
    unit: weeks
  schedule:
    cron: 00 6  * * 0
    timezone: America/Los_Angeles
EOF

You can alter the scheduling example using CRON standards. The above will schedule a backup every Sunday. Such a backup policy can be used to restore the application to a different namespace or different cluster as the backup is being kept outside of the original Red Hat OpenShift namespace.

Step 5 - Create internal backup policy (using namespace scoped snapshots)

cat <<EOF | oc create -f -
---
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: BackupPolicy
metadata:
  name: cli-internal
  namespace: ibm-spectrum-fusion-ns
spec:
  backupStorageLocation: isf-dp-inplace-snapshot
  provider: isf-backup-restore
  retention:
    number: 7
    unit: days
  schedule:
    cron: '00 6  * * * '
    timezone: America/Los_Angeles
EOF

You can alter the scheduling example using CRON standards. The above will schedule a backup every day of the week.

Application configuration

The only thing that must be perform is to assign the correct backup policy to the application namespace.

Step 1 - Assign the external backup policy

cat <<EOF | oc create -f -
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: PolicyAssignment
metadata:
  name: file-uploader-rwx-cli-external
  namespace: ibm-spectrum-fusion-ns
spec:
  application: file-uploader-rwx
  backupPolicy: cli-external
  runNow: true
EOF

N.B.: The runNow: true parameter will cause a backup to start immediately upon the assignment of the backup policy to the application namespace.

Step 2 - Assign the internal backup policy

cat <<EOF | oc create -f -
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: PolicyAssignment
metadata:
  name: file-uploader-rwx-cli-internal
  namespace: ibm-spectrum-fusion-ns
spec:
  application: file-uploader-rwx
  backupPolicy: cli-internal
  runNow: false
EOF

Step 3 - Check the status of your backup

$ oc get backup.data-protection.isf.ibm.com -n ibm-spectrum-fusion-ns -o 'custom-columns=NAME:.metadata.name,APP:.spec.application,POLICY:.spec.backupPolicy,STATUS:.status.phase'
NAME                                          APP                 POLICY         STATUS
file-uploader-rwx-cli-external-202306071953   file-uploader-rwx   cli-external   Completed

Application restore

To test the restore of an application then becomes an easy set of task.

Step 1 - Delete the entire application namespace

$ oc delete project file-uploader-rwx

Step 2 - Initiate the restore

cat <<EOF | oc create -f -
---
apiVersion: data-protection.isf.ibm.com/v1alpha1
kind: Restore
metadata:
  name: file-uploader-rwx-restore-external
  namespace: ibm-spectrum-fusion-ns
spec:
  backup: $(oc get backup.data-protection.isf.ibm.com -n ibm-spectrum-fusion-ns | grep "cli-external" | awk '{ print $1 }')
EOF

Step 3 - Check the restore status

$ oc get restore.data-protection.isf.ibm.com -n ibm-spectrum-fusion-ns -o 'custom-columns=NAME:.metadata.name,BACKUP:.spec.backup,STATUS:.status.phase'
NAME                                 BACKUP                                        STATUS
file-uploader-rwx-restore-external   file-uploader-rwx-cli-external-202306071953   Completed

Graphical User Interface Check

From the Red Hat OpenShift Console, jump to the IBM Storage Fusion UI using the top-bar selector.

Step 1 - Check the Backup Storage Location was created

Through the Fusion UI, verify the configured Backup Storage Locations

Step 2 - Check the Backup Policies were created

Through the Fusion UI, verify the configured Backup Policies

Step 3 - Check the backup and restore ran fine

Through the Fusion UI, verify the configured job that ran and their status

Conclusion

IBM Storage Fusion allows you to enable the application developers and provide them with all the tools requires for stateful application to easily integrate the constraints of large organizations while easily leveraging all the data services offered by the infrastructure deployed.

IBM Storage Fusion liberates the infrastructure team from the daily provisioning tasks they face in traditional environments to focus and bring value around SLA monitoring and enforcement, best practice design, platform deployment automation or performance monitoring and troubleshooting.

Notes

For those interested, I used a sample application I created that relies on a customized container image. It is available from the following GitHub repository. The application is customized to be deployable on IBM Storage Fusion Software version 2.5.2. For documentation see here.

Going through the steps of this blog post requires

  • IBM Storage Fusion Operator to be deployed
  • IBM Storage Fusion Data Foundation to be deployed
  • IBM Storage Fusion Backup and Restore to be deployed but not configured.

The application can eventually modified to use a different storage class that provides:

  • CSI snapshot capabilities
  • RWX capabilities

#Featured-area-3
#Featured-area-3-home
#Highlights
#Highlights-home
1 comment
515 views

Permalink

Comments

Fri June 09, 2023 04:44 PM

The IBM Storage Fusion GUI also uses these same custom resources when you create things like backup policies and backup location via the GUI workflow.  Everything that the GUI does ultimately translates to CRs.  That means that if you've just started out learning how to use Storage Fusion, you can use the GUI to generate your YAML for you.  In fact, when you view a resource such as a backup policy or backup location in the GUI, you'll notice that there is a "Launch YAML" icon by the resource name.  Clicking the icon will launch a new tab that shows the YAML version of the resource.