Storage Fusion

 View Only

How to use IBM Storage Fusion to Backup and Restore Red Hat Quay Managed on an OCP cluster

By Ke Zhao Li posted Wed January 03, 2024 12:52 PM

  

By Ke Zhao Li(likezhao@cn.ibm.com), Dan Dan Wang(ddwang@cn.ibm.com)

Notes: This document is used for Proof of Concept (POC) purpose only, please contact with RedHat Quay/Fusion team before you put it into production.

Architecture overview

  Here is the architecture overview of RedHat Quay backup and restore solution utilizing IBM Storage Fusion capabilities.

1. Red Hat Quay is an enterprise-quality container registry. Use Red Hat Quay to build and store container images, then make them available to deploy across your enterprise. The Red Hat Quay Operator provides a simple method to deploy and manage Red Hat Quay on an OpenShift cluster.

2. IBM Storage Fusion Backup & Restore service is a recommended Backup and Restore solution for enterprise container applications. This blog is a reference practice to backup RedHat Quay application. Fusion uses local snapshots for quick recovery or transfer backups to external object storage for safe keeping. Refer Fusion doc for details.

Deploy IBM Storage Fusion

  1. Follow the Fusion doc to deploy Fusion.
  2. Follow the Data Foundation doc to enable the Data Foundation service. Fusion Data Foundation (FDF) will provide PVs for postgres database and OBC for object storage.
  3. Follow the Backup Restore doc to enable the Backup services. “Backup Restore” service (BR Hub) in primary cluster and “Backup Restore Agent” service in secondary cluster (BR Spoke).

Configure IBM Fusion Data Foundation

Configure Fusion Backup & Restore service

  1. Connect the BR spoke via “Connect cluster”. Here is the screenshot of connection details in BR Hub cluster.

  2. Backup & Restore Agent service in secondary site (BR spoke)

Deploy Red Hat Quay operator in primary cluster

      1. Follow the OCP UI or Quay doc to deploy the quay registry.

      2. Default option is “All namespaces on the cluster”. All the resources will be deployed in “openshift-operators” namespace with this option. 

      3. Create Quay Registry Instance, you can use different registry name, I am using dev-registry. Then leave the other option as default. To utilize the Fusion Data Foundation storage, make sure to set the storageclass ocs-storagecluster-ceph-rbd as default sc.

  4. Initialize Red Hat Quay. Create users/repository and upload images.

Chapter 2: Using Storage Fusion to Backup Quay database

Create Backup Location

Create Backup Policy

Create quay application CR

  Note: This is an optional step. It is required only when quay is deployed in “openshift-operators” ns.

  1. Create the application CR with below yaml.

apiVersion: application.isf.ibm.com/v1alpha1
kind: Application
metadata:
  annotations:
    dp.isf.ibm.com/provider-name: isf-backup-restore
  name: quay
  namespace: ibm-spectrum-fusion-ns
spec:
  includedNamespaces:
  - openshift-operators

      2. View the quay applications in Fusion UI.

Customize secret label and create CR

    1. Label the secret of dev-registry-config-bundle

    a) If the cluster have single quay registry, the command below can label this secret automatically.

oc label secrets -n openshift-operators $(oc get quayregistries.quay.redhat.com -o jsonpath="{.items[0].spec.configBundleSecret}{'\n'}"  -n openshift-operators) fusion-backup=true

     b) If multiple quay registries are deployed, make sure the correct secret name is labeled.

oc get quayregistries.quay.redhat.com -n openshift-operators dev-registry -o yaml

spec:
  …
  configBundleSecret: dev-registry-config-bundle-fl876    <- - here

oc label secrets -n openshift-operators dev-registry-config-bundle-fl876 fusion-backup=true

       2. Create quay recipe cr, make sure the the quay-secrets has the correct label Selector. Refer the Orchestrate a backup or restore doc for more details for recipe.

apiVersion: spp-data-protection.isf.ibm.com/v1alpha1
kind: Recipe
metadata:
  name: quay-demo-recipe
  namespace: openshift-operators
spec:
  appType: quay-demo
  groups:
    - name: quay-volumes
      type: volume
      includedNamespaces:
        - openshift-operators
      labelSelector: quay-component=postgres
    - name: quay-secrets
      type: resource
      includedNamespaces:
        - openshift-operators
      includeClusterResources: true
      includedResourceTypes:
        - secrets
      labelSelector: fusion-backup=true
    - name: quay-resources
      type: resource
      includedNamespaces:
        - openshift-operators
      includeClusterResources: true
      includedResourceTypes:
        - subscriptions.operators.coreos.com
      excludedResourceTypes:
        - objectbuckets.objectbucket.io
        - objectbucketclaims.objectbucket.io
        - objectbucketclaim.objectbucket.io
        - routes.route.openshift.io
    - name: quay-cr
      type: resource
      includedNamespaces:
        - openshift-operators
      includeClusterResources: true
      includedResourceTypes:
        - quayregistries.quay.redhat.com
  workflows:
  - name: backup
    sequence:
    - group: quay-volumes
    - group: quay-secrets
    - group: quay-resources
    - group: quay-cr
  - name: restore
    sequence:
    - group: quay-volumes
    - group: quay-secrets
    - group: quay-resources
    - group: quay-cr

Assign the backup policy

      1. Assign the backup policy to quay application, do NOT “Run backup now”.

       2. Update the PolicyAssignment CR in ibm-spectrum-fusion-ns. Add the below content into spec.recipe.

spec:
  application: quay
  backupPolicy: daily-minio
  recipe:
    apiVersion: spp-data-protection.isf.ibm.com/v1alpha1
    name: quay-demo-recipe
    namespace: openshift-operators
  runNow: false

Backup the Quay application manually

     1. Click “Action” - - > Backup Now” . Then wait the backup progress is completed.

Chapter 3: Backup and Restore Quay object storage

  Fusion replies on storage vendor or 3rd tools to backup/restore the quay object storage.  ODF(FDF) storage can utilize Noobaa to Mirroring data for hybrid and Multicloud buckets . The 3rd tools (aws sync, rclone, mc , etc) can be used for all type object storage. In this blog, I will demo the steps with aws sync tool.

Backup Quay object storage

      1. Login primary site with oc login

      2. Execute the command below to get the s3 creds.

export NS=openshift-operators 
export AWS_ACCESS_KEY_ID=$(oc get secret -l app=noobaa -n $NS -o jsonpath='{.items[0].data.AWS_ACCESS_KEY_ID}' |base64 -d)
export AWS_SECRET_ACCESS_KEY=$(oc get secret -l app=noobaa -n $NS -o jsonpath='{.items[0].data.AWS_SECRET_ACCESS_KEY}' |base64 -d)
export BUCKET_NAME=$(oc get cm -l app=noobaa -n $NS -o jsonpath='{.items[0].data.BUCKET_NAME}')
export S3_URL=$(oc get route s3 -n openshift-storage  -o jsonpath='{.spec.host}')
echo $AWS_ACCESS_KEY_ID
echo $AWS_SECRET_ACCESS_KEY
echo $BUCKET_NAME
echo $S3_URL

      3. Create a directory and copy all blobs to it by entering the following command:

mkdir blobs
aws s3 sync --no-verify-ssl --endpoint https://$S3_URL s3://$BUCKET_NAME ./blobs

Restore Quay object storage in secondary OCP cluster

      1. Login secondary site with oc login.

      2. Create the openshift-operators ns if the namespace does not exist.

oc adm new-project openshift-operators

      3. Create the OBC in secondary cluster. The OBC name must keep same as primary site. Use OCP console or the yaml below to create the OBC.

apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: dev-registry-quay-datastore
  namespace: openshift-operators
spec:
  additionalConfig:
    bucketclass: noobaa-default-bucket-class
  bucketName: ""
  generateBucketName: quay-datastore
  objectBucketName: obc-openshift-operators-dev-registry-quay-datastore
  storageClassName: openshift-storage.noobaa.io

      4. Execute the command below to get the s3 creds.

export NS=openshift-operators 
export AWS_ACCESS_KEY_ID=$(oc get secret -l app=noobaa -n $NS -o jsonpath='{.items[0].data.AWS_ACCESS_KEY_ID}' |base64 -d)
export AWS_SECRET_ACCESS_KEY=$(oc get secret -l app=noobaa -n $NS -o jsonpath='{.items[0].data.AWS_SECRET_ACCESS_KEY}' |base64 -d)
export BUCKET_NAME=$(oc get cm -l app=noobaa -n $NS -o jsonpath='{.items[0].data.BUCKET_NAME}')
export S3_URL=$(oc get route s3 -n openshift-storage  -o jsonpath='{.spec.host}')
echo $AWS_ACCESS_KEY_ID
echo $AWS_SECRET_ACCESS_KEY
echo $BUCKET_NAME
echo $S3_URL

      5. Upload all blobs to the bucket by running the following command:

aws s3 sync --no-verify-ssl --endpoint https://$S3_URL ./blobs s3://$BUCKET_NAME

Chapter 4: Using Storage Fusion to restore Quay application

      1. In Fusion UI, click “Restore” to launch the restore quay wizard.

      2. Select the restore point.

      3. Wait for the restore job completed.

Chapter 5: Verify RedHat Quay

Verify all pods are running.

      1. Wait for all the pods are running, then login quay registry.

      2. If the registry-quay-database pod is in CrashLoopBackOff, see the FAQ.

[f12@mc32 quay]$ oc get pod -n openshift-operators
NAME                                          READY   STATUS      RESTARTS        AGE
dev-registry-clair-app-b74cd6d9f-54fc9        1/1     Running     6 (8m46s ago)   11m
dev-registry-clair-app-b74cd6d9f-7n78b        1/1     Running     6 (8m6s ago)    11m
dev-registry-clair-app-b74cd6d9f-d5242        1/1     Running     0               5m44s
dev-registry-clair-app-b74cd6d9f-gthcg        1/1     Running     0               5m44s
dev-registry-clair-app-b74cd6d9f-s829h        1/1     Running     0               5m55s
dev-registry-clair-postgres-f86dbc974-nz66k   1/1     Running     0               8m45s
dev-registry-quay-app-6df4d68c6b-4mjb2        1/1     Running     0               3m39s
dev-registry-quay-app-6df4d68c6b-mvhvf        1/1     Running     0               3m53s
dev-registry-quay-app-upgrade-j7rnv           0/1     Completed   3               11m
dev-registry-quay-database-6bcf44498c-grpcn   1/1     Running     6 (8m35s ago)   11m
dev-registry-quay-mirror-7bb9bc8d45-9n9d7     1/1     Running     0               11m
dev-registry-quay-mirror-7bb9bc8d45-grjks     1/1     Running     0               11m
dev-registry-quay-redis-6899c477f8-nqzjw      1/1     Running     0               11m
quay-operator.v3.10.1-797c7c76b5-xfjfw        1/1     Running     0               31m

Verify Quay images and users/repos are restored.

        1. Login the Red Hat Quay with previous user/password. Verify Quay images and users/repos are restored.

FAQ

The registry-quay-database pod is in CrashLoopBackOff

      1.The dev-registry-quay-database pod is in CrashLoopBackOff with error message chmod: changing permissions of '/var/lib/pgsql/data/userdata': Operation not permitted’.

      2.This is a known issue and mentioned in redhat doc

[f12@mc32 quay]$ oc get pod -n openshift-operators
NAME                                           READY   STATUS             RESTARTS       AGE
…
dev-registry-quay-database-765d96bd45-wgph7    0/1     CrashLoopBackOff   5 (71s ago)    4m22s
dev-registry-quay-mirror-5ff66bb4b7-4rldf      0/1     Init:0/1           1 (114s ago)   4m10s
dev-registry-quay-mirror-5ff66bb4b7-bjkpl      0/1     Init:0/1           1 (2m9s ago)   4m22s
…
[f12@mc32 quay]$ oc logs -n openshift-operators dev-registry-quay-database-765d96bd45-wgph7
chmod: changing permissions of '/var/lib/pgsql/data/userdata': Operation not permitted

  Resolution:

      1. Use the command below to find the failed quay database pv name.

oc get pvc -n openshift-operators -l "quay-operator/quayregistry=dev-registry" | grep quay | awk -F " " '{print $3}'

      Example output

[f12@mc32 quay]$ oc get pvc -n openshift-operators -l "quay-operator/quayregistry=dev-registry" | grep quay | awk -F " " '{print $3}'
pvc-42f48c49-9818-4f83-aa8f-9c541d973fdf

      2. Use the command below to find the node name with this pv attached.

POD_NAME=$(oc get pod -n openshift-operators -o wide | grep registry-quay-database  | awk -F " " '{print $1}')
NODE_NAME=$( oc get pod -n openshift-operators $POD_NAME -o jsonpath={.spec.nodeName})
echo $NODE_NAME

      3. Use oc debug command to login worker node.

[f12@mc32 quay]$ oc debug node/$NODE_NAME
Warning: would violate PodSecurity "restricted:v1.24": host namespaces (hostNetwork=true, hostPID=true, hostIPC=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Starting pod/f12-7vng5-worker-1-mrfm9-debug-xg8fz ...
To use host binaries, run `chroot /host`
Pod IP: 10.1.4.105
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# 

      4. Find the userdata folder.

sh-5.1# df -h | grep pvc-42f48c49-9818-4f83-aa8f-9c541d973fdf
/dev/rbd0        49G   56M   49G   1% /var/lib/kubelet/pods/9c376b23-d3a5-41f8-89a9-ceacce061e14/volumes/kubernetes.io~csi/pvc-42f48c49-9818-4f83-aa8f-9c541d973fdf/mount
sh-5.1# cd /var/lib/kubelet/pods/9c376b23-d3a5-41f8-89a9-ceacce061e14/volumes/kubernetes.io~csi/pvc-42f48c49-9818-4f83-aa8f-9c541d973fdf/mount

sh-5.1# ls -ltra
total 24
drwxrws---.  2 root       1000720000 16384 Dec 26 10:51 lost+found
drwxrws---. 20 1000410000 1000720000  4096 Dec 28 00:00 userdata
drwxrwsr-x.  4 root       1000720000  4096 Dec 29 11:42 .
drwxr-x---.  3 root       root          40 Dec 29 12:02 ..
sh-5.1#

      5. Use chown to update the userdata folder owner and exit the debug mode

sh-5.1# chown 1000720000:1000720000 userdata
sh-5.1# ls -ltra
total 24
drwxrws---.  2 root       1000720000 16384 Dec 26 10:51 lost+found
drwxrws---. 20 1000720000 1000720000  4096 Dec 28 00:00 userdata
drwxrwsr-x.  4 root       1000720000  4096 Dec 29 11:42 .
drwxr-x---.  3 root       root          40 Dec 29 12:02 ..
sh-5.1# exit
exit
sh-4.4# exit
exit

Removing debug pod ...
[f12@mc32 quay]$

      6. Restart the pod then it will be in “Running” .

dev-registry-quay-database-6bcf44498c-grpcn   1/1     Running                 6 (4m31s ago)   7m41s


#Highlights
#Highlights-home
0 comments
37 views

Permalink