Cloud Pak for Business Automation

 View Only

Sample of Backup and Restore Business Automation Workflow Environment for CP4BA 22.0.2 on Red Hat OpenShift Data Foundation

By DIAN GUO ZOU posted Wed February 08, 2023 02:07 AM

  

1. Introduction

To protect the environment from data loss and corruption, it is important to backup the environment, so that the environment could be rebuilt from the backup data if the primary environment failed to provide services when disaster happened.

This document describes sample steps to backup and restore Business Automation Workflow in IBM Cloud Pak for Business Automation 22.0.2 environment to a different openshift environment. More information could reference the IBM Knowledge center. 

2. About Environment

 Red Hat OpenShift Data Foundation (ODF) is a persistent storage and cluster data management solution for openshift, it can be deployed either entirely within OpenShift Container Platform (Internal approach) or to make available the services from a cluster running outside of OpenShift Container Platform (External approach). See Red Hat OpenShift Data Foundation 4.10 - Storage cluster deployment approaches for detail.

OADP (OpenShift APIs for Data Protection) is an operator that provide backup and restore APIs in the OpenShift cluster, such as: Backup, Restore, Schedule, BackupStorageLocation, VolumeSnapshotLocation etc. OADP operator will install Velero, and OpenShift plugins for Velero to use, for backup and restore operations.

This blog utilizes the OADP to work on the backup and restore. 

2.1. ODF in External mode

In ODF external deployment mode, Ceph cluster (use ceph cluster as external storage service in this blog) locates outside of the openshift. The ODF stores the PV data onto the external ceph cluster. If creating the snapshot for a PVC, the snapshot would be generated in the external ceph cluster too.

While restoring to different openshift environment (the secondary openshift), the secondary openshift needs to access the volume snapshot and the backup data. Velero could help on this, the Velero instance on secondary openshift needs pointing to the same location as the primary openshift. i.e. configure BackupStorageLocations (BSL) and VolumeSnapshotLocations (VSL) to point to the same locations used by primary openshift. In the OADP configuration, use same DataProtectionApplication (DPA) CR to configure primary openshift and secondary openshift. Below diagram demonstrates such topology.

Please reference following documents to configure ODF, OADP and DPA.

The document Deploy OpenShift Data Foundation using Red Hat Ceph storage describes how to install the OpenShift Data Foundation operator and then create OpenShift Data Foundation cluster for the external Ceph storage system.

The document Installing and configuring the OpenShift API for Data Protection with OpenShift Data Foundation describes how to install the OpenShift API for Data Protection (OADP) with OpenShift Data Foundation by installing the OADP Operator and configuring a backup location and a snapshot location with DataProtectionApplication (DPA) CR.

This blog discusses the backup & restore for ODF with external approach.

2.2. ODF in Internal mode

In ODF internal deployment mode, the storage service locates inside of the openshift. The ODF stores the PV data onto this internal storage service. If creating the snapshot for a PVC, the snapshot would be generated in the internal storage service too.

While considering how to restore to different openshift environment (the secondary openshift), the secondary openshift needs to get the backup data  and a copy of the volume snapshot. For some storage services, the snapshot made by primary openshift cannot be accessed by secondary openshift, following diagram demonstrates such situation with possible solutions.

First solution, OADP provides the data mover capability, it could back up container storage interface (CSI) volume snapshots to a remote object store. Please note, Data Mover for CSI snapshots is a Technology Preview feature currently in openshift 4.10.  The document Using Data Mover for CSI snapshots describes detail for data mover. Below diagram describes such solution.

Another possible solution is to utilize the velero with restic option, the document Backing up applications with Restic describes how to use it. Below diagram describes this topology. Please note, the Restic backup would take longer time than csi volume snapshot backup.

If the storage services could provide the capability to access the snapshot from the different openshift, then the solution would get simpler. Please check the storage services documents to see the detail.

The topology described here is a common idea, there may be some other solutions to work on such situations. 

2.3. Others

 The deployment may be different in different user’s environment, so the backup and restore method for that environment maybe some different from others. In below section, it is using following configuration and method to backup and restore:

  • Use Offline backup method. i.e., stop related pods before backup, so that there is no new data would be generated during backup, and the backup data could be in consistency state.
  • The deployment is using ODF as storage class on Openshift 4.10.
  • The deployment is using dynamic provision in this sample.
  • The deployment is BAW enterprise pattern for CP4BA 22.0.2 with Business Automation Workflow Capability and Content Platform Engine
  • The deployment is using PostgreSQL single instance as database server, and there is not configured JDBC over SSL.
  • In this blog, would backup BTS data to S3 compatible storage.

3. General idea to backup and restore

3.1. Not to backup & restore the whole namespace

It is not to use Velero to backup and restore the whole namespace, it is not workable for BAW currently.

3.2. Data to be backup & restore

For BAW (Business Automation Workflow), please consider to backup and restore following data:

  • Namespace definition
  • Necessary PVC definition
  • Necessary Contents on PV
  • Necessary Secrets
  • Databases
  • CR files

It won’t backup and restore following things, they need to be installed manually on the restore site:

  • Openshit platform
  • IAF and IBM common services

3.3. UID and namespace

While BAW pod writing and reading files in openshift, the file permission depends on uid, and the uid is associated to the BAW namespace. In general, different namespace may have different uid, the same namespace on different Openshift may have different uid too.

For example, the uid is 1000670000 for namespace “baw” on openshift A, but the uid is 1000640000 for the same namespace “baw” on openshift B. The file was backup with uid 1000670000 for namespace “baw” in openshift A, after restore to openshift B, this file cannot be read and written by namespace “baw” because of different uid.

There are two possible methods to handle such problem:

  • After files were restored on openshift B, chown files from original uid 1000670000 to the required uid 1000640000. If the number of files is huge, the chown processing needs some time to complete.
  • Copy the namespace definition from openshift A, then recreate the namespace with same definition on openshift B before deploying everything. There is no time consuming compared to option A, but some namespaces may be possible to have same uid. This blog would use this method.

3.4. Backup and restore BTS

This blog backup the BTS data to a S3 compatible storage via EDB capability, it could be a AWS S3 storage, a S3 storage provided by Ceph cluster or other compatible storage. For more information about BTS backup and restore, please check document BTS document. Below are sample steps:

  1. To access a S3 storage account, create a secret that contains the Access Key ID and Access Secret, for example:
    kubectl create secret generic s3-credentials \
      --from-literal=ACCESS_KEY_ID=<access key>\
      --from-literal=ACCESS_SECRET_KEY=<access secret key>​
  2. Add bts backup section in source BAW CR. For example:
    spec:
      ...
      bts_configuration:
        template:
          backup:
            barmanObjectStore:
              endpointURL: http://ceph:80/
              destinationPath: s3://ocp/
              s3Credentials:
                accessKeyId:
                  key: ACCESS_KEY_ID
                  name: s3-credentials
                secretAccessKey:
                  key: ACCESS_SECRET_KEY
                  name: s3-credentials
    

 After BAW CR was deployed successfully, the BTS EDB postgresql cluster would include those backup information, the cluster name could be retrieved by command: oc get cluster , then you can check detail information for that. 

  1. Trigger the backup action. After applied the backup yaml file, the backup action could be triggered. For example:
    apiVersion: postgresql.k8s.enterprisedb.io/v1
    kind: Backup
    metadata:
      name: bts-bawent-backup
    spec:
      cluster:
        name: ibm-bts-cnpg-bawent-cp4ba-bts
    
    Note: ibm-bts-cnpg-bawent-cp4ba-bts is the BTS EDB postgresql cluster name, it could be retrieved from command 'oc get cluster'.
  2. Before restoring, add recovery section in BAW restore CR. For example:
    spec:
      ...
      bts_configuration:
        template:
          recovery:
            barmanObjectStore:
              endpointURL: http://ceph:80/
              destinationPath: s3://ocp/
              s3Credentials:
                accessKeyId:
                  key: ACCESS_KEY_ID
                  name: s3-credentials
                secretAccessKey:
                  key: ACCESS_SECRET_KEY
                  name: s3-credentials
    
  3. After CR was deployed successfully on target openshift, the BTS data would be restored automatically.

4. Backup on the primary environment

 Assumed the ODF, OADP, DPA has been installed and configured successfully, and BAW enterprise pattern has been setup and configured correctly on primary openshift. Please reference below steps to backup the primary environment. 

4.1. Prepare

 Please prepare following things:

  1. Backup the Cloud Pak custom resource (CR) file used to deploy the BAW environment.
  2. Backup secret definitions that associated with the BAW CR file. For example, the database username and password etc.
  3. Backup namespace definition.
    oc get namespace bawent -o yaml \
          | yq eval 'del(.metadata.labels, .metadata.creationTimestamp, .metadata.ownerReferences, .metadata.resourceVersion, .metadata.uid, .status, .spec.finalizers)' > ns.yaml
  4. Label PV/PVC/Secret to be backup.
    oc label secret icp4adeploy-cpe-oidc-secret  -n bawent   to-be-backup=true
    oc label secret admin-user-details  -n bawent   to-be-backup=true
    oc label secret ibm-bts-cnpg-bawent-cp4ba-bts-app  -n bawent   to-be-backup=true
    
    oc label pvc/cpe-filestore   -n bawent   to-be-backup=true
    oc label pv `oc get pvc cpe-filestore -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
    
    oc label pvc/icn-cfgstore   -n bawent   to-be-backup=true
    oc label pv `oc get pvc icn-cfgstore -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
    
    oc label pvc/datadir-zen-metastoredb-0   -n bawent   to-be-backup=true
    oc label pv `oc get pvc datadir-zen-metastoredb-0 -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
    oc label pvc/datadir-zen-metastoredb-1   -n bawent   to-be-backup=true
    oc label pv `oc get pvc datadir-zen-metastoredb-1 -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
    oc label pvc/datadir-zen-metastoredb-2   -n bawent   to-be-backup=true
    oc label pv `oc get pvc datadir-zen-metastoredb-2 -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
    
    oc label pvc/icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0   -n bawent   to-be-backup=true
    oc label pv `oc get pvc icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0 -n bawent   -o=custom-columns=pv:.spec.volumeName  --no-headers=true`   to-be-backup=true
It is to mark following things to be backup:
Backup secrets:
    • icp4adeploy-cpe-oidc-secret.
    • admin-user-details
    • ibm-bts-cnpg-bawent-cp4ba-bts-app
Backup PVCs definition and their corresponding PVs:
    • cpe-filestore
    • icn-cfgstore
    • datadir-zen-metastoredb-0
    • datadir-zen-metastoredb-1
    • datadir-zen-metastoredb-2
    • icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0
    • icp4adeploy-bawins1-baw-file-storage-pvc. (Note: if some customization were uploaded to the /opt/ibm/bawfile path for BAW server, this PVC/PV need to be backup, otherwise, skip it.)

4.2. Backup (for ODF External Deployment mode):

  1. Stop the environment before backup (it is offline backup here). See below commands for example, please note, the “icp4adeploy” in commands is the ICP4ACluster name that defined in the CR file.
    oc scale deploy ibm-cp4a-operator --replicas=0
    for i in `oc get deploy -o name |grep icp4adeploy`; oc scale $i --replicas=0; done
    for i in `oc get sts -o name |grep icp4adeploy`; oc scale $i --replicas=0; done
    oc scale sts  zen-metastoredb --replicas=0
  2. Trigger backup yaml to backup PV/PVC/secret.
    apiVersion: velero.io/v1
    kind: Backup
    metadata:
      labels:
        velero.io/storage-location: velero-sample-1
      name: bawent-backup-select
      namespace: openshift-adp
    spec:
      csiSnapshotTimeout: 10m0s
      defaultVolumesToRestic: false
      includedNamespaces:
        - bawent
      labelSelector:
        matchLabels:
          to-be-backup: 'true'
      storageLocation: velero-sample-1
      ttl: 720h0m0s
      volumeSnapshotLocations:
    - velero-sample-1
    
    Note: velero-sample-1 is the name of BackupStorageLocation and VolumeSnapshotLocation, which are associated with the DataProtectionApplication definition named velero-sample.
    Then check the status for this backup:
    oc -n openshift-adp rsh  deployment.apps/velero ./velero backup describe bawent-backup-select
    After backup was completed successfully, the snapshot for volumes were generated.
  3. Backup BTS. Apply the backup yaml file mentioned above, BTS backup data into S3 compatible storage.
    apiVersion: postgresql.k8s.enterprisedb.io/v1
    kind: Backup
    metadata:
      name: bts-bawent-backup
    spec:
      cluster:
        name: ibm-bts-cnpg-bawent-cp4ba-bts
    
    Note: ibm-bts-cnpg-bawent-cp4ba-bts is the BTS EDB postgresql cluster name, it could be retrieved from command ‘oc get cluster’.
    
  4. Backup Databases, this could be done with database commands.

4.3. Backup (for ODF Internal Deployment mode):

If it is the ODF internal deployment mode, and plan to use the Restic option with Velero, the procedure is similar as above, please note to use defaultVolumesToRestic: true in the backup yaml. 

5. Restore on the different environment

 Assumed the ODF, OADP, DPA has been installed and configured successfully.

5.1. Prepare

  1. Apply the backup namespace ns.yaml to create the namespace, so that their UID are same.
  2. Install IAF/common services.
  3. Prepare the CR definition for the secondary environment.
  • The hostname might be different between the source environment and secondary environment. If they are different, you can change the hostname in the CR files to match the hostname of the secondary environment.
  • If initialization has been done in original deployment, set “shared_configuration.sc_content_initialization: false” and “shared_configuration.sc_content_verification: false”to avoid redo initialization and verification to break existing data in database.
  • If BAI is enabled, please remove initialize_configuration section in CR.
  • Add the BTS recovery section if necessary.

5.2. Restore (for ODF External Deployment mode)

Please reference below steps to restore the environment on the secondary openshift server.

  1. Restore all the databases from the backup images.
  2. Restore secrets associated with BAW CR. To restore BTS, recreate the secret s3-credentials.
  3. Apply restore yaml file, to restore secret/pv/pvc.
    apiVersion: velero.io/v1
    kind: Restore
    metadata:
      name: bawent-restore-select
      namespace: openshift-adp
    spec:
      backupName: bawent-backup-select
      restorePVs: true 
    
    Note: bawent-backup-select is name of previous backup.
    Then check the status of restoring, to see if it is completed or not.
    oc -n openshift-adp rsh  deployment.apps/velero ./velero restore describe bawent-restore-select

    If the restoring was completed successfully, then the backup PV/PVC/Secrets would be restored now.

  4. Apply the CR to deploy the BAW environment.

5.3. Restore (for ODF Internal Deployment mode)

 If it is the ODF internal deployment mode, and plan to use the Restic option with Velero, the procedure is similar as above.

6. Reference


[1] IBM Cloud Pak for Business Automation 22.0.2 - Backing up your environments

[2] IBM Cloud Pak for Business Automation 22.0.2 - Restoring your environments

[3] Business Teams Service - Backing up and restoring

[4] Red Hat OpenShift Data Foundation 4.10 - Planning your deployment

[5] Deploy OpenShift Data Foundation using Red Hat Ceph storage

[6] OpenShift Container Platform 4.10 - Backing up applications

[7] OpenShift Container Platform 4.10 – Restoring applications

 

 

0 comments
138 views

Permalink