1. Introduction
To protect the environment from data loss and corruption, it is important to backup the environment, so that the environment could be rebuilt from the backup data if the primary environment failed to provide services when disaster happened.
This document describes sample steps to backup and restore Business Automation Workflow in IBM Cloud Pak for Business Automation 22.0.2 environment to a different openshift environment. More information could reference the IBM Knowledge center.
2. About Environment
Red Hat OpenShift Data Foundation (ODF) is a persistent storage and cluster data management solution for openshift, it can be deployed either entirely within OpenShift Container Platform (Internal approach) or to make available the services from a cluster running outside of OpenShift Container Platform (External approach). See Red Hat OpenShift Data Foundation 4.10 - Storage cluster deployment approaches for detail.
OADP (OpenShift APIs for Data Protection) is an operator that provide backup and restore APIs in the OpenShift cluster, such as: Backup, Restore, Schedule, BackupStorageLocation, VolumeSnapshotLocation etc. OADP operator will install Velero, and OpenShift plugins for Velero to use, for backup and restore operations.
This blog utilizes the OADP to work on the backup and restore.
2.1. ODF in External mode
In ODF external deployment mode, Ceph cluster (use ceph cluster as external storage service in this blog) locates outside of the openshift. The ODF stores the PV data onto the external ceph cluster. If creating the snapshot for a PVC, the snapshot would be generated in the external ceph cluster too.
While restoring to different openshift environment (the secondary openshift), the secondary openshift needs to access the volume snapshot and the backup data. Velero could help on this, the Velero instance on secondary openshift needs pointing to the same location as the primary openshift. i.e. configure BackupStorageLocations (BSL) and VolumeSnapshotLocations (VSL) to point to the same locations used by primary openshift. In the OADP configuration, use same DataProtectionApplication (DPA) CR to configure primary openshift and secondary openshift. Below diagram demonstrates such topology.
Please reference following documents to configure ODF, OADP and DPA.
The document Deploy OpenShift Data Foundation using Red Hat Ceph storage describes how to install the OpenShift Data Foundation operator and then create OpenShift Data Foundation cluster for the external Ceph storage system.
The document Installing and configuring the OpenShift API for Data Protection with OpenShift Data Foundation describes how to install the OpenShift API for Data Protection (OADP) with OpenShift Data Foundation by installing the OADP Operator and configuring a backup location and a snapshot location with DataProtectionApplication (DPA) CR.
This blog discusses the backup & restore for ODF with external approach.
2.2. ODF in Internal mode
In ODF internal deployment mode, the storage service locates inside of the openshift. The ODF stores the PV data onto this internal storage service. If creating the snapshot for a PVC, the snapshot would be generated in the internal storage service too.
While considering how to restore to different openshift environment (the secondary openshift), the secondary openshift needs to get the backup data and a copy of the volume snapshot. For some storage services, the snapshot made by primary openshift cannot be accessed by secondary openshift, following diagram demonstrates such situation with possible solutions.
First solution, OADP provides the data mover capability, it could back up container storage interface (CSI) volume snapshots to a remote object store. Please note, Data Mover for CSI snapshots is a Technology Preview feature currently in openshift 4.10. The document Using Data Mover for CSI snapshots describes detail for data mover. Below diagram describes such solution.
Another possible solution is to utilize the velero with restic option, the document Backing up applications with Restic describes how to use it. Below diagram describes this topology. Please note, the Restic backup would take longer time than csi volume snapshot backup.
If the storage services could provide the capability to access the snapshot from the different openshift, then the solution would get simpler. Please check the storage services documents to see the detail.
The topology described here is a common idea, there may be some other solutions to work on such situations.
2.3. Others
The deployment may be different in different user’s environment, so the backup and restore method for that environment maybe some different from others. In below section, it is using following configuration and method to backup and restore:
- Use Offline backup method. i.e., stop related pods before backup, so that there is no new data would be generated during backup, and the backup data could be in consistency state.
- The deployment is using ODF as storage class on Openshift 4.10.
- The deployment is using dynamic provision in this sample.
- The deployment is BAW enterprise pattern for CP4BA 22.0.2 with Business Automation Workflow Capability and Content Platform Engine
- The deployment is using PostgreSQL single instance as database server, and there is not configured JDBC over SSL.
- In this blog, would backup BTS data to S3 compatible storage.
3. General idea to backup and restore
3.1. Not to backup & restore the whole namespace
It is not to use Velero to backup and restore the whole namespace, it is not workable for BAW currently.
3.2. Data to be backup & restore
For BAW (Business Automation Workflow), please consider to backup and restore following data:
- Namespace definition
- Necessary PVC definition
- Necessary Contents on PV
- Necessary Secrets
- Databases
- CR files
It won’t backup and restore following things, they need to be installed manually on the restore site:
- Openshit platform
- IAF and IBM common services
3.3. UID and namespace
While BAW pod writing and reading files in openshift, the file permission depends on uid, and the uid is associated to the BAW namespace. In general, different namespace may have different uid, the same namespace on different Openshift may have different uid too.
For example, the uid is 1000670000 for namespace “baw” on openshift A, but the uid is 1000640000 for the same namespace “baw” on openshift B. The file was backup with uid 1000670000 for namespace “baw” in openshift A, after restore to openshift B, this file cannot be read and written by namespace “baw” because of different uid.
There are two possible methods to handle such problem:
- After files were restored on openshift B, chown files from original uid 1000670000 to the required uid 1000640000. If the number of files is huge, the chown processing needs some time to complete.
- Copy the namespace definition from openshift A, then recreate the namespace with same definition on openshift B before deploying everything. There is no time consuming compared to option A, but some namespaces may be possible to have same uid. This blog would use this method.
3.4. Backup and restore BTS
This blog backup the BTS data to a S3 compatible storage via EDB capability, it could be a AWS S3 storage, a S3 storage provided by Ceph cluster or other compatible storage. For more information about BTS backup and restore, please check document BTS document. Below are sample steps:
- To access a S3 storage account, create a secret that contains the Access Key ID and Access Secret, for example:
kubectl create secret generic s3-credentials \
--from-literal=ACCESS_KEY_ID=<access key>\
--from-literal=ACCESS_SECRET_KEY=<access secret key>
- Add bts backup section in source BAW CR. For example:
spec:
...
bts_configuration:
template:
backup:
barmanObjectStore:
endpointURL: http://ceph:80/
destinationPath: s3://ocp/
s3Credentials:
accessKeyId:
key: ACCESS_KEY_ID
name: s3-credentials
secretAccessKey:
key: ACCESS_SECRET_KEY
name: s3-credentials
After BAW CR was deployed successfully, the BTS EDB postgresql cluster would include those backup information, the cluster name could be retrieved by command: oc get cluster , then you can check detail information for that.
- Trigger the backup action. After applied the backup yaml file, the backup action could be triggered. For example:
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Backup
metadata:
name: bts-bawent-backup
spec:
cluster:
name: ibm-bts-cnpg-bawent-cp4ba-bts
Note: ibm-bts-cnpg-bawent-cp4ba-bts is the BTS EDB postgresql cluster name, it could be retrieved from command 'oc get cluster'.
- Before restoring, add recovery section in BAW restore CR. For example:
spec:
...
bts_configuration:
template:
recovery:
barmanObjectStore:
endpointURL: http://ceph:80/
destinationPath: s3://ocp/
s3Credentials:
accessKeyId:
key: ACCESS_KEY_ID
name: s3-credentials
secretAccessKey:
key: ACCESS_SECRET_KEY
name: s3-credentials
- After CR was deployed successfully on target openshift, the BTS data would be restored automatically.
4. Backup on the primary environment
Assumed the ODF, OADP, DPA has been installed and configured successfully, and BAW enterprise pattern has been setup and configured correctly on primary openshift. Please reference below steps to backup the primary environment.
4.1. Prepare
Please prepare following things:
- Backup the Cloud Pak custom resource (CR) file used to deploy the BAW environment.
- Backup secret definitions that associated with the BAW CR file. For example, the database username and password etc.
- Backup namespace definition.
oc get namespace bawent -o yaml \
| yq eval 'del(.metadata.labels, .metadata.creationTimestamp, .metadata.ownerReferences, .metadata.resourceVersion, .metadata.uid, .status, .spec.finalizers)' > ns.yaml
- Label PV/PVC/Secret to be backup.
oc label secret icp4adeploy-cpe-oidc-secret -n bawent to-be-backup=true
oc label secret admin-user-details -n bawent to-be-backup=true
oc label secret ibm-bts-cnpg-bawent-cp4ba-bts-app -n bawent to-be-backup=true
oc label pvc/cpe-filestore -n bawent to-be-backup=true
oc label pv `oc get pvc cpe-filestore -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
oc label pvc/icn-cfgstore -n bawent to-be-backup=true
oc label pv `oc get pvc icn-cfgstore -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
oc label pvc/datadir-zen-metastoredb-0 -n bawent to-be-backup=true
oc label pv `oc get pvc datadir-zen-metastoredb-0 -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
oc label pvc/datadir-zen-metastoredb-1 -n bawent to-be-backup=true
oc label pv `oc get pvc datadir-zen-metastoredb-1 -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
oc label pvc/datadir-zen-metastoredb-2 -n bawent to-be-backup=true
oc label pv `oc get pvc datadir-zen-metastoredb-2 -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
oc label pvc/icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0 -n bawent to-be-backup=true
oc label pv `oc get pvc icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0 -n bawent -o=custom-columns=pv:.spec.volumeName --no-headers=true` to-be-backup=true
It is to mark following things to be backup:
Backup secrets:
-
- icp4adeploy-cpe-oidc-secret.
- admin-user-details
- ibm-bts-cnpg-bawent-cp4ba-bts-app
Backup PVCs definition and their corresponding PVs:
-
-
cpe-filestore
-
icn-cfgstore
-
datadir-zen-metastoredb-0
-
datadir-zen-metastoredb-1
-
datadir-zen-metastoredb-2
-
icp4adeploy-bawins1-baw-jms-data-vc-icp4adeploy-bawins1-baw-jms-0
-
icp4adeploy-bawins1-baw-file-storage-pvc. (Note: if some customization were uploaded to the /opt/ibm/bawfile path for BAW server, this PVC/PV need to be backup, otherwise, skip it.)
4.2. Backup (for ODF External Deployment mode):
- Stop the environment before backup (it is offline backup here). See below commands for example, please note, the “icp4adeploy” in commands is the ICP4ACluster name that defined in the CR file.
oc scale deploy ibm-cp4a-operator --replicas=0
for i in `oc get deploy -o name |grep icp4adeploy`; oc scale $i --replicas=0; done
for i in `oc get sts -o name |grep icp4adeploy`; oc scale $i --replicas=0; done
oc scale sts zen-metastoredb --replicas=0
- Trigger backup yaml to backup PV/PVC/secret.
apiVersion: velero.io/v1
kind: Backup
metadata:
labels:
velero.io/storage-location: velero-sample-1
name: bawent-backup-select
namespace: openshift-adp
spec:
csiSnapshotTimeout: 10m0s
defaultVolumesToRestic: false
includedNamespaces:
- bawent
labelSelector:
matchLabels:
to-be-backup: 'true'
storageLocation: velero-sample-1
ttl: 720h0m0s
volumeSnapshotLocations:
- velero-sample-1
Note: velero-sample-1 is the name of BackupStorageLocation and VolumeSnapshotLocation, which are associated with the DataProtectionApplication definition named velero-sample.
Then check the status for this backup:
oc -n openshift-adp rsh deployment.apps/velero ./velero backup describe bawent-backup-select
After backup was completed successfully, the snapshot for volumes were generated.
- Backup BTS. Apply the backup yaml file mentioned above, BTS backup data into S3 compatible storage.
apiVersion: postgresql.k8s.enterprisedb.io/v1
kind: Backup
metadata:
name: bts-bawent-backup
spec:
cluster:
name: ibm-bts-cnpg-bawent-cp4ba-bts
Note: ibm-bts-cnpg-bawent-cp4ba-bts is the BTS EDB postgresql cluster name, it could be retrieved from command ‘oc get cluster’.
- Backup Databases, this could be done with database commands.
4.3. Backup (for ODF Internal Deployment mode):
If it is the ODF internal deployment mode, and plan to use the Restic option with Velero, the procedure is similar as above, please note to use defaultVolumesToRestic: true in the backup yaml.
5. Restore on the different environment
Assumed the ODF, OADP, DPA has been installed and configured successfully.
5.1. Prepare
- Apply the backup namespace ns.yaml to create the namespace, so that their UID are same.
- Install IAF/common services.
- Prepare the CR definition for the secondary environment.
- The hostname might be different between the source environment and secondary environment. If they are different, you can change the hostname in the CR files to match the hostname of the secondary environment.
- If initialization has been done in original deployment, set “shared_configuration.sc_content_initialization: false” and “shared_configuration.sc_content_verification: false”to avoid redo initialization and verification to break existing data in database.
- If BAI is enabled, please remove initialize_configuration section in CR.
- Add the BTS recovery section if necessary.
5.2. Restore (for ODF External Deployment mode)
Please reference below steps to restore the environment on the secondary openshift server.
- Restore all the databases from the backup images.
- Restore secrets associated with BAW CR. To restore BTS, recreate the secret s3-credentials.
- Apply restore yaml file, to restore secret/pv/pvc.
apiVersion: velero.io/v1
kind: Restore
metadata:
name: bawent-restore-select
namespace: openshift-adp
spec:
backupName: bawent-backup-select
restorePVs: true
Note: bawent-backup-select is name of previous backup.
Then check the status of restoring, to see if it is completed or not.
oc -n openshift-adp rsh deployment.apps/velero ./velero restore describe bawent-restore-select
If the restoring was completed successfully, then the backup PV/PVC/Secrets would be restored now.
- Apply the CR to deploy the BAW environment.
5.3. Restore (for ODF Internal Deployment mode)
If it is the ODF internal deployment mode, and plan to use the Restic option with Velero, the procedure is similar as above.
6. Reference
[1] IBM Cloud Pak for Business Automation 22.0.2 - Backing up your environments
[2] IBM Cloud Pak for Business Automation 22.0.2 - Restoring your environments
[3] Business Teams Service - Backing up and restoring
[4] Red Hat OpenShift Data Foundation 4.10 - Planning your deployment
[5] Deploy OpenShift Data Foundation using Red Hat Ceph storage
[6] OpenShift Container Platform 4.10 - Backing up applications
[7] OpenShift Container Platform 4.10 – Restoring applications