Storage Fusion

 View Only

Fusion Recipe Tips - How to undo an operation in case of a failure in the Recipe workflow

By Sandeep Prajapati posted Fri April 05, 2024 11:35 AM

  

Introduction

In a previous blog, we looked at a special case of the Fusion Recipe exec hook – running K8s or oc commands during data protection workflows. In this article, we will look at another special feature of exec hook, where we can undo certain operations in the event of backup or restore workflow failures.

Background

Like any other application or resource failure in a container environment, certain backup or restore operations may fail and can impact the normal functioning of user applications. Sometime, these failures can be severe, and can lead to downtime issues for the application.

Let’s assume you have a MongoDB database that was locked during the backup snapshot operation to ensure an application-consistent snapshot, and for some reason(s), the snapshot operation has failed. In this scenario, the Fusion backup job will be reported as failed, with the database locked and halting write operations.

Fusion Recipe has a built-in capability to undo previous operations if the inverseOp field is specified in the event of failures.

How to undo an operation in case of a failure in the Fusion Recipe workflow

For the above-mentioned MongoDB database scenario, the initial Fusion Recipe to backup the application is: 

...
  hooks:

  - name: mongodb-pod-exec
    labelSelector: app=mongodb
    timeout: 300
    namespace: ${GROUP.mongodb-resources.namespace}
    onError: fail
    ops:
      - command: >
          ["/bin/bash", "-c", "mongosh -u `printenv MONGO_INITDB_ROOT_USERNAME` -p `printenv MONGO_INITDB_ROOT_PASSWORD` --eval \"db.fsyncLock()\""]
        container: mongodb
        timeout: 300
        name: fsyncLock
        onError: fail
      - command: >
          ["/bin/bash", "-c", "mongosh -u `printenv MONGO_INITDB_ROOT_USERNAME` -p `printenv MONGO_INITDB_ROOT_PASSWORD` --eval \"db.fsyncUnlock()\""]
        container: mongodb
        timeout: 300
        name: fsyncUnlock
        onError: fail
    selectResource: pod
    type: exec
  workflows:
  - name: backup
    sequence:
    ...
    - hook: mongodb-pod-exec/fsyncLock
    - group: mongodb-volumes
    - hook: mongodb-pod-exec/fsyncUnlock

In the above snippet, the database is locked before the backup snapshot operation and subsequently unlocked to make it usable by the user. However, what happens if the snapshot operation fails? The Recipe execution will be aborted, leaving the database locked and resulting in downtime for the application. How can we address this situation? Fortunately, we can specify an undo operation in the inverseOp field of the Fusion Recipe.

To ensure that the database remains unlocked in the event of failures, we can set the inverseOp field as follows:

...
  hooks:
  - name: mongodb-pod-exec
    labelSelector: app=mongodb
    timeout: 300
    namespace: ${GROUP.mongodb-resources.namespace}
    onError: fail
    ops:
      - command: >
          ["/bin/bash", "-c", "mongosh -u `printenv MONGO_INITDB_ROOT_USERNAME` -p `printenv MONGO_INITDB_ROOT_PASSWORD` --eval \"db.fsyncLock()\""]
        container: mongodb
        timeout: 300
        name: fsyncLock
        onError: fail
        inverseOp: fsyncUnlock
      - command: >
          ["/bin/bash", "-c", "mongosh -u `printenv MONGO_INITDB_ROOT_USERNAME` -p `printenv MONGO_INITDB_ROOT_PASSWORD` --eval \"db.fsyncUnlock()\""]
        container: mongodb
        timeout: 300
        name: fsyncUnlock
        onError: fail
    selectResource: pod
    type: exec
  workflows:
  - name: backup
    sequence:
    ...
    - hook: mongodb-pod-exec/fsyncLock
    - group: mongodb-volumes
    - hook: mongodb-pod-exec/fsyncUnlock 

Conclusion

In this article, we have seen how to use Fusion Recipe inverseOp capability to overcome a bigger issue of application usability in the data protection workflows. In the next blog will explore some other aspect of Fusion Recipes.

0 comments
14 views

Permalink