IBM Fusion

IBM Fusion

Ask questions, exchange ideas, and learn about IBM Fusion

 View Only

Fusion Recipe: Parallel workflow execution

By Sandeep Prajapati posted yesterday

  

Why parallel execution matters in application Backup?

Application backups can involve one or multiple steps, depending on the complexity of the application. It's often observed that when an application requires several sequential steps during backup, the entire process becomes time consuming. What if some of these steps could be executed simultaneously? Hmm... While it may not be feasible to run all steps in parallel, logically related steps can often be grouped and executed together. IBM Cloud Paks are a great example of complex applications where data preparation for backup snapshots tends to take longer. To improve the efficiency of the backup process, these bottlenecks need to be addressed.

IBM Fusion has recently taken a big step in this direction by introducing parallel execution in Backup recipe hook sequences. This enhancement is designed to streamline backup workflows and reduce overall execution time. In this blog post, we’ll explore how this new capability works and how it can be effectively used within the Fusion Recipe Framework.

How it works in Fusion?

In Fusion Backup and Restore recipe, a workflow is a related set of sequences (or steps) that define a specific execution path. The Fusion Recipe framework allows users to define multiple workflows.

In addition, these workflows can set their execution priority, allowing them for dynamic sequencing. Workflows that share the same priority often perform similar operations at a given step, making them ideal candidates for parallel execution.

Fusion’s parallel execution capability leverages this knowledge to run same-priority workflows simultaneously, thereby improving efficiency and reducing overall execution time. However, this implementation comes with a few important conditions that must be met to ensure safe and consistent execution.

Here are the key conditions that govern how workflows are executed:

     (1) All Hooks, Same Priority: Parallel Execution

Workflows with the same priority will be executed in parallel only if all sequences within them are hooks.

     (2) Group Sequence Present: Sequential Execution

If any workflow with the same priority contains at least one group sequence, all such workflows will be executed sequentially.

     (3) Different Priorities: Sequential Execution

Workflows with different priorities are always executed sequentially, based on their priority order.

     (4) Mixed Priorities: Hybrid Execution

Consider a shared workflow defined in three recipes: two with the same priority and one with a different priority. In this case, the two workflows with identical priorities will run in parallel, while the third (with a different priority) will be executed sequentially - either before or after, depending on its relative priority.

This can be diagrammatically visualized as -

                                                                  Fig. 1 - Parallel execution in Fusion Recipe

Key Notes on Parallel Execution in Fusion Recipes

      1. Configuration Control

The number of maximum parallel workers is controlled by the hook-parallel-workers flag in the ConfigMap "guardian-configmap" present in the ibm-backup-restore namespace.

Range: 0 to 35
Default: 1 (means sequential execution - for backward compatibility)
Special Case: A value of 0 dynamically maps to the number of usable CPUs on the system.

      2. Applicable Scenario

Parallel execution is only applicable in Parent/Child recipe setups. It does not apply to single recipe scenarios, as they lack the structural complexity required for parallelism.

      3. Task Suitability

This feature is primarily designed for I/O bound tasks, where parallel execution can significantly reduce wait time and improve workflow execution.

      4. Failure scenario

Any failure in parallel execution block aborts overall job execution and before aborting it waits for completion of current running sequences in other parallel blocks.

      5. Undo Operation

Any undo operation still follows sequential path in the opposite order of workflow sequences executed.

Parallel execution demonstration

Let’s assume an application has three backup recipes: Parent, Child1, and Child2. Each defines the same workflow "pre-backup" with an identical priority of 100, as illustrated in Figure 2.

                                                               Fig. 2 - Parallel execution in Fusion Recipe

When these workflows are executed sequentially, the total time taken is the sum of each workflows:

Parent (72s) + Child1 (50s) + Child2 (65s) = 187 seconds.

However, if these workflows are executed in parallel, the total time reduces to the maximum of the individual durations:

max(72, 50, 65) = 72 seconds.

That’s nearly 2.6X more efficient!

Another example -

The backup workflow for IBM Cloud Pak for Data (CP4D) - a complex enterprise application, consists of four checkpoints, each taking approximately 30 minutes to complete. When executed sequentially, the total backup time extends to around 2 hours. However, by running these checkpoints in parallel, the entire backup can be completed in just ~30 minutes - a 4X boost!, significantly enhancing the efficiency of the application backup process.

This clearly demonstrates that even complex applications can be backed up efficiently, provided the right strategies are in place.

Conclusion

Parallel execution in Fusion v2.11 marks a significant advancement in optimizing application backup workflows. By intelligently leveraging workflow priorities and sequence types, Fusion enables faster, more efficient execution - especially for I/O bound tasks. With clear configuration and well defined conditions, recipe users can now design smarter backup strategies that scale with application complexity.

Understanding when and how parallel execution applies, particularly, in Parent/Child recipe scenarios - empowers recipe developers and users to make the most of Fusion’s capabilities. As applications grow and performance demands increase, adopting parallelism isn’t just a technical enhancement, it’s a strategic necessity that helps reduce wait times and improve backup efficiency.

Acknowledgement: @Jim Smith

References

What's new: https://www.ibm.com/docs/en/fusion-software/2.11.0?topic=whats-new
Fusion Recipe: https://www.ibm.com/docs/en/fusion-software/2.11.0?topic=workflows-creating-recipe

0 comments
7 views

Permalink