Hi everyone,
while building a project with watsonx Orchestrate, I noticed relatively high latency when executing workflows.
The delay does not appear only in steps involving LLM calls (for example in Generative Prompt nodes), which would be expected. I also observe noticeable latency between nodes that should be lightweight, such as:
calls between workflows
Python tools with very small deterministic logic
simple data-processing steps
Because of this, a workflow with around 10 nodes and a few LLM calls can easily take ~10 seconds to complete, even when the individual operations themselves are trivial.
My initial idea was to design the system in a modular way, splitting logic into multiple small workflows and tools. In practice this seems to introduce a significant latency overhead between steps.
So I wanted to ask the community:
Is this expected behavior with Orchestrate workflows?
Have others observed similar latency when chaining multiple nodes?
Are there recommended best practices to reduce execution time (for example limiting workflow depth, merging steps, avoiding nested workflow calls, etc.)?
Have you found any workarounds that help keep workflows modular without incurring large latency costs?
Any experience or suggestions would be very helpful.
Thanks.
The engineering team are currently implementing a number of performance improvements to both the flow runtime and the agent runtime to reduce latency.