IBM MQ for z/OS 9.4.3 introduces built-in support for taking part in OpenTelemetry traces as messages are sent to and received from queues. This complements, and works with, the existing function available on IBM MQ for AIX, Linux, Windows and the MQ Appliance.
An OpenTelemetry trace allows individual requests to be tracked as they move through the various components that make up a distributed application. For example, a trace could track an application request initiated in the cloud via an HTTP request through a web-server, to a message being placed on a Linux queue manager running in OpenShift, which is then routed to a z/OS queue manager, and then processed by a z/OS application, before a response is sent back by the same route. A trace provides detailed information about where the request went, how long it spent in each component, and what processing that component performed. If anything goes wrong with a request an OpenTelemetry trace can provide information about where it went wrong, making it much easier to debug distributed applications.
When MQ for z/OS is configured to take part in an OpenTelemetry trace, it will emit a span each time a message is sent to, or received from a queue. The span contains MQ specific information about the message and the application that sent or received it. Span are emitted to a component of the z/OS operating system which will be able to forward the span data to any observability tooling which supports the OTLP standard, for example IBM Instana or Jaeger. The observability tooling then generates and displays the trace from all the spans that comprise it.
MQ for z/OS is the first piece of z/OS middle-ware to provide support for OpenTelemetry trace. In future other z/OS middle-ware will also be able to take part in traces and propagate traces both to and from MQ. Beta support for this is already available via the IBM CICS Transaction Server for z/OS open beta program.
Full details on what OpenTelemetry is, and how it works, is available in the MQ documentation.
Below is an example trace from a distributed JMS application, connected to a Linux queue manager sending requests to a z/OS queue manager for processing by second JMS application which then generates a response. The distributed JMS application is using OpenTelemetry auto-instrumentation which emits spans (DistClient) in brown when it sends and receives the message, the application doesn't need to write any code to achieve this. The trace clearly shows the message being routed from the distributed queue manager, FYRE.MQ1, to the z/OS queue manager, MQ21, and back. The timeline shows how long the whole trace took, as well as the duration of each span in the trace.

If there was a network issue that prevented FYRE.MQ1 sending the message to MQ21 this would be apparent in the trace as the span named "PUT MQ21.REQUESTQ" would appear in the trace but the GET MQ21 span wouldn't, indicating that the channel from FYRE.MQ21 to MQ21 wasn't running. This information could then be passed to an MQ administrator to investigate, speeding up problem determination by making it clear where the first failure occurred.
This next image shows some of the detailed data that is collected as part of each span. Information about the message, and the application that sent or received it is provided, which should make narrowing down an issues simpler.

If you have any questions about using OpenTelemetry with MQ, or want a demo, reach out to me at lemingma@uk.ibm.com.