Introduction
You may or may not be familiar with OpenTelemetry (OTel), but it has become very popular lately due to the adoption of hybrid cloud architecture, where applications get split into several layers, which makes monitoring and observability of what is happening very cumbersome.
The objective of OTel is to provide site reliability engineers (SREs) ways to observe the end-to-end life of the application and quickly identify the layer where anomalies occur so that the correct subject matter expert can be assigned to investigate and address the issues before it becomes a problem.
OpenTelemetry is a collection of APIs, SDKs, and tools, which can be used to instrument, generate, collect, and export telemetry data. Telemetry data consists of:
- Metrics for runtime measurements of a system.
- Logs that provide recordings of events.
- Traces to track the flow of an application through a system, enabling you to observe how applications are behaving.
Some organizations are already exploiting OpenTelemetry at some level, but until now, no telemetry data was gathered when the application reached a z/OS system.
Together with the GA of z/OS 3.2 comes also the support for z/OS Data Gatherer for z/OS OpenTelemetry Emitter. The support is enabled by the PTFs for APAR OA66345 (UJ98068 for z/OS 3.1 and UJ98067 for z/OS 3.2). Db2, IMS, CICS and MQ are also introducing support for OpenTelemetry trace telemetry data only in this initial delivery. The new OTel trace support is based on the specifications from W3C (World Wide Web Consortium) Trace Context Specifications.
Each component contributes to the trace by emitting a span. A span contains metadata information and attributes about the work performed by the component. Spans build up parent-child relationships, which allow their position in the trace to be calculated.
A trace is represented by a globally unique number, also known as a trace ID. Each span in the trace refers to the same trace ID.
The first span in a trace is known as the root span, and at this implementation can only be originated where the application starts at the distributed platform. Except for the root span, every span that follows in the lifetime of application, has a span immediately preceding it, which is known as the parent of the span. Every span in a trace has a unique number, also known as a span ID. Span IDs are used to build up the parent – child relationships in a trace. Span IDs must be unique inside a given trace.
The process of transferring tracing-related metadata between services and processes to link together events within the same distributed request or transaction is known as context propagation.
Overview OTel support in Db2 for z/OS
Db2 for z/OS distributed tracing for OTel support is introduced in Db2 13 at any function level starting at V13R1M100 through the following APARs:
- PH67971 – Db2 for z/OS OpenTelemetry Distributed Tracing Support
- PH68073 – Db2 for z/OS IMS Attach OpenTelemetry Distributed Tracing Support
These APARs will allow the Db2 system to accept inbound W3C trace propagation for workloads coming from the following sources.
- Db2 for z/OS native RESTful services
- JDBC Type-4/Db2 JCC type-4 Client driver 1
- CICS-Db2 Attachment Facility
- CICS Liberty-JDBC type-2 (RRS AF) 2
- IMS-Db2 Attachment (ESAF)
- IMS-Db2 Java Adapter-JDBC type-2 (RRS AF) 2
- z/OS Connect-Db2 Rest service (HTTP header)
1 JDBC Type-4 support at GA with Db2 Connect 12.1.3 or with a special build for Db2 Connect 12.1.2.
2 Attachment controlled by CICS or IMS and not JDBC type-2.
Optionally, for any valid inbound trace context, Db2 will generate and "emit" an OTel trace span record for each unique Db2 unit of work on a thread. You can use the new -START OTEL command to control when Db2 emits span records.
Db2 only supports inbound workloads in this release, this requires other middleware components to be enabled to propagate OTel distributed tracing. For more information see the links at the end of this blog.
Db2 will use a new, extended SMF record type 1161 subtype 1 format to write the emitted OTel trace span records. These SMF records can be read and processed by the z/OS OpenTelemetry Emitter mentioned in the introduction section.
The Db2 OTel trace span record includes a required session for any span record followed by specific attributes. The Db2 specific attributes are:
- Db2 version and function level
- Data sharing group name - Data sharing environments only
- Data sharing member name - Data sharing environments only
- Db2 subsystem name
- Db2 location name
- Db2 Logical Unit of Work ID (LUWID)
- Connection type string
- Connection ID string
- Correlation ID
- Db2 plan name associated to the Logical Unit of Work
- "In Db2" Elapsed time in microseconds - When IFICID 3 is on
- "In Db2" GP CPU time in microseconds - When IFICID 3 is on"In Db2" zIIP CPU time in microseconds - When IFICID 3 is on
APAR PH67971will the following new comments, new messages, and a new IFCID to support OTel distributed tracing.
-START OTEL EMIT (YES|NO)
-STOP OTEL
-DISPLAY OTEL
For more information about OTel support on System Z, refer to:
#Db2forz/OS