In this article, I want to introduce what watsonx.data OpenTelemetry feature is and how to configure it.
OpenTelemetry is a highly customizable serviceability framework that is designed to enhance monitoring and debugging capabilities. It facilitates generation, collection, and management of telemetry data such as traces and metrics to observability dashboards.
In watsonx.data 2.2.0, OpenTelemetry is supported for Presto (Java) engine and Milvus service.
- Trace: Represents the lifecycle of a single operation or request as it propagates through a system, capturing spans to detail its execution across services
- Metrics: Provide numerical measurements that reflect the performance, health, or behavior of a system, such as request counts, error rates, or resource utilization over time

Here, I want to introduce OpenTelemetry in watsonx.data using Instana as the backend, focusing on the next points.
- Technical overview
- How to configure in watsonx.data
- Configuration Verification
- Resources required by the ibm-lh-otel container
Technical Overview
After watsonx.data v2.1.2, the OpenTelemetry Collector is used to connect to Instana’s otlp-acceptor component. (The Instana agent is not used.)
When the OpenTelemetry function is enabled in watsonx.data, ibm-lh-otel is added as an Init Container for Presto and Milvus pods, and it is responsible for the OpenTelemetry function of watsonx.data.
Next diagram illustrates ibm-lh-otel is added to every Presto(Java) coordinator and worker pods and transfer the data to Instana.

How to configure OpenTelemetry in watsonx.data
Next is the steps to configure in watsonx.data 2.1.3 and 2.2.0 using Instana as the backend.
- Log in to watsonx.data console.
- From the navigation menu, select Configurations, and click OpenTelemetry tile.

- In the OpenTelemetry page, click
Diagnostic +
.
(3-1) "Available Telemetry Tools" : In this scenario, Select Instana
(3-2) "Telemetry Endpoint" : Enter the endpoint URL of the selected tool. Format of the endpoint: http://<host>:<port>/<path>
or https://<host>:<port>/<path>
. Use port 4317 for OTLP over GRPC and 4318 for OTLP over HTTP.
(3-3) Host Name : This value is assigned to x-instana-host
and should be a meaningful identifier that helps link telemetry data to its source. (I use string which represents my watsonx.data cluster as Host Name.)
(3-4) Password : This value is assigned to x-instana-key
, that is Instana agent key (used to authenticate with the Instana backend). Instructions for retrieving the Instana agent key can be found in the Instana documentation.
(3-5) TLS enabled : (skip because TLS verification is not functional on watsonx.data 2.1.3 and 2.2.0.)
(3-6) Associated diagnostics : check on diagnostic type.

(3-7) Click "Add". OpenTelemetry panel shows "Enabled".
Configuration Verification
After a while, the Presto engine and Milvus service will restart. Verify that they restart successfully.
Below shows the presto and milvus pod status using oc get pod
command after the pods restart successfully.
Some pods have 2/2
which shows the pod has 2 containers and 2 containers are 'READY'. That says, the ibm-lh-otel container is added to the pod, and it restarts successfully.
In next example, Pods which is added with ibm-lh-otel container are
- Milvus: datacoord, datanode, indexnode, proxy, querycoord , querynode, rootcood
- Presto(Java): coordinator, worker
- Presto(C++): coordinator
In watsonx.data 2.1.3 or 2.2.0, OpenTelemetry does not support Presto (C++), but the ibm-lh-otel container is created in the coordinator.
$ oc get pod | grep -E "prest|milvus|NAME"
NAME READY STATUS RESTARTS AGE
ibm-lh-lakehouse-milvus838-datacoord-7b7d44bd6f-9twkh 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-datanode-7fcd765d79-67jl7 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-etcd-0 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-etcd-1 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-etcd-2 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-indexnode-55f47ff48d-l96zf 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-indexnode-55f47ff48d-rr5vm 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-kafka-0 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-kafka-1 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-kafka-2 1/1 Running 0 12m
ibm-lh-lakehouse-milvus838-proxy-6f4ccbbcc4-dtqkm 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querycoord-7fb5cf48d9-2kdpr 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-5hnm7 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-9d467 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-cjthv 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-k9qb7 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-lsjwh 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-r2nch 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-r7pjd 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-rmr5r 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-w5gxc 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-querynode-785d78cbbb-zqj62 2/2 Running 0 12m
ibm-lh-lakehouse-milvus838-rootcoord-7688bd8c58-k4zrn 2/2 Running 0 12m
ibm-lh-lakehouse-prestissimo38-coordinator-blue-0 2/2 Running 0 13m
ibm-lh-lakehouse-prestissimo38-prestissimo-worker-0 1/1 Running 0 13m
ibm-lh-lakehouse-presto899-coordinator-blue-0 2/2 Running 0 14m
ibm-lh-lakehouse-presto899-presto-worker-0 2/2 Running 0 14m
ibm-lh-lakehouse-presto899-presto-worker-1 2/2 Running 0 14m
ibm-lh-lakehouse-presto899-presto-worker-2 2/2 Running 0 14m
$
If the connection to the telemetry endpoint fails, Init:CrashLoopBackOff
is dispalyed. In this case, you need to reconfigure.
Resources required by the ibm-lh-otel container
The OpenTelemetry feature requires additional resources.
The minimum resource required by the ibm-lh-otel container is 250 mm CPU (1/4 CPU) memory: 256 MB.
$ oc describe pod ibm-lh-lakehouse-presto899-coordinator-blue-0
(skip)
Init Containers:
ibm-lh-otel:
(skip)
Limits:
cpu: 1
ephemeral-storage: 550Mi
memory: 1024M
Requests:
cpu: 250m
ephemeral-storage: 50Mi
memory: 256M
(skip)
In the example on the previous output of oc get pod
, ibm-lh-otel was added in 22 pods.
In this case, you will need at least 5.5 cpu and 5.5G memory. It is in a very small environment. The required resources will increase depending on the number of engines and PODs.
Conclusion
In this article, I introduce OpenTelemetry in watsonx.data.
- Technical overview
- How to configure in watsonx.data
- Configuration Verification
- Resources required by the ibm-lh-otel container
I posted articles for the following subjects
References
Environment
The example in this topic introduced is run in the following environment mainly.
- OCP Version : 4.16
- CP4D 5.1.3 , watsonx.data 2.1.3
- Presto engines
- Presto (Java) v0.286 / Size : Starter
- Presto (C++) v0.286 / Size : Starter
- Milvus service : Version v2.5.0 / Size : Small
#watsonx.data