In a previous blog we learnt how to pull the IBM Storage Insights (SI) metrics onto Prometheus. In this blog, we'll learn how to onboard SI metrics onto an observability platform using OpenTelemetry (OTel) Collector. This will bring us the benefits of observability, i.e., actionable insights, to the systems monitored by SI.
OTel is a standards body that provides a vendor-neutral solution to instrument and collect telemetry data. It provides a reference architecture that can be adopted and extended by others. The OTel specification is large and made up of multiple components. The entry barrier for adopting OTel can be quite high, especially if you're new to observability or the specific OTel concepts.
In this blog, our goal is to get started with OTel. The systems onboarded to SI produce a vast amount of telemetry data that can be monitored on SI GUI. We'll make use of the REST APIs provided by SI to pull those telemetry data and bring them onto the OTel processing pipeline.
OTel Collector
The OTel Collector, or just Collector, is a component in OTel ecosystem, designed to facilitate the efficient collection. processing and exporting of telemetry data. Before delving deeper on how to go about managing the Collector, let us see how it is deployed and its place in OTel ecosystem.
The telemetry data collection may reside anywhere between an application and the platform where the data will eventually reside:
- Data collection may happen very close to the application. This mode of telemetry data collection is called agent.
- Data collection may happen in a clustered fashion. This mode is referred to as a gateway or aggregator.
The Collector is a binary written in Golang supports both running as an agent or as a gateway. An OTel instrumented application may push telemetry data directly to an OTel backend. Do we then even need the Collector. The value of Collector is that it separates logical responsibility. The instrumentation can focus on generating, while the Collector can focus on what to do with the data generated and how to get it to its destination.
Components
The Collector consists of three main components: receivers, processors, and exporters. These collector components are used to construct telemetry pipelines.
- Receivers: These are used to get data into the Collector. Receivers can be either push or pull based and supports one or more signal types - logs, traces and metrics.
- Processors: These are used to perform actions based on received data, for example, filtering or CRUD.
- Exporters: These are used to get data out of the Collector. Like receivers, these can be either push or pull based.
Collector configuration is primarily handled via a YAML file, which is passed at runtime via CLI arguments. A configuration file must include a receiver, processor and exporter as shown below.
Now that we understand how OTel specifies generating, collecting, processing and exporting telemetry data, let us turn to the problem of onboarding SI metrics onto OTel platform.
SI Metrics are available by calling SI REST APIs. To pull metrics from SI, we will need to develop a pull-based receiver that we can deploy within a Collector. The Collector will be the doorway to help us make SI metrics available into the OTel world.
Our architecture of the OTel metric processing pipeline is shown below.
The pipeline consists of these components:
We'll build a metrics receiver for SI metrics named simetrics
. The simetrics
receiver has the role to receive and convert SI metrics from its original format into the OTel metrics model, so the information can be properly processed through the Collector’s pipelines.
In order to implement a metrics receiver we will need the following:
-
A Config
implementation to enable the metrics receiver to gather and validate its configurations within the Collector’s config.yaml.
-
A receiver.Factory
implementation so the Collector can properly instantiate the metrics receiver component.
-
A MetricsReceiver
implementation that is responsible to collect the telemetry, convert it to the internal metrics representation, and hand the information to the next consumer in the pipeline.
The code is available on my GitHub. To run the collector, make sure that the Collector’s config.yaml
has been updated properly with the simetrics receiver configured: as one of the receivers and is used in the pipeline(s).
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
simetrics:
interval: 1m
endpoint: https://insights.ibm.com
ibmid:
apikey:
tenantid:
systemid:
-
metrics:
- disk_total_data_rate
- disk_total_response_time
processors:
batch:
exporters:
debug:
verbosity: detailed
prometheus:
endpoint: 127.0.0.1:8888
service:
pipelines:
metrics:
receivers: [otlp, simetrics]
processors: [batch]
exporters: [debug, prometheus]
Start the collector:
go run ./otelcol-dev --config config.yaml
The output should look like this:
% go run ./otelcol-dev --config config.yaml
2024-11-08T16:12:45.081+0530 info service@v0.113.0/service.go:166 Setting up own telemetry...
2024-11-08T16:12:45.081+0530 info telemetry/metrics.go:70 Serving metrics {"address": "localhost:8888", "metrics level": "Normal"}
2024-11-08T16:12:45.082+0530 info builders/builders.go:26 Development component. May change in the future. {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2024-11-08T16:12:45.082+0530 info service@v0.113.0/service.go:238 Starting otelcol-dev... {"Version": "", "NumCPU": 10}
2024-11-08T16:12:45.082+0530 info extensions/extensions.go:39 Starting extensions...
2024-11-08T16:12:45.082+0530 warn internal@v0.113.0/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks. {"kind": "receiver", "name": "otlp", "data_type": "metrics", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2024-11-08T16:12:45.083+0530 info otlpreceiver@v0.113.0/otlp.go:112 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4317"}
2024-11-08T16:12:45.083+0530 info service@v0.113.0/service.go:261 Everything is ready. Begin running and processing data.
2024-11-08T16:13:45.083+0530 info simetrics/metrics-receiver.go:39 I should start processing metrics now! {"kind": "receiver", "name": "simetrics", "data_type": "metrics"}
Data received: [map[disk_total_data_rate:293.65 disk_total_response_time:0.2 naturalKey:1:38:00000202E521FEDE timeStamp:1.731031969e+12]]
2024-11-08T16:13:49.109+0530 info Metrics {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 2, "data points": 2}
2024-11-08T16:13:49.109+0530 info ResourceMetrics #0
Resource SchemaURL:
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
-> Name: si_disk_total_data_rate
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-11-08 10:43:49.024062 +0000 UTC
Value: 293.650000
Metric #1
Descriptor:
-> Name: si_disk_total_response_time
-> Description:
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-11-08 10:43:49.024063 +0000 UTC
Value: 0.200000
{"kind": "exporter", "data_type": "metrics", "name": "debug"}
The Debug exporter is responsible for logging the received SI metrics to the console. The Prometheus exporter will make the SI metrics available on Prometheus.
Configure the scrape target prometheus.yaml to be the address of OTel Collector.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:8888"]
Navigate to Prometheus console on the browser. You can see the SI metrics being exported by OTel pipeline to Prometheus backend.
We have now successfully implemented a OTel metrics receiver that pulls metrics from IBM Storage Insights via its REST API and exports it to Prometheus backend after processing them through the OTel pipeline.
References - https://opentelemetry.io/docs/collector/building/receiver/
#Highlights
#Highlights-home