Global Storage Forum

 View Only

OpenTelemetry Collector for IBM Storage Insights Metrics

By Randhir Singh posted Fri November 08, 2024 06:14 AM

  

In a previous blog we learnt how to pull the IBM Storage Insights (SI) metrics onto Prometheus. In this blog, we'll learn how to onboard SI metrics onto an observability platform using OpenTelemetry (OTel) Collector.  This will bring us the benefits of observability, i.e., actionable insights, to the systems monitored by SI.

OTel is a standards body that provides a vendor-neutral solution to instrument and collect telemetry data. It provides a reference architecture that can be adopted and extended by others. The OTel specification is large and made up of multiple components. The entry barrier for adopting OTel can be quite high, especially if you're new to observability or the specific OTel concepts.

In this blog, our goal is to get started with OTel. The systems onboarded to SI produce a vast amount of telemetry data that can be monitored on SI GUI. We'll make use of the REST APIs provided by SI to pull those telemetry data and bring them onto the OTel processing pipeline.

OTel Collector

The OTel Collector, or just Collector, is a component in OTel ecosystem, designed to facilitate the efficient collection. processing and exporting of telemetry data. Before delving deeper on how to go about managing the Collector, let us see how it is deployed and its place in OTel ecosystem.

The telemetry data collection may reside anywhere between an application and the platform where the data will eventually reside:

  • Data collection may happen very close to the application. This mode of telemetry data collection is called agent.
  • Data collection may happen in a clustered fashion. This mode is referred to as a gateway or aggregator.

The Collector is a binary written in Golang supports both running as an agent or as a gateway. An OTel instrumented application may push telemetry data directly to an OTel backend. Do we then even need the Collector. The value of Collector is that it separates logical responsibility. The instrumentation can focus on generating, while the Collector can focus on what to do with the data generated and how to get it to its destination.

Components

The Collector consists of three main components: receivers, processors, and exporters. These collector components are used to construct telemetry pipelines.

  • Receivers: These are used to get data into the Collector. Receivers can be either push or pull based and supports one or more signal types - logs, traces and metrics.
  • Processors: These are used to perform actions based on received data, for example, filtering or CRUD.
  • Exporters: These are used to get data out of the Collector. Like receivers, these can be either push or pull based.
A high-level architecture diagram of the Collector is shown below with examples of each type of components. Telemetry data originates from an application suitably instrumented with OTel SDK and pushed to a Collector typically, where it is processed, and finally exported to a suitable backend for storage.

Configuration

Collector configuration is primarily handled via a YAML file, which is passed at runtime via CLI arguments. A configuration file must include a receiver, processor and exporter as shown below.
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
  
processors:
  batch:
  
exporters:
  debug:
    verbosity: detailed
  prometheus:
    endpoint: 127.0.0.1:8888

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug, prometheus]

Collector for SI Metrics

Now that we understand how OTel specifies generating, collecting, processing and exporting telemetry data, let us turn to the problem of onboarding SI metrics onto OTel platform.
SI Metrics are available by calling SI REST APIs. To pull metrics from SI, we will need to develop a pull-based receiver that we can deploy within a Collector. The Collector will be the doorway to help us make SI metrics available into the OTel world.
Our architecture of the OTel metric processing pipeline is shown below.
The pipeline consists of these components:
  1. A custom receiver to pull metrics from SI
  2. A dummy processor
  3. Two exporters:
    1. Debug exporter to dump received metrics on the console
    2. Prometheus exporter to export metrics to a Prometheus backend.

Building the Receiver

We'll build a metrics receiver for SI metrics named simetrics. The simetrics receiver has the role to receive and convert SI metrics from its original format into the OTel metrics model, so the information can be properly processed through the Collector’s pipelines.

In order to implement a metrics receiver we will need the following:

  • A Config implementation to enable the metrics receiver to gather and validate its configurations within the Collector’s config.yaml.

  • A receiver.Factory implementation so the Collector can properly instantiate the metrics receiver component.

  • A MetricsReceiver implementation that is responsible to collect the telemetry, convert it to the internal metrics representation, and hand the information to the next consumer in the pipeline.

The code is available on my GitHub. To run the collector, make sure that the Collector’s config.yaml has been updated properly with the simetrics receiver configured: as one of the receivers and is used in the pipeline(s).

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
  simetrics:
    interval: 1m
    endpoint: https://insights.ibm.com
    ibmid: 
    apikey: 
    tenantid: 
    systemid:
      - 
    metrics:
      - disk_total_data_rate
      - disk_total_response_time

processors:
  batch:
  
exporters:
  debug:
    verbosity: detailed
  prometheus:
    endpoint: 127.0.0.1:8888

service:
  pipelines:
    metrics:
      receivers: [otlp, simetrics]
      processors: [batch]
      exporters: [debug, prometheus]

Start the collector:

go run ./otelcol-dev --config config.yaml

The output should look like this:

% go run ./otelcol-dev --config config.yaml
2024-11-08T16:12:45.081+0530	info	service@v0.113.0/service.go:166	Setting up own telemetry...
2024-11-08T16:12:45.081+0530	info	telemetry/metrics.go:70	Serving metrics	{"address": "localhost:8888", "metrics level": "Normal"}
2024-11-08T16:12:45.082+0530	info	builders/builders.go:26	Development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "debug"}
2024-11-08T16:12:45.082+0530	info	service@v0.113.0/service.go:238	Starting otelcol-dev...	{"Version": "", "NumCPU": 10}
2024-11-08T16:12:45.082+0530	info	extensions/extensions.go:39	Starting extensions...
2024-11-08T16:12:45.082+0530	warn	internal@v0.113.0/warning.go:40	Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks.	{"kind": "receiver", "name": "otlp", "data_type": "metrics", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2024-11-08T16:12:45.083+0530	info	otlpreceiver@v0.113.0/otlp.go:112	Starting GRPC server	{"kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4317"}
2024-11-08T16:12:45.083+0530	info	service@v0.113.0/service.go:261	Everything is ready. Begin running and processing data.
2024-11-08T16:13:45.083+0530	info	simetrics/metrics-receiver.go:39	I should start processing metrics now!	{"kind": "receiver", "name": "simetrics", "data_type": "metrics"}
Data received: [map[disk_total_data_rate:293.65 disk_total_response_time:0.2 naturalKey:1:38:00000202E521FEDE timeStamp:1.731031969e+12]]
2024-11-08T16:13:49.109+0530	info	Metrics	{"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 2, "data points": 2}
2024-11-08T16:13:49.109+0530	info	ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: si_disk_total_data_rate
     -> Description: 
     -> Unit: 
     -> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-11-08 10:43:49.024062 +0000 UTC
Value: 293.650000
Metric #1
Descriptor:
     -> Name: si_disk_total_response_time
     -> Description: 
     -> Unit: 
     -> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-11-08 10:43:49.024063 +0000 UTC
Value: 0.200000
	{"kind": "exporter", "data_type": "metrics", "name": "debug"}

The Debug exporter is responsible for logging the received SI metrics to the console. The Prometheus exporter will make the SI metrics available on Prometheus. 

Configure the scrape target prometheus.yaml to be the address of OTel Collector.

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:8888"]

Navigate to Prometheus console on the browser. You can see the SI metrics being exported by OTel pipeline to Prometheus backend.

We have now successfully implemented a OTel metrics receiver that pulls metrics from IBM Storage Insights via its REST API and exports it to Prometheus backend after processing them through the OTel pipeline.  

References - https://opentelemetry.io/docs/collector/building/receiver/


#Highlights
#Highlights-home

0 comments
42 views

Permalink