Instana

 View Only

Developing Semantic Conventions for Generic Database Metrics

By Xiao Juan Niu posted Thu December 28, 2023 12:32 AM

  

Co Authors: Rui Liu, Zhen Gao, Li Jian Wang

The OpenTelemetry-based generic database sensor is included in Instana release 263. The database semantic convention serves as a contract between the telemetry data collector and the telemetry generic database sensor that plays an important role. This blog provides an introduction to OpenTelemetry Semantic Conventions, guides on developing semantic conventions compliant with the OpenTelemetry standard, outlines the Semantic Conventions for generic database metrics, and addresses the stability of Semantic Conventions.

OpenTelemetry Semantic Conventions

OpenTelemetry serves as an open source, vendor-agnostic Observability framework consists of a specification, the OpenTelemetry Protocol (OTLP), APIs, SDKs, tools, and integrations. For more information, see OpenTelemetry. It is designed to create and manage telemetry data such as traces, metrics, and logs. OpenTelemetry is tool-agnostic and vendor-neutral, developers of application software can instrument their code by using a unified set of APIs and conventions, and they can send telemetry data to any Observability backends that supports the OpenTelemetry standard, such as open source tools Jaeger and Prometheus, and commercial offerings such as Instana.

The OpenTelemetry Semantic Conventions define a contract between telemetry sources and consumers. The benefit of using Semantic Conventions is to use a single set of common naming schemes that can be adopted across codebases, libraries, tools, and platforms. Using a single set of common naming scheme ensures that telemetry data can be easily interpreted, correlated, exchanged, and processed by different tools and services across the system.

The OpenTelemetry Semantic Conventions specify Resource attributes (attribute key, meanings, and valid values), Traces (span name, kind, attribute name, meanings, and valid values), Metrics (name, kind, unit, and data point attributes), Log Records attributes, and Event attributes across various technology areas. For more information, see OpenTelemetry Semantic Conventions.

    Developing Semantic Conventions

    IBM Instana developed Semantic Conventions for relational database Resource attributes and Metric instruments. These conventions empower to develop your own data collectors and send telemetry data to the Instana Agent for relational database technologies that is currently not supported by Instana. The Instana team is actively working to advocate for these Semantic Conventions within the OpenTelemetry community.

    To develop compliant Semantic Conventions, you can do the following:

    1. Identify the area of interest: Determine the area of interest to check whether OpenTelemetry is defined Semantic Conventions for that area. Explore existing areas on the OpenTelemetry Semantic Conventions and ongoing projects on OpenTelemetry Project Boards. Additionally, you can create an account in https://slack.cncf.io and seek guidance in the #otel-semantic-conventions-wg channel, or attend Semantic Conventions Working Group meetings.

    2. Enhancement or new development: If OpenTelemetry is not defined Semantic Conventions for your interested area, or existing Semantic Conventions do not meet your requirements, consider requesting enhancements or develop a new one of your own and propose to the community. Engage with domain experts to identify and define the telemetry attributes and instruments. The domain experts play a crucial role in proposing the Semantic Convention to the OpenTelemetry community.

    3. Adherence to OpenTelemetry guidelines: The Semantic Conventions must adhere to OpenTelemetry guidelines, including Attribute Naming, Attribute Requirement Level, Metrics Data Model, General Metrics Guidelines, and Metric Requirement Level. The Attribute Naming rules apply to Resource, Metric, Trace, and Log attribute names, you must ensure that attribute names align with the guidelines. General Metrics Guidelines define extra naming rules specific to Metrics, and General Metrics Semantic Conventions define general guidelines that are related to Instrument Naming, Instrument Types, and Instrument Unit. For more Metrics related information, see Metrics Data Model.

    4. Reference-Specific Conventions: Depending on your needs, see General AttributesResource Semantic Conventions, Metrics Semantic Conventions, Trace Semantic Conventions, General Logs Attributes, and Semantic Conventions for Event Attributes.

    5. Follow the Semantic Conventions Process: When you propose a Semantic Convention to OpenTelemetry, follow the Semantic Conventions Process to submit a Project Tracking Issue for approval, especially for new areas or significant changes to existing Semantic Conventions. The process includes Working Group Preparation, Semantic Conventions Specification, and Implementation stages. Gather a diverse group of individuals to sign up on your Project Tracking Issue, and the project can commence after approval by the OpenTelemetry Technical Committee.

    Generic Database Semantic Conventions for Metrics

    The OpenTelemetry defined Semantic Conventions for Trace and Metrics in the database domain. For more information, see Semantic Conventions for Database Calls and Systems. However, Resource attributes are not defined for database area, and the metric instrument covers about connection pool-related metrics for database clients. The metric instruments are not defined for database server operations.

    Defining metric instruments for database server operations poses challenges due to the heterogeneous nature of database technology. Finding a common set of attributes and metrics suitable for various database types is difficult. Instana chose to develop Semantic Conventions for relational database based on the marketing requirements and similarities of different relational databases.

    Instana’s approach involved investigating several relational databases with the assistance of domain experts. Based on the research, the team extracted common attributes and metrics, defined Resource Attributes and established a set of Metric instruments for relational databases. Instana plans to enhance Semantic Conventions with more Metric instruments and potentially extend them to cover Trace, Logs, and Events in future.

    A Resource is an immutable representation of the entity that produces telemetry as attributes. The relational database server is one type of Resource, with defined database type, version, and connection information as the Resource attributes of the database server.

    The database-related metrics are further categorized into the following areas:

    - **Availability**: Database status.

    - **Throughput**: Number of sessions, transactions, SQL queries, and IO.

    - **Performance**: Cache hits, locks, and long-running SQL queries.

    - **Resource Usage**: Disk usage and table spaces.

    - **Maintenance**: Back up and restore.

    For more information, see Database Semantic Conventions.

    Stability in Semantic Conventions

    The Semantic Conventions must adapt and evolve over time. During the initial definition of conventions, mistakes are possible and require rectification over time. You might need to alter conventions, either by renaming or regrouping attributes and metrics as our understanding of the Telemetry source improves. The telemetry sources over time might want to change the schema of the telemetry they emit. 

    The changes to telemetry that is produced by OpenTelemetry instrumentation must avoid breaking telemetry consumers. To facilitate the evolution of telemetry and semantic conventions, OpenTelemetry relies on the concept of Telemetry schemas. For more information, see Semantic Conventions Stability and Telemetry schemas.

    The Telemetry schemas are versioned, allowing the schema to evolve over time, and telemetry sources might emit data conforming to newer versions of the schema. Telemetry schemas explicitly define transformations that are necessary to convert telemetry data between different versions of the schema. You must pay attention to the schema of the received telemetry. If necessary, you can transform the telemetry data from the received schema version to the target schema version.

    IBM Instana team can define and publish a telemetry schema later to allow the evolution of IBM Instana proposed generic database semantic conventions.

    1 comment
    31 views

    Permalink

    Comments

    Thu December 28, 2023 12:43 AM

    Thanks to Xiao Juan for her excellent work on OpenTelemetry Semantic Conventions.