Related Documentation
Made by
Kong Inc.
Supported Gateway Topologies
hybrid db-less traditional
Supported Konnect Deployments
hybrid cloud-gateways serverless
Compatible Protocols
grpc grpcs http https tcp tls tls_passthrough udp ws wss
Minimum Version
Kong Gateway - 3.0

The OpenTelemetry plugin provides metrics, traces, and logs in the OpenTelemetry format and can be used with any OpenTelemetry compatible backend.

The OpenTelemetry plugin allows you to collect data for the following signals:

Use cases

Common use cases for the OpenTelemetry plugin:

Use case

Description

Enable the OTEL plugin for metrics Configure the OpenTelemetry plugin to send metrics.
Enable the OTEL plugin for API transactional logs Configure the OpenTelemetry plugin to send API transactional logs.
Enable the OTEL plugin for runtime logs Configure the OpenTelemetry plugin to logs about the data plane’s internal execution.
Enable the OTEL plugin for traces Configure the OpenTelemetry plugin to send traces.
Enable the OTEL plugin for all signals Configure the OpenTelemetry plugin to send metrics, tracing and data plane/error logs and API transaction logs.
Extract, clear, and inject tracing data Configure the OpenTelemetry plugin to extract tracing context, clear specific headers, and inject tracing context using a specific format.
Ignore incoming headers Configure the OpenTelemetry plugin to inject tracing context in multiple formats.
Multiple injection Configure the OpenTelemetry plugin to extract tracing context in one format and inject tracing context in multiple formats.
Preserve incoming format Configure the OpenTelemetry plugin to extract and preserve the tracing context in the same header type.

Collecting telemetry data

To set up an OpenTelemetry backend, you need support for OTLP over HTTP with Protobuf encoding. You can:

  • Send data directly to an OpenTelemetry-compatible backend that natively supports OTLP over HTTP with Protobuf encoding, like Jaeger (v1.35.0+).

    This is the simplest setup, since it doesn’t require any additional components between the data plane and the backend.

  • Use the OpenTelemetry Collector, which acts as an intermediary between the data plane and one or more backends.

    OTEL Collector can receive all OpenTelemetry signals supported by the OpenTelemetry plugin, including traces, metrics, and logs, and then process, transform, or route that data before exporting it to a compatible backend.

    This option is useful when you need capabilities such as signal fan-out, filtering, enrichment, batching, or exporting to multiple backends. The OpenTelemetry Collector supports a wide range of exporters, available at open-telemetry/opentelemetry-collector-contrib.

Resource attributes

The OpenTelemetry plugin attaches additional resource attributes to all telemetry data it sends to an OTLP endpoint. Resource attributes describe the entity that produced the telemetry and are shared across all signals.

The OpenTelemetry plugin automatically sets the following resource attributes:

Attribute Attribute description
service.name Name of the service exposing the signal. This is optional, the default value is kong.
service.version Gateway version of the node exposing the signal.
service.instance.id ID of the node exposing the signal.

You can add or override resource attributes by configuring the config.resource_attributes parameter. Custom resource attributes are merged with the default attributes and are included with all exported telemetry data. Some metric backends, such as Prometheus, apply resource attributes to every metric. Be mindful of the impact on cardinality.

Metrics v3.13+

In Kong Gateway, metrics are natively supported by the OpenTelemetry plugin. You can send metrics using the parameters under config.metrics.

Available metrics

The following metrics are exposed:

http.server.request.count

Total number of incoming HTTP requests.

  • Instrument unit: {request}
  • Instrument type: Sum
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.auth.consumer.name Name of the authenticated Consumer.
    kong.response.source Origin of the current response. Possible values:
    • upstream if the response originated by successfully contacting the upstream service
    • kong otherwise
    kong.workspace.name Name of the Workspace.
    http.request.method Method used in the HTTP request.
    kong.response.status_code HTTP status code of the response.

kong.latency.total

Complete end-to-end duration of a request in seconds.

  • Instrument unit: s
  • Instrument type: Histogram
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.workspace.name Name of the Workspace.

kong.latency.internal

Kong’s internal processing time in seconds, from when the Gateway receives the request from the client to when it sends the request to the upstream service.

  • Instrument unit: s
  • Instrument type: Histogram
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.workspace.name Name of the Workspace.

kong.latency.upstream

Upstream processing time in seconds, from when the Gateway sends the request to the upstream, to when the data is returned to Kong.

  • Instrument unit: s
  • Instrument type: Histogram
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.workspace.name Name of the Workspace.

http.server.request.size

Size of each incoming HTTP request in bytes.

  • Instrument unit: By
  • Instrument type: Histogram
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.auth.consumer.name Name of the authenticated Consumer.
    kong.workspace.name Name of the Workspace.

http.server.response.size

Total size of the HTTP response sent back to the client in bytes.

  • Instrument unit: By
  • Instrument type: Histogram
  • Attributes:

    Attribute Attribute description
    kong.service.name Name of the Gateway Service.
    kong.route.name Name of the Route.
    kong.auth.consumer.name Name of the authenticated Consumer.
    kong.workspace.name Name of the Workspace.

kong.shared_dict.usage

Current memory usage of a shared dict in bytes.

  • Instrument unit: By
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.shared_dict.name Name of the shared dict.
    kong.subsystem Nginx subsystem that produced the metric. Possible values:
    • http
    • stream

kong.shared_dict.size

Total memory size of a shared dict in bytes.

  • Instrument unit: By
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.shared_dict.name Name of the shared dict.
    kong.subsystem Nginx subsystem that produced the metric. Possible values:
    • http
    • stream

kong.memory.workers.lua_vm

Memory used by the worker’s Lua VM in bytes.

  • Instrument unit: By
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.pid Worker process ID.
    kong.subsystem Nginx subsystem that produced the metric. Possible values:
    • http
    • stream

kong.nginx.connection.count

Number of client connections in Nginx.

  • Instrument unit: {connection}
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.subsystem Nginx subsystem that produced the metric. Possible values:
    • http
    • stream
    kong.connection.state State of the client connection. Possible values:
    • accepted
    • handled
    • total
    • active
    • reading
    • writing
    • waiting

kong.nginx.timer.count

Number of internal scheduled timers Nginx is running in the background.

  • Instrument unit: {timer}
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.timer.state State of the timer. Possible values:
    • pending
    • running

kong.db.connection.status

Shows whether Kong has an active database connection. A value of 1 means connected. A value of 0 means not connected.

  • Instrument unit: 1
  • Instrument type: Gauge
  • No attributes

kong.cp.connection.status

Shows whether the data plane has an active connection to the control plane. A value of 1 means connected. A value of 0 means not connected.

  • Instrument unit: 1
  • Instrument type: Gauge
  • No attributes

kong.upstream.target.status

Upstream target’s health. The actual status is in the state attribute, with the metric value set to 1 when a state is populated.

  • Instrument unit: 1
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.upstream.name Name of the Upstream.
    kong.target.address Address of the Target.
    server.address Address of the server.
    kong.upstream.state Health of the Upstream Target. Possible values:
    • healthy
    • unhealthy
    • dns_error
    kong.subsystem Nginx subsystem that produced the metric. Possible values:
    • http
    • stream

kong.dp.cluster_cert.expiry

Timestamp when the data plane’s cluster certificate will expire.

  • Instrument unit: s
  • Instrument type: Gauge
  • No attributes

kong.db.entity.count

Number of entities stored in Kong’s database.

  • Instrument unit: {entity}
  • Instrument type: Gauge
  • No attributes

kong.db.entity.error.count

Number of errors seen during database entity count collection.

  • Instrument unit: {error}
  • Instrument type: Sum
  • No attributes

kong.ee.license.signature

Last 8 bytes of the Enterprise license signature as a number.

  • Instrument type: Gauge
  • No attributes

kong.ee.license.expiration

Unix epoch time when the license expires, shifted by 24 hours to avoid timezone differences.

  • Instrument unit: s
  • Instrument type: Gauge
  • No attributes

kong.ee.license.features

Indicates whether the data plane can read or write entities under the current license. Each capability (ee_entity_read and ee_entity_write) is reported as its own metric, where 1 means allowed and 0 means not allowed.

  • Instrument unit: 1
  • Instrument type: Gauge
  • Attributes:

    Attribute Attribute description
    kong.ee.license.feature Enterprise feature. Possible values:
    • ee_entity_read
    • ee_entity_write

kong.ee.license.error.count

Number of errors that occurred while collecting license information.

  • Instrument unit: {error}
  • Instrument type: Sum
  • No attributes

Metrics with Kong Gateway 3.12 or earlier

If you’re using Kong Gateway 3.12 or earlier, metrics are enabled using the contrib version of the OpenTelemetry Collector.

The spanmetrics connector allows you to aggregate traces and provide metrics to any third party observability platform.

To include span metrics for application traces, configure the collector exporters section of the OpenTelemetry Collector configuration file:

connectors:
  spanmetrics:
    dimensions:
      - name: http.method
        default: GET
      - name: http.status_code
      - name: http.route
    exclude_dimensions:
      - status.code
    metrics_flush_interval: 15s
    histogram:
      disable: false

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: []
      exporters: [spanmetrics]
    metrics:
      receivers: [spanmetrics]
      processors: []
      exporters: [otlphttp]

Tracing

Built-in tracing instrumentations

Kong Gateway has a series of built-in tracing instrumentations which are configured by the tracing_instrumentations configuration. Kong Gateway creates a top-level span for each request by default when tracing_instrumentations is enabled.

The top level span has the following attributes:

  • http.method: HTTP method
  • http.url: HTTP URL
  • http.host: HTTP host
  • http.scheme: HTTP scheme (http or https)
  • http.flavor: HTTP version
  • net.peer.ip: Client IP address

For more information, see the Tracing reference.

Note: When the OpenTelemetry plugin is used together with the Proxy Cache Advanced plugin, cache-HIT responses are not traced. This is expected behavior. When a request results in a cache-HIT, the response is served before the request lifecycle reaches the phase where the OpenTelemetry plugin executes. As a result, no spans are generated for cache-HIT requests. Cache-MISS requests continue through the full request lifecycle and are traced normally.

Gen AI tracing attributes v3.13+

When processing generative AI traffic through Kong AI Gateway, additional span attributes are emitted following the OpenTelemetry Gen AI semantic conventions. These attributes capture model parameters, token usage, and tool-call metadata.

For the complete attribute reference, see Gen AI OpenTelemetry attributes.

Propagation

The OpenTelemetry plugin supports propagation of the following header formats:

This plugin offers extensive options for configuring tracing header propagation, providing a high degree of flexibility. You can customize which headers are used to extract and inject tracing context. Additionally, you can configure headers to be cleared after the tracing context extraction process, enabling a high level of customization.

 
flowchart LR
   id1(Original Request) --> Extract
   id1(Original Request) -->|"headers (original)"| Extract
   id1(Original Request) --> Extract
   subgraph ide1 [Headers Propagation]
   Extract --> Clear
   Extract -->|"headers (original)"| Clear
   Extract --> Clear
   Clear -->|"headers (filtered)"| Inject
   end
   Extract -.->|extracted ctx| id2((tracing logic))
   id2((tracing logic)) -.->|updated ctx| Inject
   Inject -->|"headers (updated ctx)"| id3(Updated request)
  

See the plugin’s configuration reference for a complete overview of the available options and values.

Note: If any of the config.propagation.* configuration options (extract, clear, or inject) are configured, the config.propagation configuration takes precedence over the deprecated header_type parameter. If none of the config.propagation.* configuration options are set, the header_type parameter is still used to determine the propagation behavior.

In Kong Gateway 3.6 or earlier, the plugin detects the propagation format from the headers and will use the appropriate format to propagate the span context. If no appropriate format is found, the plugin will fallback to the default format, which is w3c.

OTLP exporter

The OpenTelemetry plugin implements the OTLP/HTTP exporter, which uses Protobuf payloads encoded in binary format and is sent via an HTTP/1.1.

config.connect_timeout, config.read_timeout, and config.send_timeout are used to set the timeouts for the HTTP request.

config.batch_span_count and config.batch_flush_delay are used to set the maximum number of spans and the delay between two consecutive batches.

Create a custom span

The OpenTelemetry plugin is built on top of the Kong Gateway tracing PDK. You can customize the spans and add your own spans through the universal tracing PDK.

  1. Create a file named custom-span.lua with the following content:

    -- Modify the root span
    local root_span = kong.tracing.get_root_span()
    root_span:set_attribute("custom.attribute", "custom value")
    
    -- Modify the active span
    local active_span = kong.tracing.active_span()
    active_span:set_attribute("custom.attribute", "custom value")
    
    -- Create a custom span
    local span = kong.tracing.start_span("custom-span")
    
    -- Append attributes
    span:set_attribute("custom.attribute", "custom value")
    
    -- Close the span
    span:finish()
    
  2. Apply the Lua code with the Post-function plugin using a cURL file upload:

    curl -i -X POST http://localhost:8001/plugins \
      -F "name=post-function" \
      -F "config.access[1]=@custom-span.lua"
    

Logging v3.8+

This plugin supports OpenTelemetry Logging, which can be configured as described in the configuration reference to export logs in OpenTelemetry format to an OTLP-compatible backend.

Log scopes

Two different kinds of logs are exported:

  • v3.13+ API transactional logs (also known as access logs) represent metadata about client requests. These access logs are produced during the request lifecycle. These logs typically don’t have a severity.
  • Runtime and error logs aren’t directly associated with a request. They’re produced by the data plane and provide data about its internal execution. For example, they could be logs generated asynchronously (in a timer) or during a worker’s startup.

Log level

Logs are recorded based on the log level that is configured for Kong Gateway. If a log is emitted with a level that is lower than the configured log level, it is not recorded or exported.

Note: Not all logs are guaranteed to be recorded. Logs that aren’t recorded include those produced by the Nginx master process and low-level errors produced by Nginx. Operators are expected to still capture the Nginx error.log file (which always includes all such logs) in addition to using this feature, to avoid losing any details that might be useful for deeper troubleshooting.

Runtime and error log entry

Each log entry adheres to the OpenTelemetry Logs Data Model. The available information depends on the log scope and on whether tracing is enabled for this plugin.

Every log entry includes the following fields:

  • Timestamp: Time when the event occurred.
  • ObservedTimestamp: Time when the event was observed.
  • SeverityText: The severity text (log level).
  • SeverityNumber: Numerical value of the severity.
  • Body: The error log line.
  • Resource: Configurable resource attributes.
  • InstrumentationScope: Metadata that describes Kong Gateway’s data emitter.
  • Attributes: Additional information about the event.
    • introspection.source: Full path of the file that emitted the log.
    • introspection.current.line: Line number that emitted the log.

In addition to the above, request-scoped logs include:

  • Attributes: Additional information about the event.
    • request.id: Kong Gateway’s request ID.

In addition to the above, when tracing is enabled, request-scoped logs include:

  • TraceID: Request trace ID.
  • SpanID: Request span ID.
  • TraceFlags: W3C trace flag.

Logging for custom plugins

The custom plugin PDK kong.telemetry.log module lets you configure OTLP logging for a custom plugin. The module records a structured log entry, which is reported via the OpenTelemetry plugin.

Queuing

The OpenTelemetry plugin uses internal queues to decouple the production of log entries from their transmission to the upstream log server.

With queuing, request information is put in a configurable queue before being sent in batches to the upstream server. This has the following benefits:

  • Reduces any possible concurrency on the upstream server
  • Helps deal with temporary outages of the upstream server due to network or administrative changes
  • Can reduce resource usage both in Kong Gateway and on the upstream server by collecting multiple entries from the queue in one request

Note: Because queues are structural elements for components in Kong Gateway, they only live in the main memory of each worker process and are not shared between workers. Therefore, queued content isn’t preserved under abnormal operational situations, like power loss or unexpected worker process shutdown due to memory shortage or program errors.

You can configure several parameters for queuing:

Parameters

Description

Queue capacity limits:

config.queue.max_entries
config.queue.max_bytes
config.queue.max_batch_size
Configure sizes for various aspects of the queue: maximum number of entries, batch size, and queue size in bytes.

When a queue reaches the maximum number of entries queued and another entry is enqueued, the oldest entry in the queue is deleted to make space for the new entry. The queue code provides warning log entries when it reaches a capacity threshold of 80% and when it starts to delete entries from the queue. It also writes log entries when the situation normalizes.
Timer usage:

config.queue.concurrency_limit
Only one timer is used to start queue processing in the background. You can add more if needed. Once the queue is empty, the timer handler terminates and a new timer is created as soon as a new entry is pushed onto the queue.
Retry logic:

config.queue.initial_retry_delay
config.queue.max_coalescing_delay
config.queue.max_retry_delay
config.queue.max_retry_time
If a queue fails to process, the queue library can automatically retry processing it if the failure is temporary (for example, if there are network problems or upstream unavailability).

Before retrying, the library waits for the amount of time specified by the initial_retry_delay parameter. This wait time is doubled every time the retry fails, until it reaches the maximum wait time specified by the max_retry_time parameter.

When a Kong Gateway shutdown is initiated, the queue is flushed. This allows Kong Gateway to shut down even if it was waiting for new entries to be batched, ensuring upstream servers can be contacted.

Queues are not shared between workers and queuing parameters are scoped to one worker. For whole-system capacity planning, the number of workers needs to be considered when setting queue parameters.

Trace IDs in serialized logs v3.5+

When the OpenTelemetry plugin is configured along with a plugin that uses the Log Serializer, the trace ID of each request is added to the key trace_id in the serialized log output.

The value of this field is an object that can contain different formats of the current request’s trace ID. In case there are multiple tracing headers in the same request, the trace_id field includes one trace ID format for each different header format, as in the following example:

"trace_id": {
  "w3c": "4bf92f3577b34da6a3ce929d0e0e4736",
  "datadog": "11803532876627986230"
},

Troubleshooting

The OpenTelemetry spans are printed to the console when the log level is set to debug in the Kong Gateway configuration file.

The following is an example of the debug logs output:

2022/06/02 15:28:42 [debug] 650#0: *111 [lua] instrumentation.lua:302: runloop_log_after(): [tracing] collected 6 spans:
Span #1 name=GET /wrk duration=1502.994944ms attributes={"http.url":"/wrk","http.method":"GET","http.flavor":1.1,"http.host":"127.0.0.1","http.scheme":"http","net.peer.ip":"172.18.0.1"}
Span #2 name=rewrite phase: opentelemetry duration=0.391936ms
Span #3 name=router duration=0.013824ms
Span #4 name=access phase: cors duration=1500.824576ms
Span #5 name=cors: heavy works duration=1500.709632ms attributes={"username":"kongers"}
Span #6 name=balancer try #1 duration=0.99328ms attributes={"net.peer.ip":"104.21.11.162","net.peer.port":80}

Known issues

  • Only supports the HTTP protocols (http/https) of Kong Gateway.
  • May impact the performance of Kong Gateway. We recommend setting the sampling rate (tracing_sampling_rate) via the Kong Gateway configuration file when using the OpenTelemetry plugin for tracing.
  • Doesn’t support custom_fields_by_lua.
  • Doesn’t support AI Gateway and MCP metrics and access logs. You can use Prometheus for metrics, and HTTP Log or File Log for access logs.
Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!