The OpenTelemetry plugin provides metrics, traces, and logs in the OpenTelemetry format and can be used with any OpenTelemetry compatible backend.
The OpenTelemetry plugin allows you to collect data for the following signals:
Common use cases for the OpenTelemetry plugin:
|
Use case |
Description |
|---|---|
| Enable the OTEL plugin for metrics | Configure the OpenTelemetry plugin to send metrics. |
| Enable the OTEL plugin for API transactional logs | Configure the OpenTelemetry plugin to send API transactional logs. |
| Enable the OTEL plugin for runtime logs | Configure the OpenTelemetry plugin to logs about the data plane’s internal execution. |
| Enable the OTEL plugin for traces | Configure the OpenTelemetry plugin to send traces. |
| Enable the OTEL plugin for all signals | Configure the OpenTelemetry plugin to send metrics, tracing and data plane/error logs and API transaction logs. |
| Extract, clear, and inject tracing data | Configure the OpenTelemetry plugin to extract tracing context, clear specific headers, and inject tracing context using a specific format. |
| Ignore incoming headers | Configure the OpenTelemetry plugin to inject tracing context in multiple formats. |
| Multiple injection | Configure the OpenTelemetry plugin to extract tracing context in one format and inject tracing context in multiple formats. |
| Preserve incoming format | Configure the OpenTelemetry plugin to extract and preserve the tracing context in the same header type. |
To set up an OpenTelemetry backend, you need support for OTLP over HTTP with Protobuf encoding. You can:
Send data directly to an OpenTelemetry-compatible backend that natively supports OTLP over HTTP with Protobuf encoding, like Jaeger (v1.35.0+).
This is the simplest setup, since it doesn’t require any additional components between the data plane and the backend.
Use the OpenTelemetry Collector, which acts as an intermediary between the data plane and one or more backends.
OTEL Collector can receive all OpenTelemetry signals supported by the OpenTelemetry plugin, including traces, metrics, and logs, and then process, transform, or route that data before exporting it to a compatible backend.
This option is useful when you need capabilities such as signal fan-out, filtering, enrichment, batching, or exporting to multiple backends. The OpenTelemetry Collector supports a wide range of exporters, available at open-telemetry/opentelemetry-collector-contrib.
The OpenTelemetry plugin attaches additional resource attributes to all telemetry data it sends to an OTLP endpoint. Resource attributes describe the entity that produced the telemetry and are shared across all signals.
The OpenTelemetry plugin automatically sets the following resource attributes:
| Attribute | Attribute description |
|---|---|
service.name |
Name of the service exposing the signal. This is optional, the default value is kong. |
service.version |
Gateway version of the node exposing the signal. |
service.instance.id |
ID of the node exposing the signal. |
You can add or override resource attributes by configuring the config.resource_attributes parameter. Custom resource attributes are merged with the default attributes and are included with all exported telemetry data. Some metric backends, such as Prometheus, apply resource attributes to every metric. Be mindful of the impact on cardinality.
In Kong Gateway, metrics are natively supported by the OpenTelemetry plugin. You can send metrics using the parameters under config.metrics.
The following metrics are exposed:
Total number of incoming HTTP requests.
{request}
Sum
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.auth.consumer.name |
Name of the authenticated Consumer. |
kong.response.source |
Origin of the current response. Possible values:
|
kong.workspace.name |
Name of the Workspace. |
http.request.method |
Method used in the HTTP request. |
kong.response.status_code |
HTTP status code of the response. |
Complete end-to-end duration of a request in seconds.
s
Histogram
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.workspace.name |
Name of the Workspace. |
Kong’s internal processing time in seconds, from when the Gateway receives the request from the client to when it sends the request to the upstream service.
s
Histogram
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.workspace.name |
Name of the Workspace. |
Upstream processing time in seconds, from when the Gateway sends the request to the upstream, to when the data is returned to Kong.
s
Histogram
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.workspace.name |
Name of the Workspace. |
Size of each incoming HTTP request in bytes.
By
Histogram
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.auth.consumer.name |
Name of the authenticated Consumer. |
kong.workspace.name |
Name of the Workspace. |
Total size of the HTTP response sent back to the client in bytes.
By
Histogram
Attributes:
| Attribute | Attribute description |
|---|---|
kong.service.name |
Name of the Gateway Service. |
kong.route.name |
Name of the Route. |
kong.auth.consumer.name |
Name of the authenticated Consumer. |
kong.workspace.name |
Name of the Workspace. |
Current memory usage of a shared dict in bytes.
By
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.shared_dict.name |
Name of the shared dict. |
kong.subsystem |
Nginx subsystem that produced the metric. Possible values:
|
Total memory size of a shared dict in bytes.
By
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.shared_dict.name |
Name of the shared dict. |
kong.subsystem |
Nginx subsystem that produced the metric. Possible values:
|
Memory used by the worker’s Lua VM in bytes.
By
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.pid |
Worker process ID. |
kong.subsystem |
Nginx subsystem that produced the metric. Possible values:
|
Number of client connections in Nginx.
{connection}
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.subsystem |
Nginx subsystem that produced the metric. Possible values:
|
kong.connection.state |
State of the client connection. Possible values:
|
Number of internal scheduled timers Nginx is running in the background.
{timer}
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.timer.state |
State of the timer. Possible values:
|
Shows whether Kong has an active database connection. A value of 1 means connected. A value of 0 means not connected.
1
Gauge
Shows whether the data plane has an active connection to the control plane. A value of 1 means connected. A value of 0 means not connected.
1
Gauge
Upstream target’s health. The actual status is in the state attribute, with the metric value set to 1 when a state is populated.
1
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.upstream.name |
Name of the Upstream. |
kong.target.address |
Address of the Target. |
server.address |
Address of the server. |
kong.upstream.state |
Health of the Upstream Target. Possible values:
|
kong.subsystem |
Nginx subsystem that produced the metric. Possible values:
|
Timestamp when the data plane’s cluster certificate will expire.
s
Gauge
Number of entities stored in Kong’s database.
{entity}
Gauge
Number of errors seen during database entity count collection.
{error}
Sum
Last 8 bytes of the Enterprise license signature as a number.
Gauge
Unix epoch time when the license expires, shifted by 24 hours to avoid timezone differences.
s
Gauge
Indicates whether the data plane can read or write entities under the current license.
Each capability (ee_entity_read and ee_entity_write) is reported as its own metric, where 1 means allowed and 0 means not allowed.
1
Gauge
Attributes:
| Attribute | Attribute description |
|---|---|
kong.ee.license.feature |
Enterprise feature. Possible values:
|
Number of errors that occurred while collecting license information.
{error}
Sum
If you’re using Kong Gateway 3.12 or earlier, metrics are enabled using the contrib version of the OpenTelemetry Collector.
The spanmetrics connector allows you to aggregate traces and provide metrics to any third party observability platform.
To include span metrics for application traces, configure the collector exporters section of the OpenTelemetry Collector configuration file:
connectors:
spanmetrics:
dimensions:
- name: http.method
default: GET
- name: http.status_code
- name: http.route
exclude_dimensions:
- status.code
metrics_flush_interval: 15s
histogram:
disable: false
service:
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [spanmetrics]
metrics:
receivers: [spanmetrics]
processors: []
exporters: [otlphttp]
Kong Gateway has a series of built-in tracing instrumentations
which are configured by the tracing_instrumentations configuration.
Kong Gateway creates a top-level span for each request by default when tracing_instrumentations is enabled.
The top level span has the following attributes:
http.method: HTTP methodhttp.url: HTTP URLhttp.host: HTTP hosthttp.scheme: HTTP scheme (http or https)http.flavor: HTTP versionnet.peer.ip: Client IP addressFor more information, see the Tracing reference.
Note: When the OpenTelemetry plugin is used together with the Proxy Cache Advanced plugin, cache-HIT responses are not traced. This is expected behavior. When a request results in a cache-HIT, the response is served before the request lifecycle reaches the phase where the OpenTelemetry plugin executes. As a result, no spans are generated for cache-HIT requests. Cache-MISS requests continue through the full request lifecycle and are traced normally.
When processing generative AI traffic through Kong AI Gateway, additional span attributes are emitted following the OpenTelemetry Gen AI semantic conventions. These attributes capture model parameters, token usage, and tool-call metadata.
For the complete attribute reference, see Gen AI OpenTelemetry attributes.
The OpenTelemetry plugin supports propagation of the following header formats:
w3c: W3C trace context
b3 and b3-single: Zipkin headers
jaeger: Jaeger headers
ot: OpenTracing headers
datadog: Datadog headers
aws: v3.4+ AWS X-Ray header
gcp: v3.5+ GCP X-Cloud-Trace-Context header
This plugin offers extensive options for configuring tracing header propagation, providing a high degree of flexibility. You can customize which headers are used to extract and inject tracing context. Additionally, you can configure headers to be cleared after the tracing context extraction process, enabling a high level of customization.
flowchart LR id1(Original Request) --> Extract id1(Original Request) -->|"headers (original)"| Extract id1(Original Request) --> Extract subgraph ide1 [Headers Propagation] Extract --> Clear Extract -->|"headers (original)"| Clear Extract --> Clear Clear -->|"headers (filtered)"| Inject end Extract -.->|extracted ctx| id2((tracing logic)) id2((tracing logic)) -.->|updated ctx| Inject Inject -->|"headers (updated ctx)"| id3(Updated request)
See the plugin’s configuration reference for a complete overview of the available options and values.
Note: If any of the
config.propagation.*configuration options (extract,clear, orinject) are configured, theconfig.propagationconfiguration takes precedence over the deprecatedheader_typeparameter. If none of theconfig.propagation.*configuration options are set, theheader_typeparameter is still used to determine the propagation behavior.
In Kong Gateway 3.6 or earlier, the plugin detects the propagation format from the headers and will use the appropriate format to propagate the span context.
If no appropriate format is found, the plugin will fallback to the default format, which is w3c.
The OpenTelemetry plugin implements the OTLP/HTTP exporter, which uses Protobuf payloads encoded in binary format and is sent via an HTTP/1.1.
config.connect_timeout, config.read_timeout, and config.send_timeout are used to set the timeouts for the HTTP request.
config.batch_span_count and config.batch_flush_delay are used to set the maximum number of spans and the delay between two consecutive batches.
The OpenTelemetry plugin is built on top of the Kong Gateway tracing PDK. You can customize the spans and add your own spans through the universal tracing PDK.
Create a file named custom-span.lua with the following content:
-- Modify the root span
local root_span = kong.tracing.get_root_span()
root_span:set_attribute("custom.attribute", "custom value")
-- Modify the active span
local active_span = kong.tracing.active_span()
active_span:set_attribute("custom.attribute", "custom value")
-- Create a custom span
local span = kong.tracing.start_span("custom-span")
-- Append attributes
span:set_attribute("custom.attribute", "custom value")
-- Close the span
span:finish()
Apply the Lua code with the Post-function plugin using a cURL file upload:
curl -i -X POST http://localhost:8001/plugins \
-F "name=post-function" \
-F "config.access[1]=@custom-span.lua"
This plugin supports OpenTelemetry Logging, which can be configured as described in the configuration reference to export logs in OpenTelemetry format to an OTLP-compatible backend.
Two different kinds of logs are exported:
Logs are recorded based on the log level that is configured for Kong Gateway. If a log is emitted with a level that is lower than the configured log level, it is not recorded or exported.
Note: Not all logs are guaranteed to be recorded. Logs that aren’t recorded include those produced by the Nginx master process and low-level errors produced by Nginx. Operators are expected to still capture the Nginx
error.logfile (which always includes all such logs) in addition to using this feature, to avoid losing any details that might be useful for deeper troubleshooting.
Each log entry adheres to the OpenTelemetry Logs Data Model. The available information depends on the log scope and on whether tracing is enabled for this plugin.
Every log entry includes the following fields:
Timestamp: Time when the event occurred.ObservedTimestamp: Time when the event was observed.SeverityText: The severity text (log level).SeverityNumber: Numerical value of the severity.Body: The error log line.Resource: Configurable resource attributes.InstrumentationScope: Metadata that describes Kong Gateway’s data emitter.Attributes: Additional information about the event.
introspection.source: Full path of the file that emitted the log.introspection.current.line: Line number that emitted the log.In addition to the above, request-scoped logs include:
Attributes: Additional information about the event.
request.id: Kong Gateway’s request ID.In addition to the above, when tracing is enabled, request-scoped logs include:
TraceID: Request trace ID.SpanID: Request span ID.TraceFlags: W3C trace flag.The custom plugin PDK kong.telemetry.log module lets you configure OTLP logging for a custom plugin.
The module records a structured log entry, which is reported via the OpenTelemetry plugin.
The OpenTelemetry plugin uses internal queues to decouple the production of log entries from their transmission to the upstream log server.
With queuing, request information is put in a configurable queue before being sent in batches to the upstream server. This has the following benefits:
Note: Because queues are structural elements for components in Kong Gateway, they only live in the main memory of each worker process and are not shared between workers. Therefore, queued content isn’t preserved under abnormal operational situations, like power loss or unexpected worker process shutdown due to memory shortage or program errors.
You can configure several parameters for queuing:
|
Parameters |
Description |
|---|---|
Queue capacity limits:
config.queue.max_entries
config.queue.max_bytes
config.queue.max_batch_size
|
Configure sizes for various aspects of the queue: maximum number of entries, batch size, and queue size in bytes.
When a queue reaches the maximum number of entries queued and another entry is enqueued, the oldest entry in the queue is deleted to make space for the new entry. The queue code provides warning log entries when it reaches a capacity threshold of 80% and when it starts to delete entries from the queue. It also writes log entries when the situation normalizes. |
Timer usage:
config.queue.concurrency_limit
|
Only one timer is used to start queue processing in the background. You can add more if needed. Once the queue is empty, the timer handler terminates and a new timer is created as soon as a new entry is pushed onto the queue. |
Retry logic:
config.queue.initial_retry_delay
config.queue.max_coalescing_delay
config.queue.max_retry_delay
config.queue.max_retry_time
|
If a queue fails to process, the queue library can automatically retry processing it if the failure is temporary
(for example, if there are network problems or upstream unavailability).
Before retrying, the library waits for the amount of time specified by the initial_retry_delay parameter.
This wait time is doubled every time the retry fails, until it reaches the maximum wait time specified by the max_retry_time parameter.
|
When a Kong Gateway shutdown is initiated, the queue is flushed. This allows Kong Gateway to shut down even if it was waiting for new entries to be batched, ensuring upstream servers can be contacted.
Queues are not shared between workers and queuing parameters are scoped to one worker. For whole-system capacity planning, the number of workers needs to be considered when setting queue parameters.
When the OpenTelemetry plugin is configured along with a plugin that uses the
Log Serializer,
the trace ID of each request is added to the key trace_id in the serialized log output.
The value of this field is an object that can contain different formats
of the current request’s trace ID. In case there are multiple tracing headers in the
same request, the trace_id field includes one trace ID format
for each different header format, as in the following example:
"trace_id": {
"w3c": "4bf92f3577b34da6a3ce929d0e0e4736",
"datadog": "11803532876627986230"
},
The OpenTelemetry spans are printed to the console when the log level is set to debug in the Kong Gateway configuration file.
The following is an example of the debug logs output:
2022/06/02 15:28:42 [debug] 650#0: *111 [lua] instrumentation.lua:302: runloop_log_after(): [tracing] collected 6 spans:
Span #1 name=GET /wrk duration=1502.994944ms attributes={"http.url":"/wrk","http.method":"GET","http.flavor":1.1,"http.host":"127.0.0.1","http.scheme":"http","net.peer.ip":"172.18.0.1"}
Span #2 name=rewrite phase: opentelemetry duration=0.391936ms
Span #3 name=router duration=0.013824ms
Span #4 name=access phase: cors duration=1500.824576ms
Span #5 name=cors: heavy works duration=1500.709632ms attributes={"username":"kongers"}
Span #6 name=balancer try #1 duration=0.99328ms attributes={"net.peer.ip":"104.21.11.162","net.peer.port":80}
tracing_sampling_rate)
via the Kong Gateway configuration file when using the OpenTelemetry plugin for tracing.custom_fields_by_lua.