Propagate distributed tracing spans and report low-level spans to a OTLP-compatible server.
The OpenTelemetry plugin is fully compatible with the OpenTelemetry specification and can be used with any OpenTelemetry compatible backend.
Propagate distributed tracing spans and report low-level spans to a OTLP-compatible server.
The OpenTelemetry plugin is fully compatible with the OpenTelemetry specification and can be used with any OpenTelemetry compatible backend.
There are two ways to set up an OpenTelemetry backend:
Using an OpenTelemetry-compatible backend directly, like Jaeger (v1.35.0+).
All the vendors supported by OpenTelemetry are listed in OpenTelemetry’s Vendor support.
Using the OpenTelemetry Collector, which is middleware that can be used to proxy OpenTelemetry spans to a compatible backend.
You can view all the available OpenTelemetry Collector exporters at open-telemetry/opentelemetry-collector-contrib.
Metrics are enabled using the contrib
version of the OpenTelemetry Collector.
The spanmetrics
connector allows you to aggregate traces and provide metrics to any third party observability platform.
To include span metrics for application traces, configure the collector exporters section of the OpenTelemetry Collector configuration file:
connectors:
spanmetrics:
dimensions:
- name: http.method
default: GET
- name: http.status_code
- name: http.route
exclude_dimensions:
- status.code
metrics_flush_interval: 15s
histogram:
disable: false
service:
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [spanmetrics]
metrics:
receivers: [spanmetrics]
processors: []
exporters: [otlphttp]
Kong Gateway has a series of built-in tracing instrumentations
which are configured by the tracing_instrumentations
configuration.
Kong Gateway creates a top-level span for each request by default when tracing_instrumentations
is enabled.
The top level span has the following attributes:
http.method
: HTTP methodhttp.url
: HTTP URLhttp.host
: HTTP hosthttp.scheme
: HTTP scheme (http or https)http.flavor
: HTTP versionnet.peer.ip
: Client IP addressThe OpenTelemetry plugin supports propagation of the following header formats:
w3c
: W3C trace context
b3
and b3-single
: Zipkin headers
jaeger
: Jaeger headers
ot
: OpenTracing headers
datadog
: Datadog headers
aws
: v3.4+ AWS X-Ray header
gcp
: v3.5+ GCP X-Cloud-Trace-Context header
This plugin offers extensive options for configuring tracing header propagation, providing a high degree of flexibility. You can customize which headers are used to extract and inject tracing context. Additionally, you can configure headers to be cleared after the tracing context extraction process, enabling a high level of customization.
flowchart LR id1(Original Request) --> Extract id1(Original Request) -->|"headers (original)"| Extract id1(Original Request) --> Extract subgraph ide1 [Headers Propagation] Extract --> Clear Extract -->|"headers (original)"| Clear Extract --> Clear Clear -->|"headers (filtered)"| Inject end Extract -.->|extracted ctx| id2((tracing logic)) id2((tracing logic)) -.->|updated ctx| Inject Inject -->|"headers (updated ctx)"| id3(Updated request)
See the plugin’s configuration reference for a complete overview of the available options and values.
Note: If any of the
config.propagation.*
configuration options (extract
,clear
, orinject
) are configured, theconfig.propagation
configuration takes precedence over the deprecatedheader_type
parameter. If none of theconfig.propagation.*
configuration options are set, theheader_type
parameter is still used to determine the propagation behavior.
In Kong Gateway 3.6 or earlier, the plugin detects the propagation format from the headers and will use the appropriate format to propagate the span context.
If no appropriate format is found, the plugin will fallback to the default format, which is w3c
.
The OpenTelemetry plugin implements the OTLP/HTTP exporter, which uses Protobuf payloads encoded in binary format and is sent via an HTTP/1.1.
config.connect_timeout
, config.read_timeout
, and config.send_timeout
are used to set the timeouts for the HTTP request.
config.batch_span_count
and config.batch_flush_delay
are used to set the maximum number of spans and the delay between two consecutive batches.
The OpenTelemetry plugin is built on top of the Kong Gateway tracing PDK. You can customize the spans and add your own spans through the universal tracing PDK.
Create a file named custom-span.lua
with the following content:
-- Modify the root span
local root_span = kong.tracing.active_span()
root_span:set_attribute("custom.attribute", "custom value")
-- Create a custom span
local span = kong.tracing.start_span("custom-span")
-- Append attributes
span:set_attribute("custom.attribute", "custom value")
-- Close the span
span:finish()
Apply the Lua code with the Post-function plugin using a cURL file upload:
curl -i -X POST http://localhost:8001/plugins \
-F "name=post-function" \
-F "config.access[1]=@custom-span.lua"
This plugin supports OpenTelemetry Logging, which can be configured as described in the configuration reference to export logs in OpenTelemetry format to an OTLP-compatible backend.
Two different kinds of logs are exported:
Logs are recorded based on the log level that is configured for Kong Gateway. If a log is emitted with a level that is lower than the configured log level, it is not recorded or exported.
Note: Not all logs are guaranteed to be recorded. Logs that aren’t recorded include those produced by the Nginx master process and low-level errors produced by Nginx. Operators are expected to still capture the Nginx
error.log
file (which always includes all such logs) in addition to using this feature, to avoid losing any details that might be useful for deeper troubleshooting.
Each log entry adheres to the OpenTelemetry Logs Data Model. The available information depends on the log scope and on whether tracing is enabled for this plugin.
Every log entry includes the following fields:
Timestamp
: Time when the event occurred.ObservedTimestamp
: Time when the event was observed.SeverityText
: The severity text (log level).SeverityNumber
: Numerical value of the severity.Body
: The error log line.Resource
: Configurable resource attributes.InstrumentationScope
: Metadata that describes Kong Gateway’s data emitter.Attributes
: Additional information about the event.
introspection.source
: Full path of the file that emitted the log.introspection.current.line
: Line number that emitted the log.In addition to the above, request-scoped logs include:
Attributes
: Additional information about the event.
request.id
: Kong Gateway’s request ID.In addition to the above, when tracing is enabled, request-scoped logs include:
TraceID
: Request trace ID.SpanID
: Request span ID.TraceFlags
: W3C trace flag.The custom plugin PDK kong.telemetry.log
module lets you configure OTLP logging for a custom plugin.
The module records a structured log entry, which is reported via the OpenTelemetry plugin.
The OpenTelemetry plugin uses internal queues to decouple the production of log entries from their transmission to the upstream log server.
With queuing, request information is put in a configurable queue before being sent in batches to the upstream server. This has the following benefits:
Note: Because queues are structural elements for components in Kong Gateway, they only live in the main memory of each worker process and are not shared between workers. Therefore, queued content isn’t preserved under abnormal operational situations, like power loss or unexpected worker process shutdown due to memory shortage or program errors.
You can configure several parameters for queuing:
Parameters |
Description |
---|---|
Queue capacity limits:
config.queue.max_entries
config.queue.max_bytes
config.queue.max_batch_size
|
Configure sizes for various aspects of the queue: maximum number of entries, batch size, and queue size in bytes.
When a queue reaches the maximum number of entries queued and another entry is enqueued, the oldest entry in the queue is deleted to make space for the new entry. The queue code provides warning log entries when it reaches a capacity threshold of 80% and when it starts to delete entries from the queue. It also writes log entries when the situation normalizes. |
Timer usage:
config.queue.concurrency_limit
|
Only one timer is used to start queue processing in the background. You can add more if needed. Once the queue is empty, the timer handler terminates and a new timer is created as soon as a new entry is pushed onto the queue. |
Retry logic:
config.queue.initial_retry_delay
config.queue.max_coalescing_delay
config.queue.max_retry_delay
config.queue.max_retry_time
|
If a queue fails to process, the queue library can automatically retry processing it if the failure is temporary
(for example, if there are network problems or upstream unavailability).
Before retrying, the library waits for the amount of time specified by the initial_retry_delay parameter.
This wait time is doubled every time the retry fails, until it reaches the maximum wait time specified by the max_retry_time parameter.
|
When a Kong Gateway shutdown is initiated, the queue is flushed. This allows Kong Gateway to shut down even if it was waiting for new entries to be batched, ensuring upstream servers can be contacted.
Queues are not shared between workers and queuing parameters are scoped to one worker. For whole-system capacity planning, the number of workers needs to be considered when setting queue parameters.
When the OpenTelemetry plugin is configured along with a plugin that uses the
Log Serializer,
the trace ID of each request is added to the key trace_id
in the serialized log output.
The value of this field is an object that can contain different formats
of the current request’s trace ID. In case there are multiple tracing headers in the
same request, the trace_id
field includes one trace ID format
for each different header format, as in the following example:
"trace_id": {
"w3c": "4bf92f3577b34da6a3ce929d0e0e4736",
"datadog": "11803532876627986230"
},
The OpenTelemetry spans are printed to the console when the log level is set to debug
in the Kong Gateway configuration file.
The following is an example of the debug logs output:
2022/06/02 15:28:42 [debug] 650#0: *111 [lua] instrumentation.lua:302: runloop_log_after(): [tracing] collected 6 spans:
Span #1 name=GET /wrk duration=1502.994944ms attributes={"http.url":"/wrk","http.method":"GET","http.flavor":1.1,"http.host":"127.0.0.1","http.scheme":"http","net.peer.ip":"172.18.0.1"}
Span #2 name=rewrite phase: opentelemetry duration=0.391936ms
Span #3 name=router duration=0.013824ms
Span #4 name=access phase: cors duration=1500.824576ms
Span #5 name=cors: heavy works duration=1500.709632ms attributes={"username":"kongers"}
Span #6 name=balancer try #1 duration=0.99328ms attributes={"net.peer.ip":"104.21.11.162","net.peer.port":80}
tracing_sampling_rate
)
via the Kong Gateway configuration file when using the OpenTelemetry plugin.