Datadog

Overview Examples Configuration reference Changelog

Metrics

The Datadog plugin currently logs the following metrics to the Datadog server about a Service or Route:

Metric	Description	Namespace
`request_count`	Tracks the request	`kong.request.count`
`request_size`	Tracks the request body size in bytes	`kong.request.size`
`response_size`	Tracks the response body size in bytes	`kong.response.size`
`latency`	Tracks the interval between the time the request started and the time the response was received from the upstream server	`kong.latency`
`upstream_latency`	Tracks the time it took for the final service to process the request	`kong.upstream_latency`
`kong_latency`	Tracks the internal Kong Gateway latency that it took to run all the plugins	`kong.kong_latency`

The metrics will be sent with the tags name and status carrying the API name and HTTP status code respectively. If you specify consumer_identifier with the metric, a consumer tag will be added.

All metrics get logged by default. You can customize the metrics logged with the config.metrics parameter. Note that metrics with stat_type set to counter or gauge must have sample_rate defined as well.

Migrating Datadog queries

The plugin updates replace the API, status, and consumer-specific metrics with a generic metric name. You must change your Datadog queries in dashboards and alerts to reflect the metrics updates.

For example, the following query:

avg:kong.sample_service.latency.avg{*}

would need to change to:

avg:kong.latency.avg{name:sample-service}

Setting host and port per Kong Gateway node

When installing a multi-data center setup, you might want to set Datadog’s agent host and port for each Kong Gateway node. This configuration is possible by setting the host and port properties with environment variables.

Field	Description	Data types
`KONG_DATADOG_AGENT_HOST`	The IP address or hostname to send data to	string
`KONG_DATADOG_AGENT_PORT`	The port to send data to on the upstream server	integer

Note: The host and port fields in the plugin configuration take precedence over environment variables. For Kubernetes, there is a known limitation that you can’t set host to null to use the environment variable. You can work around this by using a Vault reference, for example: {vault://env/kong-datadog-agent-host}. For more information, see Configure with Kubernetes.

Kong Gateway process errors

This logging plugin logs HTTP request and response data, and also supports stream data (TCP, TLS, and UDP).

The Kong Gateway process error file is the Nginx error file. You can find it at the following path:

$PREFIX/logs/error.log

Configure the prefix in kong.conf.

Configure with Kubernetes

In most Kubernetes setups, datadog-agent runs as a daemon set. This means that a datadog-agent runs on each node in the Kubernetes cluster, and Kong Gateway must forward metrics to the datadog-agent running on the same node as Kong Gateway.

This can be accomplished by providing the IP address of the Kubernetes worker node to Kong Gateway, then configuring the plugin to use that IP address using environment variables.

Modify the env section in values.yaml:

env:
  datadog_agent_host:
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP

Update the Helm deployment:

helm upgrade -f values.yaml $RELEASE_NAME kong/kong --version $VERSION --namespace $NAMESPACE

Modify the plugin’s configuration:

```yaml
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host: "{vault://env/kong-datadog-agent-host}"
  port: 8125
```

Modify the env section in values.yaml:

env:
  - name: KONG_DATADOG_AGENT_HOST
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP

Modify the plugin’s configuration:

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host: "{vault://env/kong-datadog-agent-host}"
  port: 8125

Queuing

The Datadog plugin uses internal queues to decouple the production of log entries from their transmission to the upstream log server.

With queuing, request information is put in a configurable queue before being sent in batches to the upstream server. This has the following benefits:

Reduces any possible concurrency on the upstream server
Helps deal with temporary outages of the upstream server due to network or administrative changes
Can reduce resource usage both in Kong Gateway and on the upstream server by collecting multiple entries from the queue in one request

Note: Because queues are structural elements for components in Kong Gateway, they only live in the main memory of each worker process and are not shared between workers. Therefore, queued content isn’t preserved under abnormal operational situations, like power loss or unexpected worker process shutdown due to memory shortage or program errors.

You can configure several parameters for queuing:

Parameters	Description
Queue capacity limits: `config.queue.max_entries` `config.queue.max_bytes` `config.queue.max_batch_size`	Configure sizes for various aspects of the queue: maximum number of entries, batch size, and queue size in bytes. When a queue reaches the maximum number of entries queued and another entry is enqueued, the oldest entry in the queue is deleted to make space for the new entry. The queue code provides warning log entries when it reaches a capacity threshold of 80% and when it starts to delete entries from the queue. It also writes log entries when the situation normalizes.
Timer usage: `config.queue.concurrency_limit`	Only one timer is used to start queue processing in the background. You can add more if needed. Once the queue is empty, the timer handler terminates and a new timer is created as soon as a new entry is pushed onto the queue.
Retry logic: `config.queue.initial_retry_delay` `config.queue.max_coalescing_delay` `config.queue.max_retry_delay` `config.queue.max_retry_time`	If a queue fails to process, the queue library can automatically retry processing it if the failure is temporary (for example, if there are network problems or upstream unavailability). Before retrying, the library waits for the amount of time specified by the `initial_retry_delay` parameter. This wait time is doubled every time the retry fails, until it reaches the maximum wait time specified by the `max_retry_time` parameter.

When a Kong Gateway shutdown is initiated, the queue is flushed. This allows Kong Gateway to shut down even if it was waiting for new entries to be batched, ensuring upstream servers can be contacted.

Queues are not shared between workers and queuing parameters are scoped to one worker. For whole-system capacity planning, the number of workers needs to be considered when setting queue parameters.