Observability Explorer

Uses: Kong Gateway Observability

Enabling data ingestion

Manage data ingestion from any Control Plane Dashboard using the Advanced Analytics toggle. This toggle lets you enable or disable data collection for your API traffic per Control Plane.

Modes:

On: Both basic and advanced analytics data is collected, allowing in-depth insights and reporting.
Off: Advanced analytics collection stops, but basic API metrics remain available for API Gateway in Konnect, and can still be used for custom reports.

Traffic metrics provide insight into which of your services are being used and how they are responding. Within a single report, you have the flexibility to choose one or multiple metrics from the same category.

With API usage reporting, you can:

Identify which services are slow or have high error rates
Monitor request volume and throughput over time
Analyze payload sizes for clients and upstream services

The following table shows which API usage metrics you can view:

Metric	Category	Description
Request Count	Count	Total number of API calls within the selected time frame. This includes requests that were rejected due to rate limiting, failed authentication, and so on.
Requests per Minute	Rate	Number of API calls per minute within the selected time frame.
Response Latency	Latency	The time, in milliseconds, it takes to process an API request from start to finish. Users can choose from average (avg) or specific percentiles (p99, p95, and p50). For example, a 99th percentile response latency of 10 milliseconds means that 99 out of 100 requests were completed in under 10 ms from the time the request was received to when the response was sent.
Upstream Latency	Latency	The amount of time, in milliseconds, that Kong Gateway was waiting for the first byte of the upstream service response. Users can select between different percentiles (p99, p95, and p50). For example, a 99th percentile latency of 10 milliseconds means that 99 out of 100 requests took less than 10 ms from the moment the request was sent to the upstream service to when the first byte of the response was received.
Kong latency	Latency	The time, in milliseconds, spent within Kong Gateway processing a request, excluding upstream response time. Users can choose from different percentiles (p99, p95, and p50). For example, a 99th percentile Kong latency of 10 milliseconds means that 99 out of 100 requests took less than 10 ms to be processed in Kong Gateway before reaching the upstream service.
Request Size	Size	The size of the request payload received from the client, in bytes. Users can select between the total sum or different percentiles (p99, p95, and p50). For example, a 99th percentile request size of 100 bytes means that the payload size for every 1 in 100 requests was at least 100 bytes.
Response Size	Size	The size of the response payload returned to the client, in bytes. Users can select between the total sum or different percentiles (p99, p95, and p50). For example, a 99th percentile response size of 100 bytes means that the payload size for every 1 in 100 response back to the original caller was at least 100 bytes.
Error Rate	Percentage	The percentage of failed API requests. This includes requests that return HTTP 4xx and 5xx status codes.

Observability allows you to monitor and optimize your LLM usage by providing detailed insights into objects such as token consumption, costs, and latency.

With LLM usage reporting, you can:

Track token consumption: Monitor the number of tokens processed by the different LLM models you have configured.
Understand costs: Gain visibility into the costs associated with your LLM providers.
Measure latency: Analyze the latency involved in processing LLM requests.

The following table shows which LLM usage metrics you can view:

Attribute	Unit	Description
Completion Tokens	Count	Completion tokens are any tokens that the model generates in response to an input.
Prompt Tokens	Count	Prompt tokens are the number of tokens in the prompt that are input into the model.
Total Tokens	Count	Sum of all tokens used in a single request to the model. It includes both the tokens in the input (prompt) and the tokens generated by the model (completion).
Time per Tokens	Number	Average time in milliseconds to generate a token. Calculated as LLM latency divided by the number of tokens.
Costs	Cost	Represents the resulting costs for a request. Final costs = (total number of prompt tokens × input cost per token) + (total number of completion tokens × output cost per token) + (total number of prompt tokens × embedding cost per token).
Response Model	String	Represents which AI model was used to process the prompt by the AI provider.
Request Model	String	Represents which AI model was used to process the prompt.
Provider Name	String	Represents which AI provider was used to process the prompt.
Plugin ID	String	Represents the UUID of the plugin.
LLM Latency	Latency	Total time taken to receive a full response after a request sent from Kong (LLM latency + connection time).
Embeddings Latency	Latency	Time taken to generate the vector for the prompt string.
Fetch Latency	Latency	Total time taken to return a cache.
Cache Status	String	Shows if the response comes directly from the upstream or not. Possible values: `hit` or `Miss`.
Embeddings Model	String	AI providers may have multiple embedding models. This represents the model used for the embeddings.
Embeddings Provider	String	Provider used for generating embeddings.
Embeddings Token	Count	Tokens input into the model for embeddings.
Embeddings Cost	Cost	Cost of caching.
Cost Savings	Cost	Cost savings from cache.

With platform usage reporting, you can:

Track the number of control planes and data plane nodes in your organization
Monitor Gateway Services, Routes, and plugins per control plane
View Consumer counts across your realms and control planes

The following table shows which platform usage metrics you can view:

Metric	Category	Description
Control plane count	Count	Number of control planes in your organization.
Node count	Count	Number of data plane nodes in your organization. You can also filter this metric by data plane node version.
Service count	Count	Number of Gateway Services in your control plane.
Route count	Count	Number of Routes in your control plane or associated with a specific Gateway Service.
Plugin count	Count	Number of plugins in your control plane. These can also be filtered by plugin scope and name.
Consumer count	Count	Number of Consumers in your realm or control plane.

Agentic usage tracks analytics data for agent-to-agent (A2A) traffic that flows through the AI A2A Proxy plugin, such as agent tool use and agent MCP calls. You must configure the AI A2A Proxy plugin before analytics display in Konnect Explorer.

With agentic usage reporting, you can:

See how many times a tool was called
View the most called tools
See which tools are returning errors
View the latency for tools

The following table shows the agentic usage-specific metrics you can view:

Metric	Category	Description
A2A latency	Latency	The amount of time, in milliseconds, that Kong Gateway was waiting for the first byte of the agent’s response. Users can select average (avg).
MCP Response Size	Size	The size of the response payload returned to Kong Gateway from the MCP server, in bytes. Users can select the total sum.
A2A Response Size	Size	The size of the response payload returned to Kong Gateway from an agent, in bytes. Users can select the total sum.

Time intervals

The time frame selector controls the time frame of data visualized, which indirectly controls the granularity of the data. For example, the “5M” selection displays five minutes in one-second resolution data, while longer time frames display minute, hour, or days resolution data.

Relative time frames are dynamic and the report captures a snapshot of data relative to when a user views the report.
Custom time frames are static and the report captures a snapshot of data during the specified time frame. You can see the exact range below the time frame selector. For example:
```
 Jan 26, 2023 12:00 AM - Feb 01, 2023 12:00 AM (PST)
```

The following table describes the time intervals you can select:

Interval	Aggregation increment frequency	Notes
Last 15 minutes	1 minute	Data is aggregated in one minute increments. Platform usage data isn’t collected at this interval.
Last hour	1 minute	Data is aggregated in one minute increments.
Last six hours	1 minute	Data is aggregated in one minute increments.
Last 12 hours	1 hour	Data is aggregated in one hour increments.
Last 24 hours	1 hour	Data is aggregated in one hour increments.
Last seven days	1 hour	Data is aggregated in one hour increments.
Last 30 days	Daily	Data is aggregated in daily increments.
Current week	1 hour	Logs any traffic in the current calendar week.
Current month	1 hour	Logs any traffic in the current calendar month.
Previous week	1 hour	Logs any traffic in the previous calendar week.
Previous month	Daily	Logs any traffic in the previous calendar month.
Last 90 days	Daily, Weekly	Data is aggregated in daily or weekly increments.
Last 180 days	Daily, Weekly	Data is aggregated in daily or weekly increments.
Last 365 days	Weekly	Data is aggregated in weekly increments.
Current quarter	Daily, Weekly	Logs any traffic in the current calendar quarter. Data is aggregated in daily or weekly increments.
Previous quarter	Daily, Weekly	Logs any traffic in the previous calendar quarter. Data is aggregated in daily or weekly increments.

System-defined groups

Empty is an optional, system-defined group that indicates that API calls don’t have an entity like Consumers or Routes, selected for grouping. Empty allows you to group API calls that don’t match specific groupings so you can gain more comprehensive insights. You can filter by Is Empty or Is Not Empty.

Some common use cases for Empty include:

Identifying the number of API calls that don’t match a Route.
Identifying API calls without an associated Consumer to keep track of any security holes.

FAQs

What data can I collect Analytics from?

API
API Product (Classic)
API Product Version (Classic)
Application
Consumer
Control Plane
Control Plane Group
Data Plane Node
Data Plane Node Version
Gateway Services
Response Source
Portal
Route
Status Code
Status Code Group
Upstream Status Code
Upstream Status Code Group

What does “None” mean in Observability?

“None” is a field that can capture data that doesn’t belong to a specific category.

What can I do after customizing an Explorer dashboard?

Save as a Report: This function creates a new custom report based on your current view, allowing you to revisit these specific insights at a later time.
Export as CSV: If you prefer to analyze your data using other tools, you can download the current view as a CSV file, making it portable and ready for further analysis elsewhere.

How do I check what location an API request is coming from?

In the Analytics Explorer, you can view the geographic origin of API requests by using the country dimension and the map chart type. Navigate to Analytics → Explorer, select an API Usage report, then group or filter by Country. Switch the chart view to Map to see request volumes by country. The country information is added automatically by Kong Gateway using the IP‑to‑country database IPinfo.

What interactions are available when working with charts to investigate my data?

Charts in Konnect Analytics include interactive options to help you investigate data:

Hover over a chart to see a tooltip with exact values.
Left-click to pin the tooltip. This lets you scroll through long lists or drag the tooltip to a different spot.
Click and drag across the chart to highlight a time range. When you release, choose one of the following actions:
- View requests: Opens Analytics > API Requests with a filtered list of matching calls.
- Explore: Opens Explorer with the same filters and highlighted time window, so you can add group-bys and extra filters.
- Zoom: Focuses the chart on the selected time range.

From the chart settings, you can do the following:

Open Explorer with the chart’s current filters and time range, by selecting Explore.
Download the chart view by selecting Export CSV. The file includes the current filters and time window.

Why don’t I see any data in the Map chart view?

If you’re sending requests from an internal IP address (local), those won’t display in the Map chart in Observability, even if konnect_mode is off.

Observability Explorer

Enabling data ingestion

Metrics

Time intervals

System-defined groups

FAQs

Help us make these docs great!

Still need help?