Each AI plugin returns a set of tokens. Log entries include the following details:
The AI Proxy and AI Proxy Advanced plugins act as the main gateway for forwarding requests to AI providers. Logs here capture detailed information about the request and response payloads, token usage, model details, latency, and cost metrics. They provide a comprehensive view of each AI interaction.
Logs and metrics for cost and token usage via the OpenAI Files API are not currently supported.
|
Property
|
Description
|
ai.proxy.payload.request
|
The request payload sent to the upstream AI provider.
|
ai.proxy.payload.response
|
The response payload received from the upstream AI provider.
|
ai.proxy.usage.prompt_tokens
|
The number of tokens used for prompting.
Used for text-based requests (chat, completions, embeddings).
|
ai.proxy.usage.prompt_tokens_details
|
v3.11+ A breakdown of prompt tokens (cached_tokens, audio_tokens).
|
ai.proxy.usage.completion_tokens
|
The number of tokens used for completion.
Used for text-based responses (chat, completions).
|
ai.proxy.usage.completion_tokens_details
|
v3.11+ A breakdown of completion tokens (rejected_prediction_tokens, reasoning_tokens, accepted_prediction_tokens, audio_tokens).
|
ai.proxy.usage.total_tokens
|
The total number of tokens used (input + output).
Includes prompt/completion tokens for text, and input/output tokens for non-text modalities.
|
ai.proxy.usage.input_tokens
|
v3.11+ The total number of input tokens (text + image + audio).
Used for non-text requests (e.g., image or audio generation).
|
ai.proxy.usage.input_tokens_details
|
v3.11+ A breakdown of input tokens by modality (text_tokens, image_tokens, audio_tokens_count).
|
ai.proxy.usage.output_tokens
|
v3.11+ The total number of output tokens (text + audio).
Used for non-text responses (e.g., image or audio generation).
|
ai.proxy.usage.output_tokens_details
|
v3.11+ A breakdown of output tokens by modality (text_tokens, audio_tokens).
|
ai.proxy.usage.cost
|
The total cost of the request.
|
ai.proxy.usage.time_per_token
|
v3.8+ Average time to generate an output token (ms).
|
ai.proxy.usage.time_to_first_token
|
v3.12+ Time to receive the first output token (ms).
|
ai.proxy.meta.request_model
|
The model used for the AI request.
|
ai.proxy.meta.response_model
|
The model used to generate the AI response.
|
ai.proxy.meta.provider_name
|
The name of the AI service provider.
|
ai.proxy.meta.plugin_id
|
Unique identifier of the plugin instance.
|
ai.proxy.meta.llm_latency
|
v3.8+ Time taken by the LLM provider to generate the full response (ms).
|
ai.proxy.meta.request_mode
|
v3.12+ The request mode. Can be oneshot, stream, or realtime.
|
If you’re using the AI AWS Guardrails plugin, AI Gateway logs include fields under the ai.proxy.aws-guardrails object. These fields capture processing latency, the guardrails configuration applied, block reasons, and masking behavior.
|
Property
|
Description
|
ai.proxy.aws-guardrails.aws_region
|
The AWS region where the guardrail was applied.
|
ai.proxy.aws-guardrails.guardrails_id
|
The unique identifier of the guardrail configuration applied.
|
ai.proxy.aws-guardrails.guardrails_version
|
The version of the guardrail applied. Can be a numeric version or DRAFT.
|
ai.proxy.aws-guardrails.mode
|
v3.14+ The content guarding mode configured for the plugin. Possible values: INPUT, OUTPUT, BOTH.
|
ai.proxy.aws-guardrails.input_processing_latency
|
The time, in milliseconds, spent processing the request through the guardrail.
|
ai.proxy.aws-guardrails.output_processing_latency
|
The time, in milliseconds, spent processing the response through the guardrail.
|
ai.proxy.aws-guardrails.input_block_reason
|
The reason the request was blocked. Empty if the request was allowed.
|
ai.proxy.aws-guardrails.output_block_reason
|
The reason the response was blocked. Empty if the response was allowed.
|
ai.proxy.aws-guardrails.input_masked
|
true if the request content was masked rather than blocked. Only present when config.allow_masking is true.
|
ai.proxy.aws-guardrails.output_masked
|
true if the response content was masked rather than blocked. Only present when config.allow_masking is true.
|
ai.proxy.aws-guardrails.input_block_source
|
v3.14+ The name of the plugin that blocked the request. Empty if the request was allowed.
|
ai.proxy.aws-guardrails.output_block_source
|
v3.14+ The name of the plugin that blocked the response. Empty if the response was allowed.
|
ai.proxy.aws-guardrails.input_block_consumer_id
|
v3.14+ The ID of the consumer whose request was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.aws-guardrails.output_block_consumer_id
|
v3.14+ The ID of the consumer whose response was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.aws-guardrails.guards_triggered_count
|
v3.14+ A counter that increments each time a block is triggered on either the input or output within a single request.
|
ai.proxy.aws-guardrails.input_faulty_prompt
|
v3.14+ The raw request prompt that was blocked. Only present when config.log_blocked_content is true.
|
ai.proxy.aws-guardrails.output_faulty_response
|
v3.14+ The raw response that was blocked. Only present when config.log_blocked_content is true.
|
If you’re using the AI GCP Model Armor plugin, AI Gateway logs include fields under the ai.proxy.gcp-model-armor object. These fields capture the template applied, processing latency, and reasons for blocking when content is flagged.
|
Property
|
Description
|
ai.proxy.gcp-model-armor.template_id
|
The GCP Model Armor template identifier applied to the request.
|
ai.proxy.gcp-model-armor.input_processing_latency
|
The time, in milliseconds, spent processing the request through Model Armor.
|
ai.proxy.gcp-model-armor.output_processing_latency
|
The time, in milliseconds, spent processing the response through Model Armor.
|
ai.proxy.gcp-model-armor.input_block_reason
|
The check type or types that caused the request to be blocked, comma-separated (for example, sexually_explicit, dangerous). Empty if the request was allowed.
|
ai.proxy.gcp-model-armor.output_block_reason
|
The check type or types that caused the response to be blocked, comma-separated. Empty if the response was allowed.
|
ai.proxy.gcp-model-armor.mode
|
v3.14+ The content guarding mode configured for the plugin. Possible values: INPUT, OUTPUT, BOTH.
|
ai.proxy.gcp-model-armor.input_block_source
|
v3.14+ The name of the plugin that blocked the request. Empty if the request was allowed.
|
ai.proxy.gcp-model-armor.output_block_source
|
v3.14+ The name of the plugin that blocked the response. Empty if the response was allowed.
|
ai.proxy.gcp-model-armor.input_block_consumer_id
|
v3.14+ The ID of the consumer whose request was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.gcp-model-armor.output_block_consumer_id
|
v3.14+ The ID of the consumer whose response was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.gcp-model-armor.guards_triggered_count
|
v3.14+ A counter that increments each time a block is triggered on either the input or output within a single request.
|
ai.proxy.gcp-model-armor.input_faulty_prompt
|
v3.14+ The raw request prompt that was blocked. Only present when config.log_blocked_content is true.
|
ai.proxy.gcp-model-armor.output_faulty_response
|
v3.14+ The raw response that was blocked. Only present when config.log_blocked_content is true.
|
If you’re using the AI Azure Content Safety plugin, AI Gateway writes to two separate log paths.
The first path records per-category severity data from the Azure Content Safety API. Each entry represents a category that breached its configured rejection threshold. Multiple entries can appear per request depending on which categories were configured and what was detected.
For information on categories and severity levels, see Harm categories in Azure AI Content Safety.
|
Property
|
Description
|
ai.audit.azure_content_safety.<CATEGORY>
|
The numeric rejection severity threshold for the category that was breached (for example, Hate, Violence). Defined by config.categories[*].rejection_level. Multiple entries can appear per request.
|
The second path records plugin metadata and block reasons under the ai.proxy.azure-content-safety object:
|
Property
|
Description
|
ai.proxy.azure-content-safety.azure_tenant_id
|
The Azure tenant ID used for authentication.
|
ai.proxy.azure-content-safety.azure_client_id
|
The Azure client ID used for authentication.
|
ai.proxy.azure-content-safety.azure_api_version
|
The Azure Content Safety API version used for the request.
|
ai.proxy.azure-content-safety.azure_content_safety_url
|
The Azure Content Safety endpoint URL.
|
ai.proxy.azure-content-safety.input_processing_latency
|
The time, in milliseconds, spent processing the request through Azure Content Safety.
|
ai.proxy.azure-content-safety.output_processing_latency
|
The time, in milliseconds, spent processing the response through Azure Content Safety.
|
ai.proxy.azure-content-safety.input_block_reason
|
The reason the request was blocked. Empty if the request was allowed.
|
ai.proxy.azure-content-safety.output_block_reason
|
The reason the response was blocked. Empty if the response was allowed.
|
ai.proxy.azure-content-safety.mode
|
v3.14+ The content guarding mode configured for the plugin. Possible values: INPUT, OUTPUT, BOTH.
|
ai.proxy.azure-content-safety.input_block_source
|
v3.14+ The name of the plugin that blocked the request. Empty if the request was allowed.
|
ai.proxy.azure-content-safety.output_block_source
|
v3.14+ The name of the plugin that blocked the response. Empty if the response was allowed.
|
ai.proxy.azure-content-safety.input_block_consumer_id
|
v3.14+ The ID of the consumer whose request was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.azure-content-safety.output_block_consumer_id
|
v3.14+ The ID of the consumer whose response was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.azure-content-safety.guards_triggered_count
|
v3.14+ A counter that increments each time a block is triggered on either the input or output within a single request.
|
ai.proxy.azure-content-safety.input_faulty_prompt
|
v3.14+ The raw request prompt that was blocked. Only present when config.log_blocked_content is true.
|
ai.proxy.azure-content-safety.output_faulty_response
|
v3.14+ The raw response that was blocked. Only present when config.log_blocked_content is true.
|
If you’re using the AI Lakera Guard plugin, AI Gateway logs include additional fields under the ai.proxy.lakera-guard object. These fields capture processing latency, Lakera-assigned request UUIDs, block reasons, and violation details when requests or responses are blocked.
|
Property
|
Description
|
ai.proxy.lakera-guard.lakera_service_url
|
The Lakera API endpoint used for inspection (for example, https://api.lakera.ai/v2/guard).
|
ai.proxy.lakera-guard.lakera_project_id
|
The Lakera project identifier used for the inspection. Defaults to default if no project ID is configured.
|
ai.proxy.lakera-guard.mode
|
v3.14+ The content guarding mode configured for the plugin. Possible values: INPUT, OUTPUT, BOTH.
|
ai.proxy.lakera-guard.input_processing_latency
|
The time, in milliseconds, that Lakera took to process the request.
|
ai.proxy.lakera-guard.output_processing_latency
|
The time, in milliseconds, that Lakera took to process the response.
|
ai.proxy.lakera-guard.input_request_uuid
|
The unique identifier assigned by Lakera for the inspected request.
|
ai.proxy.lakera-guard.output_request_uuid
|
The unique identifier assigned by Lakera for the inspected response.
|
ai.proxy.lakera-guard.input_block_reason
|
The detector type that caused Lakera to block the request. Empty if the request was allowed.
|
ai.proxy.lakera-guard.output_block_reason
|
The detector type that caused Lakera to block the response. Empty if the response was allowed.
|
ai.proxy.lakera-guard.input_block_detail
|
An array of violation objects present when Lakera blocks a request. Each object includes policy_id, detector_id, project_id, message_id, detected (boolean), and detector_type (for example, moderated_content/hate).
|
ai.proxy.lakera-guard.output_block_detail
|
An array of violation objects present when Lakera blocks a response. The structure matches input_block_detail.
|
ai.proxy.lakera-guard.input_block_source
|
v3.14+ The name of the plugin that blocked the request. Empty if the request was allowed.
|
ai.proxy.lakera-guard.output_block_source
|
v3.14+ The name of the plugin that blocked the response. Empty if the response was allowed.
|
ai.proxy.lakera-guard.input_block_consumer_id
|
v3.14+ The ID of the consumer whose request was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.lakera-guard.output_block_consumer_id
|
v3.14+ The ID of the consumer whose response was blocked, or unknown if no consumer identity was resolved.
|
ai.proxy.lakera-guard.guards_triggered_count
|
v3.14+ A counter that increments each time a block is triggered on either the input or output within a single request.
|
ai.proxy.lakera-guard.input_faulty_prompt
|
v3.14+ The raw request prompt that was blocked. Only present when config.log_blocked_content is true.
|
ai.proxy.lakera-guard.output_faulty_response
|
v3.14+ The raw response that was blocked. Only present when config.log_blocked_content is true.
|
If you’re using the AI Custom Guardrail plugin, AI Gateway logs include additional fields under the custom-guardrail object. These fields record guardrail processing latency, block reasons, and the source and consumer identity associated with any triggered guards.
The following fields appear in structured AI logs when the AI Custom Guardrail plugin is enabled:
|
Property
|
Description
|
ai.proxy.custom-guardrail.mode
|
The inspection mode configured for the guardrail. For example, BOTH means both input and output are inspected.
|
ai.proxy.custom-guardrail.input_processing_latency
|
The time (in milliseconds) taken to process the input through the guardrail.
|
ai.proxy.custom-guardrail.output_processing_latency
|
The time (in milliseconds) taken to process the output through the guardrail.
|
ai.proxy.custom-guardrail.input_block_reason
|
The reason the input was blocked. Empty if the input was not blocked.
|
ai.proxy.custom-guardrail.output_block_reason
|
The reason the output was blocked. Empty if the output was not blocked.
|
ai.proxy.custom-guardrail.input_block_source
|
The source that triggered the input block (for example, ai-custom-guardrail). Empty if the input was not blocked.
|
ai.proxy.custom-guardrail.output_block_source
|
The source that triggered the output block. Empty if the output was not blocked.
|
ai.proxy.custom-guardrail.input_block_consumer_id
|
The consumer ID associated with the blocked input request. Set to unknown if the consumer can’t be identified.
|
ai.proxy.custom-guardrail.output_block_consumer_id
|
The consumer ID associated with the blocked output response. Empty if the output was not blocked.
|
ai.proxy.custom-guardrail.guards_triggered_count
|
The number of individual guard rules that were triggered during the request.
|
The plugin also allows you to define custom metrics based on Lua expressions.
If you’re using the AI PII Sanitizer plugin, AI Gateway logs include additional fields that provide insight into the detection and redaction of personally identifiable information (PII). These fields track the number of entities identified and sanitized, the time taken to process the payload, and detailed metadata about each sanitized item, including the original value, redacted value, and detected entity type.
|
Property
|
Description
|
ai.sanitizer.pii_identified
|
The number of PII entities detected in the input payload.
|
ai.sanitizer.pii_sanitized
|
The number of PII entities that were anonymized or redacted.
|
ai.sanitizer.duration
|
The time taken (in milliseconds) by the ai-pii-service container to process the payload.
|
ai.sanitizer.sanitized_items
|
A list of sanitized PII entities, each including the original text, redacted text, and the entity type.
|
ai.sanitizer.input_block_source
|
v3.14+ The name of the plugin that blocked the request. Empty if the request was allowed.
|
ai.sanitizer.output_block_source
|
v3.14+ The name of the plugin that blocked the response. Empty if the response was allowed.
|
ai.sanitizer.input_block_consumer_id
|
v3.14+ The ID of the consumer whose request was blocked, or unknown if no consumer identity was resolved.
|
ai.sanitizer.output_block_consumer_id
|
v3.14+ The ID of the consumer whose response was blocked, or unknown if no consumer identity was resolved.
|
ai.sanitizer.guards_triggered_count
|
v3.14+ A counter that increments each time a block is triggered on either the input or output within a single request.
|
When the AI Prompt Compressor plugin is enabled, additional logs record token counts before and after compression, compression ratios, and metadata about the compression method and model used.
|
Property
|
Description
|
ai.compressor.original_token_count
|
The original number of tokens before compression.
|
ai.compressor.compress_token_count
|
The number of tokens after compression.
|
ai.compressor.save_token_count
|
The number of tokens saved by compression (original minus compressed).
|
ai.compressor.compress_value
|
The compression ratio applied.
|
ai.compressor.compress_type
|
The type or method of compression used.
|
ai.compressor.compressor_model
|
The model used to perform the compression.
|
ai.compressor.msg_id
|
The identifier of the message that was compressed.
|
ai.compressor.information
|
A summary or message describing the result of compression.
|
If you’re using the AI RAG Injector plugin, AI Gateway logs include additional fields that provide detailed information about the retrieval-augmented generation process. These fields track the vector database used, whether relevant context was injected into the prompt, the latency of data fetching, and embedding metadata such as tokens used and the embedding provider and model used.
|
Property
|
Description
|
ai.proxy.rag-inject.vector_db
|
The vector database used (for example, pgvector).
|
ai.proxy.rag-inject.injected
|
Boolean indicating if RAG injection occurred.
|
ai.proxy.rag-inject.fetch_latency
|
The fetch latency in milliseconds.
|
ai.proxy.rag-inject.chunk_ids
|
List of chunk IDs retrieved.
|
ai.proxy.rag-inject.embeddings_latency
|
Time taken to generate embeddings, in milliseconds.
|
ai.proxy.rag-inject.embeddings_tokens
|
Number of tokens used for embeddings.
|
ai.proxy.rag-inject.embeddings_provider
|
Provider used to generate embeddings.
|
ai.proxy.rag-inject.embeddings_model
|
Model used to generate embeddings.
|
If you’re using the AI Semantic Cache plugin, AI Gateway logs include additional fields under the cache object for each plugin entry. These fields provide insight into cache behavior, such as whether a response was served from cache, how long it took to fetch, and which embedding provider and model were used if applicable.
|
Property
|
Description
|
ai.proxy.cache.cache_status
|
v3.8+ The cache status. This can be Hit, Miss, Bypass, or Refresh.
|
ai.proxy.cache.fetch_latency
|
The time, in milliseconds, it took to return a cached response.
|
ai.proxy.cache.embeddings_provider
|
The provider used to generate the embeddings.
|
ai.proxy.cache.embeddings_model
|
The model used to generate the embeddings.
|
ai.proxy.cache.embeddings_latency
|
The time taken to generate the embeddings.
|
Note: When returning a cached response, time_per_token and llm_latency are omitted.
The cache response can be returned either as a semantic cache or an exact cache. If it’s returned as a semantic cache, it will include additional details such as the embeddings provider, embeddings model, and embeddings latency.
If you’re using the AI LLM as Judge plugin, AI Gateway logs include additional fields under the ai-llm-as-judge object. These fields provide insight into evaluation behavior, such as which models were scored, latency, and the numeric accuracy assigned by the judge.
|
Property
|
Description
|
ai.proxy.ai-llm-as-judge.meta.llm_latency
|
The time, in milliseconds, that the judge model took to return a score.
|
ai.proxy.ai-llm-as-judge.meta.request_model
|
The candidate model being evaluated by the judge.
|
ai.proxy.ai-llm-as-judge.meta.response_model
|
The model used as the judge (for example, gpt-4o).
|
ai.proxy.ai-llm-as-judge.meta.provider_name
|
The provider of the judge model (for example, openai).
|
ai.proxy.ai-llm-as-judge.meta.request_mode
|
The mode used for evaluation (for example, oneshot).
|
ai.proxy.ai-llm-as-judge.usage.llm_accuracy
|
The numeric accuracy score (1-100) returned by the judge model.
|
If you’re using the AI MCP plugin, AI Gateway logs include additional fields under the ai.mcp object. These fields provide insight into Model Context Protocol (MCP) traffic, including session IDs, JSON-RPC request/response payloads, latency, tool usage, and v3.13+ access control audit entries.
Note: Unlike other available AI plugins, the AI MCP plugin is not invoked as part of an AI request.
Instead, it is registered and executed as a regular plugin, allowing it to capture MCP traffic independently of AI request flow.
Do not configure the AI MCP plugin together with other ai-* plugins on the same service or route.
The MCP log structure groups traffic by MCP session ID, with each session containing zero or more recorded JSON-RPC requests:
|
Property
|
Description
|
ai.mcp.mcp_session_id
|
The ID of the MCP session. A session can contain multiple requests.
|
ai.mcp.rpc
|
An array of recorded JSON-RPC requests. Only JSON-RPC traffic is logged.
|
ai.mcp.rpc[].id
|
The ID of the JSON-RPC request. Not all JSON-RPC requests have an ID.
|
ai.mcp.rpc[].latency
|
The latency of the JSON-RPC request, in milliseconds.
|
ai.mcp.rpc[].payload.request
|
The request payload of the JSON-RPC request, serialized as a JSON string.
|
ai.mcp.rpc[].payload.response
|
The response payload of the JSON-RPC request, serialized as a JSON string.
|
ai.mcp.rpc[].method
|
The JSON-RPC method name.
|
ai.mcp.rpc[].tool_name
|
If the method is a tool call, the name of the tool being invoked.
|
ai.mcp.rpc[].error
|
The error message if an error occurred during the request.
|
ai.mcp.rpc[].response_body_size
|
The size of the JSON-RPC response body, in bytes.
|
ai.mcp.audit
|
v3.13+ An array of access control audit entries. Each entry records whether access was allowed or denied for a specific MCP primitive or globally.
|
ai.mcp.audit[].primitive_name
|
v3.13+ The name of the MCP primitive (for example, list_users).
|
ai.mcp.audit[].primitive
|
v3.13+ The type of MCP primitive (for example, tool, resource, or prompt).
|
ai.mcp.audit[].action
|
v3.13+ The access control decision: allow or deny.
|
ai.mcp.audit[].consumer.name
|
v3.13+ The name of the consumer making the request.
|
ai.mcp.audit[].consumer.id
|
v3.13+ The UUID of the consumer.
|
ai.mcp.audit[].consumer.identifier
|
v3.13+ The type of consumer identifier (for example, consumer_group).
|
ai.mcp.audit[].scope
|
v3.13+ The scope of the access control check.
|
If you’re using the AI A2A Proxy plugin, AI Gateway logs include additional fields under the ai.a2a object when config.logging.log_statistics is enabled. These fields provide observability into Agent-to-Agent (A2A) protocol traffic, including operation names, task lifecycle state, latency, streaming metrics, and optional request/response payloads.
When config.logging.log_statistics is enabled, the plugin writes the following fields to the
ai.a2a.rpc[] array:
|
Field
|
Type
|
Description
|
ai.a2a.rpc[].method
|
string
|
A2A operation name
|
ai.a2a.rpc[].binding
|
string
|
Protocol binding: jsonrpc or rest
|
ai.a2a.rpc[].latency
|
number
|
End-to-end proxy latency in milliseconds
|
ai.a2a.rpc[].id
|
string
|
Request ID (JSON-RPC) or task ID (REST)
|
ai.a2a.rpc[].task_id
|
string
|
Task ID extracted from the response
|
ai.a2a.rpc[].task_state
|
string
|
Normalized task state (see task states)
|
ai.a2a.rpc[].context_id
|
string
|
A2A context ID extracted from the response
|
ai.a2a.rpc[].error
|
string
|
Error type string when the upstream returned an error
|
ai.a2a.rpc[].response_body_size
|
number
|
Response body size in bytes
|
ai.a2a.rpc[].streaming
|
boolean
|
true for SSE streaming responses
|
ai.a2a.rpc[].ttfb_latency
|
number
|
Time to first byte in milliseconds (streaming only)
|
ai.a2a.rpc[].sse_events_count
|
number
|
Count of SSE data: events received (streaming only)
|
ai.a2a.rpc[].payload.request
|
string
|
Request body (only when log_payloads is enabled)
|
ai.a2a.rpc[].payload.response
|
string
|
Response body (only when log_payloads is enabled)
|