Related Documentation
Minimum Version
Kong Gateway - 3.6
Tags
#ai

You can proxy requests to Azure AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.

Upstream paths

AI Gateway automatically routes requests to the appropriate Azure API endpoints. The following table shows the upstream paths used for each capability.

Capability Upstream path or API
Chat completions /openai/deployments/{deployment_name}/chat/completions
Completions /openai/deployments/{deployment_name}/completions
Embeddings /openai/deployments/{deployment_name}/embeddings
Function calling /openai/deployments/{deployment_name}/chat/completions
Files /openai/files
Batches /openai/batches
Assistants /openai/assistants
Responses /openai/v1/responses
Speech /openai/audio/speech
Transcriptions /openai/audio/transcriptions
Translations /openai/audio/translations
Image generations /openai/images/generations
Image edits /openai/images/edits
Video generations /openai/v1/video/generations/jobs
Realtime /openai/realtime

Supported capabilities

The following tables show the AI capabilities supported by Azure provider when used with the AI Proxy or the AI Proxy Advanced plugin.

Set the plugin’s route_type based on the capability you want to use. See the tables below for supported route types.

Text generation

Support for Azure basic text generation capabilities including chat, completions, and embeddings:

Capability Route type Streaming Model example Min version
Chat completions llm/v1/chat gpt-4o 3.6
Completions llm/v1/completions gpt-4o-mini 3.6
Embeddings1 llm/v1/embeddings text-embedding-3-small 3.11

1 Use text-embedding-3-small or text-embedding-3-large for dynamic dimensions.

Advanced text generation

Support for Azure function calling to allow Azure models to use external tools and APIs:

Capability Route type Model example Min version
Function calling llm/v1/chat gpt-4o 3.6

Processing

Support for Azure file operations, batch operations, assistants, and response handling:

Capability Route type Model example Min version
Files llm/v1/files n/a 3.11
Batches llm/v1/batches n/a 3.11
Assistants2 llm/v1/assistants n/a 3.11
Responses3 llm/v1/responses n/a 3.11

2 Assistans API requires header OpenAI-Beta: assistants=v2

3 Responses API requires config.azure_api_version set to "preview"

Audio

Support for Azure text-to-speech, transcription, and translation capabilities:

Capability Route type Model example Min version
Speech audio/v1/audio/speech n/a 3.11
Transcriptions audio/v1/audio/transcriptions n/a 3.11
Translations audio/v1/audio/translations n/a 3.11

For requests with large payloads, consider increasing config.max_request_body_size to three times the raw binary size.

Supported audio formats, voices, and parameters vary by model. Refer to your provider’s documentation for available options.

Image

Support for Azure image generation and editing capabilities:

Capability Route type Model example Min version
Generations image/v1/images/generations n/a 3.11
Edits image/v1/images/edits n/a 3.11

For requests with large payloads, consider increasing config.max_request_body_size to three times the raw binary size.

Supported image sizes and formats vary by model. Refer to your provider’s documentation for allowed dimensions and requirements.

Video

Support for Azure video generation capabilities:

Capability Route type Model example Min version
Generations video/v1/videos/generations sora-2 3.13

For requests with large payloads (video generation), consider increasing config.max_request_body_size to three times the raw binary size.

Realtime

Support for Azure’s bidirectional streaming for realtime applications:

Realtime processing requires the AI Proxy Advanced plugin and uses WebSocket protocol.

To use the realtime route, you must configure the protocols ws and/or wss on both the Service and on the Route where the plugin is associated.

Capability Route type Model example Min version
Realtime4 realtime/v1/realtime n/a 3.11

4 For requests to Azure OpenAI realtime API, include include the header OpenAI-Beta: realtime=v1.

Azure base URL

The base URL is https://{azure_instance}.openai.azure.com:443/openai/deployments/{deployment_name}/{route_type_path}, where {route_type_path} is determined by the capability.

AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Azure-compatible endpoint, in which case set the upstream_url plugin option.

Configure Azure with AI Proxy

To use Azure with AI Gateway, configure the AI Proxy or AI Proxy Advanced.

Here’s a minimal configuration for chat completions:

For more configuration options and examples, see:

FAQs

Yes, if Kong Gateway is running on Azure, AI Proxy can detect the designated Managed Identity or User-Assigned Identity of that Azure Compute resource, and use it accordingly. In your AI Proxy configuration, set the following parameters:

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!