AI Gateway providers
The core of AI Gateway is the ability to route AI requests to various providers exposed via a provider-agnostic API. This normalized API layer affords developers and organizations multiple benefits:
- Client applications are shielded from AI provider API specifics, promoting code reusability
- Centralized AI provider credential management
- The AI Gateway gives developers and organizations a central point of governance and observability over AI data and usage
- Request routing can be dynamic, allowing AI usage to be optimized based on various metrics
- AI services can be used by other Kong Gateway plugins to augment non-AI API traffic
Note that some providers may not be available depending on your Kong Gateway version, and some providers don’t support all route types. See the specific provider documentation for more details.
Frequently Asked Questions
Can I authenticate to Azure AI with Azure Identity?
Yes, if Kong Gateway is running on Azure, AI Proxy can detect the designated Managed Identity or User-Assigned Identity of that Azure Compute resource, and use it accordingly. In your AI Proxy configuration, set the following parameters:
-
config.auth.azure_use_managed_identitytotrueto use an Azure-Assigned Managed Identity. -
config.auth.azure_use_managed_identitytotrueand anconfig.auth.azure_client_idto use a User-Assigned Identity.
How can I set model generation parameters when calling Gemini?
You have several options, depending on the SDK and configuration:
-
Use the Gemini SDK:
- Set
llm_formattogemini. - Use the Gemini provider.
- Configure parameters like
temperature,top_p, andtop_kon the client side:model = genai.GenerativeModel( 'gemini-1.5-flash', generation_config=genai.types.GenerationConfig( temperature=0.7, top_p=0.9, top_k=40, max_output_tokens=1024 ) )Copied!
- Set
-
Use the OpenAI SDK with the Gemini provider:
- Set
llm_formattoopenai. - You can configure parameters in one of three ways:
- Configure them in the plugin only.
- Configure them in the client only.
- Configure them in both—the client-side values will override plugin config.
- Set
How do I use Gemini’s googleSearch tool for real-time web searches?
Configure AI Proxy Advanced with the Gemini provider and declare the googleSearch tool in your requests. See Use Gemini’s googleSearch tool with AI Proxy Advanced.
How do I control aspect ratio and resolution for Gemini image generation?
Pass imageConfig parameters via generationConfig in your image generation requests. See Use Gemini’s imageConfig with AI Proxy.
How do I get reasoning traces from Gemini models?
Pass thinkingConfig parameters via extra_body in your requests to enable detailed reasoning traces. See Use Gemini’s thinkingConfig with AI Proxy Advanced.
How do I specify model IDs for Amazon Bedrock cross-region inference profiles?
For cross-region inference, prefix the model ID with a geographic identifier:
{geography-prefix}.{provider}.{model-name}...
For example: us.anthropic.claude-sonnet-4-5-20250929-v1:0
|
Prefix |
Geography |
|---|---|
us.
|
United States |
eu.
|
European Union |
apac.
|
Asia-Pacific |
global.
|
All commercial regions |
For a full list of supported cross-region inference profiles, see Supported Regions and models for inference profiles in the AWS documentation.
How do I set the FPS parameter for video generation for Amazon Bedrock?
Use the extra_body feature when sending requests in OpenAI format:
curl http://localhost:8000 \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "amazon.nova-reel-v1:0",
"prompt": "A large red square that is rotating",
"extra_body": {
"fps": 24
}
}'
How do I use Amazon Bedrock’s Rerank API to improve RAG retrieval quality?
Configure AI Proxy with the bedrock provider and set up AWS authentication using IAM credentials or assumed roles. See Use AWS Bedrock rerank API with AI Proxy.
How do I include guardrail configuration with Amazon Bedrock requests?
Add a guardrailConfig object to your request body:
{
"messages": [
{
"role": "system",
"content": "You are a scientist."
},
{
"role": "user",
"content": "What is the Boltzmann equation?"
}
],
"guardrailConfig": {
"guardrailIdentifier": "$GUARDRAIL-IDENTIFIER",
"guardrailVersion": "1",
"trace": "enabled"
}
}
This feature requires Kong Gateway 3.9 or later. For more details, see Guardrails and content safety and the AWS Bedrock guardrails documentation.
How do I use Cohere’s document-grounded chat for RAG pipelines?
Configure AI Proxy with the Cohere provider and send queries with candidate documents. The model filters for relevance and returns answers with citations. See Use Cohere rerank API for document-grounded chat.