You can proxy requests to Gemini AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.
Gemini provider
Upstream paths
AI Gateway automatically routes requests to the appropriate Gemini API endpoints. The following table shows the upstream paths used for each capability.
| Capability | Upstream path or API |
|---|---|
| Chat completions | Uses generateContent API |
| Embeddings | Uses batchEmbedContents API |
| Function calling | Uses generateContent API with function declarations |
| Files | Uses uploadFile and files API |
| Batches | Uses batches API |
| Image generations | Uses generateContent API |
| Image edits | Uses generateContent API |
| Video generations | Uses predictLongRunning API |
| Realtime | Uses BidiGenerateContent API |
Supported capabilities
The following tables show the AI capabilities supported by Gemini provider when used with the AI Proxy or the AI Proxy Advanced plugin.
Set the plugin’s
route_typebased on the capability you want to use. See the tables below for supported route types.
Text generation
Support for Gemini basic text generation capabilities including chat, completions, and embeddings:
| Capability | Route type | Streaming | Model example | Min version |
|---|---|---|---|---|
| Chat completions | llm/v1/chat |
gemini-2.5-flash | 3.8 | |
| Embeddings | llm/v1/embeddings |
text-embedding-004 | 3.11 |
Advanced text generation
Support for Gemini function calling to allow Gemini models to use external tools and APIs:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Function calling | llm/v1/chat |
gemini-2.5-flash | 3.8 |
Processing
Support for Gemini file operations, batch operations, assistants, and response handling:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Files1 | llm/v1/files |
n/a | n/a |
| Batches2 | llm/v1/batches |
n/a | n/a |
1 Files processing for Gemini is supported in the native format from SDK only
2 Batches processing for Gemini is supported in the native format from SDK only
Image
Support for Gemini image generation and editing capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | image/v1/images/generations |
gemini-2.5-flash-preview-image-generation | 3.11 |
| Edits | image/v1/images/edits |
gemini-2.5-flash-preview-image-generation | 3.11 |
For requests with large payloads, consider increasing
config.max_request_body_sizeto three times the raw binary size.Supported image sizes and formats vary by model. Refer to your provider’s documentation for allowed dimensions and requirements.
Video
Support for Gemini video generation capabilities:
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Generations | video/v1/videos/generations |
veo-3.1-generate-001 | 3.13 |
For requests with large payloads (video generation), consider increasing
config.max_request_body_sizeto three times the raw binary size.
Realtime
Support for Gemini’s bidirectional streaming for realtime applications:
Realtime processing requires the AI Proxy Advanced plugin and uses WebSocket protocol.
To use the realtime route, you must configure the protocols
wsand/orwsson both the Service and on the Route where the plugin is associated.
| Capability | Route type | Model example | Min version |
|---|---|---|---|
| Realtime3 | realtime/v1/realtime |
gemini-2.5-flash-preview-native-audio | 3.13 |
3 Realtime processing for Gemini is supported in the native format from SDK only
Gemini base URL
The base URL is https://generativelanguage.googleapis.com, where {route_type_path} is determined by the capability.
AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Gemini-compatible endpoint, in which case set the upstream_url plugin option.
Supported native LLM formats for Gemini
By default, the AI Proxy plugin uses OpenAI-compatible request formats. Set config.llm_format to a native format to use Gemini-specific APIs and features.
The following native Gemini APIs are supported:
| LLM format | Supported APIs |
|---|---|
gemini |
|
Provider-specific limitations for native formats
- Gemini only supports
auth.allow_override = false
Configure Gemini with AI Proxy
To use Gemini with AI Gateway, configure the AI Proxy or AI Proxy Advanced.
Here’s a minimal configuration for chat completions:
For more configuration options and examples, see:
Tutorials
- Set up AI Proxy Advanced with Gemini in Kong Gateway View →
- Set up AI Proxy with Gemini in Kong Gateway View →
- Route Claude CLI traffic through AI Gateway and Gemini View →
- Use Gemini's googleSearch tool with AI Proxy Advanced in AI Gateway View →
- Use Gemini's imageConfig with AI Proxy in AI Gateway View →
- Use Gemini's thinkingConfig with AI Proxy Advanced in AI Gateway View →
- Use Google Generative AI SDK for Gemini AI service chats with AI Gateway View →
FAQs
How can I set model generation parameters when calling Gemini?
You have several options, depending on the SDK and configuration:
-
Use the Gemini SDK:
- Set
llm_formattogemini. - Use the Gemini provider.
- Configure parameters like
temperature,top_p, andtop_kon the client side:model = genai.GenerativeModel( 'gemini-1.5-flash', generation_config=genai.types.GenerationConfig( temperature=0.7, top_p=0.9, top_k=40, max_output_tokens=1024 ) )Copied!
- Set
-
Use the OpenAI SDK with the Gemini provider:
- Set
llm_formattoopenai. - You can configure parameters in one of three ways:
- Configure them in the plugin only.
- Configure them in the client only.
- Configure them in both—the client-side values will override plugin config.
- Set
How do I use Gemini’s googleSearch tool for real-time web searches?
Configure AI Proxy Advanced with the Gemini provider and declare the googleSearch tool in your requests. See Use Gemini’s googleSearch tool with AI Proxy Advanced.
How do I control aspect ratio and resolution for Gemini image generation?
Pass imageConfig parameters via generationConfig in your image generation requests. See Use Gemini’s imageConfig with AI Proxy.
How do I get reasoning traces from Gemini models?
Pass thinkingConfig parameters via extra_body in your requests to enable detailed reasoning traces. See Use Gemini’s thinkingConfig with AI Proxy Advanced.