Gemini provider

Uses: Kong Gateway AI Gateway Admin API deck KIC Konnect API Terraform

Upstream paths

AI Gateway automatically routes requests to the appropriate Gemini API endpoints. The following table shows the upstream paths used for each capability.

Capability	Upstream path or API
Chat completions	Uses `generateContent` API
Embeddings	Uses `batchEmbedContents` API
Function calling	Uses `generateContent` API with function declarations
Files	Uses `uploadFile` and `files` API
Batches	Uses `batches` API
Image generations	Uses `generateContent` API
Image edits	Uses `generateContent` API
Video generations	Uses `predictLongRunning` API
Realtime	Uses `BidiGenerateContent` API

Supported capabilities

The following tables show the AI capabilities supported by Gemini provider when used with the AI Proxy or the AI Proxy Advanced plugin.

Set the plugin’s route_type based on the capability you want to use. See the tables below for supported route types.

Text generation

Support for Gemini basic text generation capabilities including chat, completions, and embeddings:

Capability	Route type	Streaming	Model example	Min version
Chat completions	`llm/v1/chat`		gemini-2.5-flash	3.8
Embeddings	`llm/v1/embeddings`		text-embedding-004	3.11

Advanced text generation

Support for Gemini function calling to allow Gemini models to use external tools and APIs:

Capability	Route type	Model example	Min version
Function calling	`llm/v1/chat`	gemini-2.5-flash	3.8

Processing

Support for Gemini file operations, batch operations, assistants, and response handling:

Capability	Route type	Model example	Min version
Files¹	`llm/v1/files`	n/a	n/a
Batches²	`llm/v1/batches`	n/a	n/a

¹ Files processing for Gemini is supported in the native format from SDK only

² Batches processing for Gemini is supported in the native format from SDK only

Image

Support for Gemini image generation and editing capabilities:

Capability	Route type	Model example	Min version
Generations	`image/v1/images/generations`	gemini-2.5-flash-preview-image-generation	3.11
Edits	`image/v1/images/edits`	gemini-2.5-flash-preview-image-generation	3.11

For requests with large payloads, consider increasing config.max_request_body_size to three times the raw binary size.

Supported image sizes and formats vary by model. Refer to your provider’s documentation for allowed dimensions and requirements.

Video

Support for Gemini video generation capabilities:

Capability	Route type	Model example	Min version
Generations	`video/v1/videos/generations`	veo-3.1-generate-001	3.13

For requests with large payloads (video generation), consider increasing config.max_request_body_size to three times the raw binary size.

Realtime

Support for Gemini’s bidirectional streaming for realtime applications:

Realtime processing requires the AI Proxy Advanced plugin and uses WebSocket protocol.

To use the realtime route, you must configure the protocols ws and/or wss on both the Service and on the Route where the plugin is associated.

Capability	Route type	Model example	Min version
Realtime³	`realtime/v1/realtime`	gemini-2.5-flash-preview-native-audio	3.13

³ Realtime processing for Gemini is supported in the native format from SDK only

Gemini base URL

The base URL is https://generativelanguage.googleapis.com, where {route_type_path} is determined by the capability.

AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Gemini-compatible endpoint, in which case set the upstream_url plugin option.

Supported native LLM formats for Gemini

By default, the AI Proxy plugin uses OpenAI-compatible request formats. Set config.llm_format to a native format to use Gemini-specific APIs and features.

The following native Gemini APIs are supported:

LLM format	Supported APIs
`gemini`	`/v1beta/models/{model_name}:generateContent` `/v1beta/models/{model_name}:streamGenerateContent` `/v1beta/models/{model_name}:embedContent` `/v1beta/models/{model_name}:batchEmbedContent` `/v1beta/batches` `/upload/{file_id}/files` `/v1beta/files`

Provider-specific limitations for native formats

Gemini only supports auth.allow_override = false

Configure Gemini with AI Proxy

To use Gemini with AI Gateway, configure the AI Proxy or AI Proxy Advanced.

Here’s a minimal configuration for chat completions:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      auth:
        param_name: key
        param_value: ${{ env "DECK_GEMINI_API_KEY" }}
        param_location: query
      model:
        provider: gemini
        name: gemini-2.5-flash

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "auth": {
          "param_name": "key",
          "param_value": "'$GEMINI_API_KEY'",
          "param_location": "query"
        },
        "model": {
          "provider": "gemini",
          "name": "gemini-2.5-flash"
        }
      }
    }
    '

Copied!

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "auth": {
          "param_name": "key",
          "param_value": "'$GEMINI_API_KEY'",
          "param_location": "query"
        },
        "model": {
          "provider": "gemini",
          "name": "gemini-2.5-flash"
        }
      }
    }
    '

Copied!

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: ai-proxy
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: 'true'
config:
  route_type: llm/v1/chat
  auth:
    param_name: key
    param_value: '$GEMINI_API_KEY'
    param_location: query
  model:
    provider: gemini
    name: gemini-2.5-flash
plugin: ai-proxy
" | kubectl apply -f -

Copied!

resource "konnect_gateway_plugin_ai_proxy" "my_ai_proxy" {
  enabled = true

  config = {
    route_type = "llm/v1/chat"

    auth = {
      param_name = "key"
      param_value = var.gemini_api_key
      param_location = "query"
    }

    model = {
      provider = "gemini"
      name = "gemini-2.5-flash"
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "gemini_api_key" {
  type = string
}

Copied!

For more configuration options and examples, see:

AI Proxy examples

AI Proxy Advanced examples

Tutorials

FAQs

How can I set model generation parameters when calling Gemini?

You have several options, depending on the SDK and configuration:

Use the Gemini SDK:

Set llm_format to gemini.
Use the Gemini provider.

Configure parameters like temperature, top_p, and top_k on the client side:

 model = genai.GenerativeModel(
     'gemini-1.5-flash',
     generation_config=genai.types.GenerationConfig(
         temperature=0.7,
         top_p=0.9,
         top_k=40,
         max_output_tokens=1024
     )
 )

Copied!

Use the OpenAI SDK with the Gemini provider:
1. Set llm_format to openai.
2. You can configure parameters in one of three ways:
  - Configure them in the plugin only.
  - Configure them in the client only.
  - Configure them in both—the client-side values will override plugin config.

How do I use Gemini’s googleSearch tool for real-time web searches?

Configure AI Proxy Advanced with the Gemini provider and declare the googleSearch tool in your requests. See Use Gemini’s googleSearch tool with AI Proxy Advanced.

How do I control aspect ratio and resolution for Gemini image generation?

Pass imageConfig parameters via generationConfig in your image generation requests. See Use Gemini’s imageConfig with AI Proxy.

How do I get reasoning traces from Gemini models?

Pass thinkingConfig parameters via extra_body in your requests to enable detailed reasoning traces. See Use Gemini’s thinkingConfig with AI Proxy Advanced.

Gemini provider

Upstream paths

Supported capabilities

Text generation

Advanced text generation

Processing

Image

Video

Realtime

Gemini base URL

Supported native LLM formats for Gemini

Provider-specific limitations for native formats

Configure Gemini with AI Proxy

Tutorials

FAQs

Help us make these docs great!

Still need help?