AI Proxy Advanced: Load balancing: Semantic - Plugin

Load balancing: Semanticv3.8+

Configure semantic load balancing with the AI Proxy Advanced plugin. To set up semantic routing, you must configure the following parameters:

config.embeddings to define the model to use to match the model description and the prompts.
config.vectordb to define the vector database parameters. Currently, Redis and pgvector are supported.
config.targets[].description to define the description to be matched with the prompts.

This configuration routes incoming requests to the most relevant OpenAI model based on the content of the request:

If the request is related to code completions, it will be routed to the gpt-35-turbo model.
If the request is about IT support, it will be routed to the gpt-4o model.
All other requests, which don’t match the above categories, will be handled by the gpt-4o-mini model, serving as a catch-all for general queries.

Prerequisites

An OpenAI account
A Redis instance running

Environment variables

OPENAI_API_KEY: The API key to use to connect to OpenAI.

Set up the plugin

Add this section to your declarative configuration file:

_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    config:
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          name: text-embedding-3-small
          provider: openai
      vectordb:
        dimensions: 1024
        distance_metric: cosine
        strategy: redis
        threshold: 0.7
        redis:
          host: redis-stack-server
          port: 6379
      balancer:
        algorithm: semantic
      targets:
      - model:
          name: gpt-3.5-turbo
          provider: openai
          options:
            max_tokens: 826
            temperature: 0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Specialist in code completions
      - model:
          name: gpt-4o
          provider: openai
          options:
            max_tokens: 512
            temperature: 0.3
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Requests related to IT support
      - model:
          name: gpt-4o-mini
          provider: openai
          options:
            max_tokens: 256
            temperature: 1.0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: CATCHALL

Make the following request:

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
controlPlaneId: The id of the control plane.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: ai-proxy-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: 'true'
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    model:
      name: text-embedding-3-small
      provider: openai
  vectordb:
    dimensions: 1024
    distance_metric: cosine
    strategy: redis
    threshold: 0.7
    redis:
      host: redis-stack-server
      port: 6379
  balancer:
    algorithm: semantic
  targets:
  - model:
      name: gpt-3.5-turbo
      provider: openai
      options:
        max_tokens: 826
        temperature: 0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Specialist in code completions
  - model:
      name: gpt-4o
      provider: openai
      options:
        max_tokens: 512
        temperature: 0.3
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Requests related to IT support
  - model:
      name: gpt-4o-mini
      provider: openai
      options:
        max_tokens: 256
        temperature: 1.0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: CATCHALL
plugin: ai-proxy-advanced
" | kubectl apply -f -

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
  enabled = true

  config = {

    embeddings = {

      auth = {
        header_name = "Authorization"
        header_value = "Bearer var.openai_api_key"
      }

      model = {
        name = "text-embedding-3-small"
        provider = "openai"
      }
    }

    vectordb = {
      dimensions = 1024
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.7

      redis = {
        host = "redis-stack-server"
        port = 6379
      }
    }

    balancer = {
      algorithm = "semantic"
    }
    targets = [
      {

        model = {
          name = "gpt-3.5-turbo"
          provider = "openai"

          options = {
            max_tokens = 826
            temperature = 0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Specialist in code completions"
      }, 

      {

        model = {
          name = "gpt-4o"
          provider = "openai"

          options = {
            max_tokens = 512
            temperature = 0.3
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Requests related to IT support"
      }, 

      {

        model = {
          name = "gpt-4o-mini"
          provider = "openai"

          options = {
            max_tokens = 256
            temperature = 1.0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "CATCHALL"
      }    ]
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "openai_api_key" {
  type = string
}

Add this section to your declarative configuration file:

_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    service: serviceName|Id
    config:
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          name: text-embedding-3-small
          provider: openai
      vectordb:
        dimensions: 1024
        distance_metric: cosine
        strategy: redis
        threshold: 0.7
        redis:
          host: redis-stack-server
          port: 6379
      balancer:
        algorithm: semantic
      targets:
      - model:
          name: gpt-3.5-turbo
          provider: openai
          options:
            max_tokens: 826
            temperature: 0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Specialist in code completions
      - model:
          name: gpt-4o
          provider: openai
          options:
            max_tokens: 512
            temperature: 0.3
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Requests related to IT support
      - model:
          name: gpt-4o-mini
          provider: openai
          options:
            max_tokens: 256
            temperature: 1.0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: CATCHALL

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/services/{serviceName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
controlPlaneId: The id of the control plane.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
serviceId: The id of the service the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-proxy-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    model:
      name: text-embedding-3-small
      provider: openai
  vectordb:
    dimensions: 1024
    distance_metric: cosine
    strategy: redis
    threshold: 0.7
    redis:
      host: redis-stack-server
      port: 6379
  balancer:
    algorithm: semantic
  targets:
  - model:
      name: gpt-3.5-turbo
      provider: openai
      options:
        max_tokens: 826
        temperature: 0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Specialist in code completions
  - model:
      name: gpt-4o
      provider: openai
      options:
        max_tokens: 512
        temperature: 0.3
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Requests related to IT support
  - model:
      name: gpt-4o-mini
      provider: openai
      options:
        max_tokens: 256
        temperature: 1.0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: CATCHALL
plugin: ai-proxy-advanced
" | kubectl apply -f -

Next, apply the KongPlugin resource by annotating the service resource:

kubectl annotate -n kong service SERVICE_NAME konghq.com/plugins=ai-proxy-advanced

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
  enabled = true

  config = {

    embeddings = {

      auth = {
        header_name = "Authorization"
        header_value = "Bearer var.openai_api_key"
      }

      model = {
        name = "text-embedding-3-small"
        provider = "openai"
      }
    }

    vectordb = {
      dimensions = 1024
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.7

      redis = {
        host = "redis-stack-server"
        port = 6379
      }
    }

    balancer = {
      algorithm = "semantic"
    }
    targets = [
      {

        model = {
          name = "gpt-3.5-turbo"
          provider = "openai"

          options = {
            max_tokens = 826
            temperature = 0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Specialist in code completions"
      }, 

      {

        model = {
          name = "gpt-4o"
          provider = "openai"

          options = {
            max_tokens = 512
            temperature = 0.3
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Requests related to IT support"
      }, 

      {

        model = {
          name = "gpt-4o-mini"
          provider = "openai"

          options = {
            max_tokens = 256
            temperature = 1.0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "CATCHALL"
      }    ]
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "openai_api_key" {
  type = string
}

Add this section to your declarative configuration file:

_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    route: routeName|Id
    config:
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          name: text-embedding-3-small
          provider: openai
      vectordb:
        dimensions: 1024
        distance_metric: cosine
        strategy: redis
        threshold: 0.7
        redis:
          host: redis-stack-server
          port: 6379
      balancer:
        algorithm: semantic
      targets:
      - model:
          name: gpt-3.5-turbo
          provider: openai
          options:
            max_tokens: 826
            temperature: 0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Specialist in code completions
      - model:
          name: gpt-4o
          provider: openai
          options:
            max_tokens: 512
            temperature: 0.3
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Requests related to IT support
      - model:
          name: gpt-4o-mini
          provider: openai
          options:
            max_tokens: 256
            temperature: 1.0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: CATCHALL

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/routes/{routeName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
controlPlaneId: The id of the control plane.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
routeId: The id of the route the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-proxy-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    model:
      name: text-embedding-3-small
      provider: openai
  vectordb:
    dimensions: 1024
    distance_metric: cosine
    strategy: redis
    threshold: 0.7
    redis:
      host: redis-stack-server
      port: 6379
  balancer:
    algorithm: semantic
  targets:
  - model:
      name: gpt-3.5-turbo
      provider: openai
      options:
        max_tokens: 826
        temperature: 0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Specialist in code completions
  - model:
      name: gpt-4o
      provider: openai
      options:
        max_tokens: 512
        temperature: 0.3
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Requests related to IT support
  - model:
      name: gpt-4o-mini
      provider: openai
      options:
        max_tokens: 256
        temperature: 1.0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: CATCHALL
plugin: ai-proxy-advanced
" | kubectl apply -f -

Next, apply the KongPlugin resource by annotating the httproute or ingress resource:

kubectl annotate -n kong httproute  konghq.com/plugins=ai-proxy-advanced

kubectl annotate -n kong ingress  konghq.com/plugins=ai-proxy-advanced

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
  enabled = true

  config = {

    embeddings = {

      auth = {
        header_name = "Authorization"
        header_value = "Bearer var.openai_api_key"
      }

      model = {
        name = "text-embedding-3-small"
        provider = "openai"
      }
    }

    vectordb = {
      dimensions = 1024
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.7

      redis = {
        host = "redis-stack-server"
        port = 6379
      }
    }

    balancer = {
      algorithm = "semantic"
    }
    targets = [
      {

        model = {
          name = "gpt-3.5-turbo"
          provider = "openai"

          options = {
            max_tokens = 826
            temperature = 0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Specialist in code completions"
      }, 

      {

        model = {
          name = "gpt-4o"
          provider = "openai"

          options = {
            max_tokens = 512
            temperature = 0.3
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Requests related to IT support"
      }, 

      {

        model = {
          name = "gpt-4o-mini"
          provider = "openai"

          options = {
            max_tokens = 256
            temperature = 1.0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "CATCHALL"
      }    ]
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  route = {
    id = konnect_gateway_route.my_route.id
  }
}

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "openai_api_key" {
  type = string
}

Add this section to your declarative configuration file:

_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    consumer: consumerName|Id
    config:
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          name: text-embedding-3-small
          provider: openai
      vectordb:
        dimensions: 1024
        distance_metric: cosine
        strategy: redis
        threshold: 0.7
        redis:
          host: redis-stack-server
          port: 6379
      balancer:
        algorithm: semantic
      targets:
      - model:
          name: gpt-3.5-turbo
          provider: openai
          options:
            max_tokens: 826
            temperature: 0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Specialist in code completions
      - model:
          name: gpt-4o
          provider: openai
          options:
            max_tokens: 512
            temperature: 0.3
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Requests related to IT support
      - model:
          name: gpt-4o-mini
          provider: openai
          options:
            max_tokens: 256
            temperature: 1.0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: CATCHALL

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumers/{consumerName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumers/{consumerId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
controlPlaneId: The id of the control plane.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
consumerId: The id of the consumer the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-proxy-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    model:
      name: text-embedding-3-small
      provider: openai
  vectordb:
    dimensions: 1024
    distance_metric: cosine
    strategy: redis
    threshold: 0.7
    redis:
      host: redis-stack-server
      port: 6379
  balancer:
    algorithm: semantic
  targets:
  - model:
      name: gpt-3.5-turbo
      provider: openai
      options:
        max_tokens: 826
        temperature: 0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Specialist in code completions
  - model:
      name: gpt-4o
      provider: openai
      options:
        max_tokens: 512
        temperature: 0.3
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Requests related to IT support
  - model:
      name: gpt-4o-mini
      provider: openai
      options:
        max_tokens: 256
        temperature: 1.0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: CATCHALL
plugin: ai-proxy-advanced
" | kubectl apply -f -

Next, apply the KongPlugin resource by annotating the KongConsumer resource:

kubectl annotate -n kong  CONSUMER_NAME konghq.com/plugins=ai-proxy-advanced

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
  enabled = true

  config = {

    embeddings = {

      auth = {
        header_name = "Authorization"
        header_value = "Bearer var.openai_api_key"
      }

      model = {
        name = "text-embedding-3-small"
        provider = "openai"
      }
    }

    vectordb = {
      dimensions = 1024
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.7

      redis = {
        host = "redis-stack-server"
        port = 6379
      }
    }

    balancer = {
      algorithm = "semantic"
    }
    targets = [
      {

        model = {
          name = "gpt-3.5-turbo"
          provider = "openai"

          options = {
            max_tokens = 826
            temperature = 0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Specialist in code completions"
      }, 

      {

        model = {
          name = "gpt-4o"
          provider = "openai"

          options = {
            max_tokens = 512
            temperature = 0.3
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Requests related to IT support"
      }, 

      {

        model = {
          name = "gpt-4o-mini"
          provider = "openai"

          options = {
            max_tokens = 256
            temperature = 1.0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "CATCHALL"
      }    ]
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer = {
    id = konnect_gateway_consumer.my_consumer.id
  }
}

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "openai_api_key" {
  type = string
}

Add this section to your declarative configuration file:

_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    consumer_group: consumerGroupName|Id
    config:
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          name: text-embedding-3-small
          provider: openai
      vectordb:
        dimensions: 1024
        distance_metric: cosine
        strategy: redis
        threshold: 0.7
        redis:
          host: redis-stack-server
          port: 6379
      balancer:
        algorithm: semantic
      targets:
      - model:
          name: gpt-3.5-turbo
          provider: openai
          options:
            max_tokens: 826
            temperature: 0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Specialist in code completions
      - model:
          name: gpt-4o
          provider: openai
          options:
            max_tokens: 512
            temperature: 0.3
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: Requests related to IT support
      - model:
          name: gpt-4o-mini
          provider: openai
          options:
            max_tokens: 256
            temperature: 1.0
        route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        description: CATCHALL

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumer_groups/{consumerGroupName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumer_groups/{consumerGroupId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy-advanced",
      "config": {
        "embeddings": {
          "auth": {
            "header_name": "Authorization",
            "header_value": "Bearer '$OPENAI_API_KEY'"
          },
          "model": {
            "name": "text-embedding-3-small",
            "provider": "openai"
          }
        },
        "vectordb": {
          "dimensions": 1024,
          "distance_metric": "cosine",
          "strategy": "redis",
          "threshold": 0.7,
          "redis": {
            "host": "redis-stack-server",
            "port": 6379
          }
        },
        "balancer": {
          "algorithm": "semantic"
        },
        "targets": [
          {
            "model": {
              "name": "gpt-3.5-turbo",
              "provider": "openai",
              "options": {
                "max_tokens": 826,
                "temperature": 0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Specialist in code completions"
          },
          {
            "model": {
              "name": "gpt-4o",
              "provider": "openai",
              "options": {
                "max_tokens": 512,
                "temperature": 0.3
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "Requests related to IT support"
          },
          {
            "model": {
              "name": "gpt-4o-mini",
              "provider": "openai",
              "options": {
                "max_tokens": 256,
                "temperature": 1.0
              }
            },
            "route_type": "llm/v1/chat",
            "auth": {
              "header_name": "Authorization",
              "header_value": "Bearer '$OPENAI_API_KEY'"
            },
            "description": "CATCHALL"
          }
        ]
      }
    }
    '

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
controlPlaneId: The id of the control plane.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
consumerGroupId: The id of the consumer group the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-proxy-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    model:
      name: text-embedding-3-small
      provider: openai
  vectordb:
    dimensions: 1024
    distance_metric: cosine
    strategy: redis
    threshold: 0.7
    redis:
      host: redis-stack-server
      port: 6379
  balancer:
    algorithm: semantic
  targets:
  - model:
      name: gpt-3.5-turbo
      provider: openai
      options:
        max_tokens: 826
        temperature: 0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Specialist in code completions
  - model:
      name: gpt-4o
      provider: openai
      options:
        max_tokens: 512
        temperature: 0.3
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: Requests related to IT support
  - model:
      name: gpt-4o-mini
      provider: openai
      options:
        max_tokens: 256
        temperature: 1.0
    route_type: llm/v1/chat
    auth:
      header_name: Authorization
      header_value: Bearer $OPENAI_API_KEY
    description: CATCHALL
plugin: ai-proxy-advanced
" | kubectl apply -f -

Next, apply the KongPlugin resource by annotating the KongConsumerGroup resource:

kubectl annotate -n kong  CONSUMERGROUP_NAME konghq.com/plugins=ai-proxy-advanced

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
  enabled = true

  config = {

    embeddings = {

      auth = {
        header_name = "Authorization"
        header_value = "Bearer var.openai_api_key"
      }

      model = {
        name = "text-embedding-3-small"
        provider = "openai"
      }
    }

    vectordb = {
      dimensions = 1024
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.7

      redis = {
        host = "redis-stack-server"
        port = 6379
      }
    }

    balancer = {
      algorithm = "semantic"
    }
    targets = [
      {

        model = {
          name = "gpt-3.5-turbo"
          provider = "openai"

          options = {
            max_tokens = 826
            temperature = 0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Specialist in code completions"
      }, 

      {

        model = {
          name = "gpt-4o"
          provider = "openai"

          options = {
            max_tokens = 512
            temperature = 0.3
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "Requests related to IT support"
      }, 

      {

        model = {
          name = "gpt-4o-mini"
          provider = "openai"

          options = {
            max_tokens = 256
            temperature = 1.0
          }
        }
        route_type = "llm/v1/chat"

        auth = {
          header_name = "Authorization"
          header_value = "Bearer var.openai_api_key"
        }
        description = "CATCHALL"
      }    ]
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer_group = {
    id = konnect_gateway_consumer_group.my_consumer_group.id
  }
}

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "openai_api_key" {
  type = string
}

AI Proxy Advanced

Load balancing: Semanticv3.8+

Prerequisites

Environment variables

Set up the plugin

Help us make these docs great!

Still need help