Load balancing: Lowest-usagev3.8+
Configure the plugin to use two OpenAI models and route requests based on the number of tokens in the prompt.
The lowest-usage algorithm distributes requests to the model with the lowest usage volume. By default, the usage is calculated based on the total number of tokens in the prompt and in the response. However, you can customize this using the config.balancer.tokens_count_strategy
parameter. You can use:
-
prompt-tokens
to only count the tokens in the prompt -
completion-tokens
to only count the tokens in the response -
total-tokens
to count both tokens in the prompt and in the response -
v3.10+
cost
to count the cost of the tokens.
You must set thecost
parameter in each model configuration to use this strategy andlog_statistics
must be enabled.
Prerequisites
- An OpenAI account
Environment variables
-
OPENAI_API_KEY
: The API key to use to connect to OpenAI.
Add this section to your declarative configuration file:
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
Make the following request:
curl -i -X POST http://localhost:8001/plugins/ \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make the following request:
curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $KONNECT_TOKEN" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
region
: Geographic region where your Kong Konnect is hosted and operates. -
controlPlaneId
: Theid
of the control plane. -
KONNECT_TOKEN
: Your Personal Access Token (PAT) associated with your Konnect account.
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
name: ai-proxy-advanced
namespace: kong
annotations:
kubernetes.io/ingress.class: kong
labels:
global: 'true'
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
plugin: ai-proxy-advanced
" | kubectl apply -f -
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "$KONNECT_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
enabled = true
config = {
balancer = {
algorithm = "lowest-usage"
tokens_count_strategy = "prompt-tokens"
}
targets = [
{
model = {
name = "gpt-4"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
},
{
model = {
name = "gpt-4o-mini"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
} ]
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}
This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value
.
variable "openai_api_key" {
type = string
}
Add this section to your declarative configuration file:
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
service: serviceName|Id
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
Make sure to replace the following placeholders with your own values:
-
serviceName|Id
: Theid
orname
of the service the plugin configuration will target.
Make the following request:
curl -i -X POST http://localhost:8001/services/{serviceName|Id}/plugins/ \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
serviceName|Id
: Theid
orname
of the service the plugin configuration will target.
Make the following request:
curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins/ \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $KONNECT_TOKEN" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
region
: Geographic region where your Kong Konnect is hosted and operates. -
controlPlaneId
: Theid
of the control plane. -
KONNECT_TOKEN
: Your Personal Access Token (PAT) associated with your Konnect account. -
serviceId
: Theid
of the service the plugin configuration will target.
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-proxy-advanced
namespace: kong
annotations:
kubernetes.io/ingress.class: kong
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
plugin: ai-proxy-advanced
" | kubectl apply -f -
Next, apply the KongPlugin
resource by annotating the service
resource:
kubectl annotate -n kong service SERVICE_NAME konghq.com/plugins=ai-proxy-advanced
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "$KONNECT_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
enabled = true
config = {
balancer = {
algorithm = "lowest-usage"
tokens_count_strategy = "prompt-tokens"
}
targets = [
{
model = {
name = "gpt-4"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
},
{
model = {
name = "gpt-4o-mini"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
} ]
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
service = {
id = konnect_gateway_service.my_service.id
}
}
This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value
.
variable "openai_api_key" {
type = string
}
Add this section to your declarative configuration file:
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
route: routeName|Id
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
Make sure to replace the following placeholders with your own values:
-
routeName|Id
: Theid
orname
of the route the plugin configuration will target.
Make the following request:
curl -i -X POST http://localhost:8001/routes/{routeName|Id}/plugins/ \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
routeName|Id
: Theid
orname
of the route the plugin configuration will target.
Make the following request:
curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins/ \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $KONNECT_TOKEN" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
region
: Geographic region where your Kong Konnect is hosted and operates. -
controlPlaneId
: Theid
of the control plane. -
KONNECT_TOKEN
: Your Personal Access Token (PAT) associated with your Konnect account. -
routeId
: Theid
of the route the plugin configuration will target.
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-proxy-advanced
namespace: kong
annotations:
kubernetes.io/ingress.class: kong
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
plugin: ai-proxy-advanced
" | kubectl apply -f -
Next, apply the KongPlugin
resource by annotating the httproute
or ingress
resource:
kubectl annotate -n kong httproute konghq.com/plugins=ai-proxy-advanced
kubectl annotate -n kong ingress konghq.com/plugins=ai-proxy-advanced
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "$KONNECT_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
enabled = true
config = {
balancer = {
algorithm = "lowest-usage"
tokens_count_strategy = "prompt-tokens"
}
targets = [
{
model = {
name = "gpt-4"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
},
{
model = {
name = "gpt-4o-mini"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
} ]
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
route = {
id = konnect_gateway_route.my_route.id
}
}
This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value
.
variable "openai_api_key" {
type = string
}
Add this section to your declarative configuration file:
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
consumer: consumerName|Id
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
Make sure to replace the following placeholders with your own values:
-
consumerName|Id
: Theid
orname
of the consumer the plugin configuration will target.
Make the following request:
curl -i -X POST http://localhost:8001/consumers/{consumerName|Id}/plugins/ \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
consumerName|Id
: Theid
orname
of the consumer the plugin configuration will target.
Make the following request:
curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumers/{consumerId}/plugins/ \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $KONNECT_TOKEN" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
region
: Geographic region where your Kong Konnect is hosted and operates. -
controlPlaneId
: Theid
of the control plane. -
KONNECT_TOKEN
: Your Personal Access Token (PAT) associated with your Konnect account. -
consumerId
: Theid
of the consumer the plugin configuration will target.
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-proxy-advanced
namespace: kong
annotations:
kubernetes.io/ingress.class: kong
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
plugin: ai-proxy-advanced
" | kubectl apply -f -
Next, apply the KongPlugin
resource by annotating the KongConsumer
resource:
kubectl annotate -n kong CONSUMER_NAME konghq.com/plugins=ai-proxy-advanced
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "$KONNECT_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
enabled = true
config = {
balancer = {
algorithm = "lowest-usage"
tokens_count_strategy = "prompt-tokens"
}
targets = [
{
model = {
name = "gpt-4"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
},
{
model = {
name = "gpt-4o-mini"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
} ]
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
consumer = {
id = konnect_gateway_consumer.my_consumer.id
}
}
This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value
.
variable "openai_api_key" {
type = string
}
Add this section to your declarative configuration file:
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
consumer_group: consumerGroupName|Id
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
Make sure to replace the following placeholders with your own values:
-
consumerGroupName|Id
: Theid
orname
of the consumer group the plugin configuration will target.
Make the following request:
curl -i -X POST http://localhost:8001/consumer_groups/{consumerGroupName|Id}/plugins/ \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
consumerGroupName|Id
: Theid
orname
of the consumer group the plugin configuration will target.
Make the following request:
curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumer_groups/{consumerGroupId}/plugins/ \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $KONNECT_TOKEN" \
--data '
{
"name": "ai-proxy-advanced",
"config": {
"balancer": {
"algorithm": "lowest-usage",
"tokens_count_strategy": "prompt-tokens"
},
"targets": [
{
"model": {
"name": "gpt-4",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
},
{
"model": {
"name": "gpt-4o-mini",
"provider": "openai",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"route_type": "llm/v1/chat",
"auth": {
"header_name": "Authorization",
"header_value": "Bearer '$OPENAI_API_KEY'"
}
}
]
}
}
'
Make sure to replace the following placeholders with your own values:
-
region
: Geographic region where your Kong Konnect is hosted and operates. -
controlPlaneId
: Theid
of the control plane. -
KONNECT_TOKEN
: Your Personal Access Token (PAT) associated with your Konnect account. -
consumerGroupId
: Theid
of the consumer group the plugin configuration will target.
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-proxy-advanced
namespace: kong
annotations:
kubernetes.io/ingress.class: kong
config:
balancer:
algorithm: lowest-usage
tokens_count_strategy: prompt-tokens
targets:
- model:
name: gpt-4
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
- model:
name: gpt-4o-mini
provider: openai
options:
max_tokens: 512
temperature: 1.0
route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer $OPENAI_API_KEY
plugin: ai-proxy-advanced
" | kubectl apply -f -
Next, apply the KongPlugin
resource by annotating the KongConsumerGroup
resource:
kubectl annotate -n kong CONSUMERGROUP_NAME konghq.com/plugins=ai-proxy-advanced
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "$KONNECT_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_proxy_advanced" "my_ai_proxy_advanced" {
enabled = true
config = {
balancer = {
algorithm = "lowest-usage"
tokens_count_strategy = "prompt-tokens"
}
targets = [
{
model = {
name = "gpt-4"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
},
{
model = {
name = "gpt-4o-mini"
provider = "openai"
options = {
max_tokens = 512
temperature = 1.0
}
}
route_type = "llm/v1/chat"
auth = {
header_name = "Authorization"
header_value = "Bearer var.openai_api_key"
}
} ]
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
consumer_group = {
id = konnect_gateway_consumer_group.my_consumer_group.id
}
}
This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value
.
variable "openai_api_key" {
type = string
}