AI Rate Limiting Advanced: Request prompt function - Plugin

Request prompt functionv3.7+

Protect your LLM services with rate limiting. The AI Rate Limiting Advanced plugin will analyze query costs and token response to provide an enterprise-grade rate limiting strategy.

The following example uses request prompt rate limiting, which lets you you rate limit requests based on a custom token. See the how-to guide for a step-by-step walkthrough.

Prerequisites

AI Proxy plugin or AI Proxy Advanced plugin configured with an LLM service
A Redis instance.
Port 6379, or your custom Redis port is open and reachable from Kong Gateway.

Set up the plugin

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    config:
      strategy: redis
      redis:
        host: host.docker.internal
        port: 16379
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make the following request:

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
  labels:
    global: 'true'
config:
  strategy: redis
  redis:
    host: host.docker.internal
    port: 16379
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      host = "host.docker.internal"
      port = 16379
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    service: serviceName|Id
    config:
      strategy: redis
      redis:
        host: host.docker.internal
        port: 16379
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/services/{serviceName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

serviceName|Id: The id or name of the service the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
serviceId: The id of the service the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    host: host.docker.internal
    port: 16379
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the service resource:

kubectl annotate -n kong service SERVICE_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      host = "host.docker.internal"
      port = 16379
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    route: routeName|Id
    config:
      strategy: redis
      redis:
        host: host.docker.internal
        port: 16379
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/routes/{routeName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

routeName|Id: The id or name of the route the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
routeId: The id of the route the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    host: host.docker.internal
    port: 16379
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the httproute or ingress resource:

kubectl annotate -n kong httproute  konghq.com/plugins=ai-rate-limiting-advanced

Copied!

kubectl annotate -n kong ingress  konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      host = "host.docker.internal"
      port = 16379
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  route = {
    id = konnect_gateway_route.my_route.id
  }
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    consumer: consumerName|Id
    config:
      strategy: redis
      redis:
        host: host.docker.internal
        port: 16379
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumers/{consumerName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

consumerName|Id: The id or name of the consumer the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumers/{consumerId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
consumerId: The id of the consumer the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    host: host.docker.internal
    port: 16379
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the KongConsumer resource:

kubectl annotate -n kong kongconsumer CONSUMER_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      host = "host.docker.internal"
      port = 16379
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer = {
    id = konnect_gateway_consumer.my_consumer.id
  }
}

Copied!

Add this section to your kong.yaml configuration file:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-rate-limiting-advanced
    consumer_group: consumerGroupName|Id
    config:
      strategy: redis
      redis:
        host: host.docker.internal
        port: 16379
      sync_rate: 0
      llm_providers:
      - name: cohere
        limit:
        - 100
        - 1000
        window_size:
        - 60
        - 3600
      request_prompt_count_function: |
        local header_count = tonumber(kong.request.get_header("x-prompt-count"))
        if header_count then
          return header_count
        end
        return 0

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -i -X POST http://localhost:8001/consumer_groups/{consumerGroupName|Id}/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

consumerGroupName|Id: The id or name of the consumer group the plugin configuration will target.

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/consumer_groups/{consumerGroupId}/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-rate-limiting-advanced",
      "config": {
        "strategy": "redis",
        "redis": {
          "host": "host.docker.internal",
          "port": 16379
        },
        "sync_rate": 0,
        "llm_providers": [
          {
            "name": "cohere",
            "limit": [
              100,
              1000
            ],
            "window_size": [
              60,
              3600
            ]
          }
        ],
        "request_prompt_count_function": "local header_count = tonumber(kong.request.get_header(\"x-prompt-count\"))\nif header_count then\n  return header_count\nend\nreturn 0\n"
      },
      "tags": []
    }
    '

Copied!

Make sure to replace the following placeholders with your own values:

region: Geographic region where your Kong Konnect is hosted and operates.
KONNECT_TOKEN: Your Personal Access Token (PAT) associated with your Konnect account.
controlPlaneId: The id of the control plane.
consumerGroupId: The id of the consumer group the plugin configuration will target.

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-rate-limiting-advanced
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
    konghq.com/tags: ''
config:
  strategy: redis
  redis:
    host: host.docker.internal
    port: 16379
  sync_rate: 0
  llm_providers:
  - name: cohere
    limit:
    - 100
    - 1000
    window_size:
    - 60
    - 3600
  request_prompt_count_function: |
    local header_count = tonumber(kong.request.get_header('x-prompt-count'))
    if header_count then
      return header_count
    end
    return 0
plugin: ai-rate-limiting-advanced
" | kubectl apply -f -

Copied!

Next, apply the KongPlugin resource by annotating the KongConsumerGroup resource:

kubectl annotate -n kong kongconsumergroup CONSUMERGROUP_NAME konghq.com/plugins=ai-rate-limiting-advanced

Copied!

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "$KONNECT_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Copied!

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_rate_limiting_advanced" "my_ai_rate_limiting_advanced" {
  enabled = true

  config = {
    strategy = "redis"

    redis = {
      host = "host.docker.internal"
      port = 16379
    }
    sync_rate = 0
    llm_providers = [
      {
        name = "cohere"
        limit = [100, 1000]
        window_size = [60, 3600]
      }    ]
    request_prompt_count_function = <<EOF
local header_count = tonumber(kong.request.get_header("x-prompt-count"))
if header_count then
  return header_count
end
return 0
EOF
  }
  tags = []

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  consumer_group = {
    id = konnect_gateway_consumer_group.my_consumer_group.id
  }
}

Copied!

AI Rate Limiting Advanced

Request prompt functionv3.7+

Prerequisites

Set up the plugin

Help us make these docs great!

Still need help?