Cerebras provider

Uses: Kong Gateway AI Gateway Admin API deck KIC Konnect API Terraform

Upstream paths

AI Gateway automatically routes requests to the appropriate Cerebras API endpoints. The following table shows the upstream paths used for each capability.

Capability	Upstream path or API
Chat completions	`/v1/chat/completions`

Supported capabilities

The following tables show the AI capabilities supported by Cerebras provider when used with the AI Proxy or the AI Proxy Advanced plugin.

Set the plugin’s route_type based on the capability you want to use. See the tables below for supported route types.

Text generation

Support for Cerebras basic text generation capabilities including chat, completions, and embeddings:

Capability	Route type	Streaming	Model example	Min version
Chat completions	`llm/v1/chat`	Supported	llama-3.3-70b	3.13

Cerebras base URL

The base URL is https://api.cerebras.ai/{route_type_path}, where {route_type_path} is determined by the capability.

AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Cerebras-compatible endpoint, in which case set the upstream_url plugin option.

Configure Cerebras with AI Proxy

To use Cerebras with AI Gateway, configure the AI Proxy or AI Proxy Advanced.

Here’s a minimal configuration for chat completions:

kong.yaml

Copied!

_format_version: "3.0"
plugins:
  - name: ai-proxy
    config:
      route_type: llm/v1/chat
      auth:
        header_name: Authorization
        header_value: Bearer ${{ env "DECK_CEREBRAS_API_KEY" }}
      model:
        provider: cerebras
        name: gpt-oss-120b
        options:
          max_tokens: 512
          temperature: 1.0

curl -i -X POST http://localhost:8001/plugins/ \
    --header "Accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "auth": {
          "header_name": "Authorization",
          "header_value": "Bearer '$CEREBRAS_API_KEY'"
        },
        "model": {
          "provider": "cerebras",
          "name": "gpt-oss-120b",
          "options": {
            "max_tokens": 512,
            "temperature": 1.0
          }
        }
      }
    }
    '

Copied!

Make the following request:

curl -X POST https://{region}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/plugins/ \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer $KONNECT_TOKEN" \
    --data '
    {
      "name": "ai-proxy",
      "config": {
        "route_type": "llm/v1/chat",
        "auth": {
          "header_name": "Authorization",
          "header_value": "Bearer '$CEREBRAS_API_KEY'"
        },
        "model": {
          "provider": "cerebras",
          "name": "gpt-oss-120b",
          "options": {
            "max_tokens": 512,
            "temperature": 1.0
          }
        }
      }
    }
    '

Copied!

echo "
apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name:
  namespace: kong
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: 'true'
config:
  route_type: llm/v1/chat
  auth:
    header_name: Authorization
    header_value: Bearer $CEREBRAS_API_KEY
  model:
    provider: cerebras
    name: gpt-oss-120b
    options:
      max_tokens: 512
      temperature: 1.0
plugin: ai-proxy
" | kubectl apply -f -

Copied!

resource "konnect_gateway_plugin_ai_proxy" "my_ai_proxy" {
  enabled = true

  config = {
    route_type = "llm/v1/chat"

    auth = {
      header_name = "Authorization"
      header_value = "Bearer var.cerebras_api_key"
    }

    model = {
      provider = "cerebras"
      name = "gpt-oss-120b"

      options = {
        max_tokens = 512
        temperature = 1.0
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
}

Copied!

This example requires the following variables to be added to your manifest. You can specify values at runtime by setting TF_VAR_name=value.

variable "cerebras_api_key" {
  type = string
}

Copied!

For more configuration options and examples, see:

AI Proxy examples

AI Proxy Advanced examples

Cerebras provider

Upstream paths

Supported capabilities

Text generation

Cerebras base URL

Configure Cerebras with AI Proxy

Tutorials

Help us make these docs great!

Still need help?