Proxy AI CLI tools through AI Gateway

AI Gateway Documentation

AI Gateway can proxy requests from AI command-line tools to LLM providers. This gives you centralized control over AI traffic: log all requests, track costs across teams, enforce rate limits, or apply security policies and guardrails.

Supported AI CLI tools:

Claude Code: Anthropic, OpenAI, Azure OpenAI, Google Gemini, Google Vertex, AWS Bedrock, and Alibaba Cloud (Dashscope)
Codex CLI: OpenAI
Qwen Code CLI: OpenAI
Gemini CLI: Google Gemini

Current limitations:

Load balancing or failover features currently only work if all providers share the same model identifier.

Streaming is not supported when using non-Claude models with the following providers: Azure OpenAI, Google Gemini, and AWS Bedrock. Token usage might be reported as 0, but otherwise functionality is not affected.

Claude Code

Claude Code is Anthropic’s command-line tool that delegates coding tasks to Claude AI. Route Claude Code requests through AI Gateway to monitor usage, control costs, and enforce rate limits across your development team.

Claude Code with Anthropic

Use Claude Code with Anthropic provider

Claude Code with OpenAI

Use Claude Code with OpenAI provider

Claude Code with Azure AI

Use Claude Code with Azure AI provider

Claude Code with Gemini

Use Claude Code with Gemini provider

Claude Code with Vertex AI

Use Claude Code with Vertex AI provider

Claude Code with Bedrock

Use Claude Code with Bedrock provider

Claude Code with Alibaba Cloud

Use Claude Code with Alibaba Cloud (Dashscope) provider

Claude Code with HuggingFace

Use Claude Code with HuggingFace provider

Codex CLI

Codex CLI is OpenAI’s command-line tool for code generation and assistance. Proxy Codex CLI requests through AI Gateway to gain visibility into API usage, implement rate limiting, and centralize credential management.

Codex CLI with OpenAI

Use Codex CLI with OpenAI models

Qwen Code CLI

Qwen Code CLI is an AI-powered coding assistant that uses OpenAI-compatible endpoints. Proxy Qwen Code CLI requests through Kong AI Gateway to gain visibility into API usage, implement rate limiting, and centralize credential management.

Qwen Code CLI with OpenAI

Use Qwen Code CLI with OpenAI models

Gemini CLI

Gemini CLI is Google’s command-line tool for interacting with Gemini models. Route Gemini CLI requests through Kong AI Gateway to monitor usage, control costs, and enforce rate limits across your development team.

Gemini CLI with Gemini

Use Gemini CLI with Gemini models