Related Documentation
Minimum Version
Kong Gateway - 3.9
Tags
#ai

You can proxy requests to Hugging Face AI models through AI Gateway using the AI Proxy and AI Proxy Advanced plugins. This reference documents all supported AI capabilities, configuration requirements, and provider-specific details needed for proper integration.

Upstream paths

AI Gateway automatically routes requests to the appropriate Hugging Face API endpoints. The following table shows the upstream paths used for each capability.

Capability

Upstream path or API

Chat completions /v1/chat/completions
Embeddings /hf-inference/models/{model_name}/pipeline/feature-extraction
Video generations /v1/videos

Supported capabilities

The following tables show the AI capabilities supported by Hugging Face provider when used with the AI Proxy or the AI Proxy Advanced plugin.

Set the plugin’s route_type based on the capability you want to use. See the tables below for supported route types.

Text generation

Support for Hugging Face basic text generation capabilities including chat, completions, and embeddings:

Capability

Route type

Streaming

Model example

Min version

Chat completions llm/v1/chat Supported Use the model name for the specific LLM provider 3.9
Embeddings llm/v1/embeddings Not supported Use the embedding model name 3.11

Video

Support for Hugging Face video generation capabilities:

Capability

Route type

Model example

Min version

Generations video/v1/videos/generations Use the video generation model name 3.13

For requests with large payloads (video generation), consider increasing config.max_request_body_size to three times the raw binary size.

Hugging Face base URL

The base URL is https://api-inference.huggingface.co, where {route_type_path} is determined by the capability.

AI Gateway uses this URL automatically. You only need to configure a URL if you’re using a self-hosted or Hugging Face-compatible endpoint, in which case set the upstream_url plugin option.

Supported native LLM formats for Hugging Face

By default, the AI Proxy plugin uses OpenAI-compatible request formats. Set config.llm_format to a native format to use Hugging Face-specific APIs and features.

The following native Hugging Face APIs are supported:

LLM format

Supported APIs

huggingface
  • /generate
  • /generate_stream

Configure Hugging Face with AI Proxy

To use Hugging Face with AI Gateway, configure the AI Proxy or AI Proxy Advanced.

Here’s a minimal configuration for chat completions:

For more configuration options and examples, see:

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!