sequenceDiagram
autonumber
participant client as Client
participant kong as Kong Gateway
participant ai as AI LLM service
participant backend as Backend service
activate client
activate kong
client->>kong: Sends a request
deactivate client
activate ai
kong->>ai: Sends client's request for transformation
ai->>kong: Transforms request
deactivate ai
activate backend
kong->>backend: Sends transformed request to backend
backend->>kong: Returns response to Kong Gateway
deactivate backend
activate ai
kong->>ai: Sends response to AI service
ai->>kong: Transforms response
deactivate ai
activate client
kong->>client: Returns transformed response to client
deactivate kong
deactivate client
Figure 1: The diagram shows the journey of a consumer’s request through Kong Gateway to the
backend service, where it is transformed by both an AI LLM service and Kong’s AI Request Transformer and the AI Response Transformer plugins.
- The Kong Gateway admin sets up an
llm
configuration block.
- The Kong Gateway admin sets up a
prompt
.
The prompt becomes the system
message in the LLM chat request, and provides transformation
instructions to the LLM for the returning upstream response body.
- The client makes an HTTP(S) call.
- After proxying the client’s request to the backend, Kong Gateway sets the entire response body as the
user
message in the LLM chat request, then sends it to the configured LLM service.
- The LLM service returns a response
assistant
message, which is subsequently set as the upstream response body.
- The plugin returns early (
kong.response.exit
) and can handle gzip or chunked requests, similar to the Forward Proxy plugin.
You can additionally instruct the LLM to respond in the following format, which lets you adjust the response headers, response status code, and response body:
{
"headers":
{
"new-header": "new-value"
}
}
If the parse_llm_response_json_instructions
parameter is set to true
, Kong Gateway will parse these instructions and set the specified response headers, response status code, and replacement response body.
This lets you change specific headers such as Content-Type
, or throw errors from the LLM.