How to: Ensure chatbots adhere to compliance policies with the AI RAG Injector plugin

Prerequisites

Kong Gateway running

This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.

Export your license to an environment variable:

 export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'

Run the quickstart script:

curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA 

Once Kong Gateway is ready, you will see the following message:

 Kong Gateway Ready

decK

decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial you will first need to install decK.

Required entities

For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:

Run the following command:

echo '
_format_version: "3.0"
services:
  - name: example-service
    url: http://httpbin.konghq.com/anything
routes:
  - name: example-route
    paths:
    - "/anything"
    service:
      name: example-service
' | deck gateway apply -

To learn more about entities, you can read our entities documentation.

OpenAI

This tutorial uses OpenAI:

Create an OpenAI account.
Get an API key.
Create a decK variable with the API key:

export DECK_OPENAI_API_KEY="YOUR OPENAI API KEY"

export DECK_OPENAI_API_KEY="YOUR OPENAI API KEY"

Redis stack

To complete this tutorial, make sure you have the following:

A Redis Stack running and accessible from the environment where Kong is deployed.
Port 6379, or your custom Redis port is open and reachable from Kong.
Redis host set as an environment variable so the plugin can connect:
```
export DECK_REDIS_HOST='YOUR-REDIS-HOST'
```

If you’re testing locally with Docker, use host.docker.internal as the host value.

Langchain splitters

To complete this tutorial, you’ll need Python (version 3.7 or later) and pip installed on your machine. You can verify it by running:

python3
python3 -m pip --version

Once that’s set up, install the required packages by running the following command in your terminal:

python3 -m pip install langchain langchain_text_splitters requests

Configure the AI Proxy Advanced plugin

First, you’ll need to configure the AI Proxy Advanced plugin to proxy prompt requests to your model provider, and handle authentication:

echo '
_format_version: "3.0"
plugins:
  - name: ai-proxy-advanced
    config:
      targets:
      - route_type: llm/v1/chat
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          provider: openai
          name: gpt-4o
          options:
            max_tokens: 512
            temperature: 1.0
' | deck gateway apply -

Configure the AI RAG Injector plugin

Next, configure the AI RAG Injector plugin to inject precise, context-specific instructions and relevant knowledge from a company’s private compliance data into the AI prompt. This configuration ensures the AI answers employee questions accurately using only approved information through retrieval-augmented generation (RAG).

echo '
_format_version: "3.0"
plugins:
  - name: ai-rag-injector
    id: b924e3e8-7893-4706-aacb-e75793a1d2e9
    config:
      inject_template: |
        You are an AI assistant designed to answer employee questions using only the approved compliance content provided between the <RAG></RAG> tags.
        Do not use external or general knowledge, and do not answer if the information is not available in the RAG content.
        <RAG><CONTEXT></RAG>
        User'\''s question: <PROMPT>
        Respond only with information found in the <RAG> section. If the answer is not clearly present, reply with:
        "I'\''m sorry, I cannot answer that based on the available compliance information."
      embeddings:
        auth:
          header_name: Authorization
          header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
        model:
          provider: openai
          name: text-embedding-3-large
      vectordb:
        strategy: redis
        redis:
          host: "${{ env "DECK_REDIS_HOST" }}"
          port: 6379
        distance_metric: cosine
        dimensions: 3072
' | deck gateway apply -

If your Redis instance runs in a separate Docker container from Kong, use host.docker.internal for vectordb.redis.host.

If you’re using a model other than text-embedding-3-large, be sure to update the vectordb.dimensions value to match the model’s embedding size.

Split input data before ingestion

Before sending data to the AI Gateway, split your input into manageable chunks using a text splitting tool like langchain_text_splitters. This helps optimize downstream processing and improves semantic retrieval performance.

Refer to langchain text_splitters documents if your documents are structured data other than plain texts.

The following Python script demonstrates how to split text using RecursiveCharacterTextSplitter and ingest the resulting chunks into the AI Gateway. This script uses the AI RAG Injector plugin ID we set in the previous step, so be sure to replace it if your plugin has a different ID.

Save the script as inject_policy.py:

from langchain_text_splitters import RecursiveCharacterTextSplitter
import requests

TEXT = ["""
Acme Corp. Travel Policy
1. Purpose
This policy outlines the guidelines for employees traveling on company business to ensure efficient, cost-effective, and accountable use of company funds.
2. Scope
This policy applies to all employees traveling on company business, including domestic and international travel.
3. Travel Approval

    All travel must be pre-approved by the employee's supervisor and, if applicable, by higher management, based on business need and cost-effectiveness.
    Travel requests should be submitted at least [Number] weeks/days in advance, including destination, purpose, dates, and estimated costs.
    Travel requests should be submitted using the designated travel request form.

4. Transportation

    Air Travel:

    Employees should book the most cost-effective airfare, considering time and cost.

Business class or first-class travel is only permitted with prior approval and for exceptional circumstances.
Employees should choose direct flights whenever possible.

Ground Transportation:

    For travel to and from airports or within the destination, employees should use cost-effective options such as shuttles, public transportation, or car services.

Personal vehicle use is permitted for business travel, with reimbursement at the standard IRS mileage rate.
Parking and tolls: are reimbursable when necessary.

Train Travel:

    Train travel is considered an appropriate mode of transportation for certain destinations and will be reimbursed if the cost is less than other means of transportation.

5. Lodging

    Employees should choose lodging that is cost-effective and meets the needs of the business trip.
    Hotel selection: should be based on location, proximity to meeting venues, and cost.
    Employees should book accommodations in advance to secure the best rates.
    Travelers should share hotel rooms with other employees when feasible and appropriate.

6. Meals

    Meals are reimbursable during business travel, but expenses should be kept reasonable and appropriate.
    Employees should present receipts for all meal expenses.
    Alcoholic beverages: are not reimbursable.
    When attending business functions with meals provided, expenses for meals purchased elsewhere are not reimbursed unless specifically authorized in advance.

7. Other Expenses

    Entertainment expenses: are generally not reimbursable, except for business-related entertainment that is necessary for client relations.
    Telephone expenses: are reimbursable when necessary for business travel, but should be kept to a minimum.
    Internet access: is reimbursable when necessary for business travel.

8. Reimbursement

    Employees should submit all travel expenses for reimbursement within 27 days of the trip.
    Employees should submit receipts for all travel expenses.
    Reimbursement will be made in accordance with company policy.

9. Compliance

    All employees are expected to comply with this travel policy.
    Violation of this policy may result in disciplinary action.

10. Policy Updates

    This policy may be updated from time to time as needed.
    Employees will be notified of any changes to this policy.
"""]

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
docs = text_splitter.create_documents(TEXT)

print("Injecting %d chunks..." % len(docs))

for doc in docs:
    response = requests.post(
        "http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/ingest_chunk", # Replace the placeholder with your AI RAG Injector plugin ID
        data={'content': doc.page_content}
    )
    print(response.json())

You can replace print(response.json()) with print(response.text) to view the raw HTTP response body as a plain string instead of a parsed JSON object. This is useful for debugging cases where:

The response isn’t valid JSON (e.g., plain text error message or HTML).

You want to inspect the exact response content without triggering a JSON parse error.

Use response.text when troubleshooting unexpected server responses or plugin misconfigurations.

Run the inject_policy.py script in your terminal:

python3 ./inject_policy.py

This will output the number of chunks created and display the response from the injector endpoint for each chunk:

Injecting 4 chunks...
{"metadata":{"ingest_duration":1476,"embeddings_tokens_count":157,"chunk_id":"a1b2c3d4-e5f6-7890-ab12-34567890abcd"}}
{"metadata":{"ingest_duration":1323,"embeddings_tokens_count":140,"chunk_id":"b2c3d4e5-f678-9012-bc34-567890abcdef"}}
{"metadata":{"ingest_duration":1286,"embeddings_tokens_count":141,"chunk_id":"c3d4e5f6-7890-1234-cd56-7890abcdef12"}}
{"metadata":{"ingest_duration":2892,"embeddings_tokens_count":168,"chunk_id":"d4e5f678-9012-3456-de78-90abcdef1234"}}

Ingest content to the vector database

Now, you can feed the split chunks into AI Gateway using the Kong Admin API.

The following example shows how to ingest content to the vector database for building the knowledge base. The AI RAG Injector plugin uses the OpenAI text-embedding-3-large model to generate embeddings for the content and stores them in Redis.

 curl "http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/ingest_chunk" \
     -H "Accept: application/json"\
     -H "Content-Type: application/json" \
     --json '{
       "content": "<chunk>"
     }'

This will return something like the following:

{"metadata":{"embeddings_tokens_count":3,"chunk_id": "3fa85f64-5717-4562-b3fc-2c963fabcdef","ingest_duration":550}}

Test RAG configuration

Now you can send various questions to the AI to verify that RAG is working correctly.

In-scope questions

Use the following in-scope questions to verify that the AI responds accurately based on the approved compliance content and doesn’t rely on external knowledge.

Use simple user questions that map directly to travel policy clauses:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Are alcoholic beverages reimbursable?"
         }
       ]
     }'

You can also ask this question:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What documentation is required for travel reimbursement?"
         }
       ]
     }'

Use slightly more complex prompts involving multi-step policy logic or multiple clauses:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Can I get reimbursed for internet charges during a business trip?"
         }
       ]
     }'

Also, you can ask a more complex query about booking a hotel:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Do I need to book my hotel in advance for business travel?"
         }
       ]
     }'

Use prompts that test boundaries of the compliance language:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Am I allowed to share a hotel room with another employee?"
         }
       ]
     }'

Or ask about public transportation:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What’s the policy on using public transportation during travel?"
         }
       ]
     }'

Out-of-scope questions

Use the following out-of-scope questions to confirm that the AI correctly refuses to answer queries that fall outside the ingested compliance content. AI should return the following response to these requests:

"message": {
    "role": "assistant",
    "content": "I'm sorry, I cannot answer that based on the available compliance information.",
  }

These questions ask about Acme Corp. in general, not about the travel policy:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What does Acme Corp. do?"
         }
       ]
     }'

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Where is Acme Corp. headquartered?"
         }
       ]
     }'

These questions require general or external knowledge that is not included in the ingested content:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "Who is the CEO of OpenAI?"
         }
       ]
     }'

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "How does Redis handle vector storage?"
         }
       ]
     }'

These prompts reference company policies that aren’t part of the travel policy content:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "How much vacation time do I get per year?"
         }
       ]
     }'

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What’s the parental leave policy at Acme Corp.?"
         }
       ]
     }'

These prompts are vague, outside compliance scope, or might encourage hallucination if guardrails aren’t working:

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What is the best destination for international travel?"
         }
       ]
     }'

 curl "http://localhost:8000/anything" \
     -H "Content-Type: application/json"\
     -H "Authorization: Bearer $DECK_OPENAI_API_KEY" \
     --json '{
       "messages": [
         {
           "role": "user",
           "content": "What should I pack for an international trip?"
         }
       ]
     }'

Debug the retrieval of the knowledge base

To evaluate which documents are retrieved for a specific prompt, use the following command:

 curl "http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/lookup_chunks" \
     -H "Accept: application/json"\
     -H "Content-Type: application/json" \
     --json '{
       "prompt": "Am I allowed to share a hotel room with another employee?",
       "exclude_contents": false
     }'

This will return which content in the compliance policy AI is using to answer the user question.

To omit the chunk content and only return the chunk ID, set exclude_contents to true.

Update content for ingesting

If you are running Kong Gateway in traditional mode, you can update content for ingesting by sending a request to the /ai-rag-injector/{pluginId}/ingest_chunk endpoint.

However, this won’t work in hybrid mode or Konnect because the control plane can’t access the plugin’s backend storage.

To update content for ingesting in hybrid mode or Konnect, you can use a script:

Retrieve the ID of the AI RAG Injector plugin that you want to update.

Copy and paste the following script to a local file, for example ingest_update.lua:

local embeddings = require("kong.llm.embeddings")
local uuid = require("kong.tools.utils").uuid
local vectordb = require("kong.llm.vectordb")

local function get_plugin_by_id(id)
  local row, err = kong.db.plugins:select(
    {id = id},
    { workspace = ngx.null, show_ws_id = true, expand_partials = true }
  )

  if err then
      return nil, err
  end

  return row
end

local function ingest_chunk(conf, content)
  local err
  local metadata = {
      ingest_duration = ngx.now(),
  }
  -- vectordb driver init
  local vectordb_driver
  do
      vectordb_driver, err = vectordb.new(conf.vectordb.strategy, conf.vectordb_namespace, conf.  vectordb)
      if err then
          return nil, "Failed to load the '" .. conf.vectordb.strategy .. "' vector database   driver: " .. err
      end
  end

  -- embeddings init
  local embeddings_driver, err = embeddings.new(conf.embeddings, conf.vectordb.dimensions)
  if err then
      return nil, "Failed to instantiate embeddings driver: " .. err
  end

  local embeddings_vector, embeddings_tokens_count, err = embeddings_driver:generate(content)
  if err then
      return nil, "Failed to generate embeddings: " .. err
  end

  metadata.embeddings_tokens_count = embeddings_tokens_count
  if #embeddings_vector ~= conf.vectordb.dimensions then
    return nil, "Embedding dimensions do not match the configured vector database. Embeddings were   " ..
      #embeddings_vector .. " dimensions, but the vector database is configured for " ..
      conf.vectordb.dimensions .. " dimensions.", "Embedding dimensions do not match the   configured vector database"
  end

  metadata.chunk_id = uuid()
  -- ingest chunk
  local _, err = vectordb_driver:insert(embeddings_vector, content, metadata.chunk_id)
  if err then
      return nil, "Failed to insert chunk: " .. err
  end

  return true
end

assert(#args == 3, "2 arguments expected")
local plugin_id, content = args[2], args[3]

local plugin, err = get_plugin_by_id(plugin_id)
if err then
  ngx.log(ngx.ERR, "Failed to get plugin: " .. err)
  return
end

if not plugin then
  ngx.log(ngx.ERR, "Plugin not found")
  return
end

local _, err = ingest_chunk(plugin.config, content)
if err then
  ngx.log(ngx.ERR, "Failed to ingest: " .. err)
  return
end

ngx.log(ngx.INFO, "Update completed")

Run the script from your Kong instance. This uses your AI RAG Injector plugin ID and the content you want to update. Here’s an example:
```
kong runner ingest_api.lua b924e3e8-7893-4706-aacb-e75793a1d2e9 ./inject_policy.py
```

Cleanup

Clean up Konnect environment

If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.

Destroy the Kong Gateway container

curl -Ls https://get.konghq.com/quickstart | bash -s -- -d

Ensure chatbots adhere to compliance policies with the AI RAG Injector plugin