Ensure chatbots adhere to compliance policies with the AI RAG Injector plugin
Use the AI RAG Injector plugin to integrate your company’s compliance policy documents as retrieval-augmented knowledge. Configure the plugin to inject context from these documents into chatbot prompts, ensuring it can generate relevant, accurate compliance-related questions dynamically during conversations.
Prerequisites
Kong Gateway running
This tutorial requires Kong Gateway Enterprise. If you don’t have Kong Gateway set up yet, you can use the quickstart script with an enterprise license to get an instance of Kong Gateway running almost instantly.
-
Export your license to an environment variable:
export KONG_LICENSE_DATA='LICENSE-CONTENTS-GO-HERE'
-
Run the quickstart script:
curl -Ls https://get.konghq.com/quickstart | bash -s -- -e KONG_LICENSE_DATA
Once Kong Gateway is ready, you will see the following message:
Kong Gateway Ready
decK
decK is a CLI tool for managing Kong Gateway declaratively with state files. To complete this tutorial you will first need to install decK.
Required entities
For this tutorial, you’ll need Kong Gateway entities, like Gateway Services and Routes, pre-configured. These entities are essential for Kong Gateway to function but installing them isn’t the focus of this guide. Follow these steps to pre-configure them:
-
Run the following command:
echo ' _format_version: "3.0" services: - name: example-service url: http://httpbin.konghq.com/anything routes: - name: example-route paths: - "/anything" service: name: example-service ' | deck gateway apply -
To learn more about entities, you can read our entities documentation.
OpenAI
This tutorial uses OpenAI:
- Create an OpenAI account.
- Get an API key.
- Create a decK variable with the API key:
export DECK_OPENAI_API_KEY='YOUR OPENAI API KEY'
Redis stack
To complete this tutorial, you must have a Redis stack configured in your environment. Set your Redis host as an environment variable:
export DECK_REDIS_HOST='YOUR-REDIS-HOST'
PgVector (optional)
Test
Langchain splitters
To complete this tutorial, you’ll need Python (version 3.7 or later) and pip
installed on your machine. You can verify it by running:
python3
python3 -m pip --version
Once that’s set up, install the required packages by running the following command in your terminal:
python3 -m pip install langchain langchain_text_splitters requests
Configure the AI Proxy Advanced plugin
First, you’ll need to configure the AI Proxy Advanced plugin to proxy prompt requests to your model provider, and handle authentication:
echo '
_format_version: "3.0"
plugins:
- name: ai-proxy-advanced
config:
targets:
- route_type: llm/v1/chat
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
model:
provider: openai
name: gpt-4o
options:
max_tokens: 512
temperature: 1.0
' | deck gateway apply -
Configure the AI RAG Injector plugin
Next, configure the AI RAG Injector plugin to inject precise, context-specific instructions and relevant knowledge from a company’s private compliance data into the AI prompt. This configuration ensures the AI answers employee questions accurately using only approved information through retrieval-augmented generation (RAG).
echo '
_format_version: "3.0"
plugins:
- name: ai-rag-injector
id: b924e3e8-7893-4706-aacb-e75793a1d2e9
config:
inject_template: |
You are an AI assistant designed to answer employee questions using only the approved compliance content provided between the <RAG></RAG> tags.
Do not use external or general knowledge, and do not answer if the information is not available in the RAG content.
<RAG><CONTEXT></RAG>
User'\''s question: <PROMPT>
Respond only with information found in the <RAG> section. If the answer is not clearly present, reply with:
"I'\''m sorry, I cannot answer that based on the available compliance information."
embeddings:
auth:
header_name: Authorization
header_value: Bearer ${{ env "DECK_OPENAI_API_KEY" }}
model:
provider: openai
name: text-embedding-3-large
vectordb:
strategy: redis
redis:
host: "${{ env "DECK_REDIS_HOST" }}"
port: 6379
distance_metric: cosine
dimensions: 3072
' | deck gateway apply -
If your Redis instance runs in a separate Docker container from Kong, use
host.docker.internal
forvectordb.redis.host
.If you’re using a model other than
text-embedding-3-large
, be sure to update thevectordb.dimensions
value to match the model’s embedding size.
Split input data before ingestion
Before sending data to the AI Gateway, split your input into manageable chunks using a text splitting tool like langchain_text_splitters
. This helps optimize downstream processing and improves semantic retrieval performance.
Refer to langchain text_splitters documents if your documents are structured data other than plain texts.
The following Python script demonstrates how to split text using RecursiveCharacterTextSplitter
and ingest the resulting chunks into the AI Gateway. This script uses the AI RAG Injector plugin ID we set in the previous step, so be sure to replace it if your plugin has a different ID.
Save the script as inject_policy.py
:
from langchain_text_splitters import RecursiveCharacterTextSplitter
import requests
TEXT = ["""
Acme Corp. Travel Policy
1. Purpose
This policy outlines the guidelines for employees traveling on company business to ensure efficient, cost-effective, and accountable use of company funds.
2. Scope
This policy applies to all employees traveling on company business, including domestic and international travel.
3. Travel Approval
All travel must be pre-approved by the employee's supervisor and, if applicable, by higher management, based on business need and cost-effectiveness.
Travel requests should be submitted at least [Number] weeks/days in advance, including destination, purpose, dates, and estimated costs.
Travel requests should be submitted using the designated travel request form.
4. Transportation
Air Travel:
Employees should book the most cost-effective airfare, considering time and cost.
Business class or first-class travel is only permitted with prior approval and for exceptional circumstances.
Employees should choose direct flights whenever possible.
Ground Transportation:
For travel to and from airports or within the destination, employees should use cost-effective options such as shuttles, public transportation, or car services.
Personal vehicle use is permitted for business travel, with reimbursement at the standard IRS mileage rate.
Parking and tolls: are reimbursable when necessary.
Train Travel:
Train travel is considered an appropriate mode of transportation for certain destinations and will be reimbursed if the cost is less than other means of transportation.
5. Lodging
Employees should choose lodging that is cost-effective and meets the needs of the business trip.
Hotel selection: should be based on location, proximity to meeting venues, and cost.
Employees should book accommodations in advance to secure the best rates.
Travelers should share hotel rooms with other employees when feasible and appropriate.
6. Meals
Meals are reimbursable during business travel, but expenses should be kept reasonable and appropriate.
Employees should present receipts for all meal expenses.
Alcoholic beverages: are not reimbursable.
When attending business functions with meals provided, expenses for meals purchased elsewhere are not reimbursed unless specifically authorized in advance.
7. Other Expenses
Entertainment expenses: are generally not reimbursable, except for business-related entertainment that is necessary for client relations.
Telephone expenses: are reimbursable when necessary for business travel, but should be kept to a minimum.
Internet access: is reimbursable when necessary for business travel.
8. Reimbursement
Employees should submit all travel expenses for reimbursement within 27 days of the trip.
Employees should submit receipts for all travel expenses.
Reimbursement will be made in accordance with company policy.
9. Compliance
All employees are expected to comply with this travel policy.
Violation of this policy may result in disciplinary action.
10. Policy Updates
This policy may be updated from time to time as needed.
Employees will be notified of any changes to this policy.
"""]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
docs = text_splitter.create_documents(TEXT)
print("Injecting %d chunks..." % len(docs))
for doc in docs:
response = requests.post(
"http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/ingest_chunk", # Replace the placeholder with your AI RAG Injector plugin ID
data={'content': doc.page_content}
)
print(response.json())
You can replace
print(response.json())
withprint(response.text)
to view the raw HTTP response body as a plain string instead of a parsed JSON object. This is useful for debugging cases where:
- The response isn’t valid JSON (e.g., plain text error message or HTML).
- You want to inspect the exact response content without triggering a JSON parse error.
Use
response.text
when troubleshooting unexpected server responses or plugin misconfigurations.
Run the inject_policy.py
script in your terminal:
python3 ./inject_policy.py
This will output the number of chunks created and display the response from the injector endpoint for each chunk:
Injecting 4 chunks...
{"metadata":{"ingest_duration":1476,"embeddings_tokens_count":157,"chunk_id":"a1b2c3d4-e5f6-7890-ab12-34567890abcd"}}
{"metadata":{"ingest_duration":1323,"embeddings_tokens_count":140,"chunk_id":"b2c3d4e5-f678-9012-bc34-567890abcdef"}}
{"metadata":{"ingest_duration":1286,"embeddings_tokens_count":141,"chunk_id":"c3d4e5f6-7890-1234-cd56-7890abcdef12"}}
{"metadata":{"ingest_duration":2892,"embeddings_tokens_count":168,"chunk_id":"d4e5f678-9012-3456-de78-90abcdef1234"}}
Ingest content to the vector database
Now, you can feed the split chunks into AI Gateway using the Kong Admin API.
The following example shows how to ingest content to the vector database for building the knowledge base. The AI RAG Injector plugin uses the OpenAI text-embedding-3-large
model to generate embeddings for the content and stores them in Redis.
curl "http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/ingest_chunk" \
-H "Accept: application/json"\
-H "Content-Type: application/json" \
--json '{
"content": "<chunk>"
}'
This will return something like the following:
{"metadata":{"embeddings_tokens_count":3,"chunk_id": "3fa85f64-5717-4562-b3fc-2c963fabcdef","ingest_duration":550}}
Test RAG configuration
Now you can send various questions to the AI to verify that RAG is working correctly.
In-scope questions
Use the following in-scope questions to verify that the AI responds accurately based on the approved compliance content and doesn’t rely on external knowledge.
Out-of-scope questions
Use the following out-of-scope questions to confirm that the AI correctly refuses to answer queries that fall outside the ingested compliance content. AI should return the following response to these requests:
"message": {
"role": "assistant",
"content": "I'm sorry, I cannot answer that based on the available compliance information.",
}
Debug the retrieval of the knowledge base
To evaluate which documents are retrieved for a specific prompt, use the following command:
curl "http://localhost:8001/ai-rag-injector/b924e3e8-7893-4706-aacb-e75793a1d2e9/lookup_chunks" \
-H "Accept: application/json"\
-H "Content-Type: application/json" \
--json '{
"prompt": "Am I allowed to share a hotel room with another employee?",
"exclude_contents": false
}'
This will return which content in the compliance policy AI is using to answer the user question.
To omit the chunk content and only return the chunk ID, set
exclude_contents
to true.
Update content for ingesting
If you are running Kong Gateway in traditional mode, you can update content for ingesting by sending a request to the /ai-rag-injector/{pluginId}/ingest_chunk
endpoint.
However, this won’t work in hybrid mode or Konnect because the control plane can’t access the plugin’s backend storage.
To update content for ingesting in hybrid mode or Konnect, you can use a script:
- Retrieve the ID of the AI RAG Injector plugin that you want to update.
-
Copy and paste the following script to a local file, for example
ingest_update.lua
:local embeddings = require("kong.llm.embeddings") local uuid = require("kong.tools.utils").uuid local vectordb = require("kong.llm.vectordb") local function get_plugin_by_id(id) local row, err = kong.db.plugins:select( {id = id}, { workspace = ngx.null, show_ws_id = true, expand_partials = true } ) if err then return nil, err end return row end local function ingest_chunk(conf, content) local err local metadata = { ingest_duration = ngx.now(), } -- vectordb driver init local vectordb_driver do vectordb_driver, err = vectordb.new(conf.vectordb.strategy, conf.vectordb_namespace, conf. vectordb) if err then return nil, "Failed to load the '" .. conf.vectordb.strategy .. "' vector database driver: " .. err end end -- embeddings init local embeddings_driver, err = embeddings.new(conf.embeddings, conf.vectordb.dimensions) if err then return nil, "Failed to instantiate embeddings driver: " .. err end local embeddings_vector, embeddings_tokens_count, err = embeddings_driver:generate(content) if err then return nil, "Failed to generate embeddings: " .. err end metadata.embeddings_tokens_count = embeddings_tokens_count if #embeddings_vector ~= conf.vectordb.dimensions then return nil, "Embedding dimensions do not match the configured vector database. Embeddings were " .. #embeddings_vector .. " dimensions, but the vector database is configured for " .. conf.vectordb.dimensions .. " dimensions.", "Embedding dimensions do not match the configured vector database" end metadata.chunk_id = uuid() -- ingest chunk local _, err = vectordb_driver:insert(embeddings_vector, content, metadata.chunk_id) if err then return nil, "Failed to insert chunk: " .. err end return true end assert(#args == 3, "2 arguments expected") local plugin_id, content = args[2], args[3] local plugin, err = get_plugin_by_id(plugin_id) if err then ngx.log(ngx.ERR, "Failed to get plugin: " .. err) return end if not plugin then ngx.log(ngx.ERR, "Plugin not found") return end local _, err = ingest_chunk(plugin.config, content) if err then ngx.log(ngx.ERR, "Failed to ingest: " .. err) return end ngx.log(ngx.INFO, "Update completed")
-
Run the script from your Kong instance. This uses your AI RAG Injector plugin ID and the content you want to update. Here’s an example:
kong runner ingest_api.lua b924e3e8-7893-4706-aacb-e75793a1d2e9 ./inject_policy.py
Cleanup
Clean up Konnect environment
If you created a new control plane and want to conserve your free trial credits or avoid unnecessary charges, delete the new control plane used in this tutorial.
Destroy the Kong Gateway container
curl -Ls https://get.konghq.com/quickstart | bash -s -- -d