Kong Gateway Operator can deploy Data Planes that will horizontally autoscale based on user defined criteria.
This page shows how to autoscale Data Planes based on their average CPU utilization.
Kong Gateway Operator can deploy Data Planes that will horizontally autoscale based on user defined criteria.
This page shows how to autoscale Data Planes based on their average CPU utilization.
Kong Gateway Operator uses Kubernetes HorizontalPodAutoscaler
to perform horizontal autoscaling of data planes.
Add the Kong Helm charts:
helm repo add kong https://charts.konghq.com
helm repo update
Create a kong
namespace:
kubectl create namespace kong --dry-run=client -o yaml | kubectl apply -f -
Install Kong Ingress Controller using Helm:
helm upgrade --install kgo kong/gateway-operator -n kong-system --create-namespace \
--set image.tag=1.5 \
--set kubernetes-configuration-crds.enabled=true \
--set env.ENABLE_CONTROLLER_KONNECT=true
In order to be able to use
HorizontalPodAutoscaler
in your clusters you’ll need to have a metrics server installed. More info on the metrics server can be found in official Kubernetes docs.
To install a metrics server to test, run the following command:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl patch deployment metrics-server \
-n kube-system \
--type='json' \
-p='[{
"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value": "--kubelet-insecure-tls"
}]'
To enable horizontal autoscaling, you must specify the spec.deployment.scaling
section in your DataPlane
resource to indicate which metrics should be used for decision making.
In the example below autoscaling is triggered based on CPU utilization. The DataPlane
resource can have between 2 and 10 replicas, and a new replica will be launched whenever CPU utilization is above 50%.
The scaleUp
configuration states that either 100% of existing replicas, or 5 new pods (whichever is higher) may be launched every 10 seconds. If you have 3 replicas, 5 pods may be created. If you have 50 replicas, up to 50 more pods may be launched.
The scaleDown
configuration states that 100% of pods may be removed (with a minReplicas
value of 2).
echo '
apiVersion: gateway-operator.konghq.com/v1beta1
kind: DataPlane
metadata:
name: horizontal-autoscaling
namespace: kong
spec:
deployment:
scaling:
horizontal:
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
behavior:
scaleUp:
stabilizationWindowSeconds: 1
policies:
- type: Percent
value: 100
periodSeconds: 10
- type: Pods
value: 5
periodSeconds: 10
selectPolicy: Max
scaleDown:
stabilizationWindowSeconds: 1
policies:
- type: Percent
value: 100
periodSeconds: 10
podTemplateSpec:
spec:
containers:
- name: proxy
image: kong/kong-gateway:3.10
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "1024Mi"
cpu: "1000m"
# Add any Konnect-related configuration here: environment variables, volumes, and so on.
' | kubectl apply -f -
See the CRD reference for all scaling options.
A DataPlane
is created when the manifest above is applied. This creates 2 Pod
s running Kong Gateway, as well as a HorizontalPodAutoscaler
which will manage the replica count of those Pod
s to ensure that the average CPU utilization is around 50%.
kubectl get hpa -n kong
The output will show the HorizontalPodAutoscaler
resource:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontal-autoscaling Deployment/dataplane-horizontal-autoscaling-4q72p 2%/50% 2 10 2 30s
You can test if the autoscaling works by using a load testing tool (e.g. k6s) to generate traffic.
Fetch the DataPlane address and store it in the PROXY_IP
variable:
export PROXY_IP=$(kubectl get -n kong dataplanes.gateway-operator.konghq.com -o jsonpath='{.status.addresses[0].value}' horizontal-autoscaling)
Install k6s
, then create a configuration file containing the following code:
echo '
import http from "k6/http";
import { check } from "k6";
export const options = {
insecureSkipTLSVerify: true,
stages: [
{ duration: "120s", target: 5 },
],
};
// Simulated user behavior
export default function () {
let res = http.get(`https://${__ENV.PROXY_IP}`);
check(res, { "status was 404": (r) => r.status == 404 });
}
' > k6.js
Start the load test:
k6 run k6.js
Observe the scaling events in the cluster while the test is running:
kubectl get events -n kong --field-selector involvedObject.name=horizontal-autoscaling --field-selector involvedObject.kind=HorizontalPodAutoscaler --field-selector='reason=SuccessfulRescale' -w
The output will show the scaling events:
LAST SEEN TYPE REASON OBJECT MESSAGE
3m55s Normal SuccessfulRescale horizontalpodautoscaler/horizontal-autoscaling New size: 6; reason: cpu resource utilization (percentage of request) above target
3m25s Normal SuccessfulRescale horizontalpodautoscaler/horizontal-autoscaling New size: 7; reason: cpu resource utilization (percentage of request) above target
2m55s Normal SuccessfulRescale horizontalpodautoscaler/horizontal-autoscaling New size: 10; reason: cpu resource utilization (percentage of request) above target
85s Normal SuccessfulRescale horizontalpodautoscaler/horizontal-autoscaling New size: 2; reason: All metrics below target
The DataPlane
’s status
field will also be updated with the number of ready/target replicas:
kubectl get -n kong dataplanes.gateway-operator.konghq.com horizontal-autoscaling -o jsonpath-as-json='{.status}'
[
{
...
"readyReplicas": 2,
"replicas": 2,
...
}
]