The lifecycle management has been significantly improved
with the SidecarContainers
feature introduced in Kubernetes v1.29.
In order to enable the use of this feature with Kong Mesh you should enable the
experimental.sidecarContainers
/KUMA_EXPERIMENTAL_SIDECAR_CONTAINERS
option.
This feature supports adding the injected Kuma container to
initContainers
with restartPolicy: Always
, which marks it as a sidecar
container. Refer to the
Kubernetes docs
to learn more about how they work.
In effect, the following lifecycle subsections are irrelevant when using this feature.
When enabled, the ordering of the sidecar startup and shutdown is enforced by Kubernetes.
In order to use the mesh in an init container, ensure that it comes after kuma-sidecar
.
Kong Mesh injects its sidecar at the front of the list but can’t guarantee that its webhook runs last.
Draining incoming connections gracefully is handled via a preStop
hook
on the kuma-sidecar
container.
However, remember that if the terminationGracePeriodSeconds
has elapsed, ordering
and thus correct behavior of the sidecar is no longer guaranteed.
The grace period should be set long enough that your workload finishes shutdown
before it elapses.
As always, your application should itself initiate graceful shutdown
when it receives SIGTERM. In particular, remember that Kubernetes works largely
asynchronously. So if your application exits “too quickly” it’s possible that
requests are still routed to the Pod
and they fail. There’s an article at
learnk8s.io
with an overview of the problem and potential solutions.
It ultimately comes down to “just wait a little bit” when your application
receives SIGTERM. This can either be done in the application itself or you can
add a preStop
command to your container:
kind: Deployment
# ...
spec:
template:
spec:
containers:
- name: app-container
lifecycle:
preStop:
exec:
command: ["/bin/sleep", "15"]
On Kubernetes, Dataplane
resource is automatically created by kuma-cp. For each Pod
with sidecar-injection label, a new
Dataplane
resource will be created.
To join the mesh in a graceful way, we need to first make sure the application is ready to serve traffic before it can be considered a valid traffic destination.
Due to the way that Kong Mesh implements transparent proxying and sidecars in Kubernetes,
network calls from init containers while running a mesh can be a challenge.
The common pitfall is the idea that it’s possible to order init containers so that the mesh init container is run after other init containers.
However, when injecting these init containers into a Pod via webhooks, such as the Vault init container, there is no assurance of the order.
The ordering of init containers also doesn’t provide a solution when the Kong Mesh CNI is used, as traffic redirection to the sidecar occurs even before
any init container runs.
To solve this issue, start the init container with a specific user ID and exclude specific ports from interception.
Remember also about excluding port of DNS interception. Here is an example of annotations to enable HTTPS traffic for a container running as user id 1234
.
apiVersion: v1
king: Deployment
metadata:
name: my-deployment
spec:
template:
metadata:
annotations:
traffic.kuma.io/exclude-outbound-tcp-ports-for-uids: "443:1234"
traffic.kuma.io/exclude-outbound-udp-ports-for-uids: "53:1234"
spec:
initContainers:
- name: my-init-container
...
securityContext:
runAsUser: 1234
In this scenario, using the init container is simply impossible
because kuma-dp
is responsible for encrypting the traffic and only runs after all init containers have exited.
By default, containers start in arbitrary order, so an app container can start even though the sidecar container might not be ready to receive traffic.
Making initial requests, such as connecting to a database, can fail for a brief period after the pod starts.
To mitigate this problem try setting
-
runtime.kubernetes.injector.sidecarContainer.waitForDataplaneReady
to true
, or
-
kuma.io/wait-for-dataplane-ready annotation to
true
so that the app container waits for the dataplane container to be ready to serve traffic.
The waitForDataplaneReady
setting relies on the fact that defining a postStart
hook causes Kubernetes to run containers sequentially based on their order of occurrence in the containers
list.
This isn’t documented and could change in the future.
It also depends on injecting the kuma-sidecar
container as the first container in the pod, which isn’t guaranteed since other mutating webhooks can rearrange the containers.
To leave the mesh in a graceful shutdown, we need to remove the traffic destination from all the clients before shutting it down.
When the Kong Mesh sidecar receives a SIGTERM signal it:
- Starts draining Envoy listeners.
- Waits the entire drain time.
- Terminates.
While draining, Envoy can still accept connections, however:
- It is marked unhealthy on the Envoy Admin
/ready
endpoint.
- It sends
connection: close
for HTTP/1.1 requests and the GOAWAY
frame for HTTP/2.
This forces clients to close their connection and reconnect to the new instance.
You can read the Kubernetes docs to learn how Kubernetes handles the Pod
lifecycle. Here is the summary including the parts relevant for Kong Mesh.
Whenever a user or system deletes a Pod
, Kubernetes does the following:
- It marks the
Pod
as terminated.
- For every container concurrently it:
- Executes any pre stop hook if defined.
- Sends a SIGTERM signal.
- Waits until container is terminated for maximum of graceful termination time (by default 60s).
- Sends a SIGKILL to the container.
- It removes the
Pod
object from the system.
When Pod
is marked as terminated, Kong Mesh, the CP marks the Dataplane
object unhealthy, which triggers a configuration update to all the clients in order to remove it as a destination.
This can take a couple of seconds depending on the size of the mesh, resources available to the CP, XDS configuration interval, etc.
If the application served by the Kong Mesh sidecar quits immediately after the SIGTERM signal, there is a high chance that clients will still try to send traffic to this destination.
To mitigate this, we need to either
- Support graceful shutdown in the application. For example, the application should wait X seconds to exit after receiving the first SIGTERM signal.
- Add a pre-stop hook to postpone stopping the application container. Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
template:
spec:
containers:
- name: redis
image: "redis"
lifecycle:
preStop:
exec:
command: ["/bin/sleep", "15"]
When a Pod
is deleted, its matching Dataplane
resource is deleted as well. This is possible thanks to the
owner reference set on the Dataplane
resource.