Multi-zone deployment

Uses: Kong Mesh

Kong Mesh supports running your service mesh in multiple zones, including a mix of Kubernetes and Universal zones. Your mesh environment can include multiple isolated service meshes and workloads running in different regions, on different clouds, or in different data centers. A zone can be a Kubernetes cluster, a VPC, or any other deployment you need to include in the same distributed mesh environment.

If you’re looking for a simpler deployment mode, see Single-zone deployments. In a multi-zone deployment, all the data planes running within the zone can connect to the other data planes in the same zone.

 
flowchart TB
    GCP[Global control plane]

    subgraph ZA[Zone A]
        ZCPA[Zone control plane]
        DPPA[Data plane proxies]
        ZEA[Zone egress]
        ZIA[Zone ingress]
    end

    subgraph ZB[Zone B]
        ZCPB[Zone control plane]
        DPPB[Data plane proxies]
        ZEB[Zone egress]
        ZIB[Zone ingress]
    end

    GCP <-->|KDS| ZCPA
    GCP <-->|KDS| ZCPB
    ZCPA <-->|xDS| DPPA & ZEA & ZIA
    ZCPB <-->|xDS| DPPB & ZEB & ZIB
    DPPA -->|outbound| ZEA -->|cross-zone| ZIB -->|inbound| DPPB
    DPPB -->|outbound| ZEB -->|cross-zone| ZIA -->|inbound| DPPA
  

How it works

Kong Mesh abstracts away zones, so your data plane proxies find services wherever they run. You can make a service multi-zone by having data planes use the same kuma.io/service in different zones. This gives you automatic failover of services if a specific zone fails.

Let’s look at how a service backend in zone-b is advertised to zone-a and a request from the local zone zone-a is routed to the remote service in zone-b.

Destination service zone

When the new service backend joins the mesh in zone-b, the mesh handles it as follows:

  1. The zone-b zone control plane adds this service to the availableServices on the zone-b ZoneIngress resource.
  2. The kuma-dp proxy running as a zone ingress is configured with this list of services so that it can route incoming requests.
  3. This ZoneIngress resource is synchronized to the global control plane.
  4. The global control plane propagates the zone ingress resources and all policies to all other zones over Kong Mesh Discovery Service (KDS), which is a protocol based on xDS.

Source service zone

When the zone-b ZoneIngress resource is synchronized from the global control plane to the zone-a zone control plane, the following happens:

  • Requests to the availableServices from zone-a are load balanced between local instances and remote instances of this service.
  • Requests sent to zone-b are routed to the zone ingress proxy of zone-b.

For load balancing, zone ingress endpoints are weighted by the number of instances running behind them, so a zone with two instances receives twice as much traffic as a zone with one instance. You can also favor local service instances with locality-aware load balancing.

When a zone egress is present, traffic routes through the local zone egress before reaching the remote zone ingress.

When using transparent proxying (default in Kubernetes), Kong Mesh generates a VIP and a DNS entry with the format <kuma.io/service>.mesh, and listens on the service VIP port (default 80).

A zone ingress is not an API gateway. It only handles cross-zone communication within a mesh. API gateways are supported in Kong Mesh gateway mode and can be deployed in addition to zone ingresses.

Components of a multi-zone deployment

A multi-zone deployment includes:

Component

Responsibilities

Global control plane
  • Accepts connections only from zone control planes.
  • Accepts creation and changes to policies that will be applied to the data plane proxies.
  • Sends policies down to zone control planes.
  • Sends zone ingresses down to zone control planes.
  • Keeps an inventory of all data plane proxies running in all zones (this is only done for observability but is not required for operations).
  • Rejects connections from data plane proxies.
Zone control planes
  • Accept connections from data plane proxies started within the zone.
  • Receive policy updates from the global control plane.
  • Send data plane proxies and zone ingress changes to the global control plane.
  • Compute and send configurations using xDS to the local data plane proxies.
  • Update the list of services available in the zone in the zone ingress.
  • Reject policy changes that do not come from the global control plane.
Data plane proxies
  • Connect to the local zone control plane.
  • Receive configurations using xDS from the local zone control plane.
  • Connect to other local data plane proxies.
  • Connect to zone ingresses to send cross-zone traffic.
  • Receive traffic from local data plane proxies and local zone ingresses.
Zone ingress
  • Receives xDS configuration from the local zone control plane.
  • Proxies traffic from other zone data plane proxies to local data plane proxies.
Zone egress (optional)
  • Receives xDS configuration from the local zone control plane.
  • Proxies traffic from local data plane proxies to zone ingress proxies from other zones.
  • Proxies traffic from local data plane proxies to external services from the local zone.

Failure modes

The following table describes how Kong Mesh behaves when components of a multi-zone deployment become unavailable or lose connectivity:

Failure mode

Impact

What still works

Global control plane offline
  • Policy updates are impossible.
  • Changes in the service list between zones won’t propagate: new services won’t be discoverable in other zones, and services removed from a zone will still appear available in other zones.
  • Zones can’t be deleted or disabled.
  • Local and cross-zone application traffic.
  • Data plane proxy changes continue to propagate within their zones.
Zone control plane offline
  • New data plane proxies can’t join the mesh, including new instances (Pod/VM) created by automatic deployment mechanisms such as rolling updates. A control plane connection failure could block application updates.
  • On mTLS-enabled meshes, a data plane proxy may fail to refresh its client certificate before it expires (defaults to 24 hours), causing traffic failures.
  • Data plane proxy configuration won’t be updated.
  • Communication between data plane proxies.
  • Cross-zone communication.
  • Other zones are unaffected.
Communication between global and zone control plane failing Can occur when there is a misconfiguration or network connectivity issues between control planes.
  • Policy changes won’t propagate to the zone control plane.
  • ZoneIngress, ZoneEgress, and Dataplane changes won’t propagate to the global control plane, leaving the global inventory of data plane proxies outdated.
  • Other zones won’t see new or removed services from this zone, or changes in instance counts.
  • Local data plane proxies won’t see new or removed services from other zones, or changes in instance counts.
  • Data plane proxies can join, leave, and receive configuration updates.
  • Local and cross-zone application traffic.
Communication between two zones failing Can occur when there are network connectivity issues:
  • Between a control plane and zone ingress or egress from another zone.
  • Between a zone egress and zone ingress from another zone.
  • All zone ingress or egress instances in a zone are down.


In this situation, all cross-zone communication fails.

With the right resiliency setup (MeshRetries, MeshHealthCheck, MeshLoadBalancingStrategy, MeshCircuitBreakers), the failing zone can be quickly severed and traffic re-routed to another zone.

  • Communication and operations within each zone.

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!