Rate limiting strategies

Uses: Kong Gateway

Related Documentation

Plugins

Rate Limiting Response Rate Limiting Rate Limiting Advanced AI Rate Limiting Advanced GraphQL Rate Limiting Advanced

Related Resources

All rate limiting plugins support some subset of the following strategies:

Strategy	Pros	Cons	Supported in plugin
`local`	Minimal performance impact.	Less accurate. Unless there’s a consistent-hashing load balancer in front of Kong, it diverges when scaling the number of nodes.	AI Rate Limiting Advanced Rate Limiting Advanced Rate Limiting Response Rate Limiting
`cluster`	Accurate¹, no extra components to support.	Each request forces a read and a write on the data store. Therefore, relatively, the biggest performance impact.	AI Rate Limiting Advanced Rate Limiting Advanced Rate Limiting Response Rate Limiting GraphQL Rate Limiting Advanced
`redis`	Accurate¹, less performance impact than a `cluster` policy.	Needs a Redis installation. Bigger performance impact than a `local` policy.	AI Rate Limiting Advanced Rate Limiting Advanced Rate Limiting Response Rate Limiting GraphQL Rate Limiting Advanced

[1]: Only when sync_rate option is set to 0 (synchronous behavior). See the configuration reference for each plugin for more details.

Two common use cases for rate limiting are:

Every transaction counts: The highest level of accuracy is needed. An example is a transaction with financial consequences.
Backend protection: Accuracy is not as relevant. The requirement is only to protect backend services from overloading that’s caused either by specific users or by attacks.

Every transaction counts

In this scenario, because accuracy is important, the local policy is not an option. Consider the support effort you might need for Redis, and then choose either cluster or redis.

You could start with the cluster policy, and move to redis if performance reduces drastically.

If using a very high sync frequency, use redis. Very high sync frequencies with cluster mode are not scalable and not recommended. The sync frequency becomes higher when the sync_rate setting is a lower number - for example, a sync_rate of 0.1 is a much higher sync frequency (10 counter syncs per second) than a sync_rate of 1 (1 counter sync per second).

You can calculate what is considered a very high sync rate in your environment based on your topology, number of plugins, their sync rates, and tolerance for loose rate limits.

If you choose to switch strategies, note that you can’t port the existing usage metrics from the Kong Gateway data store to Redis. This might not be a problem with short-lived metrics (for example, seconds or minutes) but if you use metrics with a longer time frame (for example, months), plan your switch carefully.

Backend protection

If accuracy is less important, choose the local policy. You might need to experiment a little before you get a setting that works for your scenario. As the cluster scales to more nodes, more user requests are handled. When the cluster scales down, the probability of false negatives increases. Make sure to adjust your rate limits when scaling.

For example, if a user can make 100 requests every second, and you have an equally balanced 5-node Kong Gateway cluster, you can set the local limit to 30 requests every second. If you see too many false negatives, increase the limit.

To minimize inaccuracies, consider using a consistent-hashing load balancer in front of Kong Gateway. The load balancer ensures that a user is always directed to the same Kong Gateway node, which reduces inaccuracies and prevents scaling problems.

Fallback from Redis

When the redis strategy is used and a Kong Gateway node is disconnected from Redis, the plugin will fall back to local. This can happen when the Redis server is down or the connection to Redis broken. Kong Gateway keeps the local counters for rate limiting and syncs with Redis once the connection is re-established. Kong Gateway will still rate limit, but the Kong Gateway nodes can’t sync the counters. As a result, users will be able to perform more requests than the limit, but there will still be a limit per node.

Policy strategies

Two common use cases are:

You need…	Use the following plugin policy strategies…
A high level of accuracy in critical transactions. An example is a transaction with financial consequences.	`cluster` or `redis`
Protect backend services from overloading caused by specific users or attacks. High accuracy is not as relevant.	`local`

If the plugin can’t retrieve the selected policy, it falls back to limiting usage by identifying the IP address.

High accuracy use case recommendations

In this scenario, because accuracy is important, the local policy is not an option. Consider the support effort you might need for Redis, and then choose either cluster or redis.

You could start with the cluster policy, and move to redis if performance reduces drastically.

Do remember that you cannot port the existing usage metrics from the data store to Redis. This might not be a problem with short-lived metrics (for example, seconds or minutes) but if you use metrics with a longer time frame (for example, months), plan your switch carefully.

Backend protection use case recommendations

If accuracy is of lesser importance, choose the local policy. You might need to experiment a little before you get a setting that works for your scenario. As the cluster scales to more nodes, more user requests are handled. When the cluster scales down, the probability of false negatives increases. So, adjust your limits when scaling.

For example, if a user can make 100 requests every second, and you have an equally balanced 5-node Kong cluster, setting the local limit to something like 30 requests every second should work. If you see too many false negatives, increase the limit.

To minimize inaccuracies, consider using a consistent-hashing load balancer in front of Kong. The load balancer ensures that a user is always directed to the same Kong node, thus reducing inaccuracies and preventing scaling problems.