Rate limit

Global rate limiting architecture overview.

The HTTP rate limit filter will call the rate limit service when the request’s route or virtual host has one or more rate limit configurations that match the filter stage setting. The route can optionally include the virtual host rate limit configurations. More than one configuration can apply to a request. Each configuration results in a descriptor being sent to the rate limit service.

If the rate limit service is called, and the response for any of the descriptors is over limit, a 429 response is returned.

  "type": "decoder",
  "name": "rate_limit",
  "config": {
    "domain": "...",
    "stage": "...",
    "request_type": "...",
    "timeout_ms": "..."
(required, string) The rate limit domain to use when calling the rate limit service.

(optional, integer) Specifies the rate limit configurations to be applied with the same stage number. If not set, the default stage number is 0.

NOTE: The filter supports a range of 0 - 10 inclusively for stage numbers.

(optional, string) The type of requests the filter should apply to. The supported types are internal, external or both. A request is considered internal if x-envoy-internal is set to true. If x-envoy-internal is not set or false, a request is considered external. The filter defaults to both, and it will apply to all request types.
(optional, integer) The timeout in milliseconds for the rate limit service RPC. If not set, this defaults to 20ms.


The buffer filter outputs statistics in the cluster.<route target cluster>.ratelimit. namespace. 429 responses are emitted to the normal cluster dynamic HTTP statistics.

Name Type Description
ok Counter Total under limit responses from the rate limit service
error Counter Total errors contacting the rate limit service
over_limit Counter total over limit responses from the rate limit service


The HTTP rate limit filter supports the following runtime settings:

% of requests that will call the rate limit service. Defaults to 100.
% of requests that will call the rate limit service and enforce the decision. Defaults to 100. This can be used to test what would happen before fully enforcing the outcome.
% of requests that will call the rate limit service for a given route_key specified in the rate limit configuration. Defaults to 100.