Rate Limiting in ASP.NET Core
Introduction
Rate limiting is sometimes known as throttling, even though they're not quite the same - rate limiting is more about fairness, making sure everyone gets a fair share (number of requests), whereas throttling is about limiting, or cutting down, the number of requests. The two are commonly used interchangeably.
Why do we need this? There are many reasons:
- To prevent attacks (DoS, DDoS)
- To save resource usage
- To limit accesses using some strategy
Rate limiting consists of two major components:
- A limiting strategy, which specifies how requests are going to be limited
- A partitioning strategy, which specifies the request groups to which we will be applying the limiting strategy
Like with API versioning, caching, API routing, etc, this is something that is usually implemented at the cloud level, at the API gateway, both Azure, AWS, and Google supply this services, but there may be good reasons to implement it ourselves, in our app, or a reverse proxy. Fortunately ASP.NET Core includes a good library for this! It applies to all incoming requests, either MVC, Razor Pages, or APIs.
Let's start by the limiting strategies available.
Limiting Strategies
The limiting strategy/algorithm of choice will determine how the limiting will be applied. Generally, it consists of a number of requests per time slot, all of which can be configured. Worth noting that the count is stored in memory, but there are plans to support distributed storage using Redis, which will be nice for clusters. The ASP.NET Core rate limiting middleware includes four strategies:
Fixed Window
The fixed window limiter uses a fixed time window to limit requests. When the time window expires, a new time window starts and the request limit is reset.
Read more here: https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?#fixed
Sliding Window
The sliding window algorithm is similar to the fixed window limiter but adds segments per window. The window slides one segment each segment interval. The segment interval is (window time)/(segments per window). Limits the requests for a window to X requests and each time window is divided in N segments per window. Requests taken from the expired time segment one window back (N segments prior to the current segment) are added to the current segment. The most expired time segment one window back is the expired segment.
Official documentation: https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?#slide
Token Bucket
This is similar to the sliding window limiter, but rather than adding back the requests taken from the expired segment, a fixed number of tokens are added each replenishment period. The tokens added each segment can't increase the available tokens to a number higher than the token bucket limit.
Info: https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?#token
Concurrency
The concurrency limiter limits the number of concurrent requests, which is a bit different than the previous ones. Each request reduces the concurrency limit by one, when a request completes, the limit is instead increased by one. Unlike the other requests limiters that limit the total number of requests for a specified period, the concurrency limiter limits only the number of concurrent requests and doesn't care for a time period.
Here is the doc: https://learn.microsoft.com/en-us/aspnet/core/performance/rate-limit?#concur
Chained Limiters
You can combine multiple algorithms together, the framework will consider them as one, and apply all the algorithms contained in sequence. More on this later on.
Rate Partitioner
A rate limiter, such as Concurrency
Applying Rate Limitations
It all starts with the AddRateLimiter() method, to register the services in the Dependency Injection (DI) framework:
builder.Services.AddRateLimiter(options =>
{
//options go here })
It must be followed by UseRateLimiter(), to actually add the middleware to the pipeline, and this must go before other middleware, such as controllers:
app.UseRateLimiter();
Let's now see how can we configure the limitations.
Policy-Based
For the fixed window, as an example, it should be something like this:
builder.Services.AddRateLimiter(_ => _ .AddFixedWindowLimiter(
policyName: "Limit10PerMinute", options =>
{
options.PermitLimit = 10;
options.Window = TimeSpan.FromMinutes(1);
options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
options.QueueLimit = 2;
})
);
I won't go into all the details of each algorithm, I ask you to have a look at the documentation, but this example uses the fixed window algorithm (method AddFixedWindowLimiter, but you can use any other algorithm using AddSlidingWindowLimiter, AddTokenBucketLimiter, or AddConcurrencyLimiter), limits requests to 10 (PermitLimit) per minute (Window), where the oldest in the queue is allowed first (QueueProcessingOrder), and there's a queue limit (QueueLimit) of just 2, the rest is discarded. The policy we are creating is called "10PerMinute" (policyName), the name should always be anything meaningful, we'll see how to use it in a moment. We can, of course, have multiple policies:
builder.Services.AddRateLimiter(options =>
{
options.AddFixedWindowLimiter("API", options =>
{
options.PermitLimit = 20;
options.Window = TimeSpan.FromMinutes(1);
});
options.AddFixedWindowLimiter("Web", options =>
{
options.PermitLimit = 10;
options.Window = TimeSpan.FromMinutes(1);
});
options.AddPolicy("Limit3PerIP", ctx =>
{
var clientIpAddress = ctx.GetRemoteIpAddress()!;
return RateLimitPartition.GetConcurrencyLimiter(clientIpAddress, _ =>
new ConcurrencyLimiterOptions
{
PermitLimit = 3
});
});
options.AddPolicy("NoLimit", ctx =>
{
return RateLimitPartition.GetNoLimiter("");
});
});
If we want to configure how the request is rejected we can use the OnRejected parameter, like in this example:
builder.Services.AddRateLimiter(_ =>
{
_ .AddFixedWindowLimiter(
policyName: "Limit10PerMinute", options =>
{
//...
});
_.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
_.OnRejected = async (ctx, cancellationToken) =>
{
await ctx.HttpContext.Response.WriteAsync("Request slots exceeded, try again later", cancellationToken);
};
});
We are both setting the status code of the response (RejectionStatusCode) to HTTP 429 - Too Many Requests (it actually is the default), and supplying a custom response message.
It could also be a redirect to some page:
ctx.HttpContext.Response.Redirect("https://some.site/unavailable.html", permanent: false);
Global
Having a named policy means that the rate limiting may or may not be used, depending on whether or not we actually use the policy; the alternative is to have a global rate limiter, where we don't specify a policy name, and it will apply everywhere, including MVC and minimal APIs:
builder.Services.AddRateLimiter(options => {
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
{
var partitionKey = ""; //<--- more on this in a moment
return RateLimitPartition.GetFixedWindowLimiter(partitionKey: partitionKey, _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 10,
Window = TimeSpan.FromMinutes(1)
});
});
});
Any of the options, like setting the status code or the response message/redirect, can be supplied as well for global limiters.
Chained
A chained partition limiter acts as one, and each individual limiters will be applied sequentially as they are registered. A chained limiter is created using PartitionedRateLimiter.CreateChained, here, for a global limiter only:
builder.Services.AddRateLimiter(options =>
{
options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
RateLimitPartition.GetConcurrencyLimiter(ctx.GetRemoteIpAddress()!, partition =>
new ConcurrencyLimiterOptions
{
PermitLimit = 3
})
),
PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
RateLimitPartition.GetFixedWindowLimiter(ctx.GetRemoteIpAddress()!, partition =>
new FixedWindowRateLimiterOptions
{
PermitLimit = 6000,
Window = TimeSpan.FromHours(1)
})
));
});
Let's now see how we can partition the requests.
Partioning Strategies
The idea behind the partitioning is: the limitation will be applied, and individual requests counted, for each of the partitions ("buckets"). For example, the fixed window count will be per partition, one may have reached the limit while others may not. The partition key is the partitionKey parameter and it is up to us to provide it somehow.
Global
If we wish to apply the same limit to all, we just need to set the partitionKey parameter to the same value, which can even be an empty string. For example:
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
{
var partitionKey = ""; //global partition, applies to all requests
return RateLimitPartition.GetFixedWindowLimiter(partitionKey: partitionKey, _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 10,
Window = TimeSpan.FromMinutes(1)
});
});
Health Status
One possible option is to limit the number of accesses (concurrent or per time period) if the health status of the system is degraded or unhealth. Ths idea is, if your system is misbehaving, you might want to throttle requests to it. If you've seen my recent post, you know how to check for this. Here's one way to implement this limitation: we say that if the health is degraded or unhealthy (HealthStatus), we apply a certain limiter, by setting a common partition key, otherwise, we just return a GUID. As it is, returning a GUID mean essentially disabling the rate limiter:
var healthChecker = ctx.RequestServices.GetRequiredService<HealthCheckService>();
var healthStatus = healthChecker.CheckHealthAsync().ConfigureAwait(false).GetAwaiter().GetResult();
var partitionKey = (healthStatus.Status == HealthStatus.Healthy) ? Guid.NewGuid().ToString() : healthStatus.ToString();
The HealthCheckService.CheckHealthAsync() call is asynchronous, so we must make it synchronous, because we are in a synchronous context.
Authenticated vs Anonymous Users
Another option would be: we want to offer the same (high) quality of service to authenticated users, while limiting anonymous ones. We could do it as this:
var partitionKey = ctx.User.Identity.IsAuthenticated ? Guid.NewGuid().ToString() : "Anonymous";
So, if the user is authenticated, we just return a GUID, which essentially means that the limits "bucket" will be incremented each time for a different GUID, which is the same to say that the limits will never be reached. For anonymous users, the same "bucket" will be incremented, which will make it reach the limit faster.
Per Authenticated User
A related option is to use a limit "bucket" for each authenticated user, and the same for anonymous ones:
var partitionKey = ctx.User.Identity.IsAuthenticated ? ctx.User.Identity.Name : "Anonymous";
Per Header
And if we want to use some custom header, we can as well:
var partitionKey = ctx.Request.Headers["X-ClientId"].ToString();
Per Tenant
If we are running a multi-tenant app, we can use as the limits "bucket" the tenant name. Let's suppose we have a ITenantIdentificationService, such as the one I presented on my SharedFlat project, which looks like this:
public interface ITenantIdentificationService
{
string GetCurrentTenant(HttpContext context);
}
If this service is registered in DI, we can get the tenant name as this:
var partitionKey = ctx.RequestServices.GetRequiredService<ITenantIdentificationService>().GetCurrentTenant(ctx);
In any case, you just need to have a way to get the current tenant from the HttpContext, and use it as the partition key.
Per Country
Another option, which I introduce here more for fun: we apply limits per requesting country! We can use the service introduced in my previous post, IGeoIpService. We get the geo location from the requesting client's IP address, and then use the country code as the partition key:
var partitionKey = ctx.RequestServices.GetRequiredService<IGeoIpService>().GetInfo(ctx.GetRemoteIpAddress()).ConfigureAwait(false).GetAwaiter().GetResult().CountryCode;
Of course, this call is asynchronous, so we must make it synchronous first. Not ideal, but nothing we can do about it.
Per Source IP
Yet another option is to use the remote client's IP address:
var partitionKey = ctx.GetRemoteIpAddress();
Mind you, the GetRemoteIpAddress() extension method is the same that was introduced here.
Per IP Range
And what if we want to limit just some IP range? We can do it as well:
var ipAddress = ctx.GetRemoteIpAddress();
var startIpAddress = ...; //get a start and end IP addresses somehow
var endIpAddress = ...;
if (IPAddress.Parse(ipAddress!).IsInRange(startIpAddress, endIpAddress))
{
partitionKey = $"{_startIpAddress}-{_endIpAddress}";
}
else
{
partitionKey = Guid.NewGuid().ToString(); //the client IP is out of the limited range, which means that we don't want it to be limited
}
The IsInRange extension method is:
public static class IPAddressExtensions
{
public static bool IsInRange(this IPAddress ipAddress, IPAddress startIpAddress, IPAddress endIpAddress)
{
ArgumentNullException.ThrowIfNull(ipAddress, nameof(ipAddress));
ArgumentNullException.ThrowIfNull(startIpAddress, nameof(startIpAddress));
ArgumentNullException.ThrowIfNull(endIpAddress, nameof(endIpAddress));
ArgumentOutOfRangeException.ThrowIfNotEqual((int)ipAddress.AddressFamily, (int)startIpAddress.AddressFamily, nameof(startIpAddress));
ArgumentOutOfRangeException.ThrowIfNotEqual((int)ipAddress.AddressFamily, (int)endIpAddress.AddressFamily, nameof(endIpAddress));
long ipStart = 0;
long ipEnd = 0;
long ip = 0;
if (ipAddress.AddressFamily == AddressFamily.InterNetwork)
{
ipStart = BitConverter.ToInt32(startIpAddress.GetAddressBytes().Reverse().ToArray(), 0);
ipEnd = BitConverter.ToInt32(endIpAddress.GetAddressBytes().Reverse().ToArray(), 0);
ip = BitConverter.ToInt32(ipAddress.GetAddressBytes().Reverse().ToArray(), 0);
}
else if (ipAddress.AddressFamily == AddressFamily.InterNetworkV6)
{
ipStart = BitConverter.ToInt64(startIpAddress.GetAddressBytes().Reverse().ToArray(), 0);
ipEnd = BitConverter.ToInt64(endIpAddress.GetAddressBytes().Reverse().ToArray(), 0);
ip = BitConverter.ToInt64(ipAddress.GetAddressBytes().Reverse().ToArray(), 0);
}
else
{
throw new ArgumentException($"AddressFamily {ipAddress.AddressFamily} not supported.", nameof(ipAddress));
}
return (ip >= ipStart) && (ip <= ipEnd);
}
}
Essentially, it turns each IPAddress into a number and checks to see if the remote client's IP address is contained.
Per Domain
Similar to the previous one, if we want to limit per DNS domain name:
var ipAddress = ctx.GetRemoteIpAddress();
var address = IPAddress.Parse(ipAddress!);
var entry = Dns.GetHostEntry(address);
var partitionKey = "";
if (entry != null)
{
partitionKey = string.Join(".", entry.HostName.Split('.').Skip(1)); //just the domain name, no host
}
else
{
partitionKey = ipAddress.ToString(); //no DNS domain name registered, so we have to use the source IP
}
Applicability
Now, how to actually apply the limitations.
Global
We just saw how to configure a rate limit globally. If nothing else is said, global limiters apply to all endpoints, MVC controllers, Razor Pages, and minimal APIs.
Policy-Based
In order to apply a specific policy to an endpoint, we apply the [EnableRateLimiting] attribute. For MVC controllers, here's how we do it:
[EnableRateLimiting("Limit10PerMinute")]
public IActionResult Get()
{
//
}
It can also be applied to the whole constructor, and to individual action methods:
[EnableRateLimiting("Limit10PerMinute")]
public class HomeController : Controller
{
[EnableRateLimiting("Web")]
public IActionResult Get()
{
//
}
}
Or globally:
app.MapDefaultControllerRoute().RequireRateLimiting("Limit10PerMinute");
For Razor Pages, we need to apply the attribute to the page model class:
[EnableRateLimiting("Limit10PerMinute")]
public class Index2Model : PageModel
{
}
Mind you, we cannot restrict a single method, like OnGet or OnPost, but we can apply globally:
app.MapRazorPages().RequireRateLimiting("Limit10PerMinute");
For minimal APIs, we add the restriction next to the route declaration:
Snippet
app.MapGet("/Home", () => { //
}).RequireRateLimiting("Limit10PerMinute");
And, if we want to exclude some particular action method, we apply a [DisableRateLimiting] attribute:
[DisableRateLimiting]
public IActionResult Get()
{
//
}
And for minimal API endpoints:
app.MapGet("/Home", () => { //
}).DisableRateLimiting();
What to do When Limit is Reached
So, when the limit is reached, you have a couple options:
- Return the default error message, with the status code of your choice
- Return some static content
- Redirect to some URL
Let's explore these options. The first one is straightforward, that is what you happen when you do nothing. You may want to customise the return status code, it can be HTTP 429, which is the default, to symbolise that the request limit has been reached:
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
This is inside the AddRateLimiter() call, RejectionStatusCode is a property of RateLimiterOptions.
The next option, return static content, which is probably some file, can be implemented as this:
static async Task ReturnStaticFile(HttpContext context, string localFilePath, CancellationToken cancellationToken)
{
var filePath = Path.Combine(context.RequestServices.GetRequiredService<IWebHostEnvironment>().WebRootPath, localFilePath);
var contents = await File.ReadAllTextAsync(filePath, cancellationToken);
await context.Response.WriteAsync(contents, cancellationToken);
context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
}
//...
options.OnRejected = async (context, cancellationToken) =>
{
if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter = ((int)retryAfter.TotalSeconds).ToString(NumberFormatInfo.InvariantInfo);
}
await ReturnStaticFile(context.HttpContext, "ratelimitexceeded.html", cancellationToken);
});
As you can see, it uses IWebHostEnvironment to get the local path of the WWWRoot folder, then reads all text from it, and returns it to the browser.
The final option, redirect to some location, can be implemented as:
static Task RedirectToPage(HttpContext context, string location)
{
context.Response.Redirect(location, permanent: false);
return Task.CompletedTask;
}
//...
options.OnRejected = async (context, cancellationToken) =>
{
if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
{
context.HttpContext.Response.Headers.RetryAfter = ((int)retryAfter.TotalSeconds).ToString(NumberFormatInfo.InvariantInfo);
}
await RedirectToPage(context.HttpContext, "/ratelimitexceeded");
});
As you can see, nothing special here, we just use the Redirect method with the permanent flag set to false, because it's a temporary redirect.
The RetryAfter return header is a standard HTTP return header that just warns the client browser when it will be possible to request the resource again.
One final word of caution: the order by which the middleware is added to the pipeline matters, so UseRateLimiter must be placed afer UseRouting and before MapRazorPages (if you're using Razor Pages):
app.UseHttpsRedirection();
app.UseStaticFiles();
app.UseRouting();
app.UseRateLimiter();
app.UseAuthorization();
app.MapRazorPages();
app.Run();
Also, if you want to redirect to a local action method or to a Razor Page, make sure you add the [DisableRateLimiting] attribute to it:
//ASP.NET Core MVC
[DisableRateLimiting]
public IActionResult RateLimitExceeded() { ... }
//...
//Razor Pages
[DisableRateLimiting]
public class RateLimitingExceededModel : PageModel
{
public void OnGet() { ... }
}
And that's it!
Conclusion
Rate limiting is a powerful tool for you to use on your services, usually when your cloud provider-provided service is not enough, or when you are hosting on-premises.
Hope you enjoyed this, stay tuned for more!