FastAPI: Custom Middleware optimization guide as Expected for Rate Limiting

👀 Views: 79 💬 Answers: 1 📅 Created: 2025-06-09

fastapi middleware rate-limiting python Python

I'm trying to configure I'm relatively new to this, so bear with me. I'm trying to implement a custom middleware in my FastAPI application for rate limiting requests, but I'm running into issues where the limit doesn't seem to be applied correctly. I'm using FastAPI version 0.75.0 and Python 3.9. The middleware should restrict users to a maximum of 5 requests per minute. Here's the code I've written so far: ```python from fastapi import FastAPI, Request, Response from starlette.middleware.base import BaseHTTPMiddleware from time import time from collections import defaultdict app = FastAPI() class RateLimitMiddleware(BaseHTTPMiddleware): def __init__(self, app): super().__init__(app) self.user_requests = defaultdict(list) async def dispatch(self, request: Request, call_next): user_ip = request.client.host current_time = time() # Clean up old timestamps self.user_requests[user_ip] = [t for t in self.user_requests[user_ip] if t > current_time - 60] if len(self.user_requests[user_ip]) >= 5: return Response(content="Rate limit exceeded. Try again later.", status_code=429) self.user_requests[user_ip].append(current_time) response = await call_next(request) return response app.add_middleware(RateLimitMiddleware) @app.get("/") def read_root(): return {"message": "Hello World"} ``` When I test this middleware by sending 6 requests in quick succession from the same IP address, I expect the 6th request to be blocked with a 429 status code. However, all requests are going through without hitting the rate limit, which suggests that either my timestamp cleanup is not working or the count is not being tracked correctly. I've tried debugging by printing out the `self.user_requests` dictionary after each request, and it seems to contain all timestamps as expected. However, the condition for checking the limit doesn't seem to be triggering as I anticipated. Is there an scenario with how the timestamps are being managed or how the requests are being counted? Any insights or suggestions would be helpful! For context: I'm using Python on macOS. This is happening in both development and production on Windows 11. Has anyone else encountered this? I'm developing on CentOS with Python.