CodexBloom - Programming Q&A Platform

GCP Cloud Run service not scaling up under load - requests timing out even with concurrency set to 1

πŸ‘€ Views: 79 πŸ’¬ Answers: 1 πŸ“… Created: 2025-06-09
gcp cloud-run nodejs scaling performance JavaScript

I'm prototyping a solution and I'm experiencing an scenario with my GCP Cloud Run service where it's not scaling up as expected under load. I have set the maximum instances to 10, and the concurrency to 1, but when I send a burst of 50 requests, multiple requests return a timeout behavior ("504 Gateway Timeout"). My service is built using Node.js (version 14) and I'm leveraging Express.js for the API endpoints. Here’s a snippet of my Dockerfile: ```Dockerfile FROM node:14 WORKDIR /usr/src/app COPY package*.json ./ RUN npm install COPY . . CMD [ "node", "server.js" ] ``` I've also configured the timeout to 60 seconds in the Cloud Run settings, but it still doesn't seem to make a difference. I’ve tried adding load testing with Apache JMeter, and I can see that requests start failing after around 20 concurrent requests. Additionally, I checked the CPU and memory allocations and they seem fine, currently set to the default settings. I’m unsure if this is a configuration scenario on my end or if there's something else that could be affecting the scaling behavior. I would appreciate any insights on how to properly handle scaling for high traffic on Cloud Run or adjustments I might need to make to my service configuration. I'd love to hear your thoughts on this.