Cloud Run WebSocket service scaling for no apparent reason

louisnw · May 7, 2025, 8:42pm

Hi! I’m running a websocket server in cloud run. The settings I currently have are:

Max Instances: 10
Concurrency: 1000
Request Timeout: 3600s

During peak hours, the metrics for this service are:

max CPU usage: 20%
max Memory usage: 30%
Max concurrent requests: 500
Containers: 12 (??)

Why is cloud run scaling the service so heavily, when my CPU, memory usage, and number of requests are well below their respective limits? Am I missing something?

Additional Info:

I am using the Warp library in rust, which has no internal request limits
To be very clear, I have already set max concurrency to 1000, and I’m only receiving 500 concurrent requests. CPU and Memory usage never exceed the limits outlined above; the traffic is not bursty.
I am aware that long lived websocket connections will mean that containers will be slow to scale down (as each container will need to complete their long-lived requests beforehand) but this should have no impact when scaling up.
I have read the Concurrency and WebSockets Cloud Run documentation, from which I could not gain anything useful.
I have tried halving the request timeout to 30mins, but this made no difference.

Any help with this would be greatly appreciated!

knet · May 8, 2025, 5:47pm

One thought i have: If you’re using multiple vCPUs: Is your code actually capable of utilizing all CPUs? For example, if you’re setting CPU to 4, and your container is really only using 1 CPU, then you can see how even though the utilization looks low, the service actually isn’t able to serve more requests concurrently. Some languages don’t do a good job of utilizing multiple CPUs. If this is the case, try setting CPU to 1 and see if that helps - you would see more instances, but each would be cheaper.

louisnw · May 8, 2025, 5:49pm

Thanks! Thats a good suggestion, but unfortunately I’m only using 1 vCPU

knet · May 8, 2025, 5:55pm

Could it be I/O limits, then? Are you calling a downstream resource that doesn’t scale beyond a certain amount? Or VPC connector with too small an instance size?

louisnw · May 8, 2025, 6:05pm

I am connecting to another cloud run service through websocket (the service this forum thread refers to essentially acts as a “passthrough” between the other service and a client), however it is only a single connection between the two services regardless of the number of concurrent requests.

I am not connecting the services together through a VPC, so I don’t think thats the issue.

I will run some tests and check the behaviour of the service when it is not connected to the external resource; hopefully that will narrow it down.

Thanks!

louisnw · May 11, 2025, 11:24am

Just to update; I have tried running without this connection and it still behaves in the same way.

I have also tried load testing locally (with 2000 clients) and the CPU usage of the process did not increase significantly.

louisnw · May 11, 2025, 11:30am

Some other behaviour I’ve noticed: If I turn on manual scaling set to 1 container, I will eventually receive “492 No available instance” error. The log message is:

"The request was aborted because there was no available instance. Additional troubleshooting documentation can be found at: https://cloud.google.com/run/docs/troubleshooting#abort-request"

Again, during these error messages max concurrent connections is well below the limit, as well as CPU and Memory usage.

Is there a way to reset the Cloud Run scaling behaviour back to its initial settings? I wonder if cloud run is “remembering” to scale at certain times of the day based on previous load when it doesn’t need to.

Thanks again!

LeonardoBeams · July 2, 2025, 5:14pm

I’m having exactly the same issue using FastAPI websockets. Getting “429 The request was aborted because there was no available instance”.

Topic		Replies	Views
Cloud Run create instance with a less than 1 req/sec Serverless Applications cloud-run	1	1	September 28, 2022
Cloud Run Auto Scaling behaviour Serverless Applications cloud-run	2	14	June 11, 2024
Random error 500 on Cloud Run - Failed to start instance Serverless Applications cloud-run	3	7	December 12, 2022

Cloud Run WebSocket service scaling for no apparent reason

AI Suggested topics