Hey everyone,
We’re running a gRPC service on Cloud Run and keep hitting this error after a few minutes under load: Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted or memory limit exceeded
Our setup:
-
50 requests per second
-
Each request takes 3-6 seconds to respond
-
Memory: 512 MiB
-
CPU: 1
-
Max concurrency: 20 per instance
-
Max instances: 100
The service works fine for short bursts, but once we hit sustained traffic for a few minutes, multiple instances start throwing this error.
I found that Cloud Run has a 600 Mbps bandwidth limit per instance. Our gRPC responses might be pretty large, so I’m thinking we’re hitting that limit.
Questions:
-
Is this definitely a bandwidth issue, or could it be something else?
-
Should we lower the concurrency to spread traffic across more instances?
-
Will bumping up memory/CPU help with bandwidth, or are those limits fixed?
-
Any other config changes we should try?
We’re a bit stuck here - the service works great until it hits this wall. Any advice would be awesome!
Thanks!