We’re experiencing persistent connection timeouts between our Cloud Run service and Memorystore Redis instance. All troubleshooting steps fail and we need community help.
Problem:
FastAPI service on Cloud Run (cortex-engine-service) fails to connect to Redis (cortex-redis-cache) during startup. Error in logs:
“ConnectionError: Redis is not reachable after retries” after 5 attempts. Results in 503 errors.
Why would VPC connector settings not persist on Cloud Run despite a successful update command?
Cloud Run doesn’t automatically carry over VPC connector settings between deployments. Even if you run “gcloud run services update” with “--vpc-connector” and “--vpc-egress”, those settings only apply to the current revision. The moment you deploy a new version without including those flags, Cloud Run falls back to the default — meaning no VPC connector is used. This is actually expected behavior and is covered in the Cloud Run VPC connector documentation.
What hidden networking constraints might block Serverless VPC Access to Memorystore?
VPC Peering State: Memorystore for Redis instances automatically establish a VPC peering connection (e.g., redis-peer-############) with a Google internal VPC network. If this peering is accidentally deleted, connectivity will fail with connection timeouts.
IP Range Exhaustion or Conflicting Routes: The IP range allocated for the VPC connector (192.168.200.0/28) might be exhausted or conflict with other existing routes or subnets within the “cortex-private-vpc”.
Region Mismatch: A fundamental requirement is that the VPC connector, Cloud Run service, and Memorystore instance must all be in the same region
How to debug traffic flow between VPC connector and Memorystore?
Connectivity Tests: This tool can validate the network path from a source representing the Cloud Run service (e.g., an IP within the VPC connector’s allocated range) to the Memorystore Redis private IP. It analyzes configurations and performs live data-plane analysis to pinpoint where traffic is being dropped.
Enable VPC Flow Logs on the subnet used by the connector. This will show whether packets are reaching Redis and if they’re being dropped.
Any known issues in the europe-southwest1 region with this setup?
The Google Cloud Service Health Dashboard currently reports no major incidents affecting Cloud Run, Memorystore, or Serverless VPC Access in the europe-southwest1 region. However, on June 12, 2025, there was a multi-product incident that impacted Cloud Run and Memorystore in europe-southwest1 as well as other regions. This incident caused 503 errors and intermittent API/UI access issues across multiple Google Cloud products due to an invalid automated quota update to the API management system. Although the incident is resolved, it is a good idea to check whether your Redis instance was affected or if any residual issues remain.
If the issue still persists, I recommend reaching out to Google Cloud Support for further assistance. They have the tools and expertise to delve deeper into the problem and provide specific solutions.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.