I am building an agent using the Google Agent Starter Pack.
I develop and test it locally first — through the Playground and a Telegram bot running in polling mode. In the local environment, the entire pipeline works fast and consistently: every LLM request completes in 1–3 seconds.
Then I deployed the exact same agent to Cloud Run (asia-southeast1). The code, configuration, model, and parameters are 100% identical to the local version. The agent behaves exactly the same — the only difference is that it runs inside Cloud Run.
The issue is that in Cloud Run the LLM request suddenly takes 20–40 seconds instead of 1–3 seconds locally.
The slowdown happens specifically during the call to the Vertex AI model (Gemini 2.5 Flash). All other stages take only a few milliseconds. Logs clearly show the delay between LLM_CALL_START and LLM_CALL_END, and the responses always return success with no retries and no errors.
In other words, the local version is fast and stable, while the Cloud Run version is extremely slow despite being completely identical. I need to understand what could cause such a consistent performance difference and how to properly diagnose and resolve it.

