Freezing when using Gemini Flash Preview on Vertex AI Global Endpoint

Hi everyone,

I am encountering intermittent issues where my requests get stuck or hang indefinitely without returning a response or throwing an error (Reseource Exhausted - even i am not sending that much data) immediately.

The Issue: When sending requests to the Flash models, the process often freezes “in between” the request and response. I end up having to restart the script because it just sits there.

Details:

  • Models: I am seeing this primarily with gemini-2.5-flash gemini-3-flash-preview.

  • Libraries: I am using the google-genai and also trying the Vertex AI Global Endpoint.

  • Behavior: The connection seems to hang open, but no content is generated, eventually leading to a manual timeout or indefinite wait.

Is this a known issue with the Global Endpoint or the Python SDKs right now? Are there any specific timeout settings I should apply to handle these “silent freezes” better?

Thanks.