Hi everyone,
I am encountering intermittent issues where my requests get stuck or hang indefinitely without returning a response or throwing an error (Reseource Exhausted - even i am not sending that much data) immediately.
The Issue: When sending requests to the Flash models, the process often freezes “in between” the request and response. I end up having to restart the script because it just sits there.
Details:
-
Models: I am seeing this primarily with
gemini-2.5-flashgemini-3-flash-preview. -
Libraries: I am using the
google-genaiand also trying the Vertex AI Global Endpoint. -
Behavior: The connection seems to hang open, but no content is generated, eventually leading to a manual timeout or indefinite wait.
Is this a known issue with the Global Endpoint or the Python SDKs right now? Are there any specific timeout settings I should apply to handle these “silent freezes” better?
Thanks.