Hello everyone,
I am developing a face-swapping application using the Python google-genai SDK. Currently, I’m evaluating the gemini-3.1-flash-image-preview model via Vertex AI.
During my testing, the pipeline frequently crashes, and I am consistently running into this specific error:
Critical Error: 429 RESOURCE_EXHAUSTED. {'error': {'code': 429, 'message': 'Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.', 'status': 'RESOURCE_EXHAUSTED'}}
What are the overall best practices and architectural fixes to handle or prevent these 429 errors when working with preview image models?
I came across option of provisioned throughput, but it is not cost efficient for our app.
Any advice or suggestions would be greatly appreciated!