My application use gemini-2.5 flash under *asia-southeast1. it fail with 429 error. How can i check the vertex api rate limit? I research a bit online and community. and asked Gemini 3.0 pro. Cannot find any insight.
Thanks.
My application use gemini-2.5 flash under *asia-southeast1. it fail with 429 error. How can i check the vertex api rate limit? I research a bit online and community. and asked Gemini 3.0 pro. Cannot find any insight.
Thanks.
@King_Wu : Gemini quotas on Vertex are managed through Pay as You Go and Provisioned Throughput. Gemini 2+ models use Dynamic Shared Quota (DSQ), which dynamically distributes capacity among all users so there are no predefined quota limits with DSQ.
For the 429 errors, Iād suggest 1) check out resolutions: Error code 429 | Generative AI on Vertex AI | Google Cloud Documentation. 2) set up monitoring: Monitor models | Generative AI on Vertex AI | Google Cloud Documentation