429 with no quota or rate limit hit

Hi, we’re on Vertex and sending requests to Gemini 2.5 Pro (GA). We keep getting 429, but we are far away from hitting any quotas visible in the quota panel (the biggest usage is on 2% on an unrelated quota). Is this a known issue? We are not using a lot of tokens.

Hey,

Hope you’re keeping well.

A 429 from the Vertex AI Gemini API can occur even if you haven’t reached the visible quotas in the console, because there are backend rate limits that aren’t exposed in the quota dashboard. These limits can be per-project, per-region, or tied to concurrency on the model, especially for high-demand GA models like Gemini 2.5 Pro. I’d recommend checking the Vertex AI > Monitoring > Requests section in Cloud Console to see actual request patterns, and try lowering concurrency or adding small delays between calls. If the issue persists, open a support case with your request IDs so the Vertex AI team can confirm whether you’re hitting hidden service limits.

Thanks and regards,
Taz

Thank you Taz, appreciate your help! We’re monitoring and logging through other services so have not configured the monitoring interface per region yet. This makes sense though, I remember seeing more detailed quota information on individual models in the quota panel last year. So current best practice to keep track of usage and quotas is to set up individual Vertex AI Monitoring for each model+region combo we’re using, which might give us the necessary info?

In case someone searches and finds this, seems like Vertex AI Monitoring is for custom models and not Gemini models, but the answer is here:

We understand that encountering a ‘resource exhausted’ 429 error can be frustrating and might lead you to suspect you are hitting some sort of quota limit. However, with DSQ, this is not the case. These errors indicate that the overall shared pool of resources for that specific type (e.g., a particular model in a specific region) at a specific time is experiencing extremely high demand from many users simultaneously.