How Vertex AI rate limits are calculated on GCP?

diegol116 · October 25, 2024, 7:19pm

I’m planning to use Google Cloud Platform’s Vertex AI for a few projects. So, I was looking through the documentation in the section on rate limits and I came across this:

https://cloud.google.com/vertex-ai/generative-ai/docs/quotas

But I haven’t found any information anywhere about the algorithm that sets these limits. That is, I have two scenarios in my mind:

First scenario: The limits are at fixed times. For example, between 08:00:00 AM and 08:00:59 AM there are 4 million tokens available and at 08:01:00 AM the tokens are reset.
Second scenario: The limits move as requests are made.

Or maybe it’s different from the scenarios outlined.

I would appreciate if someone could explain to me how Google calculates it, or if there is a section of the documentation where I can find this since I haven’t seen it.

dawnberdan · October 31, 2024, 4:58pm

Hi @diegol116 ,

Welcome to Google Cloud Community!

Vertex AI Generative AI quotas are calculated based on the number of requests per minute (RPM) for a base model and all its versions, identifiers, and tuned versions. Unfortunately, Google doesn’t publicly disclose the exact algorithm used to calculate these limits. The quotas apply to requests for a given Google Cloud project and supported region. Additionally, there are quotas for specific services like RAG Engine and Gen AI Evaluation Service. Some quotas are shared across all applications and IP addresses within a Google Cloud project.

I hope the above information is helpful.

Topic		Replies	Views
Quota Issue with Gemini Pro / Palm2 Bison on Vertex AI Custom ML & MLOps gemini-in-looker , vertex-ai-platform	1	4	February 8, 2024
Quota exceeded error for Generate content requests per minute per project per base model per minute Custom ML & MLOps vertex-ai-platform	8	153	March 10, 2025
Increase quota for gemini-2.5-flash Custom ML & MLOps vertex-ai-platform	4	124	July 31, 2025

How Vertex AI rate limits are calculated on GCP?

AI Suggested topics