Increase quota for gemini-2.5-flash

mevnai · July 30, 2025, 6:32pm

Hi,

I am trying to figure out how to request for increasing quota for gemini-2.5-flash, gemini-1.5-flash and gemini-2.5-pro.
Can someone help me with this?

Thanks,
Bhuvana

LeoK · July 30, 2025, 8:22pm

Hello @mevnai,

Since you tagged vertex-ai-platform, I recommend checking the Vertex AI quotas and limits and verifying whether you can request a quota increase in your project.

Alternatively, if you’re comfortable with a more hacky approach, you can distribute the load across multiple regions to effectively multiply your quota since each region has its own independent limit.

dawnberdan · July 30, 2025, 10:19pm

Hi @mevnai,

To request a quota increase for Gemini 2.5 Flash, Gemini 1.5 Flash, or Gemini 2.5 Pro in Google Cloud, follow these steps:

Enable Cloud Billing: Your project must have billing enabled, as quota increases are tied to your billing tier.
Know Your Usage Tier: For more information you may check this document.

Free Tier: ~100 requests/day for Pro, ~1000/day for Flash.
Paid Tiers (Tier 1–3): Higher limits depending on your cumulative spend.

Visit the Quotas Page: Go to the Google Cloud Console Quotas page.
Submit a Quota Increase Request: Find the relevant Gemini quota, click the three-dot menu, and select Edit Quota.
Wait for Review: If your project meets the requirements, it might be auto-approved. Otherwise, it may undergo manual review.

Additional Tips:

If you’re using Vertex AI, Gemini 2.5 Flash and Pro support Dynamic Shared Quota, so you might not need to manually request increases.
For Gemini Code Assist, upgrading to a Standard or Enterprise plan can also unlock higher limits.
If you’re hitting limits unexpectedly, check for 429 RESOURCE_EXHAUSTED errors and consider switching to Flash models for higher throughput. You can check this thread for more information.

mevnai · July 31, 2025, 3:40am

@LeoK , @dawnberdan - Thank you for your response.
I am not able to figure out the service name to submit the request for gemini increased usage in Quotas & System Limits.
Can you please help me with this?

LeoK · July 31, 2025, 7:45am

@mevnai,

It depends on your use case: Gemini for Google Cloud or Vertex AI.

Gemini for Google Cloud

Gemini for Google Cloud offers generative AI-powered assistance to a wide range of Google Cloud users, including developers and data scientists. To provide an integrated assistance experience, Gemini for Google Cloud is embedded in many Google Cloud products.

If you’re using this, look at the Gemini for Google Cloud API on your GCP project and adjust quotas accordingly.

Vertex AI

If you want to use Gemini for Google Cloud models to create your own generative AI application, see Overview of Generative AI on Vertex AI.

In that case, look at the Vertex AI API and check quota settings based on your needs using Filters.

You’ll likely need to prioritise increasing these two quotas:

Generate content input tokens per minute per region per base_model
Generate content requests per minute per project per region per base_model

These are usually the first to hit limits (tokens or requests).

M_S2 · February 19, 2026, 11:40pm

I am working on building on saas products and services using the gcp infrastructure and may need to acquire quota increases in different micro-services. I have read some of the forums and there doesn’t seem to be an answer. I have switched between a few components to either use the enterprise vertex-ai route or the gemini-ai api express that piggybacks off the vertex ai component and seems to allow more llm requests/responses. Throttling happens and 429 errors occur due to limitations on how many calls can be made to an llm on gcp. But if we are building services that require multiple calls to be chained together as a result of RAG dependencies this wont work.

I am testing with gemini-2.5-flash-lite and was not able to find anywhere in the IAM-QOUTA limitations of google console where I can request quota increases.

Topic		Replies	Views
Vertex ai enterprise vs gemini-ai-vertex-express api QOUTA limitations Generative AI & Foundational Models gemini , vertex-ai-studio	2	16	February 20, 2026
Quota Increase Request for Gemini API Custom ML & MLOps vertex-ai-training	2	252	August 5, 2025
Gemini API - 429 Resource has been exhausted (e.g. check quota). Custom ML & MLOps gemini-in-looker , vertex-ai-platform	29	5265	July 12, 2025

Increase quota for gemini-2.5-flash

AI Suggested topics