Urgent Need for Quota Increase: GenerateContent Input Tokens (TPM) for Gemini 2.5 Flash

Hello everyone, my name is Aldo, and I am developing my first major API project, PrevisER, which utilizes the Gemini 2.5 Flash model for structured geopolitical analysis.

I am reaching out because, despite upgrading my account, I am currently blocked by a quota limit that prevents me from running my core workload.

Technical Context

  1. Project Workload: My application, PrevisER, performs high-volume analytical simulations that require a rapid sequence of API calls. The structured analysis model demands consistent, high-throughput processing.

  2. Observed Peak Usage: During stress testing, the application demonstrated a peak load of 1,140,000 Tokens Per Minute (1.14M TPM).

  3. Account Status:

    • I recently migrated from the Free Tier to a Paid (Pay-As-You-Go) account to remove the initial restrictions.

    • The quota for the critical metric GenerateContent input token count limit per model per minute for gemini-2.5-flash was automatically raised from 500,000 TPM to 1,000,000 TPM (1M).

The Problem: Blocked at Tier 1

The current limit of 1,000,000 TPM is still insufficient for my required peak usage (1.14M TPM).

When I attempt to formally request a quota increase through the Google Cloud Console interface, I encounter a roadblock: the system prevents me from entering a value higher than 1,000,000. This suggests that my project is currently capped at the Paid Tier 1 maximum limit.

My Request

I urgently require access to the next quota tier (Tier 2 or higher) to accommodate my production workload.

  • Target Metric: GenerateContent input token count limit per model per minute

  • Target Model: gemini-2.5-flash

  • Target Quota: 2,000,000 Tokens Per Minute (2M TPM) (to provide necessary safety margin above the 1.14M peak).

Could a member of the Google Cloud or Generative AI API team please assist in manually reviewing and approving this Tier 2 quota increase request?

Thank you for your time and assistance.

Best regards,
Aldo

Hello @aldoskyz8 ,

First things first :slight_smile: This forum is not a place for requesting URGENT things ( especially if we talking about increasing quotas). We can help as always, but it’s not an official path of requesting standard or urgent support. Regarding your request, according to documentation[1] you have to met special conditions to be able to request tier upgrade. For Tier 2 and Tier 3:

However, let’s ask AI Guru :robot: @ilnardo92 would you be able to help here by any chance?

[1.] https://ai.google.dev/gemini-api/docs/rate-limits#tier-2

cheers,
Damian | GDE for Google Cloud

1 Like

Thank you so mutch Damian! I was so stuck in the sand! :slightly_smiling_face: (and so sorry for wrong location..
I was in a panic because I didn’t know where to go..)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.