Gemini API - 429 Resource has been exhausted (e.g. check quota).

I have implemented a logic where I have created 15 API keys and allowed access to them. The whole process goes to GeminiPro1.0, which is Google’s AI. The reason for all this is that if I use only one API key simultaneously, the entire process takes 4-5 days, as I want to stay outside the 60 calls per minute per API key limit, so that Gemini can be used for free. Accordingly, I now have 15 running in parallel so that the time should be limited to a few hours. However, I received the message “429 Resource has been exhausted (e.g. check quota).” As well as: “limit ‘GenerateContent request limit per minute for a region’ of service ‘generativelanguage.googleapis.com’ for consumer ‘project_number:11111111111(sample)’”. Could this possibly be because almost all the API keys are associated with one project? How do I increase that and what would it cost?

8 Likes

Possibly that you are sending too many requests in a short period of time. If possible can you place a delay for your request to avoid the error. As for quota increase you can follow the instruction provided here. Sample instructions below:

  1. Go to the Quotas page:> > Go to Quotas> > schoolThe remaining steps will appear automatically in the Google Cloud console.> > 1. On the Quotas page, find the quota you want to increase in the Quota column.> > You can use the Filter search box to search for your quota.> > 1. Select the checkbox to the left of your quota.> > 1. Click create EDIT QUOTAS. The Quota changes form displays.> > 1. In the Quota changes form, enter the increased quota that you want for your project in the New limit field.> > 1. Complete any additional fields in the form, and then click DONE.> > 1. Click SUBMIT REQUEST.
1 Like

I’m running into this same issue. I’m using workflows to do two tasks (via Cloud Functions) that are generative and build on each other. I’m using 2 different API keys, but the second task always fails with the Cloud Function giving me a final error of 429.
Even after implementing 30 seconds of wait time in the workflow, it still happens, with 2 different keys, using 2 different models.

Are we limited at a project level of how often we can call the Gemini Generative endpoint?

6 Likes

My theory is that the gemini-pro-1.5-latest endpoint has some sort of other limit, that we as users can’t see when using the “generativeai” python SDK. The only thing that shows up in metrics is failed API calls, but NOT limit hits.

The way around this, I believe, would be to directly use the Vertex SDK directly, not the GenAI API.

5 Likes

I found this GoogleAI API vs VertextAI API a whole mess. GoogleAI API won’t allow us to use PDFs in prompt even for gemini-1.5-pro. But we can do it using VertextAI. Not sure why a company like Google has failed to develop a good API that’s easy to understand.

3 Likes

I’m looking at my quotas and I’m not even seeing anything for GenAI content generation even though the metrics show 400 requests.

2 Likes

Yesterday, I used Gemini to translate eBook. Now the problem is that the send request is displayed error 429 Resource has been exhausted (e.g. check quota). too many request. And i go to quotas, but i dont find anyone over 90%.then i dont know which is need to change

3 Likes

Hi i have same about problem, in my case i use gemini flash 1.5 <= 50 request per minute and i get same alert

5 Likes

Exactly, I believe I am hitting the 10k quota per day with my paid account, but I can’t find where this quota is shown in GCP, I’ve even asked Gemini itself (which I would assume is trained to help with GCP) and it just points me in the wrong direction :grinning_face: .
I am coming from OpenAI/AWS which is much more simple to manage IMO, and I can’t tell if I am not very smart or if the ecosystem in GCP of quotas/projects/API keys/OAuth is as confusing as I feel it is!

7 Likes

I am struggling to understand the relationships and layout of GCP, AI Studio, and the product line of Vertex/Gemini/etc.

Why can I use an API key to make requests to base Gemini models, but a clunky OAuth is needed for managing and USING fine-tuned models?

Why am I getting “429 Resource has been exhausted (e.g. check quota)”… This is not an acceptable level of logging in my opinion, what quota, where is it managed, and what project/application/credential is the issue?

8 Likes

Honestly, this doesn’t seem like just their AI suite of APIs either. Trying to manage the YouTube Data API for example may be the most unintuitive and strangely documented experience I’ve had with such a major platform.

5 Likes

yes, i agree with you, its really hard for understanding documentation for gemini AI, Its different of ChatGPT model

4 Likes

You can read about the quotas for Gemini here: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas

If you go to the quota page you can search for Gemini, you can find the quota and increase it. Make sure you do it for the right region.

I figured out that in europe-west1 the default is 10 requests per minute where in us-central1 it’s 300.

If you are looking for you can type: base_model:gemini-pro

I faced the same issue, seems like some of the free quota limit in my case. The main problem is that it wont show in the quotas page that you’ve reached the limit. So whats the point of having it there? Quite misleading.

2 Likes

any luck with this issue anyone?

1 Like

Their API’s and in particular the GenAI VS Vertex is INSANELY confusing for ME and I’m a well-seasoned beta tester of every AI product they put out. I participated in a one-on-one research call where I spent an hour telling them all about these kinds of issues, so they are well aware of it!

2 Likes

How to resolve this issue
also we integrate 5 sec delay mechanism but we steel getting below error
DEBUG: Retrying due to 503 The model is overloaded. Please try again later., sleeping 0.3s … 2024-11-19 17:44:05,387 DEBUG: Retrying due to 503 The model is overloaded. Please try again later., sleeping 0.3s …

generate_text_manipulation_response
raise google.api_core.exceptions.ResourceExhausted(“Quota exhausted after retries.”)
google.api_core.exceptions.ResourceExhausted: 429 Quota exhausted after retries.

Hi,

We recieved information that the issue is because of resource problems due to a new way to handle quotas and requests. This will happen for version 002 (when many users try to use 002 at the same time, for the specific region), but 001 will still work as it uses the old way to handle quotas and requests. They are investigating it at the moment, as I understood it.

Other ways to solve it is to try another region (if pay-as-you-go) or buy dedicated GSU’s via Provisioned Throughput (for Production Environments).

Working in a pay-as-you-go account using the

gemini-1.5-pro-002 model. The onlt quotas that show up in my Quotas page are the 2 API services: Generate Content and Search Grounding, both far below the quota, yet I’m still getting 429 errors. The first time I got the error, I waited 24 hours and then it worked again until later that day when it popped up again. Running it on ± 10,000 rows and I only have 776 rows left so it can’t be a per-minute rate issue or a daily limit issue since I’ve waited another 24 hours but it’s still not working.

I was having this problem too, and found a solution: after your .generate_content(prompt) function put a time.sleep(1) in there. I went from all but one 429 to none. A 1 second pause might be a deal breaker for some, but should solve at least half the use cases.

1 Like