textembedding-gecko Quota

stars · August 26, 2023, 2:54am

I keep receiving 429 Quota exceeded errors when trying to create embeddings. Looking at my quotas, I can see “online_prediction_requests_per_base_model” is limited to 5/minute.

This seems to contradict this page, which suggests the limit should be 600. https://cloud.google.com/vertex-ai/docs/quotas

Is there a reason why I cannot receive a higher quota?

Many thanks

lsolatorio · September 1, 2023, 3:14pm

Hi @stars ,

Welcome and thank you for reaching out to our community.

I understand that our documentation can be confusing at times but let me help you get a better picture of our quotas and limits.

The base_model:textembedding-gecko indeed has 600 requests per minute quota but it is limited to 5 input text per request. This means that you can have a maximum of 600 request instances per minute with a maximum of 5 input text for each request, as shown in the screenshot that you have provided.

Please do note that you can also reach out to Vertex AI Support to discuss more of this in detail.

stars · September 8, 2023, 3:29pm

Thank you for the clarification!

glaforge · January 26, 2024, 7:39pm

I’ve just discovered that the new limit now seems to be 250 input texts per request, compared to 5 before.

Topic		Replies	Views
Error: ResourceExhausted: 429 Quota exceeded for aiplatform.googleapis.com/online_prediction_request Custom ML & MLOps agent-platform , model-registry , agent-platform-workbench	10	628	June 16, 2025
Quota on textembedding models using batchpredictionjob Custom ML & MLOps agent-platform	0	14	March 21, 2025
Receiving quota error when trying to use the Embedding for Image model in Model Garden Custom ML & MLOps agent-platform , model-registry	1	39	September 2, 2023

textembedding-gecko Quota

AI Suggested topics