Bug - Online prediction requests per base model per minute per region per base_model set to 0!

For the claude 3.5 sonnet model, the Online prediction requests per base model per minute per region per base_model is set to 0, and I have no way to asking for an increase. Is this a bug?

Hi @DrinkBeer,

Welcome to Google Cloud Community!

The “Online prediction requests per base model per minute per region per base_model” setting being set to 0 for the Claude 3.5 sonnet model likely indicates that you have reached or are approaching your quota limit for that specific model. Quotas are in place to ensure fair usage and to prevent any single user from overloading the system.

You can confirm whether you have reached the quota limit assigned to your project. You can navigate to the Google Cloud Console, and, in the left-hand navigation panel, click on “IAM & Admin” and then select “Quotas & System Limits." You can use the Filter search box to search for your quota.

If you want to increase any of your quotas, you can use the Google Cloud Console to request a quota increase. You may follow the steps in this documentation. Keep in mind that these requests are subject to review and approval and may take some time to process. Additionally, quota increase requests are typically evaluated based on the validity of the business case provided.

If the issue persists, I suggest contacting Google Cloud Support as they can provide more insights to see if the behavior you’ve encountered is a known issue or specific to your project.

I hope the above information is helpful.

The problem is getting a hold of Google Cloud Support. Cloud support won’t accept free accounts, as their is no way to report any problems to them with out paying for support, and I am not going to pay for support for something that should be provided for free, specially if it is a bug. It just a great big circle. No one will deal with you at Google, unless you pay huge amount of money. So, unless you have away around that, then this answer is useless.

I am experiencing the same issue since a week or so. Also the support is not helpful at all. This is concerning to me, as this API is GA and we rely on it for our production load.

You can request a quota increase here: https://console.cloud.google.com/iam-admin/quotas

This way you don’t need a support contract, but you have to have billing set up and they can just decline if you don’t have enough billing history.

It is not adjustable. Why providing a model while you can’t use it???