Hello Community,
I am currently using the Search Grounding tool and attempting to implement cost controls by setting a hard limit on my API usage.
The Goal: I want to limit my usage to approximately 5,000 requests per month. To achieve this, I calculated a daily limit and configured a specific quota of 166 requests per day in the Google Cloud Console.
The Issue: Despite setting this quota, I have observed that my actual API usage consistently exceeds the 166 limit.
-
The API continues to return successful responses (HTTP 200) well past the limit.
-
I am not receiving any HTTP 429 (Too Many Requests) or similar
4xxerrors that would indicate the quota has been triggered. -
I can see the usage counter increasing live (with delay) in the console, so the tracking seems active, but the enforcement is missing.
My Question: Is the daily quota for Search Grounding treated as a “soft” limit or is there a significant latency in enforcement? Has anyone else successfully configured a hard stop for this specific API?
Any advice on how to strictly enforce this limit would be appreciated. Thanks!