Hi Community,
I am blocked trying to get an answer from
gemini-1.5-flash or
gemini-1.5-pro
I am doing a simple call. But keep getting inconsistent errors (sometimes it works, most of the time it doesn’t).
Error is “google.api_core.exceptions.ResourceExhausted: 429 Resource exhausted. Please try again later.”
I tried:
-
checking my quotas & limits from the console and everything is in the green with less than 20% usage
-
changing region
-
check my IAM policy
-
checking my API service details:
-
Methods Requests Errors google.cloud.aiplatform.ui.JobService.ListDataLabelingJobs 9 100% google.cloud.aiplatform.v1.PredictionService.GenerateContent 144 24.31% google.cloud.aiplatform.v1.PredictionService.StreamGenerateContent 2 50% google.cloud.aiplatform.v1beta1.GenAiCacheService.GetCachedContent 4 100% The code is very simple
def fetch_and_save_raw_output(
prompt: str, uri: str, system_prompt: str
) -> Optional[str]:
vertexai.init(project=config["project_id"], location=config["location"])
model = GenerativeModel(config["model_name"], system_instruction=[SYSTEM_PROMPT])
document = Part.from_uri(mime_type="application/pdf", uri=uri)
try:
response = model.generate_content(
[prompt, document],
generation_config=generation_config,
safety_settings=safety_settings,
stream=False,
)
if hasattr(response, "finish_reason"):
if response.finish_reason == "MAX_TOKENS":
log.warning(f"Response truncated due to MAX_TOKENS for {uri}")
elif response.finish_reason != "STOP":
log.warning(
f"Unexpected finish reason: {response.finish_reason} for {uri}"
)
# Save raw response with metadata
pdf_name = extract_filename(uri)
output_dict = {
"metadata": {"uri": uri, "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")},
"response": response.to_dict(),
}
print(f"output_dict: {output_dict}")
return pdf_name
except Exception as e:
log.exception(f"Error fetching and saving raw output for {uri}: {str(e)}")
return None