I’m having similar issues, still don’t know how to fix it.
when processing request one by one works fine:
client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt,
config={
"response_mime_type": "application/json",
"response_schema": RESPONSE_SCHEMA,
},
),
but processing file with batch predictions, gemini output breaks. Some output is not even in valid schema, and gemini is using Grounding with Google Search tool for some reason.
# THIS IS HOW I PREPARE FILE
with open("input.jsonl", "w", encoding="utf-8") as f:
for judgment in judgments:
judgment_dict = {
"id": judgment["id"],
"request": {
"contents": [
{
"parts": {
"text": prompt.create_prompt(judgment["text_content"])
},
"role": "user",
}
],
"generationConfig": (
response_mime_type="application/json",
response_schema=JudgmentAnalysisPrompt.RESPONSE_SCHEMA,
),
},
}
# THIS IS HOW I'M SENDING BATCH REQUESWT
job = client.batches.create(
model="gemini-2.0-flash-001",
src="gs://.../test/input/input.jsonl",
config=CreateBatchJobConfig(
dest="gs://.../test/output/",
),
)
I can’t find any reasonable documentation for batch vertex ai usage