Performance degradation when using Batch prediction

Dawidwk · April 8, 2025, 6:51pm

I’m having similar issues, still don’t know how to fix it.

when processing request one by one works fine:

client.models.generate_content(
            model="gemini-2.0-flash",
            contents=prompt,
            config={
                "response_mime_type": "application/json",
                "response_schema": RESPONSE_SCHEMA,
            },
        ),

but processing file with batch predictions, gemini output breaks. Some output is not even in valid schema, and gemini is using Grounding with Google Search tool for some reason.

# THIS IS HOW I PREPARE FILE
with open("input.jsonl", "w", encoding="utf-8") as f:
    for judgment in judgments:
        judgment_dict = {
            "id": judgment["id"],
            "request": {
                "contents": [
                    {
                        "parts": {
                            "text": prompt.create_prompt(judgment["text_content"])
                        },
                        "role": "user",
                    }
                ],
                "generationConfig": (
                    response_mime_type="application/json",
                    response_schema=JudgmentAnalysisPrompt.RESPONSE_SCHEMA,
                ),
            },
        }

# THIS IS HOW I'M SENDING BATCH REQUESWT
job = client.batches.create(
    model="gemini-2.0-flash-001",
    src="gs://.../test/input/input.jsonl",
    config=CreateBatchJobConfig(
        dest="gs://.../test/output/",
    ),
)

I can’t find any reasonable documentation for batch vertex ai usage

Topic		Replies	Views
Gemini Batch API Performance Issue - Slow Processing Custom ML & MLOps gemini-in-looker , vertex-ai-platform	1	441	April 4, 2025
Gemini Batch API image generation fails at 2K resolution but works fine at 1K Generative AI & Foundational Models gemini	7	152	March 16, 2026
Vertex AI batch prediction for Gemini 1.5 Pro model is very slow Custom ML & MLOps vertex-ai-platform	0	270	December 5, 2024

Performance degradation when using Batch prediction

AI Suggested topics