Hi everyone,
I’m observing an unexpected cost spike with Gemini Batch API image generation using vertex.
After investigation it seems that in the Batch output predictions.jsonl, we sometimes get many responses for the same request key (up to 20+ lines with identical key), each containing a different generated image.
I’m not using any special generation config besides response_modalities=["IMAGE"]
(Note: I don’t set candidateCount, According to the docs, when unset it should default to 1)
Batch job creation:
batch_job = client.batches.create(
model=model_name,
src=jsonl_uri,
config=CreateBatchJobConfig(dest=f"{bucket}/path/to/results/{Path(jsonl_uri).stem}"),
)
Observed Issue
In the Batch output written to dest, we now often get multiple output lines with the exact same request key (each containing a different generated image response).
This results in extra generations and billing, even though we only requested 1 output per key.
Is this a known issue / incident?