Environment:
- Model: gemini-3-pro-preview
- Endpoint: Vertex AI REST API (global endpoint)
- API: generateContent (also tried streamGenerateContent)
Issue:
Gemini 3 Pro Preview completes thinking but returns zero text parts in the response. The model thinks successfully
but produces no actual output text afterward.
Token usage from our logs:
prompt=3688, candidates=2945, thoughts=1737, total=8370
Extracted 0 text parts, 1 thinking parts, 0 empty parts
finishReason: STOP
The model uses ~2900 candidate tokens and ~1700 thinking tokens, but when we parse candidates[].content.parts[], we only find parts with thought: true - no actual text output follows the thinking.
Request configuration:
{
“contents”: [{“role”: “user”, “parts”: [{“text”: “…”}]}],
“generationConfig”: {
“temperature”: 1.0,
“maxOutputTokens”: 65536,
“thinkingConfig”: { “thinkingLevel”: “high” }
},
“tools”: [{ “googleSearch”: {} }]
}
What we’ve tried:
- thinkingLevel: “high” and thinkingLevel: “low” - same result
- Streaming vs non-streaming endpoints - same result
- With and without responseMimeType: “application/json” - same result
- Tested across 5 different GCP projects - same result
Expected behavior:
Model thinks, then returns text content in parts[].text.
Actual behavior:
Model thinks (finishReason: STOP), but all parts have thought: true flag. Zero parts contain actual text output.
Response is effectively empty.
Questions:
- Is this a known issue with Gemini 3 Pro Preview on Vertex AI?
- Is there a parameter we’re missing to ensure text output is generated after thinking?
- Any workarounds while this is investigated?
Thanks for any help!