We are using gemini-3-pro-image-preview for a production project (target: 20 RPM). We frequently hit 429 Resource Exhausted errors, but Provisioned Throughput is currently too expensive for our scale.
-
What is the ETA for this model to move from Preview to GA (Stable)?
-
Besides Provisioned Throughput, what is the best way to stabilize 20 RPM ?
-
Will the stable version be launched in all regions simultaneously?
You can try the retry and exponential backoff logic for your API calls. If you’re using GenAI SDK, try:
client = genai.Client(
vertexai=True,
project=PROJECT_ID,
location="global",
http_options=types.HttpOptions(
timeout=int(60*1000), # 60 seconds
retry_options=types.HttpRetryOptions(
attempts=3,
initial_delay=1.0,
http_status_codes=[408, 429, 500, 502, 503, 504]
)
)
)
Yes, Provisioned Throughput is the best option.
Stable versions typically come with regional support, and in a gradual expansion.
1 Like