[Question] gemini-3-pro-image-preview 429 RESOURCE_EXHAUSTED

[Question] gemini-3-pro-image-preview 429 RESOURCE_EXHAUSTED — Inquiry on Dynamic Shared Quota Status & GA Release Timeline

Please excuse my limited English — I’ll do my best to describe the issue clearly.


Hi everyone,

We’re using gemini-3-pro-image-preview (Nano Banana Pro) in a production image generation pipeline via the Vertex AI API. Over the past week, we’ve been experiencing a significant spike in 429 RESOURCE_EXHAUSTED errors — specifically concentrated between 15:00–18:00 KST (UTC 06:00–09:00)


Our Setup

  • Platform: Vertex AI API (not AI Studio)
  • Billing: Pay-as-you-go (on-demand), billing enabled, no custom RPM limits configured
  • Error window: KST 15:00–18:00, intermittent but increasingly frequent over the last 14 days
  • Pattern: Works fine outside that window; even within it, some requests succeed while others fail immediately
  • GCP Console quota dashboard: Project-level quota shows no consumption at the time of the 429s — no quota exhaustion on our end
  • API Location: global

Error response we receive:

{
  "error": {
    "code": 429,
    "message": "Resource has been exhausted (e.g. check quota).",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Note that the error body contains no details field specifying which quota dimension was exceeded (e.g., no quota_metric or quota_limit fields), which makes it difficult to diagnose on our end.


What We’ve Already Tried

We implemented a retry strategy with fixed delays on 429 errors — retrying after 30s, 50s, and 70s respectively. Unfortunately, this made little to no practical difference during the affected time window. Requests that fail tend to keep failing across all retry attempts, suggesting the issue isn’t a brief transient spike but rather sustained congestion on the shared pool throughout the peak period.


Our Understanding of the Root Cause

As a preview model, gemini-3-pro-image-preview runs on Dynamic Shared Quota — meaning 429s can occur due to global pool congestion regardless of individual project quota. Given that our project-level quota shows zero consumption when the errors occur, we believe this is a shared pool contention issue, not a per-project limit.

With that in mind, we have three questions:


Q1. Has the shared pool capacity for this model been reduced recently?

Our usage patterns haven’t changed, yet error frequency has risen sharply over the past week. We’re trying to understand whether Google has intentionally reduced or reallocated the global shared pool for this preview model, or whether this is simply a result of increased developer adoption driving higher competition on the shared pool.

If there have been any infrastructure changes on Google’s side, having that reflected in the changelog or an official notice would help us plan accordingly.

Q2. Is there a roadmap or estimated timeline for gemini-3-pro-image-preview to reach GA?

The Dynamic Shared Quota constraint is a fundamental limitation for production workloads with SLA requirements. Provisioned Throughput is currently too costly at our scale, so a GA release with stable per-project quotas would be the most practical path forward.

Any visibility into the GA roadmap — even a rough timeline — would be greatly appreciated.

Q3. Does submitting a quota increase request via GCP Console have any effect for Dynamic Shared Quota models?

Our understanding is that increasing per-project quota doesn’t resolve global pool contention inherent to preview models. If that’s correct, is there any mechanism available — short of Provisioned Throughput — to improve reliability during peak hours?


Would love to hear from others facing the same issue, especially any workarounds you’ve found for peak-hour reliability. Many thanks in advance from Korea.