GCP Batch unable to run jobs for us since 2025 Nov 11th

Since 2025 Nov 11th, our team’s Batch jobs always ends up like this: it is in state scheduled for a long while with several vm instances already spinned up but no task is actually running. Later it would fail with the following error:

This should not happen because we just requested some c2-standard-30 or e2-highmem-16 with Standard provision model in us-central-1,which should be very easy to satisfy. Could someone DM me and help us take a look? Thanks!

Hi! There are several ways to try to manage this problem but from the information you provided I would recommend verifying your job configuration. Gcloud offers capabilities to check if machines of a given type are available in that region, e.g.:

gcloud compute machine-types list

Machine type is not the only reason for which ZRPE can happen. Resource types that are subject to stockouts include:

  • Compute Resources (vCPUs and Memory):
    Specific VM families (e.g., N1, N2, N2D, E2, C2, C3, M1, M2, M3, A2, A3, G2, etc.).
    Specific VM shapes (e.g., n2-standard-64, c2-highmem-32).
    Minimum CPU platforms (e.g., requesting Intel Ice Lake or later).
    The sheer amount of cores or RAM requested.
    Accelerators:
    Specific types and counts of GPUs (e.g., NVIDIA T4, V100, A100, H100).
    TPUs (Tensor Processing Units).
    Storage:
    Local SSD:
    Lack of available Local SSD capacity on machines that match the other VM requirements.
    Persistent Disk (PD): While sometimes manifesting as a PD_STOCKOUT, the inability to create the required PD (due to lack of cell-level capacity for HDD, SSD, or IOPS) can cause the VM creation to fail, sometimes still surfacing as a general ZRPE to the user.

Good test would be also to temporarily rent an instance (or single-vm Regional Managed Instance Group) with the same spec - this would allow you to say if the region supports that configuration. From your description it seems it does (“several instances spinned up“)

Once you exclude these reasons I encourage to raise a ticket with details of a recent failure.