Until Saturday (Aug 30, 2025) my GKE cluster was running fine with GPU time-sharing enabled. I could schedule 2 pods per NVIDIA L4 GPU node (using maxSharedClientsPerGpu=2). After my cluster auto-upgraded to v1.33.3-gke.1136000, all my GPU nodes only advertise Allocatable: 1, and I can no longer co-schedule multiple pods per GPU.
2 Likes
Since the same time, I have also been getting these events (never got these before) from cluster-autoscaler stating max-cluster nvidia-l4 limit reached, which doesn’t make any sense to me considering that the quotas (32) are much higher than the number that is running (1).
We seem to have the same problem!
Node pools with GPUs are not scaling up with events saying: 2 max cluster nvidia-l4 limit reached, 2 not ready for scale-up, 6 max cluster nvidia-tesla-t4 limit reached.
Our project quota is at 3 / 32:

I just spoke to the support. The solution they recommend is to either downgrade the node pool or create a new one that is lower than 1.33.3.
They apparently have multiple client that got this issue and engineering is working on a solution.
3 Likes


