Trouble allocating GPUs to GKE cluster

Hi,

I’m unable to allocate GPU resources for a Pod on a GKE cluster because the cluster autoscaler is failing to scale up the Nodes required to run the Pod. When I inspect the pod, GKE reports that I’m exceeding a GCE quota, and as a result, it fails to schedule the node. However, it doesn’t show what resource I exceeded, and I when I go to the Quotas page, I don’t see any resources being exceeded.

I’m not sure if there is a bug in my setup, or if Google Cloud has run out of GPUs in my cluster’s region. I tried switching regions from us-west4 to us-west1, but I saw the same error.

Could someone please help point me in the right direction?

Below is the output from kubectl describe pod:

Below is my manifest file:

3 Likes

Depending on your account, it’s possible that you don’t have quota to use any GPUs.
You should check the quota for “GPUs (all regions)”

My personal project:

One of my work projects:

3 Likes

Thanks @garisingh . I’ve checked my Quotas via the Quotas & System Limits page, and it says that I have 2 available GPUs.

I’m only requesting one GPU when deploying my pod and it’s saying that I’m exceeding a GCE quota limit, and failing to start. The pod runs perfectly (and instantly) when I remove the GPU request from my manifest file.

Any other ideas?

3 Likes

Quick rant: getting GPU-accelerated pods up and running with GKE has been a nightmare because of this issue… I’ve lost so many hours…I’ve carefully followed all documentation instructions, and have double-checked my quotas. But it seems like the Autoscaler is completely unreliable, and outputs nothing useful in the console or pod event messages that can identify what resource is getting exceeded, or what other methods to try.

3 Likes

What GPU type are you trying to use?

Sorry, I see that it’s T4. Can you check your T4 GPU quota (not the GPUs - all regions quota) to see whether you have a quota for T4 in that region?

3 Likes