insufficient nvidia.com/gpu with autopilot

anonymous · February 5, 2023, 9:24pm

Hi,

I am trying to run a pod with gpu support but I am getting “insufficient nvidia.com/gpu”. Can you help me understand what am I doing wrong?

This is the pod definition:


apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-t4
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
# [https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile](https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile)
image: "registry.k8s.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1

And this is the error I get if I run kubectl describe pods


Warning FailedScheduling 56s (x3 over 11m) gke.io/optimize-utilization-scheduler 0/2 nodes are available: 2 Insufficient cpu, 2 Insufficient memory, 2 Insufficient nvidia.com/gpu, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.

Can someone give me a hand?

thanks

garisingh · February 6, 2023, 12:09am

I believe it should fail at first since there are not GPU nodes deployed/available. After a little while, you should see a message about triggering autoscaling. Did this not happen?

anonymous · February 6, 2023, 5:14pm

Hi,

yeah I saw the message about autoscaling. And this gave me the clue ! . I was not understanding that , because of autoscaling feature on autopilot, I need a quota of 2 gpus, when I had a quota of 1.

I was able to fix it by requesting an increase of quota.

Thanks for your help

Topic		Replies	Views
GKE autopilot cluster can't scale up GPU pod Serverless Applications gke	1	115	March 15, 2024
GKE Autopilot cluster and Wanted up a GPU ( Nvidia-l4 or Nvidia-tesla-t4 ) Serverless Applications	4	77	July 30, 2024
GKE Autopilot Inifinite Pod Pending Serverless Applications gke	7	36	October 1, 2024

insufficient nvidia.com/gpu with autopilot

AI Suggested topics