Hi,
I am trying to run a pod with gpu support but I am getting “insufficient nvidia.com/gpu”. Can you help me understand what am I doing wrong?
This is the pod definition:
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-t4
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
# [https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile](https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile)
image: "registry.k8s.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1
And this is the error I get if I run kubectl describe pods
Warning FailedScheduling 56s (x3 over 11m) gke.io/optimize-utilization-scheduler 0/2 nodes are available: 2 Insufficient cpu, 2 Insufficient memory, 2 Insufficient nvidia.com/gpu, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
Can someone give me a hand?
thanks