GKE autopilot cluster can't scale up GPU pod

I created GKE autopilot cluster with default parameters:

gcloud container clusters create-auto test-autopilot-gpu --location=europe-west4 --project=wwl-ml
gcloud container clusters get-credentials test-autopilot-gpu --location=europe-west4 --project=wwl-ml

And tried to deploy a GPU pod there as described in https://cloud.google.com/kubernetes-engine/docs/how-to/autopilot-gpus

apiVersion: v1
kind: Pod
metadata:
  name: my-gpu-pod
spec:
  nodeSelector:
    cloud.google.com/gke-accelerator: nvidia-tesla-a100
  containers:
  - name: my-gpu-container
    image: nvidia/cuda:11.0.3-runtime-ubuntu20.04
    command: ["/bin/bash", "-c", "--"]
    args: ["while true; do sleep 600; done;"]
    resources:
      limits:
        nvidia.com/gpu: 1

However, the pod remains in Pending state indefinitely. Here is it’s description:

Name:         my-gpu-pod
Namespace:    default
Priority:     0
Node:         <none>
Labels:       <none>
Annotations:  autopilot.gke.io/resource-adjustment:
                {"input":{"containers":[{"limits":{"nvidia.com/gpu":"1"},"requests":{"nvidia.com/gpu":"1"},"name":"my-gpu-container"}]},"output":{"contain...
              autopilot.gke.io/warden-version: 2.7.52
              cloud.google.com/cluster_autoscaler_unhelpable_since: 2024-03-06T05:21:52+0000
              cloud.google.com/cluster_autoscaler_unhelpable_until: Inf
Status:       Pending
IP:           
IPs:          <none>
Containers:
  my-gpu-container:
    Image:      nvidia/cuda:11.0.3-runtime-ubuntu20.04
    Port:       <none>
    Host Port:  <none>
    Command:
      /bin/bash
      -c
      --
    Args:
      while true; do sleep 600; done;
    Limits:
      cpu:                9
      ephemeral-storage:  1Gi
      memory:             60Gi
      nvidia.com/gpu:     1
    Requests:
      cpu:                9
      ephemeral-storage:  1Gi
      memory:             60Gi
      nvidia.com/gpu:     1
    Environment:          <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x75h9 (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-x75h9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Guaranteed
Node-Selectors:              cloud.google.com/gke-accelerator=nvidia-tesla-a100
                             cloud.google.com/gke-accelerator-count=1
Tolerations:                 cloud.google.com/gke-accelerator=nvidia-tesla-a100:NoSchedule
                             cloud.google.com/machine-family:NoSchedule op=Exists
                             kubernetes.io/arch=amd64:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                             nvidia.com/gpu:NoSchedule op=Exists
Events:
  Type     Reason             Age                    From                                   Message
  ----     ------             ----                   ----                                   -------
  Warning  FailedScheduling   3m13s (x3 over 3m31s)  gke.io/optimize-utilization-scheduler  no nodes available to schedule pods
  Warning  FailedScheduling   2m54s                  gke.io/optimize-utilization-scheduler  0/1 nodes are available: 1 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
  Normal   TriggeredScaleUp   2m25s                  cluster-autoscaler                     pod triggered scale-up: [{https://www.googleapis.com/compute/v1/projects/wwl-ml/zones/europe-west4-a/instanceGroups/gk3-test-autopilot-gpu-nap-1y8i627v-a8bfd2df-grp 0->1 (max: 1000)}]
  Warning  FailedScaleUp      2m                     cluster-autoscaler                     Node scale up in zones europe-west4-a associated with this pod failed: GCE out of resources. Pod is at risk of not being scheduled.
  Normal   NotTriggerScaleUp  83s                    cluster-autoscaler                     pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) had untolerated taint {cloud.google.com/gke-quick-remove: true}, 18 node(s) didn't match Pod's node affinity/selector, 2 in backoff after failed scale-up
  Warning  FailedScheduling   82s                    gke.io/optimize-utilization-scheduler  0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..

Cloud logging show following errors:

[
  {
    "insertId": "de8e1715-c796-4fbf-a79e-27d6e9d39fed@a1",
    "jsonPayload": {
      "noDecisionStatus": {
        "noScaleUp": {
          "unhandledPodGroups": [
            {
              "rejectedMigs": [
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-4",
                    "zone": "europe-west4-a",
                    "name": "gk3-test-autopilot-gpu-pool-4-5cf4acab-grp"
                  }
                },
                {
                  "mig": {
                    "nodepool": "pool-6",
                    "name": "gk3-test-autopilot-gpu-pool-6-3e8afa7d-grp",
                    "zone": "europe-west4-a"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-2",
                    "name": "gk3-test-autopilot-gpu-pool-2-806b23c7-grp",
                    "zone": "europe-west4-c"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-5-6f0ef260-grp",
                    "nodepool": "pool-5"
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-b",
                    "nodepool": "pool-3",
                    "name": "gk3-test-autopilot-gpu-pool-3-fb1922fe-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-1-a065df10-grp",
                    "nodepool": "pool-1"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-6-e5b063f1-grp",
                    "nodepool": "pool-6",
                    "zone": "europe-west4-c"
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-3-811400b8-grp",
                    "nodepool": "pool-3"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-1-b989b740-grp",
                    "zone": "europe-west4-b",
                    "nodepool": "pool-1"
                  },
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  }
                },
                {
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-2-fd71e860-grp",
                    "zone": "europe-west4-a",
                    "nodepool": "pool-2"
                  },
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "zone": "europe-west4-a",
                    "name": "gk3-test-autopilot-gpu-pool-5-fca82bbc-grp",
                    "nodepool": "pool-5"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-4-9a759708-grp",
                    "zone": "europe-west4-c",
                    "nodepool": "pool-4"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-6",
                    "zone": "europe-west4-b",
                    "name": "gk3-test-autopilot-gpu-pool-6-cbfebf7e-grp"
                  }
                },
                {
                  "mig": {
                    "nodepool": "pool-1",
                    "name": "gk3-test-autopilot-gpu-pool-1-0b03ec88-grp",
                    "zone": "europe-west4-a"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-default-pool-be071803-grp",
                    "zone": "europe-west4-a",
                    "nodepool": "default-pool"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "TaintToleration",
                      "node(s) had untolerated taint {cloud.google.com/gke-quick-remove: true}"
                    ]
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-b",
                    "nodepool": "pool-4",
                    "name": "gk3-test-autopilot-gpu-pool-4-2992e3f5-grp"
                  },
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "zone": "europe-west4-b",
                    "nodepool": "pool-5",
                    "name": "gk3-test-autopilot-gpu-pool-5-3e1c0e68-grp"
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-a",
                    "name": "gk3-test-autopilot-gpu-pool-3-b0feff4d-grp",
                    "nodepool": "pool-3"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-2-4c51e47d-grp",
                    "nodepool": "pool-2",
                    "zone": "europe-west4-b"
                  }
                }
              ],
              "napFailureReasons": [
                {
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config",
                  "parameters": [
                    "europe-west4-a"
                  ]
                },
                {
                  "parameters": [
                    "europe-west4-b"
                  ],
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config"
                },
                {
                  "parameters": [
                    "europe-west4-c"
                  ],
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config"
                }
              ],
              "podGroup": {
                "totalPodCount": 1,
                "samplePod": {
                  "name": "my-gpu-pod",
                  "namespace": "default"
                }
              }
            }
          ],
          "skippedMigs": [
            {
              "mig": {
                "nodepool": "nap-1y8i627v",
                "zone": "europe-west4-a",
                "name": "gk3-test-autopilot-gpu-nap-1y8i627v-a8bfd2df-grp"
              },
              "reason": {
                "messageId": "no.scale.up.mig.skipped",
                "parameters": [
                  "in backoff after failed scale-up"
                ]
              }
            },
            {
              "mig": {
                "nodepool": "nap-1y8i627v",
                "name": "gk3-test-autopilot-gpu-nap-1y8i627v-5ff53eff-grp",
                "zone": "europe-west4-b"
              },
              "reason": {
                "messageId": "no.scale.up.mig.skipped",
                "parameters": [
                  "in backoff after failed scale-up"
                ]
              }
            }
          ],
          "unhandledPodGroupsTotalCount": 1
        },
        "measureTime": "1709702512"
      }
    },
    "resource": {
      "type": "k8s_cluster",
      "labels": {
        "location": "europe-west4",
        "project_id": "wwl-ml",
        "cluster_name": "test-autopilot-gpu"
      }
    },
    "timestamp": "2024-03-06T05:21:52.968881760Z",
    "logName": "projects/wwl-ml/logs/container.googleapis.com%2Fcluster-autoscaler-visibility",
    "receiveTimestamp": "2024-03-06T05:21:53.323492805Z"
  },
  {
    "insertId": "3f2686b3-403d-49b9-945a-56bb8b5c7f53@a1",
    "jsonPayload": {
      "noDecisionStatus": {
        "noScaleUp": {
          "unhandledPodGroups": [
            {
              "napFailureReasons": [
                {
                  "messageId": "no.scale.up.nap.pod.zonal.resources.exceeded",
                  "parameters": [
                    "europe-west4-a"
                  ]
                },
                {
                  "messageId": "no.scale.up.nap.pod.zonal.resources.exceeded",
                  "parameters": [
                    "europe-west4-b"
                  ]
                }
              ],
              "podGroup": {
                "totalPodCount": 1,
                "samplePod": {
                  "namespace": "default",
                  "name": "my-gpu-pod"
                }
              }
            }
          ],
          "unhandledPodGroupsTotalCount": 1
        },
        "measureTime": "1709702555"
      }
    },
    "resource": {
      "type": "k8s_cluster",
      "labels": {
        "cluster_name": "test-autopilot-gpu",
        "project_id": "wwl-ml",
        "location": "europe-west4"
      }
    },
    "timestamp": "2024-03-06T05:22:35.463795917Z",
    "logName": "projects/wwl-ml/logs/container.googleapis.com%2Fcluster-autoscaler-visibility",
    "receiveTimestamp": "2024-03-06T05:22:35.936621027Z"
  },
  {
    "insertId": "7683fa55-47a1-4615-ae07-217dc9552d95@a1",
    "jsonPayload": {
      "noDecisionStatus": {
        "noScaleUp": {
          "unhandledPodGroupsTotalCount": 1,
          "unhandledPodGroups": [
            {
              "napFailureReasons": [
                {
                  "parameters": [
                    "europe-west4-a"
                  ],
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config"
                },
                {
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config",
                  "parameters": [
                    "europe-west4-b"
                  ]
                },
                {
                  "parameters": [
                    "europe-west4-c"
                  ],
                  "messageId": "no.scale.up.nap.pod.zonal.illegal.config"
                }
              ],
              "podGroup": {
                "samplePod": {
                  "namespace": "default",
                  "name": "my-gpu-pod"
                },
                "totalPodCount": 1
              },
              "rejectedMigs": [
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-1",
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-1-a065df10-grp"
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-5-6f0ef260-grp",
                    "nodepool": "pool-5"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-c",
                    "nodepool": "pool-4",
                    "name": "gk3-test-autopilot-gpu-pool-4-9a759708-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "mig": {
                    "nodepool": "default-pool",
                    "zone": "europe-west4-a",
                    "name": "gk3-test-autopilot-gpu-default-pool-be071803-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "TaintToleration",
                      "node(s) had untolerated taint {cloud.google.com/gke-quick-remove: true}"
                    ]
                  }
                },
                {
                  "mig": {
                    "nodepool": "pool-6",
                    "zone": "europe-west4-b",
                    "name": "gk3-test-autopilot-gpu-pool-6-cbfebf7e-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-6-3e8afa7d-grp",
                    "zone": "europe-west4-a",
                    "nodepool": "pool-6"
                  }
                },
                {
                  "mig": {
                    "nodepool": "pool-4",
                    "zone": "europe-west4-a",
                    "name": "gk3-test-autopilot-gpu-pool-4-5cf4acab-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-1-0b03ec88-grp",
                    "nodepool": "pool-1",
                    "zone": "europe-west4-a"
                  }
                },
                {
                  "mig": {
                    "zone": "europe-west4-c",
                    "nodepool": "pool-2",
                    "name": "gk3-test-autopilot-gpu-pool-2-806b23c7-grp"
                  },
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-3",
                    "name": "gk3-test-autopilot-gpu-pool-3-b0feff4d-grp",
                    "zone": "europe-west4-a"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "zone": "europe-west4-b",
                    "nodepool": "pool-2",
                    "name": "gk3-test-autopilot-gpu-pool-2-4c51e47d-grp"
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-2-fd71e860-grp",
                    "zone": "europe-west4-a",
                    "nodepool": "pool-2"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "nodepool": "pool-5",
                    "name": "gk3-test-autopilot-gpu-pool-5-3e1c0e68-grp",
                    "zone": "europe-west4-b"
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "zone": "europe-west4-b",
                    "name": "gk3-test-autopilot-gpu-pool-4-2992e3f5-grp",
                    "nodepool": "pool-4"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "zone": "europe-west4-c",
                    "nodepool": "pool-3",
                    "name": "gk3-test-autopilot-gpu-pool-3-811400b8-grp"
                  }
                },
                {
                  "reason": {
                    "messageId": "no.scale.up.mig.failing.predicate",
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ]
                  },
                  "mig": {
                    "nodepool": "pool-6",
                    "zone": "europe-west4-c",
                    "name": "gk3-test-autopilot-gpu-pool-6-e5b063f1-grp"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "name": "gk3-test-autopilot-gpu-pool-5-fca82bbc-grp",
                    "nodepool": "pool-5",
                    "zone": "europe-west4-a"
                  }
                },
                {
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  },
                  "mig": {
                    "zone": "europe-west4-b",
                    "nodepool": "pool-3",
                    "name": "gk3-test-autopilot-gpu-pool-3-fb1922fe-grp"
                  }
                },
                {
                  "mig": {
                    "nodepool": "pool-1",
                    "name": "gk3-test-autopilot-gpu-pool-1-b989b740-grp",
                    "zone": "europe-west4-b"
                  },
                  "reason": {
                    "parameters": [
                      "NodeAffinity",
                      "node(s) didn't match Pod's node affinity/selector"
                    ],
                    "messageId": "no.scale.up.mig.failing.predicate"
                  }
                }
              ]
            }
          ],
          "skippedMigs": [
            {
              "reason": {
                "parameters": [
                  "in backoff after failed scale-up"
                ],
                "messageId": "no.scale.up.mig.skipped"
              },
              "mig": {
                "zone": "europe-west4-b",
                "nodepool": "nap-1n277bso",
                "name": "gk3-test-autopilot-gpu-nap-1n277bso-88e2c650-grp"
              }
            },
            {
              "reason": {
                "messageId": "no.scale.up.mig.skipped",
                "parameters": [
                  "in backoff after failed scale-up"
                ]
              },
              "mig": {
                "zone": "europe-west4-a",
                "nodepool": "nap-1n277bso",
                "name": "gk3-test-autopilot-gpu-nap-1n277bso-139213a0-grp"
              }
            }
          ]
        },
        "measureTime": "1709702877"
      }
    },
    "resource": {
      "type": "k8s_cluster",
      "labels": {
        "project_id": "wwl-ml",
        "location": "europe-west4",
        "cluster_name": "test-autopilot-gpu"
      }
    },
    "timestamp": "2024-03-06T05:27:57.627433959Z",
    "logName": "projects/wwl-ml/logs/container.googleapis.com%2Fcluster-autoscaler-visibility",
    "receiveTimestamp": "2024-03-06T05:27:58.131601669Z"
  }
]

I checked that quotas compute.googleapis.com/nvidia_a100_gpus=1, and went through other possible solutions described here https://cloud.google.com/kubernetes-engine/docs/troubleshooting/autopilot-clusters#scale-up-failed-serial-port-logging . However, nothing worked. Could point me to the solution for this problem?

Hello @TonyF ,

Checking on the provided information, what may be causing the issue may fall under 2 options:

  1. Resource limitation: In some cases, there are some scenarios where the resource quota is limited. You may need to request additional resources to push the pods for scaling up.
  2. Node Pool configuration: You may consider revisiting your configuration for the node pools to check if there is any configuration that causes the error for not scaling up.