Ideally, it should spin up node within minutes, but we often see even after half hour no new node is added. When Describing the pod I see this Event.
**pod didn't trigger scale-up (it wouldn't fit if a new node is added): 24 node(s) didn't match Pod's node affinity/selector, 1 node(s) had untolerated taint {cloud.google.com/gke-quick-remove: true}**
My question is what am I missing here? Is there something we can do to reliably scale up nodes to quickly run our workloads?
Also, when deploying a pod I see GKE warden adds bunch of extra fields to the pod such as **cloud.google.com/pod-isolation:** **'2'**
what does this field mean? why is it added ?
Another issue/feature I see is, Every pod we deploy is spun up on a new node, making the creation of new pods slower, as node needs to be scaled up first. Is this because of cloud.google.com/pod-isolation: '2' annotation?
For the first question, what are your Pod resource requests? I’m wondering if you’re requesting so much that there’s no pre-defined C3 machine type that can handle the size of the Pod.
For the second, Performance class is specifically a one-Pod-per-node compute class so that you can burst into the entire node at any time without worrying about competing with other Pods. The pod isolation label is probably supporting that, yes.
To spin up nodes in advance, you could deploy Pods with a low PriorityClass that don’t do anything. They’d get evicted by your actual workload Pods if needed. https://cloud.google.com/kubernetes-engine/docs/how-to/capacity-provisioning has the instructions. BUT because of the Performance class pricing model you’ll be paying for the idle node regardless of whether the small Pod is using it.
Thanks. now I understood the the reasoning behind 1 pod per node. Is there anyway we can still continue using performace nodes and deploy multiple pods on it so we dont see lag of node scaling up.
Just to make sure I am 100% clear on this. If I spin up a pod which requests performance node of e2 with request and limits of 50MB/50vcpu - 100MB/100vCPU, It would still spin up e2-medium (This is what we have seen always.) and rest of the resources will just be sitting idle and hence wasted. Correct?
Also I want to make sure that these resources are not shared with other GCP clients.
Not yet, but I believe that product teams are aware that it would be a good capability to have.
Yes, the unused resources would be idle if the machine size that was spun up was bigger than your Pod size and your Pod never burst into the extra capacity.
I’m…not sure what you mean by “shared with other GCP clients”. Do you mean whether the underlying VM is dedicated to your project and workloads?