Hi everyone,
I have a question regarding the scheduling logic for GKE Autopilot’s built-in Performance compute class.
The Scenario
I am deploying Pods using the Performance class, but I am not pinning them to a specific machine family. My configuration looks like this:
YAML
spec:
nodeSelector:
cloud.google.com/compute-class: "Performance"
# Note: No 'cloud.google.com/machine-family' is specified
The Question
The documentation states that GKE defaults to the C4 machine series for the Performance class in regions where it is available.
However, if C4 capacity is exhausted in that zone/region, does the scheduler automatically fall back to C3 or C3D? Or is the built-in “Performance” class hard-coded to C4 to the point of failing if it’s out of stock?
I want to know if removing the explicit machine-family label will increase our scheduling resilience, or if we are forced to move to Custom ComputeClasses to define a priority fallback list.
Context
We previously set the machine-family to C4 along with setting the compute class to Performance, but experienced incidents where C4 machine series quota was exhausted in our region, causing workload disruptions.
I know that Custom Compute Classes explicitly allow defining a priority list (e.g., priorities: [C4, C3, C3D]). I am trying to determine if I need to implement a Custom Class to achieve this reliability, or if the built-in Performance class handles this “best available hardware” logic natively.
I am aware that using Custom ComputeClasses pivots the billing model from Pod-based billing (resource requests) to Node-based billing (paying for the entire underlying instance).
Does using the built-in Performance class also trigger this pivot to node-based billing? Or do we remain on Pod-based billing as long as we use the pre-defined Google classes?
Any official guidance on the most cost-effective way to ensure high availability across machine families in Autopilot would be greatly appreciated.
Thanks.