No burstable workloads after cluster upgrade

Hello, I have a cluster that was created a couple of years ago and I am trying to get burstable workloads working. I have followed the instructions here but am not having any luck.

  • I have upgraded the cluster to 1.33.1-gke.1107000
$ gcloud --project=HIDDEN container clusters describe --region=europe-west2 HIDDEN

# ...
createTime: '2023-02-02T15:48:57+00:00'                                                                                                                                                                                                                                                                                                                                     
currentMasterVersion: 1.33.1-gke.1107000
# ...
  • I can confirm that the cluster is now using cgroupsv2 (it was v1 prior to upgrading)
$ gcloud --project=HIDDEN container clusters describe --region=europe-west2 HIDDEN --format='value(nodePools[0].config.effectiveCgroupMode)'

EFFECTIVE_CGROUP_MODE_V2
​
  • I have restarted the control plane after the upgrade (including again today, a day after upgrading)
  • There is no efficiency-daemon running in the cluster
kubectl get daemonset --namespace=kube-system efficiency-daemon

Error from server (NotFound): daemonsets.apps "efficiency-daemon" not found
  • When I edit a workload and make the limits higher than the requests, they are set back to the same value, and they have “qosClass: Guaranteed”

I feel like I’m probably missing something obvious. Does anyone have any tips or pointers? I’m stumped at this point and would greatly appreciate it.

Thanks!

Martin

Are you running Autopilot or Standard?

Autopilot. Sorry, I should have said that in my original post!

Can you check the node version(s)?

kubectl get nodes

Here you go:

$ kubectl get nodes

NAME                                        STATUS   ROLES    AGE     VERSION
gk3-XXXXXXXXXX-nap-XXXXXXXX-XXXXXXXX-XXXX   Ready    <none>   5d16h   v1.33.1-gke.1107000
gk3-XXXXXXXXXX-nap-XXXXXXXX-XXXXXXXX-XXXX   Ready    <none>   5d16h   v1.33.1-gke.1107000

Hmm … everything seems to check out in terms of being eligible for bursting.
Have you tried again recently? Perhaps when you last tried the nodes had not yet been upgraded as they upgrade after the control plane.

You’ll probably need to open a support ticket if it’s still not working.

If you don’t have support, if you send an email to gke-forum@google.com we’ll see what we can do as we’ll need to collect some additional information.

I have restarted the control plane again an hour ago and there is still no efficiency-daemon. So I will email the support address you listed.

Thanks for your help :slightly_smiling_face:

Looks like we have a bug in our enablement logic for clusters created prior to cgroupv2.
We are working on a fix.

In the meantime, you can “force” things:

gcloud container clusters update CLUSTER_NAME \
    --autoprovisioning-cgroup-mode=v2
2 Likes

Thanks for your help, that worked!