I am trying to get a comprehensive list of GCP services that create or manage GCE instances and/or disks as a part of their activation. For example, when you create a GKE cluster it will spin up a VM instance and persistent disk and then manage that as a node.
I am trying to manage operational labels and metadata for all resources at my organization and need insight into which VM instances are managed by other resources vs. directly by a user.
So far I’ve identified:
Google Compute Engine (GCE) and Persistent disks (obviously)
Google Kubernetes Engine (GKE)
Dataproc Clusters
App Engine flexible environment
Compute Engine managed instance groups
Cloud Workstations
AI Platform Training
Cloud Composer
Cloud SQL (persistent disks not VM’s)
Anything I’m missing? I could not seem to find any documentation about this specifically.
Alternatively, you may also view if a certain resource is being used by a service in the Console if there is data indicated in the “In use by”. In the image below, you can see that three VMs in my Compute Engine is being used by my GKE Cluster.
So the overall suggestion from a support case has been to rely on labels. It is just unfortunate that there is no consistent convention for those labels. We can reliably identify GKE and dataproc cluster nodes with the ‘goog-’ prefix on their labels, but on other services such as Cloud Workstations don’t. We have an issue open for this with GCP, but not sure when/if it will get worked on. https://issuetracker.google.com/issues/304824114?pli=1
I also notice that GKE nodes have the “in use by” field but Dataproc cluster nodes don’t.
Unfortunately we are iterating through 100’s of projects and 1000’s of resources, so the console won’t be a good solution. We use the Python API though so if it’s returned in the list/get call, that might be helpful.