When using Google Cloud Run for Anthos, if your application doesn’t receive any requests, it automatically scales down. If you want to explicitly set the minimum number of instances to zero, you can do so by adjusting the configuration of your Cloud Run service. There are a few ways to accomplish this:
For existing services, you can use the gcloud run services update command with the --min-instances parameter. Here’s an example:
gcloud run services update <service-name> --min-instances 0
For new services, you can use the gcloud run deploy command with the --min-instances parameter during deployment. Here’s an example:
gcloud run deploy <service-name> --image=<image_url> --min-instances 0
Additionally, you have the option to configure your Cloud Run service using a YAML file. You can download the configuration of your service into a file named service.yaml using the gcloud run services describe command. Then, update the autoscaling.knative.dev/minScale attribute in the local file to specify the minimum instance number.
Download the configuration of your service into a file named service.yaml:
gcloud run services describe <service-name> --format export > service.yaml
Here’s an example of updating the service.yaml file:
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: '0'
Finally, you can apply the updated configuration by running the following command:
gcloud run services replace service.yaml
These methods allow you to control the scaling behavior of your Python Flask application in Google Cloud Run for Anthos programmatically, without relying on the console interface.
For more information please look at the official documentation here.