Vertex AI AutoML Endpoint Cost Optimisation for Idle State

Vinita_Jhakra · July 24, 2025, 10:41am

Hi Everyone,

I have trained an image classification AutoML model on Vertex AI using labelled images. This model prediction endpoints need to be active and available when requests are received.
However, I have noticed that simply keeping the Vertex AI endpoint enabled is incurring a cost of approximately £25 per day, even when no predictions are being made. I have explored the option of un-deploying and re-deploying the model to reduce costs, but this process takes at least ten minutes each time, which is not a reliable solution for a production environment that requires responsiveness.
Could you please advise on a better approach or best practices to minimise costs while still ensuring that the endpoint is available when needed? Any recommendations or configuration adjustments that could help would be greatly appreciated.

Thanks in advance

caryna · July 25, 2025, 11:43am

Hi @Vinita_Jhakra,

This public discussion confirms that Vertex AI does not automatically scale to zero when idle, which limits cost savings during non-business hours.

However, you could use a schedule to remove the deployment and re-create deployment when needed. Cloud Run is another alternative mentioned in the public discussion that might be worth considering. You can also refer to the Scaling behavior section in the official Vertex AI documentation, this section will help you understand how to configure your endpoint’s scaling.

Topic		Replies	Views
HELP: Clarification and Cost Optimization for Hourly Billing on Deployed Vertex AI Endpoints Custom ML & MLOps vertex-ai-training , vertex-ai-platform , vertex-ai-model-registry	4	147	November 7, 2025
Vertex AI Endpoint with no deployed model - is there a cost? Custom ML & MLOps vertex-ai-platform	4	69	November 4, 2025
Vertex AI Online Predictions Scale Down Custom ML & MLOps vertex-ai-platform	2	106	April 15, 2025

Vertex AI AutoML Endpoint Cost Optimisation for Idle State

AI Suggested topics