Hi Everyone,
I have trained an image classification AutoML model on Vertex AI using labelled images. This model prediction endpoints need to be active and available when requests are received.
However, I have noticed that simply keeping the Vertex AI endpoint enabled is incurring a cost of approximately £25 per day, even when no predictions are being made. I have explored the option of un-deploying and re-deploying the model to reduce costs, but this process takes at least ten minutes each time, which is not a reliable solution for a production environment that requires responsiveness.
Could you please advise on a better approach or best practices to minimise costs while still ensuring that the endpoint is available when needed? Any recommendations or configuration adjustments that could help would be greatly appreciated.
Thanks in advance