I encountered unexpected GPU usage on Vertex AI and would appreciate some guidance.
Context:
On August 1, I manually verified in the Google Cloud console that all my projects and resources were shut down.
Despite this, between Aug 1–11, an H100 80 GB GPU allocated for Vertex AI Online/Batch Prediction continued running without my knowledge while I was asleep.
No active resources were visible in the console during this period, yet the usage consumed my free credits and incurred additional charges.
As a student, I was unaware that the hourly cost of the H100 GPU is so high.
Questions:
Could there be background processes or hidden resources that continue running even after a project shutdown?
Is there an auto‑shutdown or idle‑timeout setting for Vertex AI GPU instances to prevent unintended usage?
If this behaviour is not expected, could this case be escalated for review?Additional context: I already contacted billing support and they provided a 90% adjustment, but there is still around 100,000 KRW outstanding. As a student, this is a significant amount, so any guidance on whether a final adjustment or waiver might be possible would be greatly appreciated.
Could there be background processes or hidden resources that continue running even after a project shutdown?
Yes. When you ‘shut down’ a resource in the console, it might refer to stopping a VM. However, components like Vertex AI Endpoints (for online prediction) or deployed models themselves are not automatically deleted or stopped when you think a project is inactive. They continue to run and incur costs until explicitly deleted. Batch prediction jobs also need to be explicitly stopped or completed. Therefore, manual cleanup of all related resources is essential, not just stopping a VM.
Is there an auto‑shutdown or idle‑timeout setting for Vertex AI GPU instances to prevent unintended usage?
Only for Vertex AI Workbench notebooks. Online and Batch Prediction services don’t auto-stop; you have to delete them yourself.
If this behaviour is not expected, could this case be escalated for review?Additional context: I already contacted billing support and they provided a 90% adjustment, but there is still around 100,000 KRW outstanding. As a student, this is a significant amount, so any guidance on whether a final adjustment or waiver might be possible would be greatly appreciated.
Yes. Since you’ve already received a partial refund, you can still request a further review by reopening your billing support case or starting a new one. Be sure to mention your student status, the unexpected nature of the charges, and the financial impact. Google Cloud sometimes grants full waivers in special circumstances, especially when users clearly didn’t intend to keep resources running. It’s worth explaining that you believed everything was shut down and weren’t aware of the high cost of the H100 GPU.
Additionally, you can refer to the following documentation that might help in your investigation in your use case:
Thanks again for your reply.
I’ve already contacted the billing support team multiple times (about 4 times), and every time they replied that no further adjustment is possible because a 90% credit was already given.
In this case, is there any way to escalate the issue to a higher team or another channel for review?
I understand the policy, but as a student this remaining cost is still a heavy burden, so I’d appreciate any advice on possible next steps.
Hi, I’m encountering the same problem. I have a H100 80 GP CPU running online batch predictions non stop since yesterday. I undeployed the models and even deleted them but the cost is still going up. How could i manually force the process from running ?