I trained my model in Vertex AI without the issue, and now I want to save its params into the bucket. In order to do that I do this:
model.save_pretrained(model_dir)
tokenizer.save_pretrained(model_dir)
print(f"Model artifacts written to: {model_dir}")
Where model_dir is valid path, I thought it is permissions issue but I gave all possible permissions for my service job to write in the buckets. I have to note that when I manually try to create/write file in Google Bucket it works or through gsutil command line. Does anyone know how to resolve this issue, I do not get any error logs so I am really confused?
Hi Gold_diggee,
Welcome to Google Cloud Community!
The standard and recommended method for saving a model from a Vertex AI Custom Training Job is to use the AIP_MODEL_DIR environment variable.
Here’s how it works:
- Vertex AI Provides the Path: When you start a custom training job, Vertex AI automatically creates and sets the AIP_MODEL_DIR environment variable inside your training container. This variable contains a unique GCS URI (e.g., gs://your-bucket/your-job/model/).
- Your Script Saves to the Path: Your training script must be written to read this environment variable and save the final, deployable model artifacts directly to that location.
This single action accomplishes two critical tasks: it persists your model in Google Cloud Storage and simultaneously signals its location to the Vertex AI platform, making it discoverable for seamless model registration and deployment.
Here’s a similar case that you may find helpful.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.