I’ve been working on an ETL (Extract, Transform, Load) project to prepare a dataset for a fuel demand prediction model. The data, sourced from CSV, JSONL, was successfully loaded into BigQuery. I then used a SQL query to clean the data, handling negative values, missing payment methods, and unifying the inventory and transaction tables. The final table, Table_Final_DemandaFuel, is ready for machine learning.
The goal is to use this clean data to train a regression model in Vertex AI AutoML Tabular to predict fuel sales (Litros_Vendidos).
However, the training job keeps failing with a recurring error:
The DAG failed because some tasks failed. The failed tasks are: [exit-handler-1].; Job (project_id = braided-circuit-457918-m3, job_id = <JOB_ID>) is failed due to the above error.; Failed to handle the job: {project_number = 847026632307, job_id = <JOB_ID>}. Always happens on the 8th step, it works well until this step comes.
Steps Taken So Far to Fix the Error:
-
Data Validation: The final BigQuery table has been validated, and all known data quality issues (negative values,
NULLs) have been resolved. The data looks clean. -
Permissions Check: I’ve meticulously added all necessary IAM roles to my service account (
847026632307-compute@developer.gserviceaccount.com). This includes:-
Vertex AI User -
BigQuery Data Viewer -
Service Account User -
Vertex AI Administrator
-
-
Service Agent Verification: I’ve also confirmed that the Google-managed service agent for Vertex AI (
service-847026632307@gcp-sa-aiplatform.iam.gserviceaccount.com) exists and has the correctVertex AI Service Agentrole.
Despite these steps, the error persists, indicating a more complex issue with the project’s configuration or a backend problem within the Vertex AI service itself.