The DAG failed because some tasks failed. The failed tasks are: [get-model-display-name, set-optiona

{“@type”:“type.googleapis.com/google.cloud.aiplatform.logging.PipelineCloudLoggingLogEntry”, “payload”:{…}, “pipelineJobId”:“4259905107140804608”, “pipelineName”:“automl-tabular”, “taskId”:“-1525248680444035072”, “taskName”:“get-model-display-name”}

The DAG failed because some tasks failed. The failed tasks are: [get-model-display-name, set-optional-inputs].; Job (project_id = credit-risk-v1, job_id = 4259905107140804608) is failed due to the above error.; Failed to handle the job: {project_number = 763916535827, job_id = 4259905107140804608}

Node details

com.google.cloud.ai.platform.common.errors.AiPlatformException: code=RESOURCE_EXHAUSTED, message=The following quota metrics exceed quota limits: aiplatform.googleapis.com/custom_model_training_cpus, cause=null; Failed to handle the pipeline task. Task: Project number: 763916535827, Job id: 4259905107140804608, Task id: 1933515833376505856, Task name: set-optional-inputs, Task state: DRIVER_SUCCEEDED, Execution name: projects/763916535827/locations/us-central1/metadataStores/default/executions/18200162191389704380

Hi, i keep getting this error both on custom training jobs and AutoML. I’m on a paid account and I’ve requested for quotas increase multiple times and i’ve been approved but still getting the same error. I changes CPUs from N1, E2 to A2 , and still getting errors please help me

Hi Adeola1,

Welcome to Google Cloud Community!

I would suggest trying to confirm your specific quota allocation. Navigate to the Quotas page in the Google Cloud Console and verify the limit and current usage for the ‘Custom model training CPUs’ metric, ensuring you have filtered specifically for the us-central1 region.

If you find your current usage is high, proceed to the Vertex AI section and cancel any unnecessary ‘Running’, ‘Queued’, or ‘Failed’ Pipeline and Custom jobs that are also in us-central1, as these may still be holding onto resources. After canceling the jobs, wait for 5-10 minutes to allow the system to release the allocated CPUs, which you can confirm by observing the ‘Current Usage’ drop on the Quotas page.

Once the resources are free, attempt to rerun a single, simple pipeline to avoid competing for the quota.

If the error still persists despite confirming the correct quota and ensuring no other jobs are running, it is time to contact Google Cloud Support. When you do, be prepared to provide them with your Project ID, the failed Job ID, a screenshot of your us-central1 quota page, and a confirmation that you have already tried canceling other active jobs.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.