Job state is set from RUNNING to FAILED for job projects/XXXXXXXXXXXXXX/locations/us-central1/jobs/xxxxxx-orch-202309182321. Job failed due to task failures. For example, task with index 0 failed, failed task event description is Task state is updated from PENDING to FAILED on zones/us-central1-f/instances/5883206691509808443 with exit code 127.
I am just trying to run the below sample code but instance template:
For example, in your case, the job fails because Batch hasn’t supported Red Hat, so Batch is trying to install docker with default “apt-get” command, which results the docker failure for the container job.
Out of curiosity, is the Red Hat the required OS you need to use, or will a replacement OS such as Debian or CentOS work?
Thanks again for your response. Based on recommendation I switched my instance template to Debian and after switching to Debian I am still getting the same error but this time exit code 100. Have you ever seen this error and what could be it related to. There is no much information in the logs.
Error:
Job state is set from SCHEDULED to FAILED for job projects/XXXXXXXXXXXXXXX/locations/us-central1/jobs/XXXXXXXXX-orch-202309191306. Job failed due to task failures. For example, task with index 0 failed, failed task event description is Task state is updated from PENDING to FAILED on zones/us-central1-c/instances/8693805527343590318 with exit code 100.
It seems your Batch job with Debian 10 image is failed with error Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?. Could you try to retry on the failed tasks with https://cloud.google.com/batch/docs/automate-task-retries to see whether it helps?