Hi, I have created a cloud run job where each task would take certain number of rows from bigquery table and pass through some Api and update the response of APi for the corresponding record in the table.
But some times job will retry without having no error in it. And the execution starts from the beginning for the retried task
So, the cloud run job has execution time and no. of retry. If the execution didn’t get completed within stipulated time the job will retry until the no. of retry gets completed.
Basically each job would pickup 150 records every 20 minutes from bigquery table and pass those specific records from the row to 3 APIs(outside of GCP) one after the other and update the response of the API to the corresponding row in bigquery table.
Here cpu allocated - 4, Memory allocated - 16 GiB for the cloud run job I created with parallelism factor of 5.
Does this happen fairly predictably (i.e. most of the time)? Or does is happen sometimes, maybe in clusters (some days the job runs fine, other days it has this strange retry behavior)?
Is there really nothing in the logs?
What is the memory usage like - are you perhaps running out of memory?
Hi, we’d like to take a closer look at what’s going on; if you send me a private message with your project ID I’ll pass it to our engineering team to take a look.