Hi, my use case is the following. I have three Batch jobs A,B,C to run. Each one of (A,B,C) has 1000 tasks. I would want to schedule a sequential execution of A,B,C, that is, the tasks within each one of (A,B,C) are still running in parallel but B only starts running after A fully completes.
I am currently using GCP Workflow to do this, however, Workflow has a max timeout of 1800 seconds, which is too restrictive. Can the staff suggest an alternative solution to this? Thanks very much!
An alternative could be to use Cloud Composer, which is a managed Apache Airflow for workflow management. You can find details on the Batch Airflow operator here.
Another alternative is to trigger Workflows from pub/sub and Batch job can post pubsub messages on job state changes. The downside is that it has more moving parts.
As a daily user of workflow, I’d like to have what @gradientopt suggested, increasing the timeout to a much longer range. We manage all our pipelines through the GCP’s workflow. Several computation-intensive steps are executed through Batch and they’re usually hours long. (Batch doesn’t have an async API to allow us to submit the job and then poll the status. ).
It would be an important feature for workflow to allow long running operations and then on the user side we don’t have to worry about adding more moving parts to just implement a “watcher” job for the long-running tasks
Same issue. Workflow + Batch is practically unusable for us with the timeout of 1800 seconds.
On a related note this is not highlighted on https://cloud.google.com/workflows/docs/tutorials/batch-and-workflows, so this limit was not discover util we implemented a prototype version.
For those interested in increased Cloud Workflow timeouts, please DM me with the following details to help inform the Workflow team how to best advise.
For me this does not work since it just says that the maximum limit is 1800 seconds … could the staff confirm that this would work? Thanks! @robertcarlos@Shamel@bolianyin
We do have similar issue. 1. Our system receives multiple files from a source then files are being stored to Cloud Storage. Cloud storage state change is triggered via Cloud PubSub which triggers Cloud Workflow. If there are multiple files are received before completing the execution of workflow, we would like workflow to not accept the trigger event. how can we achieve this?