Dataflow: Failed to start the VM

Hello

Is the Dataflow service available? I’m getting the error ": “Failed to start the VM, launcher-202310040825083211699208715384817, used for launching because of status code: UNAVAILABLE, reason: One or more operations had an error: ‘operation-1696433110827-606e59cf4c309-097dcbda-f40d6a73’: [UNAVAILABLE] ‘HTTP_503’..”

Start time: October 4, 2023 at 10:25:10 AM GMT-5
Elapsed time: 15 sec
Region: us-central1


Best regards
David Regalado
Web | Linkedin | Cloudskillsboost

1 Like

The error message suggests that the Dataflow job couldn’t start due to a VM initialization failure. While the exact cause isn’t specified, potential reasons include:

  1. Compute Engine Metadata Limits: Dataflow uses Compute Engine metadata for pipeline options. If there are too many JAR files to stage, the JSON representation might exceed the metadata limits, which are immutable. To check the size of your pipeline’s JSON request, run it with the --dataflowJobFile= option. Ensure the output file is below 256 KB.

  2. VM Image Issues: There could be compatibility issues with the VM image you’re using.

  3. Network Configuration: Incorrect network configurations can prevent VMs from starting.

  4. Compute Engine Infrastructure: Sometimes, underlying issues with Compute Engine can affect VM startup.

Troubleshooting Steps:

  • Verify you have the required permissions to create VMs in the designated Compute Engine zone.
  • Ensure the VM image is Dataflow-compatible.
  • Check your network configuration for accuracy.
  • If you suspect you’re exceeding metadata limits, reduce the number of staged JAR files or adjust your pipeline options.
  • Consider using a different VM image or changing your network configuration if those are potential culprits.
  • Try running the Dataflow job in another region.
1 Like

Thanks for providing solution alternatives. I tried running in another region and it worked!


Best regards
David Regalado
Web | Linkedin | Cloudskillsboost

3 Likes