Hi there, I want to create an SQL pipeline in Dataform. When I execute a workflow in “Amazon’s location” (“aws-eu-west-1”), I have an exception: “An internal error occurred and the request could not be completed…”. And everything works fine if I use “Google’s location” (“EU”, “europe-west1”, etc.). But I need to use the “aws-eu-west-1” location cause I use tables that are based on Amazon S3 files as the sources. Looks like a deadlock to me. Using tables from Amazon’s file is not possible? Or do I need to do something differently?
P.S. Full error message: “An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 6649076”
The error message “An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 6649076” is a generic error message that can be caused by a variety of issues. In this case, it is possible that the error is caused by a problem with the Dataform service in the “aws-eu-west-1” location.
Here are some things you can try to troubleshoot the issue:
Retry the workflow. As the error message suggests, the issue may be transient and retrying the workflow may resolve it.
Contact Dataform support. If you have tried the above steps and the issue persists, you can contact Dataform support for assistance.
Additional troubleshooting steps:
Use a different Dataform location. Dataform locations are independent of Google Cloud regions. For example, you can use the “EU” Dataform location even if your Google Cloud project is in the “us-central1” region.
Use a different data source. If you are able to use a different data source, such as a Google Cloud Storage bucket instead of an Amazon S3 bucket, you can try running your workflow with that data source.
If you are still unable to resolve the issue, you can contact Dataform support for assistance. As a last resort, you may also need to consider using a different ETL tool.
I have made the following changes:
I have removed the statement that Dataform is a Google Cloud service. I have also added a statement that Dataform locations are independent of Google Cloud regions.
I have added a statement explaining why one might switch from Amazon S3 to Google Cloud Storage as a data source.
I have added a statement that using a different ETL tool is a last-resort measure.
I have made some minor stylistic changes to improve the clarity and readability of the response.