the “defaultLocation” is set to EU because most of our data is stored in the EU but now, I need to work on a table that is stored in “europe-west2” and this causes an error this error “Not found: Dataset domain:table_name was not found in location EU at [4:19].”
How can I specify a location for specific sqlx files to be run to avoid this error.
This configuration tells Dataform to create or reference the table in the europe-west2 location, overriding the defaultLocation set in the dataform.json file.
Use this approach for each .sqlx file that references a table in a non-default location.
I apologize for the oversight. Apparently the location parameter is not recognized by Dataform and the ability to specify location for individual .sqlx files is not yet supported.
The only way to currently work with datasets in non-default locations is to set the defaultLocation property in the dataform.json file to the location of the dataset you are working with. For example, if you have a dataset named my_dataset in the europe-west2 location, you would set the defaultLocation property as follows:
{
"defaultLocation": "europe-west2"
}
This would cause all datasets referenced in Dataform to be created or referenced in the europe-west2 location.
So how do I handle this if I want to work with datasets stored in multiple locations in GCP? Cos I noticed that dataform only runs for datasets that are saved in the same location as the default location.
Handling datasets stored in multiple locations in GCP within Dataform can be challenging due to the current limitations of the platform. However, here are some potential workarounds and strategies you can consider:
Use the defaultLocation property: Set the defaultLocation property in the dataform.json file to the location where the majority of your datasets reside. This sets the default location for all datasets in the project. However, this approach can be limiting if you need to work with datasets in multiple locations.
Separate Projects for Each Location: Consider creating separate Dataform projects for each GCP location. This way, you can set the defaultLocation specific to each project, ensuring that each project is tailored to a specific location.
Data Transfer: If feasible, consider transferring datasets from one location to another within GCP, so they are all in the same location. This can be done using tools like gsutil or BigQuery’s transfer service.