Instead of hardcoding, I’d like to dynamically retrieve the list of tables from a configuration table or INFORMATION_SCHEMA.TABLES and auto-generate these declarations. How can I fetch table names dynamically inside sources.js and use them in declare()?
Would love to hear best practices or workarounds! @ms4446 Any suggestions?
It looks like you are trying to dynamically declare your BigQuery sources in Dataform, retrieving schema and table names programmatically, so that Dataform recognizes new or existing BigQuery tables without manual updates.
Here are the potential ways that might help with your use case:
Query BigQuery INFORMATION_SCHEMA.TABLES: You may want to use the INFORMATION_SCHEMA.TABLES to get a list of tables. In your sources.js file, you can execute a BigQuery query against INFORMATION_SCHEMA.TABLES that retrieves the table names.
Iterate and Declare: After you obtain the table names, you’ll loop through your results and utilize the declare() function to dynamically set up your source declarations for each table.
Note:Ensure that you have necessarypermissionto access tables in BigQuery. To declare a data source inDataform, you must also have the roles/dataform.editor IAM role on workspaces.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Thank you, @MarvinLlamas , for your detailed response, and sorry for the delayed reply!
I understand the approach of querying INFORMATION_SCHEMA.TABLES to retrieve the table names dynamically. However, in JavaScript, we can’t directly query BigQuery and store the results as an intermediate variable before iterating over them in sources.js. Unless I’m missing something, this seems to be a limitation.
Would you happen to have a recommended workaround for dynamically fetching and injecting these declarations into sources.js? Perhaps using an external script or automation process?
Probably add the information schema query as part of the build process( assuming that build account will have access to BQ) and generate/overwrite a config.js which contains the list of tables, and then the declaration can be done in iteration. If the list of tables won’t change that often , it still make sense to define the list of tables manually in a JS object so that your code does not have a side effect (i.e avoiding any table creation affecting the dataform runs unintentionally)