Best option to implement a Data Dictionary in BigQuery

Hi Dear,

I am working with BigQuery, Dataform and Dataplex Data Catalog to manage my data and metadata. I have implemented a data dictionary by storing it in a table within the same dataset that I use in Dataform.

The problem I am facing is that every time I run Dataform, it recreates the tables, which causes the data dictionary to be deleted as it is stored in a table of the same dataset. Since Dataplex Data Catalog handles metadata but does not prevent it from being deleted when the table is deleted, I need a viable alternative to maintain documentation of the data without it being deleted on each run.

What are the best options to keep a persistent data dictionary in BigQuery without it being deleted when running Dataform or some other viable option?

Many Thanks! :slightly_smiling_face:

This is what I use the columns{} parameter for.

https://cloud.googe.com/dataform/docs/document-tables#add_column_and_record_descriptions

1 Like