Want to move data from Native Bigquery to Lakehouse

Folks, newbie here:
Our team is currently exploring approaches for moving data from BigQuery to Lakehouse Iceberg tables, with the goal of enabling querying across multiple engines (such as BigQuery and Spark). Since this is a relatively new offering, we haven’t been able to find as much documentation as we’d like:

I do have a couple of questions regarding the Lakehouse approach:

  • We manage a few large datasets (on the order of 3–4 billion rows per day) with significant historical data stored in BigQuery. If we migrate this data to standard GCS bucket and connected to bigquery through lakehouse connector, should we expect any cost savings for query and storage workloads compared to native BigQuery?

  • We currently rely on reservation slots in BigQuery—would a similar query require more slot time, or otherwise impact cost efficiency?

  • From a performance standpoint, what kind of differences should we anticipate, assuming all other factors are comparable?

Additionally, our tables are partitioned by date, and data older than two years is rarely accessed. Does this usage pattern influence the cost or performance trade-offs when considering a Lakehouse architecture? Currently we load the data into native bigquery using dataflow/ composer.

would appreciate if you can point the right forum, if this is not the right place

Thanks again for sharing your insights.

1 Like