Datastream has granularity explosion ingesting CloudSQL Postgres Partitioned Tables

Rob_Senseman · October 21, 2025, 5:59pm

We use Datastream to mirror (Cloud SQL) Postgres tables in BigQuery. We are hitting an issue when trying to replicate **partitioned** Postgres tables into BQ; our Datastream pipelines never finish backfilling and the data throughput is 10 or ever 100X higher than the total size of the target tables. Is there someone who can help us debug this? We’ve have experimented with our Cloud SQL Postgres setup in accordance with the docs and looked around online, but we haven’t found any solution to this problem that will be a deal breaker for using Datastream if we can’t resolve it.

This seems like table stakes for a CDC system to be able to handle partitioned tables so I’m hoping there is a simple solution.

LeoK · October 22, 2025, 7:48am

Hello @Rob_Senseman,

Not knowing what you’ve already tried exactly, here’s what I would try:

Delete every Datastream-related object:
- Datastream itself
- BigQuery resources
- Publication/Subscription on PostgreSQL
Create a publication using publish_via_partition_root=true as mentioned by the doc you shared
Recreate a fresh Datastream, targeting (at first) only one table and setting the appropriate max_staleness

I think that deleting everything is a good start to avoid any remaining configuration that could still be used, especially on BigQuery. Also, I remember having an issue linked to insane throughput last year that was related to the max_staleness.

Topic		Replies	Views
Datastream CDC does not consistently capture changes in the source DB Data Analytics analytics-general	3	51	March 16, 2023
Datastream Not Working with BigQuery Table Partitioning and Clustering Database datastream	0	66	April 1, 2024
Datastream vs Data Fusion replication Data Analytics bigquery , cloud-data-fusion	12	185	December 12, 2024

Datastream has granularity explosion ingesting CloudSQL Postgres Partitioned Tables

AI Suggested topics