Help Needed: Missing _metadata_uuid and _metadata_lsn in BigQuery Dataset for Datastream Pipeline

sargis93 · November 20, 2024, 1:19am

Hello everyone!

I could really use some assistance!

I’ve set up a pipeline to copy data from my managed SQL (PostgreSQL) on GCP to BigQuery. I followed these guides:

I successfully created a Dataflow template as outlined here: Datastream to BigQuery Template

The pipeline requires two BigQuery datasets: the main dataset and a staging dataset. After getting the pipeline running, I started seeing data being ingested into the staging dataset. The pipeline auto-created columns and added the metadata* fields, but it did not add the _metadata_uuid and _metadata_lsn fields, which is causing the merge step to fail.

I checked the data in the GCS bucket that holds the delta files, and I can see the LSN field present in the metadata JSON.

Does anyone know why the _metadata_uuid and _metadata_lsn fields were not auto-created in the BigQuery datasets? Any help would be greatly appreciated!

Thanks in advance!

Note: Note: I was able to get merge step to work by manually adding _metadata_uuid and _metadata_lsn but this feels wrong. Especially if I have dozens of tables

Actual Error message that the job produces (for every table):
Merge Job Failed With BigQuery Exception: com.google.cloud.bigquery.BigQueryException: Query error: Unrecognized name: _metadata_lsn; Did you mean _metadata_ssn? at [1:627] Statement: BEGIN BEGIN TRANSACTION; MERGE …

The _metadata fields that were auto created are:

  "_metadata_timestamp",
  "_metadata_read_timestamp",
  "_metadata_read_method",
  "_metadata_source_type",
  "_metadata_deleted",
  "_metadata_change_type",
  "_metadata_row_id",
  "_metadata_scn", <---- THIS IS AN ORACLE FIELD?
  "_metadata_ssn",
  "_metadata_rs_id"

sargis93 · November 20, 2024, 3:13am

Hi @ms4446 !

I am tagging you here because I saw similar posts where you helped! Would really appreciate any help/hints you might provide here

Zinia · November 20, 2024, 5:23am

Since you’re using DataStream to replicate data from PostgreSQL to BigQuery, it’s essential to ensure that your DataStream configuration is set up to correctly capture all relevant metadata fields and also try checking if the _metadata_uuid and _metadata_lsn fields are set in your DataStream destination configuration because DataStream automatically creates fields such as _metadat_timestamp or _metadata_read_timestamp and others, but the specific fields like the ones i mentioned may or may not be included.

Still if you verified your configurations and are still encountering issues, its best to reach out to Google Cloud Support.

sargis93 · November 20, 2024, 6:34am

Thank you Zinia for response. But I think DataStream is set up correctly. I am looking at the JSON objects in the bucket and they include “LSN” which makes me think that dataflow itself isn’t set up right

viniremigio · April 17, 2025, 2:08pm

I’m facing the same issue. Any update?

Topic		Replies	Views
Issue in Implement Datastream and Dataflow for analytics Data Analytics dataflow , bigquery , datastream , cloud-pubsub	2	60	April 26, 2025
Datastream Issues Data Analytics datastream	3	184	August 7, 2024
Datastream to bigquery data flow template problem Data Analytics dataflow , bigquery	5	106	September 1, 2023

Help Needed: Missing _metadata_uuid and _metadata_lsn in BigQuery Dataset for Datastream Pipeline

AI Suggested topics