Conditional Incremental Merge

I have an incremental table in Dataform and need to update the merge statement to make it conditional - if ID is matched update everything BUT Column 1; if ID is not matched insert row.

Please note, there is another condition - when rebuild is incremental the source is data_increment, when we run full refresh the source is data.

The simplified code below works on manual run, but fails on incremental execution, because another merge statement gets wrapped around the conditional one in sqlx:

config {
type: “incremental”,
uniqueKey: [“ID”]
}

${when(incremental(),
`MERGE INTO ${self()} AS t
USING (SELECT * FROM ${ref(“data_increment”)}) AS s
ON t.ID = s.ID

    -- Condition 1: When the ID matches AND the Column2 has changed,
    -- update all fields except Column1 (retain original) and increment the Times_Updated counter.
    WHEN MATCHED 
	AND t.Column2 != s.Column2 
	THEN UPDATE SET
        t.Column2 = s.Column2,
        t.Date = s.Date,
        t.Times_Updated = t.Times_Updated + 1

    -- Condition 2: When the ID matches but the Column2 has NOT changed,
    -- only update other fields like Date (in case the modified date has changed)
    WHEN MATCHED 
	THEN UPDATE SET t.Date = s.Date 
        
    WHEN NOT MATCHED 
	THEN
    INSERT ROW`,

    //when not incremental
    `SELECT * FROM ${ref("data")} WHERE Column1 <> Column2`

}

I saw a few other topics with similar problems, but they did not answer my two main questions:

  1. How do I exclude a column from MERGE if ID already exists (insert all columns if it doesn’t)

  2. How do I retain the capability of full refresh with the conditional source when I modify the merge statement?

Any help is appreciated.

Hi ElenaG,

You can refer to this related case from the GDPF forum that might help you to resolve your issue.

To answer your questions:

1. How do I exclude a column from MERGE if ID already exists (insert all columns if it doesn’t).

  • To prevent a specific column from being updated when a matching ID is found, omit that column from the ‘UPDATE SET’ clause in your ‘MERGE. statement. If the ID is not present, you can insert all columns using the ‘WHEN NOT MATCHED THEN INSERT’ clause.

2. How do I retain the capability of full refresh with the conditional source when I modify the merge statement?

  • As of now, Google Cloud’s official documentation doesn’t directly address this specific scenario. However, the behavior you’re describing is commonly discussed in community forums and GitHub threads, where users share practical workarounds and examples. If you’re looking for guidance, I recommend checking out this page, which covers similar use cases in detail.

For more information, you can check the following documentation below: