@ms4446 @maffka123
Hello, I have a similar to this request :
The logic of the flow :
There are three data transformation steps :
- Receiving raw data into tables raw_table1, raw_table2 tables
- Transforming raw data and inserting the result into trans_table1, trans_table2 tables
- Aggregating transformed data into agg_table
When the raw data from raw_table1 / raw_table2 arrives, we transform this data and inserting it into trans_table1, trans_table2
(this is done by executing the actions trans_table1.sqlx and trans_table2.sqlx). The data from both tables : raw_table1 and raw_table2 arrives at different time, that means that we cannot schedule the actions, but instead, we need to trigger an update automatically upon the data arrival (this is just to say that both actions trans_table1.sqlx and trans_table2.sqlx are executed independently of each other). Also, there is no order of data arrival : the data for raw_table2 can arrive earlier than raw_table1 or backwards. Then there is agg_table with aggregated data, which depends on these two tables (trans_table1 and trans_table2. Agg_table needs to be updated as soon as both trans_table1.sqlx and trans_table2.sqlx are executed (in other words, when the last among the two actions finished successfully its execution).
My question is the following :
How can I execute the action agg_table.sqlx only after making sure both actions trans_table1.sqlx and trans_table2.sqlx were executed successfully ? If I just mention trans_table1 and trans_table2 in the dependencies of action agg_table.sqlx, while executing the actions trans_table1.sqlx and trans_table2.sqlx, agg_table.sqlx will never be executed.
HERE IS THE CODE :
// just to make it more simple, I don’t insert the data from raw_tables but instead insert the values by //hand, the logic stays the same
trans_table1.sqlx :
config {
type: “operations”,
}
TRUNCATE TABLE dataformtraining2024.transformed.trans_table1
;
INSERT INTO dataformtraining2024.transformed.trans_table1
VALUES
(“Steven”, “Spielberg”, “Schindler’s list”),
(“Steven”, “Spielberg”, “Jurassic Park”)
trans_table2.sqlx :
config {
type: “operations”,
}
TRUNCATE TABLE dataformtraining2024.transformed.trans_table2
;
INSERT INTO dataformtraining2024.transformed.trans_table2
VALUES
(“Steven”, “Spielberg”, “Schindler’s list”),
(“Steven”, “Spielberg”, “Jurassic Park”)
agg_table.sqlx
config {
type: “operations”,
dependencies : [“trans_table1.sqlx”, “trans_table2.sqlx”] // mention all dependencies
}
INSERT INTO dataformtraining2024.aggregated.agg_table
SELECT *
FROM dataformtraining2024.transformed.trans_table1
UNION ALL
SELECT *
FROM dataformtraining2024.transformed.trans_table2