I have a table with invoice with a date, net_total and tag-column. the tag column contains a JSON array (but could be converted to comma-separated list as well).
I want to show all invoices grouped by month dimension with its sum and do not need to show the tags. But I want to have a “tag” filter. So all available tags should be visible in the tags dropdown and as soon as I choose one or multiple of those tags, the table should show only invoices having at least this tag.
Problem: when I unpack the list in the invoice table for looker studio, all invoice-tag combinations show up and I could filter by tag. But I need to group it by date and select distinct by invoice-id so invoices do not occur and are not summed up multiple times with multiple valid tags. also, without a tag selected, an invoice should for sure only appear once.
Any hints for me?
1 Like
Hi @sven.nussbaum In Looker Studio the difficulty comes from exploding the JSON array, which multiplies rows and therefore inflates your sums. The key is to separate filtering logic from aggregation logic.
A common approach is:
-
In the data source (BigQuery is ideal), create a derived table that unnests the tags array into one row per invoice–tag combination.
-
Use this exploded table only for the tag filter control.
-
For the main table (grouped by month with SUM(net_total)), use the original invoice-level table (one row per invoice).
-
Blend the two sources in Looker Studio on invoice_id. Apply the tag filter to the blended data so it filters invoices, but keep aggregation based on distinct invoice_id from the base table.
If you cannot use blending, another option in BigQuery is to pre-aggregate first (GROUP BY invoice_id, month, net_total) and then use a WHERE EXISTS clause to filter invoices by selected tags before aggregation. That way each invoice is counted once.
The important principle is: filter at the tag level, aggregate at the invoice level. Avoid summing on the unnested dataset directly, or you will always get duplicated totals.
HI @a_aleinikov, thank you very much so far.
Where I struggled is your 4. point. I have a table-display in looker studio having the dimension date (year,month) and the net_total as measurement. How can I have this table filtered by the selected tag? I cannot “aggregate” by invoice_id in the report table without selecting invoice_id in the dimension, or am I missing something?
I have the tag-datasource like this
SELECT
id,
`date`,
JSON_UNQUOTE(jt.tg) as tagname
FROM
invoices,
JSON_TABLE(tags, '$[*]' COLUMNS (tg JSON PATH '$')) AS jt;
And the invoice data source like
SELECT
i.*
FROM
invoices i;
And I have the blended data source Invoices left outer joined invoicetags on id.
I have a filter dropdown using the invoicetags-table and showing the tagname and a table using “invoices” table with “date” as dimension and net_total as measurement-column.