I am in trouble because the numbers do not match when comparing the data in GA4 and exported BQ. If you have any clues, I would appreciate it if you could let me know.
【premise】
Im running e-commerce site, and using GA4. Also, I set the export from GA4 to BigQuery using GA4 export function. (https://support.google.com/analytics/answer/9358801)
【overview】
Using GA4’s search tool, we created a report that counts the number of sessions for each referrer. Next, in BigQuery to which the GA4 data was exported, the number of sessions was aggregated for each session referrer. Comparing the GA4 report with the BQ results, the number of sessions for a particular session referrer is significantly lower in the BQ results.
【detail】
When aggregating the number of sessions for each referrer of sessions in BigQuery, the key of event_params was grouped by the string_value of source.
[SQL on BQ]
select source ,count(distinct session_id) ,sum(ss_session_start)
from(
select
CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'source') as STRING) as source
,CONCAT(user_pseudo_id,"-", CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') as STRING)) as session_id
from XXXXXXXX.events_*ga
where 1=1 and _table_suffix = "YYYYMMDD"
)
group by source
order by source
[things I tried]
・Depending on the referrer of the session, the figures were almost the same. For example, BQ app is very different, but insta is almost the same.
・The total number of sessions is almost the same, and there are a large number of null session referrers. → Therefore, in GA4, what is classified as app etc. is null on BQ? I’m guessing
・I tried to aggregate the page_location with the utm_source extracted by myself using REGEX, but the same tendency was observed.