Big Query SQL GA4 Data

Hi everyone and infront thanks for your time and help… I have a questions regarding Google Analytics Big Query Data. I want to analyse Google Analytics(4) Data based on different user Segments. I have a raw data table with user_id, event, timestamp/date. Now I want to build three different user segments:

  • Users who triggered event1 then event2 (based on the timestamp)
  • Users who triggered event1 <10 minutes after event2
  • Users who triggered event1 directly after event2
    The raw data is event scoped, so every event is one row and its a normal relational database.

I can query for users who had event 1 and 2, but I don’t know how to query for the different order-szenarios shown above.

The basic query for users with both events would be:
Select
user_pseudo_id
from >table name<
where event in (event1,event2)

Many thanks for your support, Rapid_SQL:)

Hi @Rapid_SQL ,

Welcome to Google Cloud Community.

You can use the following SQL queries to look for the various order situations you described based on the timestamp:

Individuals that started event1 and subsequently event2:

SELECT t1.user_pseudo_id
FROM `table_name` t1
JOIN `table_name` t2 ON t1.user_pseudo_id = t2.user_pseudo_id
WHERE t1.event = 'event1'
AND t2.event = 'event2'
AND t1.timestamp < t2.timestamp

Event1 was started by users 10 minutes after event2:

SELECT t1.user_pseudo_id
FROM `table_name` t1
JOIN `table_name` t2 ON t1.user_pseudo_id = t2.user_pseudo_id
WHERE t1.event = 'event1'
AND t2.event = 'event2'
AND TIMESTAMP_DIFF(t1.timestamp, t2.timestamp, MINUTE) < 10

Users who immediately following event2 initiated event1:

SELECT t1.user_pseudo_id
FROM `table_name` t1
JOIN `table_name` t2 ON t1.user_pseudo_id = t2.user_pseudo_id
WHERE t1.event = 'event1'
AND t2.event = 'event2'
AND TIMESTAMP_DIFF(t1.timestamp, t2.timestamp, SECOND) = 0

These queries produce the user_pseudo_id for the chosen order scenario by self-joining the timestamps of events 1 and 2 for each user and consider that event1 and event2 should be changed to the actual event names you are using in your data, and table_name should be changed to the name of your BigQuery raw data table.