GBQ: Finding the leading page

Google analytics premium account or 360 suite give you access to raw data on hits level. You can track user journey on most granular label, you could imagine. But with great data comes great headache of analysis. And the medicine is GBQ.
Google big query is a server-less data warehouse tool designed by google. BigQuery stores data in SQL query-able tables and allows for powerful and comprehensive analytics. Simple analysis can be carried out in GA using customize report or even built in report but now and than you may want report that cant easily be built in GA. You can connect your GA and GBQ and have access to raw data.
Detail of  GBQ schema for GA sessions data is here . Many fields in this schema are only populated if GBQ and GA are properly configure in your website. So please contact IT team.

How GA sessions data are stored in GBQ.
Sources: https://www.lunametrics.com

One of the analysis I do frequently is tracking the user journey. Lets say you have a banner in your homepage and you want to see how many people land on homepage and than click on the banner. This can be done in GA itself but this is simple example we cant built more complex example making this as base like how long does it take for people to click on that banner?
This can be done for any leading action not just pages. May be you have events setup and you want to track leading/lagging events.

We will be using following function on our code, so please check the google document, if you don't know what they do.
1. TABLE_DATE_RANGE
2. LEAD
3. PARTITION BY
4. REGEXP_MATCH

STEP 1:
Prepare the table with required field and order:

STEP 2:
Get the leading page/item from first table, this is where you cant added any addition columns you want to added, may be 3 leading page or leading page hits time.

STEP 3:
We filter and summarized the data to get information. Here I want to track the traffic to certain page, if they originate from homepage and on daily basis.

This is the power of big query. This was very simple example, you can added multiple lead/lag or multiple fields.

But be warned big query is not free, they charge you per TB of data processing, not export made. Data storage has separate cost.
Cost per 1 TB data processing is 5$ after free tier of 1 TB. Some single query can easily take 10 GB data processing. So be careful with your queries. Always test on simple sample of one day or one week ga session table before running query on whole table.
Check bigquery/pricing for detail.

Quote from book I am reading.
"Isocrates: "Democracy destroys itself because it abuses its right to freedom and equality. Because it teaches its citizens to consider audacity as a right, lawlessness as a freedom, abrasive speech as equality, and anarchy as progress."
― Michael Lewis, Boomerang: Travels in the New Third World

Unknown

I'm a Data-holic. I'm passionate about learning new things and playing with data. This blog is a place where I want to share my data analysis tips and tricks, data visualization and automation tips and tricks.

No comments:

Post a Comment