BigQuery for marketers who don't write SQL
You don't need to become a data engineer. You need five queries. Here's how to get real answers from your GA4 data for $0-10/month.
Last month I watched a marketing director copy numbers from GA4 into a spreadsheet for three hours. She was trying to figure out which landing pages actually drove revenue. The GA4 interface kept timing out, showing sampled data, and hiding the one dimension she needed behind a “not set” label.
I showed her a BigQuery query. Twelve lines of SQL. It ran in four seconds and gave her exact numbers, unsampled, going back fourteen months.
She looked at me like I’d performed a magic trick.
It’s not magic. It’s just that GA4 keeps your data in a warehouse you already have access to, and nobody told you how to use it.
Your GA4 data is lying by omission
Here’s something Google doesn’t advertise: the GA4 interface samples your data once you cross certain thresholds. For a site doing 500K sessions a month, you’re often looking at estimates, not actuals. The green shield icon in the top corner tells you when sampling kicks in. Most people never notice it.
Beyond sampling, GA4 has hard limits. You can’t easily combine certain dimensions. You can’t look at raw event sequences. You can’t do proper cohort analysis without losing your mind clicking through menus. And those “insights” Google auto-generates? They’re optimized for Google’s interests, not yours.
BigQuery gives you the raw events. Every single one. No sampling, no limits on dimensions, no mysterious aggregation.
What BigQuery costs (less than your coffee)
Google gives you 1TB of free queries per month and 10GB of free storage. For a site with 1-5 million sessions per month, the GA4 export typically uses 2-8GB of storage. Your monthly bill will be somewhere between $0 and $10.
I’m not exaggerating. I have clients doing 3 million monthly sessions whose BigQuery bill is $4.27/month. The most expensive client I’ve seen pays around $35/month, and they have 20 million sessions with a year of historical data.
Setting up the export takes about ten minutes. Go to GA4 Admin > BigQuery Links > Link. Choose daily export (streaming export costs more and you don’t need it). Pick your Google Cloud project. Done.
One thing to know: BigQuery charges by the amount of data your query scans, not by the results returned. So a query that scans your entire events table costs more than one that filters to a specific date range. Always include a date filter. I’ll show you how in every query below.
Five queries you can steal right now
I’m going to give you five queries that answer questions GA4 makes unnecessarily hard. Copy them. Paste them into the BigQuery console at console.cloud.google.com/bigquery. Replace your-project.analytics_XXXXXXX with your actual dataset name (you’ll see it in the left sidebar).
You don’t need to understand every line. You need to understand what the query does and how to change the dates.
1. Users who viewed product A then bought product B
This is cross-sell analysis that GA4 simply cannot do in the interface. (Make sure your GA4 ecommerce tracking is set up correctly first, or these queries will return garbage.)
WITH product_viewers AS (
SELECT
user_pseudo_id,
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'item_name') AS viewed_product
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE event_name = 'view_item'
AND _TABLE_SUFFIX BETWEEN '20260101' AND '20260331'
),
purchasers AS (
SELECT
user_pseudo_id,
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'item_name') AS purchased_product
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE event_name = 'purchase'
AND _TABLE_SUFFIX BETWEEN '20260101' AND '20260331'
)
SELECT
pv.viewed_product,
pu.purchased_product,
COUNT(DISTINCT pv.user_pseudo_id) AS users
FROM product_viewers pv
JOIN purchasers pu ON pv.user_pseudo_id = pu.user_pseudo_id
WHERE pv.viewed_product != pu.purchased_product
GROUP BY 1, 2
ORDER BY users DESC
LIMIT 50;
I ran this for a fashion retailer last year. Turns out people who viewed their premium leather jackets almost never bought them, but 23% of those viewers bought a specific mid-range jacket within two weeks. That changed their entire remarketing strategy.
2. Actual time to conversion by channel
GA4’s conversion paths report is a mess. This query tells you the median days between first visit and purchase, broken down by the channel that brought them in.
WITH first_visits AS (
SELECT
user_pseudo_id,
traffic_source.medium AS first_medium,
traffic_source.source AS first_source,
MIN(PARSE_DATE('%Y%m%d', event_date)) AS first_visit_date
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20250401' AND '20260331'
GROUP BY 1, 2, 3
),
conversions AS (
SELECT
user_pseudo_id,
MIN(PARSE_DATE('%Y%m%d', event_date)) AS conversion_date
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE event_name = 'purchase'
AND _TABLE_SUFFIX BETWEEN '20250401' AND '20260331'
GROUP BY 1
)
SELECT
fv.first_medium,
fv.first_source,
COUNT(*) AS conversions,
APPROX_QUANTILES(DATE_DIFF(c.conversion_date, fv.first_visit_date, DAY), 100)[OFFSET(50)] AS median_days_to_convert
FROM first_visits fv
JOIN conversions c ON fv.user_pseudo_id = c.user_pseudo_id
GROUP BY 1, 2
HAVING conversions >= 10
ORDER BY conversions DESC;
This one always surprises people. A B2B SaaS client was convinced their Google Ads drove fast conversions. The data showed a median of 34 days from first click to purchase. Their organic traffic? 12 days. They were giving paid search credit for conversions that organic had been nurturing for weeks.
Want BigQuery set up without the learning curve? I'll configure your GA4 export, write your first queries, and get you unsampled data in an afternoon.
Book a Free Audit →3. Unsampled funnel analysis
Want to know the real drop-off between steps in your checkout? GA4 funnels sample aggressively on larger sites. This doesn’t.
SELECT
event_name,
COUNT(DISTINCT user_pseudo_id) AS users,
ROUND(COUNT(DISTINCT user_pseudo_id) * 100.0 /
FIRST_VALUE(COUNT(DISTINCT user_pseudo_id)) OVER (ORDER BY
CASE event_name
WHEN 'view_item' THEN 1
WHEN 'add_to_cart' THEN 2
WHEN 'begin_checkout' THEN 3
WHEN 'add_shipping_info' THEN 4
WHEN 'add_payment_info' THEN 5
WHEN 'purchase' THEN 6
END
), 1) AS pct_of_top
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260331'
AND event_name IN ('view_item', 'add_to_cart', 'begin_checkout',
'add_shipping_info', 'add_payment_info', 'purchase')
GROUP BY event_name
ORDER BY
CASE event_name
WHEN 'view_item' THEN 1
WHEN 'add_to_cart' THEN 2
WHEN 'begin_checkout' THEN 3
WHEN 'add_shipping_info' THEN 4
WHEN 'add_payment_info' THEN 5
WHEN 'purchase' THEN 6
END;
Change the event names to match your funnel steps. The key thing here is that pct_of_top gives you the percentage relative to the first step, so you can see exactly where people fall off.
4. Revenue by landing page (properly attributed)
GA4’s landing page report exists, but connecting it to revenue is surprisingly clunky. This gives you a clean picture.
WITH sessions AS (
SELECT
user_pseudo_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
MIN(CASE WHEN event_name = 'page_view'
THEN (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location')
END) AS landing_page,
SUM(CASE WHEN event_name = 'purchase' THEN ecommerce.purchase_revenue ELSE 0 END) AS revenue,
MAX(CASE WHEN event_name = 'purchase' THEN 1 ELSE 0 END) AS converted
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20260301' AND '20260331'
GROUP BY 1, 2
)
SELECT
NET.REG_DOMAIN(landing_page) AS domain,
REGEXP_EXTRACT(landing_page, r'https?://[^/]+(\/[^?#]*)') AS path,
COUNT(*) AS sessions,
SUM(converted) AS conversions,
ROUND(SUM(converted) * 100.0 / COUNT(*), 2) AS conversion_rate,
ROUND(SUM(revenue), 2) AS total_revenue,
ROUND(SUM(revenue) / COUNT(*), 2) AS revenue_per_session
FROM sessions
WHERE landing_page IS NOT NULL
GROUP BY 1, 2
HAVING sessions >= 20
ORDER BY total_revenue DESC
LIMIT 100;
I had a client spending €15K/month on content marketing. This query showed that 3 blog posts out of 200 drove 61% of their organic revenue. The other 197 posts? Almost zero revenue. They didn’t cut the content program, but they sure changed what they wrote about.
5. Session-level user journey
This one is my favorite. It shows you the actual page sequence for each session, so you can see how people move through your site.
WITH session_pages AS (
SELECT
user_pseudo_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
event_timestamp,
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_title') AS page_title,
REGEXP_EXTRACT(
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location'),
r'https?://[^/]+(\/[^?#]*)'
) AS page_path
FROM `your-project.analytics_XXXXXXX.events_*`
WHERE event_name = 'page_view'
AND _TABLE_SUFFIX BETWEEN '20260301' AND '20260331'
)
SELECT
user_pseudo_id,
session_id,
COUNT(*) AS pages_viewed,
STRING_AGG(page_path, ' → ' ORDER BY event_timestamp) AS journey
FROM session_pages
GROUP BY 1, 2
HAVING pages_viewed >= 3
ORDER BY pages_viewed DESC
LIMIT 200;
Reading actual user journeys is humbling. You think people follow your carefully designed navigation. They don’t. They bounce between product pages, go back to the homepage twice, visit your shipping policy four times, and then convert from a page you forgot existed.
A few things that will trip you up
The UNNEST thing. GA4’s BigQuery schema stores event parameters as nested arrays. That UNNEST(event_params) pattern is how you dig into them. You’ll use it in almost every query. It looks weird. Just accept it and copy the pattern.
Date suffixes. Tables are sharded by date with the format events_YYYYMMDD. The _TABLE_SUFFIX filter is how you pick date ranges without scanning everything. Always use it. Forgetting it is the fastest way to blow past your free tier.
User identity. user_pseudo_id is a device-level identifier. If someone visits on their phone and their laptop, that’s two user_pseudo_ids. You can set a user_id parameter if you have logged-in users, but most sites don’t.
The schema is documented. Google actually has decent documentation at support.google.com/analytics under the BigQuery export schema section. Bookmark it. You’ll reference it often in the first few weeks.
Frequently asked questions
Q: How much does BigQuery cost for GA4 data?
Google provides 1TB of free queries per month and 10GB of free storage. For a site with 1-5 million monthly sessions, the GA4 export typically uses 2-8GB of storage, resulting in a monthly bill between $0 and $10. Always include a date filter in queries to avoid scanning unnecessary data and exceeding the free tier.
Q: How do I connect GA4 to BigQuery?
Go to GA4 Admin, click BigQuery Links, then Link. Choose daily export (streaming export costs more and is rarely needed), select your Google Cloud project, and confirm. The setup takes about ten minutes. Data will start appearing in BigQuery within 24 hours.
Q: Do I need to know SQL to use BigQuery with GA4?
You do not need to become a data engineer. Five to ten well-structured queries can answer the most common marketing questions that GA4 makes difficult. Copy and paste queries, change the date ranges and dataset name, and you will get unsampled results in seconds.
Q: Why does GA4 show different numbers than BigQuery?
GA4’s interface samples your data once you cross certain traffic thresholds, showing estimates rather than exact counts. BigQuery contains every raw event without sampling or aggregation limits. The BigQuery numbers are the source of truth and will often differ from what the GA4 interface reports.
When to graduate to a real data stack
These five queries will carry you far. Probably six months to a year. But at some point you’ll notice a few things.
Your queries are getting copy-pasted and slightly modified everywhere. You have fifteen versions of the “sessions” CTE floating around. Someone accidentally changed a filter and got wrong numbers for a month.
That’s when you want dbt (data build tool), the foundation of a proper marketing data warehouse. It lets you define your transformations as version-controlled SQL models that build on each other. Your “sessions” definition lives in one place. Change it once, everything downstream updates.
Pair that with Looker Studio (free) or Metabase (also free, self-hosted) for dashboards that people actually use, and you have a proper analytics stack that costs maybe $20/month total.
But don’t start there. Start with these five queries. Get comfortable reading raw event data. Understand what your GA4 actually captures versus what you think it captures. That understanding is worth more than any dashboard.
The gap between “I use GA4” and “I can query my own data” is smaller than you think. It’s five queries and an afternoon. The gap between querying your own data and making better decisions is even smaller.
Go open BigQuery. Paste in query number four. See which landing pages actually make money.
Then tell me the results weren’t worth the twenty minutes it took to get here.
Artem Reiter
Web Analytics Consultant