Recently, I ran into a calculation I had never seen before which is called gross margin (profit) analysis using matching pairs, or “Like for Like” analysis. In case you are not a financial controller, here is an explanation of what it is:
The Business Question
You’ve probably looked at your total sales and total costs to figure out your profit (or gross margin). It’s a great starting point! But what if your profit changes from one month to the next? Just knowing the final number doesn’t tell you the whole story.
Imagine you run an ecommerce store. Last quarter, you made a profit of $100,000. This quarter, you made a profit of $120,000. That’s great! But *why* did your profit go up by $20,000?
* Did you just sell more of everything?
* Did you increase your prices?
* Did you sell more of your fancy, high-profit cakes and fewer of your low-profit cookies?
Answering these questions is what Gross Margin Variance Analysis (also known as Gross Profit Bridge Analysis or Margin Waterfall Analysis) is all about. It’s a way of breaking down the change in your profit into simple, understandable pieces using a like-for-like comparison approach.
The Three Main Variance Effects: Like-for-Like Analysis
Instead of just looking at the total profit, we break it down into three main variance “effects” using like-for-like product comparisons:
1. The Volume Variance Effect (Sales Volume Analysis): This tells you how much of your profit change is due to selling more or fewer items for the same products in both periods (like-for-like). If you sold 1,000 more t-shirts this quarter than last quarter, the Volume Variance will show you how much extra profit you made just from that increase in quantity.
2. The Price Variance Effect (Price Realization Analysis): This tells you how much of your profit change is due to changing your prices on like-for-like products. If you increased the price of your jeans by $5 this quarter, the Price Variance will show you how much extra profit you made from that price change across comparable products.
3. The Product Mix Variance Effect (Mix Shift Analysis): This is the most interesting one. It tells you how much of your profit change is due to a shift in what you’re selling among comparable products. Let’s say you sold the same total number of items this quarter as last quarter, and you didn’t change your prices. But this quarter, you sold a lot more of your high-profit sneakers and a lot fewer of your low-profit socks. Your overall profit would go up, and the Product Mix Variance would show you exactly how much of that increase was due to selling a more profitable mix of products in your like-for-like analysis.
What About New or Discontinued Products? (Non-Like-for-Like Items)
We also have a special category called “Non-Matching Products Variance” (or “Non-Like-for-Like Items”). It’s the profit you made from products you sold this quarter but didn’t sell last quarter (like a new line of winter jackets). It also includes the profit you *lost* from products you sold last quarter but didn’t sell this quarter (like last season’s swimwear). These items fall outside the like-for-like comparison because they don’t exist in both periods.
Why Is Variance Analysis Better Than Just Looking at Total Profit?
By breaking down your profit change into these variance effects using like-for-like product comparisons, you can get a much clearer picture of what’s really driving your business. You can answer questions like:
* “Is our new pricing strategy working?” (Look at the Price Variance)
* “Are our marketing campaigns leading to more sales?” (Look at the Volume Variance)
* “Are our customers starting to prefer our more profitable products?” (Look at the Product Mix Variance)
* “How are new product launches affecting profitability?” (Look at the Non-Like-for-Like Variance)
This kind of gross margin bridge analysis helps you understand the story behind the numbers, so you can make smarter decisions to grow your business.
The reason I ran into this was a company who is using tools like SAP Analytics which requires using aggregation exceptions. While this approach works, it comes with significant limitations:
1. Performance Issues: These calculations are slow because they rely heavily on the system’s calculation engine and proprietary formula syntax, which can create bottlenecks when analyzing large datasets.
2. Maintenance Challenges: Changes to calculations are made through the UI rather than through code. While version history exists, it isn’t tied to standard development workflows (like Git), making it difficult to track changes, collaborate, and maintain consistency across environments.
These limitations make it challenging to iterate quickly on business logic and provide a flexible, performant solution for understanding profit drivers.
-–
How We Built This: A Technical Deep Dive
At its core, this analysis is powered by a dynamic derived table in Looker. A derived table is essentially a custom SQL query that creates a virtual table on-the-fly. Think of it as a sophisticated calculation engine that runs in your data warehouse but is defined and managed within Looker. It is also dynamic as its calculation depends on the input date parameters entered by the end user.
Here’s how we constructed the analysis step-by-step:
STEP 1: base_data - Gathering the Raw Ingredients
WITH base_data AS (
SELECT
ii.product_id, p.category AS product_category, oi.sale_price, ii.cost, oi.returned_at, ({% condition dates_period_1 %} oi.created_at {% endcondition %}) AS is_period_1, ({% condition dates_period_2 %} oi.created_at {% endcondition %}) AS is_period_2
FROM order_items AS oiLEFT JOIN inventory_items AS ii ON oi.inventory_item_id = ii.id
LEFT JOIN products AS p ON ii.product_id = p.id
WHERE
({% condition dates_period_1 %} oi.created_at {% endcondition %} OR {% condition dates_period_2 %} oi.created_at {% endcondition %})
)
What’s happening here?
- We’re joining three core tables: `order_items` (sales transactions), `inventory_items` (cost data), and `products` (product details)
- The `{% condition dates_period_1 %}` and `{% condition dates_period_2 %}` are Looker templated filters - they allow users to dynamically select their comparison periods through the UI without writing SQL
- We create boolean flags (`is_period_1` and `is_period_2`) to tag each transaction with which period it belongs to
- We filter to only include transactions from the two selected periods, making our query more efficient
Why this matters: This CTE establishes our foundation by bringing together all the necessary data points (price, cost, product info) and clearly marking which period each transaction belongs to.
STEP 2: aggregated_data - Rolling Up to Product Level
aggregated_data AS (
SELECT
product_id, product_category, SUM(CASE WHEN is_period_1 THEN CASE WHEN returned_at IS NULL THEN 1 ELSE -1 END ELSE 0 END) as quantity_c1, SUM(CASE WHEN is_period_2 THEN CASE WHEN returned_at IS NULL THEN 1 ELSE -1 END ELSE 0 END) as quantity_c2, SUM(CASE WHEN is_period_1 THEN sale_price ELSE 0 END) as revenue_c1, SUM(CASE WHEN is_period_2 THEN sale_price ELSE 0 END) as revenue_c2, SUM(CASE WHEN is_period_1 THEN cost ELSE 0 END) as cost_c1, SUM(CASE WHEN is_period_2 THEN cost ELSE 0 END) as cost_c2
FROM base_data
GROUP BY 1, 2
)
What’s happening here?
- We’re aggregating all transactions to the **product level** (each row represents one product)
- For each period, we calculate:
- quantity: Number of units sold (with returned items counted as -1 to properly account for returns)
- revenue: Total sales revenue
- cost: Total cost of goods sold
- The suffix `_c1` means “cohort 1” (Period 1) and `_c2` means “cohort 2” (Period 2)
Why this matters: We’re moving from individual transactions to summarized product-level metrics for each period. This aggregation is crucial because our variance analysis compares products between periods, not individual transactions.
STEP 3: matching_product - Identifying Apples-to-Apples Comparisons
matching_product AS (
SELECT
product_id, (CASE WHEN SUM(quantity_c1) > 0 AND SUM(quantity_c2) > 0 THEN 1 ELSE 0 END) as matching_product
FROM aggregated_data
GROUP BY 1
)
What’s happening here?
- We identify products that were sold in **both periods** (quantity > 0 in both periods)
- These are our “matching pairs” - the products we can fairly compare between periods
- Products sold in only one period are flagged with `matching_product = 0`
Why this matters: This is a critical step for accuracy. You can’t calculate a “price effect” for a product that didn’t exist in Period 1, and you can’t calculate a “volume effect” for a product that was discontinued. By separating matching and non-matching products, we ensure our variance calculations are mathematically sound.
STEP 4: data_with_matching_product - Enriching Our Dataset
data_with_matching_product AS (
SELECT
agg.\*,
COALESCE(mp.matching_product, 0) as matching_product
FROM aggregated_data as agg
LEFT JOIN matching_product as mp ON agg.product_id = mp.product_id
)
What’s happening here?
- We’re simply joining the matching product indicator back to our main aggregated dataset
- The `COALESCE` ensures any NULL values become 0 (non-matching)
Why this matters: Now every product row knows whether it’s a matching product or not, which we’ll use to split our calculations in the final measures.
STEP 5: final_data - Calculating Per-Unit Metrics
final_data AS (
SELECT
product_id, product_category, matching_product, quantity_c1, quantity_c2, revenue_c1, revenue_c2, cost_c1, cost_c2, revenue_c1 - cost_c1 as gross_margin_c1, revenue_c2 - cost_c2 as gross_margin_c2, -- Per-unit metrics
CASE WHEN quantity_c1 = 0 THEN 0 ELSE revenue_c1 / quantity_c1 END as price_per_unit_c1, CASE WHEN quantity_c2 = 0 THEN 0 ELSE revenue_c2 / quantity_c2 END as price_per_unit_c2, CASE WHEN quantity_c1 = 0 THEN 0 ELSE (revenue_c1 - cost_c1) / quantity_c1 END as margin_per_unit_c1, CASE WHEN quantity_c2 = 0 THEN 0 ELSE (revenue_c2 - cost_c2) / quantity_c2 END as margin_per_unit_c2 FROM data_with_matching_product )
What’s happening here?
- We calculate gross margin (revenue - cost) for each period
- We calculate per-unit metrics:
- `price_per_unit`: Average selling price per unit
- `margin_per_unit`: Average gross margin per unit
- We use `CASE WHEN quantity = 0 THEN 0` to avoid division by zero errors
Why this matters: Per-unit metrics are the foundation of variance analysis. To isolate the effect of price changes, we need to know the average price per unit. To isolate the effect of volume changes, we need to know the average margin per unit. These calculations set us up for the final variance formulas.
STEP 6: Final SELECT - Adding Time Dimensions
SELECT
*,
DATE_DIFF(DATE({% date_end dates_period_1 %}), DATE({% date_start dates_period_1 %}), MONTH) as months_c1,
DATE_DIFF(DATE({% date_end dates_period_2 %}), DATE({% date_start dates_period_2 %}), MONTH) as months_c2
FROM final_data
What’s happening here?
- We calculate the number of months in each period using `DATE_DIFF`
- These are used to normalize our effects to **average monthly** impacts
Why this matters: If Period 1 was 3 months and Period 2 was 4 months, raw comparisons would be misleading. By calculating monthly averages, we can fairly compare periods of different lengths.
The Variance Formulas: Breaking Down the Change
Once we have our base data prepared, we leverage the power of ythe semantic layer to expose these as Looker measures that calculate the three main effects:
Volume Effect
measure: volume_effect {
type: sum
sql: ((${quantity_c1} / NULLIF(${months_c1},0)) - (${quantity_c2} / NULLIF(${months_c2},0)))
* ${margin_per_unit_c2}
* ${matching_product} ;;
}
Formula: (Avg Monthly Qty Period 1 - Avg Monthly Qty Period 2) × Margin Per Unit Period 2 × Matching Flag
Translation: “How much more/less profit did we make just from selling more/fewer units, holding everything else constant?”
Price Effect
measure: price_effect {
type: sum
sql: (${price_per_unit_c1} - ${price_per_unit_c2}) * (${quantity_c1} / NULLIF(${months_c1},0)) * ${matching_product} ;;
}
Formula: (Price Per Unit Period 1 - Price Per Unit Period 2) × Avg Monthly Qty Period 1 × Matching Flag
Translation: “How much more/less profit did we make from price changes, using Period 1 volume as the base?”
Product Mix Effect
measure: product_mix_effect {
type: sum
sql: (${margin_per_unit_c1} - ${margin_per_unit_c2})
* (${quantity_c1} / NULLIF(${months_c1},0))
* ${matching_product} ;;
}
Formula: (Margin Per Unit Period 1 - Margin Per Unit Period 2) × Avg Monthly Qty Period 1 × Matching Flag
Translation: “How much more/less profit did we make from changes in product profitability (cost changes, promotional activity, etc.)?”
Non-Matching Products
measure: non_matching_products {
type: sum
sql: ((${gross_margin_c1} / NULLIF(${months_c1},0)) - (${gross_margin_c2} / NULLIF(${months_c2},0)))
* (1 - ${matching_product}) ;;
}
Formula: (Avg Monthly Margin Period 1 - Avg Monthly Margin Period 2) × Non-Matching Flag
Translation: “What’s the profit impact from products that only existed in one period (new products or discontinued products)?”
the complete view is:
view: gross_margin_effects {
derived_table: {
sql:
-- STEP 1: Base data with period indicators
WITH base_data AS (
SELECT
ii.product_id,
p.category AS product_category,
oi.sale_price,
ii.cost,
oi.returned_at,
({% condition dates_period_1 %} oi.created_at {% endcondition %}) AS is_period_1,
({% condition dates_period_2 %} oi.created_at {% endcondition %}) AS is_period_2
FROM ${order_items.SQL_TABLE_NAME} AS oi
LEFT JOIN ${inventory_items.SQL_TABLE_NAME} AS ii ON oi.inventory_item_id = ii.id
LEFT JOIN ${products.SQL_TABLE_NAME} AS p ON ii.product_id = p.id
WHERE
({% condition dates_period_1 %} oi.created_at {% endcondition %} OR {% condition dates_period_2 %} oi.created_at {% endcondition %})
),
-- STEP 2: Aggregate metrics to product-category level (across all customers)
aggregated_data AS
SELECT
product_id,
product_category,
SUM(CASE WHEN is_period_1 THEN CASE WHEN returned_at IS NULL THEN 1 ELSE -1 END ELSE 0 END) as quantity_c1,
SUM(CASE WHEN is_period_2 THEN CASE WHEN returned_at IS NULL THEN 1 ELSE -1 END ELSE 0 END) as quantity_c2,
SUM(CASE WHEN is_period_1 THEN sale_price ELSE 0 END) as revenue_c1,
SUM(CASE WHEN is_period_2 THEN sale_price ELSE 0 END) as revenue_c2,
SUM(CASE WHEN is_period_1 THEN cost ELSE 0 END) as cost_c1,
SUM(CASE WHEN is_period_2 THEN cost ELSE 0 END) as cost_c2
FROM base_data
GROUP BY 1, 2
),
-- STEP 3: Calculate matching products (sold in both periods)
matching_product AS (
SELECT
product_id,
(CASE WHEN SUM(quantity_c1) > 0 AND SUM(quantity_c2) > 0 THEN 1 ELSE 0 END) as matching_product
FROM aggregated_data
GROUP BY 1
),
-- STEP 4: Join matching product info
data_with_matching_product AS (
SELECT
agg.*,
COALESCE(mp.matching_product, 0) as matching_product
FROM aggregated_data as agg
LEFT JOIN matching_product as mp ON agg.product_id = mp.product_id
),
-- STEP 5: Calculate per-unit metrics and final data
final_data AS (
SELECT
product_id,
product_category,
matching_product,
quantity_c1,
quantity_c2,
revenue_c1,
revenue_c2,
cost_c1,
cost_c2,
revenue_c1 - cost_c1 as gross_margin_c1,
revenue_c2 - cost_c2 as gross_margin_c2,
-- Per-unit metrics
CASE WHEN quantity_c1 = 0 THEN 0 ELSE revenue_c1 / quantity_c1 END as price_per_unit_c1,
CASE WHEN quantity_c2 = 0 THEN 0 ELSE revenue_c2 / quantity_c2 END as price_per_unit_c2,
CASE WHEN quantity_c1 = 0 THEN 0 ELSE (revenue_c1 - cost_c1) / quantity_c1 END as margin_per_unit_c1,
CASE WHEN quantity_c2 = 0 THEN 0 ELSE (revenue_c2 - cost_c2) / quantity_c2 END as margin_per_unit_c2
FROM data_with_matching_product
)
-- STEP 7: Final select with month calculations
SELECT
*,
DATE_DIFF(DATE({% date_end dates_period_1 %}), DATE({% date_start dates_period_1 %}), MONTH) as months_c1,
DATE_DIFF(DATE({% date_end dates_period_2 %}), DATE({% date_start dates_period_2 %}), MONTH) as months_c2
FROM final_data
;;
}
filter: dates_period_1 {
type: date
default_value: "2025/01/01 to 2025/03/31"
description: "Select the first period for comparison (baseline) - Default: Q1 2025"
}
filter: dates_period_2 {
type: date
default_value: "2025/04/01 to 2025/06/30"
description: "Select the second period for comparison - Default: Q2 2025"
}
dimension: product_id {
type: number
sql: ${TABLE}.product_id ;;
primary_key: yes
}
dimension: product_category {
type: string
sql: ${TABLE}.product_category ;;
}
dimension: matching_product {
hidden: yes
type: number
sql: ${TABLE}.matching_product ;;
}
dimension: months_c1 {
hidden: yes
type: number
sql: ${TABLE}.months_c1 ;;
}
dimension: months_c2 {
hidden: yes
type: number
sql: ${TABLE}.months_c2 ;;
}
dimension: quantity_c1 {
hidden: yes
type: number
sql: ${TABLE}.quantity_c1 ;;
}
dimension: quantity_c2 {
hidden: yes
type: number
sql: ${TABLE}.quantity_c2 ;;
}
dimension: revenue_c1 {
hidden: yes
type: number
sql: ${TABLE}.revenue_c1 ;;
}
dimension: revenue_c2 {
hidden: yes
type: number
sql: ${TABLE}.revenue_c2 ;;
}
dimension: cost_c1 {
hidden: yes
type: number
sql: ${TABLE}.cost_c1 ;;
}
dimension: cost_c2 {
hidden: yes
type: number
sql: ${TABLE}.cost_c2 ;;
}
dimension: gross_margin_c1 {
hidden: yes
type: number
sql: ${TABLE}.gross_margin_c1 ;;
}
dimension: gross_margin_c2 {
hidden: yes
type: number
sql: ${TABLE}.gross_margin_c2 ;;
}
dimension: price_per_unit_c1 {
hidden: yes
type: number
sql: ${TABLE}.price_per_unit_c1 ;;
}
dimension: price_per_unit_c2 {
hidden: yes
type: number
sql: ${TABLE}.price_per_unit_c2 ;;
}
dimension: margin_per_unit_c1 {
hidden: yes
type: number
sql: ${TABLE}.margin_per_unit_c1 ;;
}
dimension: margin_per_unit_c2 {
hidden: yes
type: number
sql: ${TABLE}.margin_per_unit_c2 ;;
}
measure: total_gross_margin_c1 {
type: sum
sql: ${gross_margin_c1} ;;
label: "Total Gross Margin - Period 1"
value_format_name: usd
}
measure: total_gross_margin_c2 {
type: sum
sql: ${gross_margin_c2} ;;
label: "Total Gross Margin - Period 2"
value_format_name: usd
}
measure: months_c1_max {
hidden: yes
type: max
sql: ${months_c1} ;;
}
measure: months_c2_max {
hidden: yes
type: max
sql: ${months_c2} ;;
}
measure: gmii_diff {
type: number
sql: ${total_gross_margin_c1} - ${total_gross_margin_c2} ;;
group_label: "Gross Margin Effects"
label: "Gross Margin Difference (Total Period)"
description: "Total difference in gross margin between Period 1 and Period 2"
value_format_name: usd
}
measure: volume_effect {
type: sum
sql: ((${quantity_c1} / NULLIF(${months_c1},0)) - (${quantity_c2} / NULLIF(${months_c2},0))) * ${margin_per_unit_c2} * ${matching_product} ;;
group_label: "Gross Margin Effects"
label: "Volume Effect (Avg Monthly)"
description: "Average monthly impact of volume changes for products sold in both periods"
value_format_name: usd
}
measure: price_effect {
type: sum
sql: (${price_per_unit_c1} - ${price_per_unit_c2}) * (${quantity_c1} / NULLIF(${months_c1},0)) * ${matching_product} ;;
group_label: "Gross Margin Effects"
label: "Price Effect (Avg Monthly)"
description: "Average monthly impact of price changes for products sold in both periods"
value_format_name: usd
}
measure: product_mix_effect {
type: sum
sql: (${margin_per_unit_c1} - ${margin_per_unit_c2}) * (${quantity_c1} / NULLIF(${months_c1},0)) * ${matching_product} ;;
group_label: "Gross Margin Effects"
label: "Product Mix Effect (Avg Monthly)"
description: "Average monthly impact of margin/cost changes for products sold in both periods"
value_format_name: usd
}
measure: non_matching_products {
type: sum
sql: ((${gross_margin_c1} / NULLIF(${months_c1},0)) - (${gross_margin_c2} / NULLIF(${months_c2},0))) * (1 - ${matching_product}) ;;
group_label: "Gross Margin Effects"
label: "Non-Matching Products (Avg Monthly)"
description: "Average monthly impact of products sold in only one period (new products or discontinued products)"
value_format_name: usd
}
}
-–
Once all this is defined in the semantic layer, Looker’s Explore interface completely abstracts this complexity away from the end user and enables them to self serve on this complex calculation. Here’s how:
The `dates_period_1` and `dates_period_2` filters appear as simple date pickers in the UI.
Each time they change the dates, Looker automatically re-runs the entire derived table with the new parameters. It’s like having a custom report generator that adapts to any comparison you want to make.
1. Volume Variance (-$19,701/month):
- Major volume declines in Accessories (-$3,245), Outerwear & Coats (-$2,695), and Fashion Hoodies & Sweatshirts (-$2,405)
- Only 3 categories showed positive volume effects: Sleep & Lounge (+$383), Blazers & Jackets (+$199), and Clothing Sets (+$99)
- This is the **biggest driver** of margin decline
2. Price Variance (-$1,718/month):
- Mixed results across categories
- Blazers & Jackets saw massive price pressure (-$1,290), likely due to heavy promotional activity
- Outerwear & Coats had positive price realization (+$791), suggesting successful premium pricing
- Overall net negative indicates competitive pricing pressure
3. Product Mix Variance (-$976/month):
- Blazers & Jackets (-$828) shows customers shifting to lower-margin styles within the category
- Outerwear & Coats (+$431) indicates a favorable shift toward higher-margin products
- Overall negative suggests margin dilution from promotional items
4. Non-Matching Products (-$4,538/month):
- Shorts showed +$843 from new product launches, offsetting volume declines
- Outerwear & Coats (+$906) benefited from new higher-margin styles
- Several categories (Sweaters -$1,894, Dresses -$521) lost margin from discontinued products
Looker also pushes the derived table execution to the data warehouse (BigQuery in this case), where it can leverage:
- Columnar storage for efficient aggregations
- Distributed computing for large datasets
- Query caching for frequently-used comparisons
All of this analysis happened in minutes, without writing SQL, and with full transparency into how the numbers were calculated. Let me know what you thought about it in the comments!
