DIY Granular COGS Tracking

Case Study: Cookie Monster’s Monster Cookies

Jeff Goldkamp
Apr 5, 2019 · 6 min read


We’ll implement an end-to-end system that tracks variable costs for different goods sold at a transaction level. The results give a detailed view of variable costs to help with strategic decision making for departments like procurement, supply chain, and inventory management.

The files for this case-study can be found here: GitHub Repository Ro-COGS-Tracking.

Disclaimer: The case study and datasets are completely made up for the sake of demonstrating how to implement a COGS tracking system.


Controlling Cost of Goods Sold (COGS hereafter) is a requirement for running a sustainable business. Many factors play into COGS so they are often measured at a high-level, which consequentially sacrifices any insights into costs that change across products and transactions. Therefore, it is advantageous for companies to manage their costs at a granular level to continue delivering superior services in a sustainable manner.

The material cost of actual goods sold is just one of many variable costs that can eat into a product’s gross margins. Packaging, labor, and shipping also contribute to variable costs. Let’s define gross margin as:

Gross Margin = Transaction Amount - Variable Costs

Variable costs include material, labor, payment processing fees, and shipping for each transaction. The transaction amount is the amount charged after discounts.

Case Study: Cookie Monster’s Monster Cookies LLC

Cookie Monster’s Monster Cookies LLC (CMMC hereafter) wants to share their superior cookies in the best way possible. An important piece in doing so is controlling the COGS as CMMC expands their cookie offerings.

Data Sources

CMMC maintains 4 Google Sheets to keep track of their products (cookies), variable costs (ingredients, labor, packaging), and transactions.

Inventory Item Sheet

This sheet tags products with various material costs that are associated with each. We see that CMMC offers five cookie varieties with ingredients spanning from 1–3 each. They all use the same CMMC label and are packaged individually in the same type of packaging. If a customer orders multiple cookies, then they’re batched into the same batch packaging.

Inventory Cost Sheet

This sheet enumerates the field-value pairs of the Inventory Item Sheet and assigns a price for each period (monthly). We see that there are two types of INGREDIENT01 in column C above, so we have two columns in B:C labeled “INGREDIENT01 TOP_SECRET_01" and “INGREDIENT01 TOP_SECRET_02”. The same pattern repeats for inventory items INGREDIENT02, INGREDIENT03, LABEL_TYPE, SOLO_PACKAGING, and BATCH_PACKAGING.

Labor and Services Sheet

Cookie Monster actually contracts out CMMC’s baking and retail services in order to prevent himself from eating all the cookies. We see that CMMC cookies are baked and sold in two different bakeries, which is why there are two records for each period with their respective costs.

(baking cost is per cookie)

Bakery Orders Sheet

Neither bakery has a great POS system in place, so poor Elmo consolidates the monthly sales by hand in the below sheet to record monthly transactions.

Loading Datasets into Snowflake

Cookie Monster has much more important things to do (like eating cookies) than manually load these datasets each month. That’s why he created an Airflow DAG to automatically import each Google Sheet in to his analytics warehouse Snowflake. You can reference the attached file named “”, but the basic idea is that each sheet has it’s own operator like so:

Dependencies are set as upstreams to look like the following:

This directed graph depicts the automatic data load process that can be set to run at custom schedules.

The last rectangle is the operator that combines all the above data and calculates our order-level gross margins. Since Snowflake does not support dynamic pivoting in SQL, we generate the dynamic query with the following python script.

Dynamic Pivoting in Snowflake with Python

Within the attached “” script we have the below two helper functions. These act as our substitute for dynamic pivoting in Snowflake by fetching whatever column names exist in a given table, formatting them, and calling those results in a query generator.

We call the above functions in the query generator on the inventory_cost_table, which was covered above as the “Inventory Cost Sheet”.

A template is defined for calling each cost category in the dynamic list and then the margin query begins. CTE1 selects from the bakery_orders table and assigns the variable cost to each bakery transaction by product_id and period. The for loop inserts the individual item costs as columns through the template for the corresponding product_id and period. If the product doesn’t use the inventory item or a price is missing, a 0 is assigned in each case statement. The escape_for_like() is another helper function that can be viewed in the attached script.

The last part of the query generator is where we prepare the results and categorize the costs with a bit of hard-coding. A useful extension would add an additional cross-table that links each variable cost to a specific category. We also multiply costs by quantity sold for those that scale with amount purchased. A case statement is used for other conditional costs like packaging. Single quantity orders have the solo packaging cost assigned where any orders with larger quantities receive batch packaging.

The output from this query yields our final order-level margin table.

Gross margin here is given as a percentage of the amount charged

COGS Analysis

Now that we have our order-level costs assigned, we can extract some detailed COGS insights such as:

  • Variable cost composition by product
  • Margins across specific cost categories
  • Month over month gross margin trends by product

Of course you can slice and dice the order margin table as you wish in SQL, but CMMC uses a killer BI platform called Looker for users to really dive into the data and create customized graphs.

CMMC’s unit margins (gross margin shown as a percent of revenue)
CMMC’s margin breakdowns

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data science

Jeff Goldkamp

Written by

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data science

More From Medium

More from Ro Data Team Blog

More on Looker from Ro Data Team Blog

More on Looker from Ro Data Team Blog

Looker Alerts with Dynamic Thresholds

More on Looker from Ro Data Team Blog

More on Looker from Ro Data Team Blog

How to Set Up Customer Retention Analysis in Looker


Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade