DIY Granular COGS Tracking

Case Study: Cookie Monster’s Monster Cookies

Jeff Goldkamp
Apr 5, 2019 · 6 min read
Image for post
Image for post

Purpose

We’ll implement an end-to-end system that tracks variable costs for different goods sold at a transaction level. The results give a detailed view of variable costs to help with strategic decision making for departments like procurement, supply chain, and inventory management.

The files for this case-study can be found here: GitHub Repository Ro-COGS-Tracking.

Disclaimer: The case study and datasets are completely made up for the sake of demonstrating how to implement a COGS tracking system.

Motivation

Controlling Cost of Goods Sold (COGS hereafter) is a requirement for running a sustainable business. Many factors play into COGS so they are often measured at a high-level, which consequentially sacrifices any insights into costs that change across products and transactions. Therefore, it is advantageous for companies to manage their costs at a granular level to continue delivering superior services in a sustainable manner.

The material cost of actual goods sold is just one of many variable costs that can eat into a product’s gross margins. Packaging, labor, and shipping also contribute to variable costs. Let’s define gross margin as:

Gross Margin = Transaction Amount - Variable Costs

Variable costs include material, labor, payment processing fees, and shipping for each transaction. The transaction amount is the amount charged after discounts.

Case Study: Cookie Monster’s Monster Cookies LLC

Cookie Monster’s Monster Cookies LLC (CMMC hereafter) wants to share their superior cookies in the best way possible. An important piece in doing so is controlling the COGS as CMMC expands their cookie offerings.

Data Sources

CMMC maintains 4 Google Sheets to keep track of their products (cookies), variable costs (ingredients, labor, packaging), and transactions.

Inventory Item Sheet

This sheet tags products with various material costs that are associated with each. We see that CMMC offers five cookie varieties with ingredients spanning from 1–3 each. They all use the same CMMC label and are packaged individually in the same type of packaging. If a customer orders multiple cookies, then they’re batched into the same batch packaging.

Image for post
Image for post

Inventory Cost Sheet

This sheet enumerates the field-value pairs of the Inventory Item Sheet and assigns a price for each period (monthly). We see that there are two types of INGREDIENT01 in column C above, so we have two columns in B:C labeled “INGREDIENT01 TOP_SECRET_01" and “INGREDIENT01 TOP_SECRET_02”. The same pattern repeats for inventory items INGREDIENT02, INGREDIENT03, LABEL_TYPE, SOLO_PACKAGING, and BATCH_PACKAGING.

Image for post
Image for post

Labor and Services Sheet

Cookie Monster actually contracts out CMMC’s baking and retail services in order to prevent himself from eating all the cookies. We see that CMMC cookies are baked and sold in two different bakeries, which is why there are two records for each period with their respective costs.

Image for post
Image for post
(baking cost is per cookie)

Bakery Orders Sheet

Neither bakery has a great POS system in place, so poor Elmo consolidates the monthly sales by hand in the below sheet to record monthly transactions.

Image for post
Image for post

Loading Datasets into Snowflake

Cookie Monster has much more important things to do (like eating cookies) than manually load these datasets each month. That’s why he created an Airflow DAG to automatically import each Google Sheet in to his analytics warehouse Snowflake. You can reference the attached file named “blog_load_cost_info.py”, but the basic idea is that each sheet has it’s own operator like so:

Image for post
Image for post

Dependencies are set as upstreams to look like the following:

Image for post
Image for post

This directed graph depicts the automatic data load process that can be set to run at custom schedules.

Image for post
Image for post

The last rectangle is the operator that combines all the above data and calculates our order-level gross margins. Since Snowflake does not support dynamic pivoting in SQL, we generate the dynamic query with the following python script.

Dynamic Pivoting in Snowflake with Python

Within the attached “blog_create_order_margin_table.py” script we have the below two helper functions. These act as our substitute for dynamic pivoting in Snowflake by fetching whatever column names exist in a given table, formatting them, and calling those results in a query generator.

Image for post
Image for post

We call the above functions in the query generator on the inventory_cost_table, which was covered above as the “Inventory Cost Sheet”.

Image for post
Image for post

A template is defined for calling each cost category in the dynamic list and then the margin query begins. CTE1 selects from the bakery_orders table and assigns the variable cost to each bakery transaction by product_id and period. The for loop inserts the individual item costs as columns through the template for the corresponding product_id and period. If the product doesn’t use the inventory item or a price is missing, a 0 is assigned in each case statement. The escape_for_like() is another helper function that can be viewed in the attached script.

Image for post
Image for post

The last part of the query generator is where we prepare the results and categorize the costs with a bit of hard-coding. A useful extension would add an additional cross-table that links each variable cost to a specific category. We also multiply costs by quantity sold for those that scale with amount purchased. A case statement is used for other conditional costs like packaging. Single quantity orders have the solo packaging cost assigned where any orders with larger quantities receive batch packaging.

Image for post
Image for post

The output from this query yields our final order-level margin table.

Image for post
Image for post
Gross margin here is given as a percentage of the amount charged

COGS Analysis

Now that we have our order-level costs assigned, we can extract some detailed COGS insights such as:

  • Variable cost composition by product
Image for post
Image for post
  • Margins across specific cost categories
Image for post
Image for post
  • Month over month gross margin trends by product
Image for post
Image for post

Of course you can slice and dice the order margin table as you wish in SQL, but CMMC uses a killer BI platform called Looker for users to really dive into the data and create customized graphs.

Image for post
Image for post
CMMC’s unit margins (gross margin shown as a percent of revenue)
Image for post
Image for post
CMMC’s margin breakdowns

Ro Data Team Blog

Ro Data Team Blog: data analytics, data engineering, data…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store