Machine Learning for How Many Donuts, Bagels, Etc. to Make Each Day

Thomas Haley
ConchaML
Published in
10 min readDec 14, 2020

This is a guide to using machine learning to answer the question: “How many should I make?” Every cafe, bakery, and restaurant has to figure this out for any product they need to make or get delivered in the morning, and then sell that day (e.g. muffins, premade sandwiches, scones, quiche slices). I wrote code to solve this problem and named it “Concha” (conchas are one of my favorite things to buy with my coffee). This guide shows cafe managers how to run the Concha code and get a list for each day of how many of each product to make.

How it Works

Concha works by looking at the sales history to see how much of each item sold during the past, and then uses machine learning to predict how much to make in future days to maximize profit. Specifically, Concha looks at what the weather was like, and the day of the week for each date in the sales history, “learns” the patterns, then uses the predicted weather, and the day of the week for the next six days to make the estimate for the future.

The sales history is imported directly from Square, the Concha code runs on Google’s machine learning computers (for free), and the predictions Concha makes are saved on your Google Drive.

Why?

This question is not new, so why go through the trouble of using advanced data science to answer it?

  • Money: Let’s consider Donut Friend in L.A.: on a Tuesday they need to plan how many Fudgegazi donuts to make for the next Saturday. They decide on 30. If that’s not enough, they might run out at 1:00 PM and miss 19 sales, a loss of $57 in profit. If 30 is too many, they might have 8 donuts left over they can’t sell the next day — losing $16. The better they can estimate how much to make, the more money they make. It’s impossible to predict exactly how much Concha improve profits for each business, but I created a cafe simulator to see what difference it might make. When I ran the numbers, Concha improved profits by about 8%. At Starbucks, that would mean a profit increase of ~$30,000,000 per year. Speaking of which —
  • Starbucks: At some level, every coffee house, bakery, and donut shop is competing against the crowned mermaid. It has many advantages over smaller retailers (lower supply costs, rent negotiating power), but now you can leverage deep learning to optimize production. Judging by how often it runs out of chicken wraps before 11 AM, Starbucks doesn’t.
  • Power: There is power in data to solve problems. Usually companies use that data on us to maximize their profits, now you can use your data to maximize your profits.

Setup

Setting up is about as complicated as changing a tire: easy if you’ve done it before, but an odd experience if you haven’t. This will walk you through the steps.

If you run into any bumps, I’m happy to answer any questions, or jump on a video chat to help out— you can reach me at thomashaleyds@gmail.com.

What you’ll need:

  • Google Drive (to store the predictions)
  • Square Point-of-Sale at any location where you want to make predictions

Google Colab Notebooks

Colab is a tool Google created to allow anyone to run code for free on their Google Cloud computers. A Colab file is called a “notebook”. It’s called a notebook because it lets you run code in sections (called “cells”), instead of all at once. We’ll use Colab to run Concha.

How Colab Works

If you haven’t used Colab before, this is a great way to learn: Practice Notebook. (To keep this guide open in its own tab, hold down CTRL when you click on the link to open it in a new tab, or right-click for mac users.) There are two cells in this notebook. The first one says “Hello” when you click the play triangle in the upper left of the cell.

Colab runs code one cell at a time.

If it’s your first time running the notebook, Colab will first warn you about running the code:

I wrote the code on Github.

The notebook is loaded from GitHub because that’s where most open source, code that anyone can see and use, code projects are stored. (The source code is here: Concha code on Github.) The next cell shows how Colab can use data you type into the text line on the right side of the cell.

Colab can use text you enter.

Setup: Saving Your Own Copy

The next part of the guide will show how to set up Concha. Open up the setup Colab notebook here in a new tab. When you first open the notebook, you are seeing a template version I put up, but in order to keep your information after setup, save the notebook to your Google Drive with the “Copy to Drive” button.

Save your own copy of the setup notebook.

The version in your drive will look the same, but the “Copy to Drive” button will have disappeared.

Notebook on Drive will save updates.

Now close the template version (when I have both open I forget which is which). The notebook is saved on your Drive in a folder called “Colab Notebooks” with the name “Copy of 02_setup_do_once.ipynb”.

Connecting Computers

Computers use “Open Authorization” (OAuth) to give each other permission to share data. We need to create two connections:

  1. Connect Colab to Google Drive (to save the predictions).
  2. Connect Colab to Square (to get the sales history).

We need to connect the computer running the code on the Colab notebook to your Google Drive. (Which seems bizarre because this Colab notebook file is already in your Drive — it’s set up that way to be extra careful.) To connect the notebook computer to your Drive, Google uses four steps:

  1. The notebook will give you a link to let you ask to connect to your Drive when you run the first cell.
When you run the cell, Drive gives you a request-to-connect link to follow.

2. Google Drive will confirm you want to connect it to the notebook.

3. Google Drive will give you a confirmation code.

Copy this code so you can paste it back in the notebook.

4. Copy the code, paste it in the notebook, then type “Enter”. Once Drive is connected, the cell will say Mounted at /content/drive.

Paste the code here and hit “Enter”.

Run the next cell to import the concha code.

This cell downloads the Concha code.

Square

The reports available in the Square dashboard don’t have the level of detail Concha needs to make predictions. Specifically, in order to estimate on which days product supply ran out before the end of the day, and to estimate how much would have sold if the supply hadn’t run out, Concha uses the timestamp and quantity for each sale. This next section shows how to get your sales history.

Activate the Square Data Connection

The first step is to tell Square you want to a data connection. Go to the Square site (here: https://developer.squareup.com/apps) and create an “Application”. You can give it any name you like.

Create an “app” to do the importing.

On your new connector’s dashboard, toggle over to “Production” on the top menu and “OAuth” on the left menu.

Setup your application.

We’ll set it up so we can follow the same authorization steps we just used to connect the Colab notebook’s computer to Google Drive.

Set the “Production Redirect URL” to “https://www.google.com”, and save the update.

Copy the “Production Application ID” and “Production Application Secret” (Click “Show” first) fields and paste each one to the next cell in the Colab notebook. When you run the cell, it will give you a link to follow.

Run the cell to create the request-to-connect url.

When you click on this link you are requesting that Square use the connection you just created. Square will ask you to login and ask if you want to give yourself access to your data. If you say “yes”, Square will drop you off at “www.google.com” (because that’s what you put in the “Redirect URL” field.), Now you’ll see that the url has a “…code=xxxx…” part at the end. That’s the confirmation code.

URL with the confirmation code.

Copy the whole url and paste it into the next cell then run it.

You should see a new “access_token” after running the cell. You now have (limited) access to your own data. This access token can only read the store locations and order history. All the connector information is saved in concha_planners/importers/square.json on your Google Drive.

To check that the connnection is working, run the next cell. The notebook will import the business locations listed in your Square account and list them.

If you can see this list, the connection works.

Weather

The National Oceanic and Atmospheric Administration (NOAA) provides the weather data Concha needs. You can get an API (“application programming interface”) key here. They’ll email it to you and then you can copy it (the email will call it a “token”) and paste it into the next cell. When you run the cell, the token/key is saved on your Drive.

Paste the API key/token NOAA sends you here, then run the cell.

Final Setup Check

Run the last cell. You should see the weather forecast near the first location for the next six days.

Everything works! You did it!

The next part of the guide shows how to make the predictions.Open up the next notebook for making predictions (here) in a new tab, and save a copy to your Google Drive.

This is the notebook that wil make predictions.

On your Drive copy of the notebook, run the cells to connect to Drive and download the Concha code.

Predictions

The code that does the prediction work in Concha is called a “planner”. Each planner is assigned a location to know which data history to learn. Here’s the list of the locations on my Square account.

Locations from Square

Create A New Planner

Let’s make a new planner. planner_name can be any name you would like, but the location_name has to match one of the locations named in your list. In my case “Concha West” is my location_name and so I chose “conchawest” as the name of the planner I’m making for that location.

Running this cell creates a new planner.

Use An Existing Planner

If you have already created a planner, this next section explains how to import the most recent sales history, train the machine learning models, and make predictions for how much of each product to make.

I know I have an existing planner called “conchawest”, so I’ll start up that planner and import all the new sales transactions at the planner’s location.

When the cell runs, it shows the first ten rows for the imported sales history. Because my planner was called “conchawest”, the sales history will be saved in concha_planners/conchawest/history/.

Planner sales history files in Google Drive

The next cell does the actual machine learning and makes predictions for how much to make.

This cell creates the predictions.

The predictions for this planner are stored in concha_planners/conchawest/history/ on my Google Drive. When I open them up in Drive (using Sheets), they look like this.

Optimal Production Predictions

The column “production” is how much the machine learning model thinks you should make of each product on each day to maximize profit.

Tuning the Products

There are two things we’ll want to adjust before actually running this on a regular basis: the properties of each product, and which products need a prediction. First let’s see a list of the products in the sales history by running the next cell.

How to list the products in the sales history.

Set the Product Price, Cost, and Batch Size

The optimal amount of product to produce depends on three things:

  1. The sale price of the product.
  2. The cost to make a batch of the product.
  3. The production batch size.

You can set the product properties for each product by name. When you run the cell, it saves the settings to planner’s file (so you only have to set it once.)

These muffins cost $12 to make per batch of 6, and sell for $4 each.

Here’s the Data Science behind what Concha does: for each product, Concha approximates the conditional probability density function (pdf) of demand conditioned on the day of the week and the weather, then chooses the optimal point on the pdf curve based on the marginal profit, which depends on the sale price, cost to produce a batch, and the batch size to maximize profit.

Limit Which Products To Predict

Muffins, donuts, and quiches need to be made or stocked each morning, but some products can be made as needed (e.g. coffee) and don’t need a prediction. You can specify which products need predicting to speed up the calculations. You can put the products you want to predict into a comma-separated list and enter them on the next cell.

Limit the products to predict.

Each product name needs a quote around it, and then commas between each name, with square brackets around the whole thing. Now if you go back and run the cell that makes the predictions, it will only do the products you listed.

Contact

I hope you find this helpful! This project is experimental, and I would love your help making it better! Please comment on this page or contact me at thomshaleyds@gmail.com.

--

--

Thomas Haley
ConchaML
Editor for

I’m always looking for useful insights from data. Right now, that means predicting optimal concha/bagel/salad production.