SVM — Cupcakes or Muffins? — Start To Finished

A Complete Colab Notebook Using Skit-learn — Based on Alice Zhao Post — AISeries — Episode #07

J3
Jungletronics
6 min readMay 27, 2021

--

Hi, this time we tried an interesting project: delicacy o/

Please kindly watch this video from Alice Zhao. Get the Colab file here.

This will end up with a model that will automatically predict whether a given recipe is a muffin recipe or a cupcake recipe.

Muffins vs Cupcakes — What’s the difference? Image From cozinhatecnica.com — This analysis tends to answers these questions: Is muffing a cupcake with random bits of stuff in it? Is cupcakes just a muffin with frosting?
Although the cupcake is usually decorated, the shape and appearance are very similar. But the real difference is in the preparation technique.

Cupcake is a cup-sized cake. The cupcake recipe uses techniques for preparing the cake dough. And the muffin is considered a quick roll.
Is muffing a cupcake with random bits of stuff in it? Is cupcakes just a muffin with frosting?

Let’s get it on!

7 Steps for ML Solution

In Machine Learning these principles apply:

1# Gathering Data

The quality and quantity of data that you gather will directly determine how good your predictive model can be;

Alice Zhao found data by googling ‘basic muffin recipe’ and ‘basic cupcake recipe’.

Here is an example from bakedbyanintrovert.com

Basic Muffin RecipeIngredients2 cups (260 g) all-purpose flour
½ cup (100 g) granulated sugar
2 teaspoons baking powder
½ teaspoon salt
¾ cup (180 ml) milk, room temperature
½ cup (114 g) unsalted butter, melted and cooled
2 large eggs, room temperature
2 tablespoons coarse sugar, optional

She found, surprisingly, that muffins and cupcakes just have eight ingredients in them, and then, also surprisingly, with all twenty recipes that were different and there were no duplicates.

She normalizes the data and the way she did this was she took her data from being amount-based to percentage-based.

Amount-based           Percentage-based
Flour Sugar .... Flour Sugar Total
½ cup ½ teaspoon 12% 15% ... 100%

At end of the day this is what the data looks like:

Here are the type and 8 ingredients. Each line adds up to 100%.

Importing the libraries: Open your Google Colab and type:

# Packages for analysis
import pandas as pd
import numpy as np
from sklearn import svm
# Packages for visuals
import matplotlib.pyplot as plt
import seaborn as sns; sns.set(font_scale=1.2)
# Allows charts to appear in the notebook
%matplotlib inline
# Pickle package
import pickle

Importing the data:

Download this .csv file by clicking on This icon in your Colab:

recipes_muffins_cupcakes.csv
# Read in muffin and cupcake ingredient data
df = pd.read_csv('recipes_muffins_cupcakes.csv')
df
[See data set above]

2# Preparing The Data

Is the process of transforming raw data so that data scientists and analysts can run it through machine learning algorithms to uncover insights or make predictions;

sns.pairplot(df)
Seaborn Pairplotting: The best combination must be Sugar x Flour; They tend to be inversely proportional (Low Negative Correlation).
# Plot two ingredients (Flour and Sugar)sns.lmplot('Flour', 'Sugar', data=df, hue='Type',palette='Set1', fit_reg=False, scatter_kws={"s": 70});
These two ingredients would be most likely to differentiate cupcakes x Muffins; muffins are on the bottom left cause they tend to have more flour than Cupcakes :)

3# Choosing a Model

This is a process that can be applied both across different types of models (e.g. logistic regression, SVM, KNN, etc.);

SVM  - Support Vector Machine# Specify inputs for the model
# ingredients = recipes[['Flour', 'Milk', 'Sugar', 'Butter', 'Egg', 'Baking Powder', 'Vanilla', 'Salt']].as_matrix()
X_test = df[['Flour','Sugar']].to_numpy()
#print(X_test)
y_test = np.where(df['Type']=='Muffin', 0, 1)
#print(y_test)
# Feature names
recipe_features = df.columns.values[1:].tolist()
recipe_features
['Flour', 'Milk', 'Sugar', 'Butter', 'Egg', 'Baking Powder', 'Vanilla', 'Salt']

4 # Training And Fit The Model

means learning (determining) good values for all the weights and the bias from labeled examples;

# Fit the SVM modelmodel = svm.SVC(kernel='linear')model.fit(X_test, y_test)SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,   decision_function_shape=None, degree=3, gamma='auto', kernel='linear',   max_iter=-1, probability=False, random_state=None, shrinking=True,   tol=0.001, verbose=False)

5 # Evaluating The Model

It helps to find the best model that represents our data and how well the chosen model will work in the future

here we created a hyperplane along with the dotted lines for the margins; on the left, you can see that this is our SVM model and on the right, you can see that we’ve maximized the margin. Here is the code:
# Get the separating hyperplanew = model.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(30, 60)
yy = a * xx - (model.intercept_[0]) / w[1]
# Plot the parallels to the separating hyperplane that pass through the support vectorsb = model.support_vectors_[0]
yy_down = a * xx + (b[1] - a * b[0])
b = model.support_vectors_[-1]
yy_up = a * xx + (b[1] - a * b[0])
# Plot the hyperplanesns.lmplot('Flour', 'Sugar', data=df, hue='Type', palette='Set1', fit_reg=False, scatter_kws={"s": 70})
plt.plot(xx, yy, linewidth=2, color='black');

6 # hyper-parameter training

Is a parameter whose value is used to control the learning process ;

You can try others combination about hyperparameter

7 # Prediction

PREDICT function that can be used to predict outcomes using the model;

We’ve created a function called muffin or cupcake and what we need to do is input the amount of flour and sugar:

# Create a function to guess when a recipe is a muffin or a cupcake   def muffin_or_cupcake(flour, sugar):      if(model.predict([[flour, sugar]]))==0:      print('You\'re looking at a muffin recipe!')      else:      print('You\'re looking at a cupcake recipe!')

Now let’s use this function: Suppose our recipe has 50% parts of flour and 20% Sugar:

# Predict if 50 parts flour and 20 parts sugarmuffin_or_cupcake(50, 20)You're looking at a muffin recipe!

Let’s plot the graph to confirm:

# Plot the point to visually see where the point liessns.lmplot('Flour', 'Sugar', data=df, hue='Type', palette='Set1', fit_reg=False, scatter_kws={"s": 70})plt.plot(xx, yy, linewidth=2, color='black')plt.plot(50, 20, 'yo', markersize='9');
Yep! The newly found recipe is in fact a Muffin! Our model is correct!

I will not extend this more than that…If you will see other AISeries Episodes— linked below :)

There you’ll find some more interesting information about this cutting-edge technology: AI, since the beginning of time ;)

More AI history:
1950-Allan Turing: Can Machine Think?
1956-John McCarty: Can Machine Behave like Human?
1970-Failure of Projects: Machine Translation (AI Winter)
1997-Deep Blue: Defeated Chess World Champion Gary Gasparov
2011-IBM/Watson: Beats the brightest trivia minds at Jeopardy.
2016-Deep Mind/AlphaGo: Defeated Go World Champion Lee Sedol
2018-Deep Mind/AlphaGo Zero: Defeated StartCraft Game World Champion Grzegorz -MaNa- Komincz

OK! That’s all!

I hope you enjoyed that lecture.

If you find this post helpful, please click the applause button and subscribe to the page for more articles like this one.

Until next time!

I wish you an excellent day!

Download The File For This Project

30_muffins_Cupcakes_SVM.ipynb

Credits & References

Based on: Support Vector Machines: A Visual Explanation with Sample Python Code by Alice Zhao

Related Posts

00#Episode — AISeries — ML — Machine Learning Intro — What Is It and How It Evolves Over Time?

01#Episode — AISeries — Huawei ML FAQ — How do I get an HCIA certificate?

02#Episode — AISeries — Huawei ML FAQ Again — More annotation from Huawei Mock Exam

03#Episode — AISeries — AI In Graphics — Getting Intuition About Complex Math & More

04#Episode — AISeries — Huawei ML FAQ — Advanced — Even More annotation from Huawei Mock Exam

05#Episode — AISeries — SVM — Credit Card — Start to Finished — A Complete Colab Notebook Using the Default of Credit Card Clients Data Set from UCI

06#Episode — AISeries — SVM — Breast Cancer — Start to Finished — A Complete Colab Notebook Using the Default of Credit Card Clients Data Set from UCI

07#Episode — AISeries — SVM — Cupcakes or Muffins? — Start To Finished — Based on Alice Zhao post (this one)

Download All The Files For This Project

“The secret of getting ahead is getting started.” — Mark Twain.

--

--

J3
Jungletronics

Hi, Guys o/ I am J3! I am just a hobby-dev, playing around with Python, Django, Ruby, Rails, Lego, Arduino, Raspy, PIC, AI… Welcome! Join us!