Don’t Second Guess Yourself: Machine Learning 101

Nila Sadeeshkumar
HackHer413
Published in
5 min readJul 8, 2019

Hey everyone, my name is Nila Sadeeshkumar and I am a rising senior at the University of Massachusetts, Amherst studying Computer Science with a focus in Data Science. I am also currently the Assistant Director for HackHer413. During this summer, I am interning at Liberty Mutual and am working on an Agile team that is developing a data ingestion pipeline for the product ByMile.

In a recent article I read, researchers found a substantial difference between how men and women apply for jobs. While men usually apply to a job if they meet at least 60% of its required qualifications, it was found that women feel they need to meet all of the requested qualifications before they apply. Men tend to jump after new things they want to learn without questioning themselves, while women tend to second guess themselves, feeling that they aren’t capable.

As minorities in tech, we need to stop second-guessing ourselves.

When I first heard about machine learning and blockchain technology, I immediately wanted to learn more. But, given that I didn’t feel proficient in intermediate CS skills such as web development, I questioned myself on how could I move on to more advanced topics like machine learning? By staying in this mindset, I was continuously second guessing myself and I prevented my own technical growth.

During my junior year, I finally took a machine learning based course and I realized how narrow my perspective was. I learned so much and so quickly during this course that I realized I wanted to specialize in Machine Learning. If I had stopped second guessing myself earlier, I could have made this discovery sooner and been more advanced already.

If you feel like you are not smart enough or experienced enough to begin learning something that sounds challenging, I am here to tell you that you are wrong! If you want to learn something, go for it and don’t hold yourself back for any reason.

So, today I am going to walk you through a very simple “Hello World” machine learning project to help you begin exploring and grow instead of being afraid like I was.

What is Machine Learning?

What if I asked you to write code to label images on whether it is an apple or an orange? You could start by making rule-based assumptions about each of these fruits, but there would always be an exception to the rule.

Machine Learning is the study of algorithms that learn from examples and experience instead of relying on hard-coded rules.

Prerequisites

  1. Open up Jupyter Notebooks and hit “Try Jupyter with Python
  2. Wait for it to start the server and open up a jupyter notebook
  3. Create on File New to open a new python notebook(.pynb)
  4. Then, import sklearn
  5. And, hit run!
This is my Jupyter Notebook(.pynb) file with sklearn imported.

Now let’s begin our first Machine Learning Project

We are going to create a classifier that takes data as input and assigns a value to it as output. The classifier we will build will learn from all the samples provided in the data set and create its own rules based on the data provided.

We are going to work with the famous iris flower data set. The data set contains 50 samples from each of the three species of Iris including Setosa, Versicolour and Virginica. Each sample includes the measurement’s of Sepal Length, Sepal Width, Petal Length, and Petal width for every Iris in the data set. Using these measurement’s or features, we can classify every sample into either Iris Setosa, Iris Versicolour, or Iris Virginica.

The machine learning algorithm will take the data and create rules based on the data to label every iris flower to the proper species and we will explore how below.Load and Examine the Data

Understanding the data set

Split up data set into training and test set

Splitting up the data set into training and test set

Train the Classifier

We create a Decision Tree Classifier. Then, we fit this decision tree classifier to fit the data(to create rules based on the training data).

Make Predictions based on our test data

Visualize the Decision Tree Classifier

This is how we create print out the decision tree classifier we created to see the rules it learned from the data.
Decision Tree Classifier Visualization

Above, you can visualize the Decision Tree Classifier that was created based on the training data. It sets rules based on the data provided to help predict what type of iris a sample could be.

Let’s look at the example in the text data set! The first sample has the values [5.1, 3.5, 1.4, 0.2] which corresponds to [sepal length, sepal width, petal length, petal width].

Using this data, let’s follow along the decision tree classifier to see what type of iris it predicts the sample could be. We start at the root node, and evaluate whether our petal length is less than 2.45 cm or not. In this case, our petal length is which classifies our sample as Setosa, which is correct!

Decision Tree Classifier Example Sample Classification

Congratulations, you have now completed your very first Machine Learning Project. Now, you can continue your learning journey :)

If you got stuck anywhere or something didn’t work, check out my code here on github.

I would not call myself a pro in machine learning because I am also still learning. But, what I am here to tell you is that you should not hold yourself back or second guess yourself. If you want to learn something, then go for it. It may or may not be for you but you will never know if you don’t try!

Continue growing and check out more resources in machine learning!

--

--