Day 1: Hello@ML

Krrish Dholakia
Journey Into Vision and AI
5 min readOct 1, 2017

Disclaimer

This is my attempt at better expanding my own knowledge of artificial intelligence by sharing what I learn by using various online resources, on a weekly basis. I will always put the relevant links that I am using, in an attempt to redirect you to people who know far more about ML/AI than I currently do. In the event that you’d like me to elaborate on something or I have misinterpreted something and made an error in explaining, please leave a comment! This will only help me improve the quality of content available to people on this publication.

Resource Materials

A great resource for beginning an introduction to Machine Learning is Andrew Ng’s course available on both Coursera and YouTube. While I already have some experience with Ng’s Coursera course, having started watching the YouTube videos, my personal preference is with the YouTube videos as I enjoy the work that Ng does with proofs and the student-teacher interactions in the course.

Another resource that I have begun using for Machine Learning is the course created by Udacity and Georgia Tech, which is available for free on the Udacity website.

Introduction

Let’s begin with defining what exactly is machine learning:

A field of study that gives machines the ability to learn without being explicitly programmed

- Arthur Samuel

This should help give us a basic idea of what ML is and help understand the purpose behind a lot of the work that we will be doing over this journey.

In Week 1, my rudimentary of machine learning is that given a set of data, you’re working towards developing a certain hypothesis/prediction for a certain event. The way to determine whether this hypothesis is accurate is to compare this to the true event and see how close the hypothesis got to predicting the true event.

This thinking of difference between the hypothesis and true event works in the case of either predicting things like prices (housing, stocks,etc.) or true/false events (hypothesis could be either 1 or 0, the true event could be either 1 or 0).

Arriving at our hypothesis

So the question that now comes to mind is: how do we arrive at this hypothesis?

Let’s explain this with an example:

Let’s say that for some reason, we’re building a predicting model for chairs. How do we go about this?

First we have to see what data we’re going to be looking at?

So we define the “features” of our data.

The logic behind this, is that for us to develop our hypothesis, we need a certain set of inputs, and therefore must determine what these inputs must be.

For now, let’s keep it simple and keep it to just 1 feature:

Since it’s a chair, something we may care about in predicting it’s price could be:

  • Price of wood used

To train a machine, we must also determine how many training examples we’re going to use.

In the interest of simplicity, let’s keep it at just 10 for now (in the real world you would be using a training set several orders of 10 greater than this one).

This is what our table looks like right now:

Current representation of data

Now that we’ve got our features, let’s see how we go about arriving at our predictive model:

There are several techniques we could choose to arrive at this prediction. For today we’re only going to look at a pretty straightforward one:

Linear Regression

By simply plotting our data onto a 2 dimensional map where our x-vector is the price of wood and y-vector is the real price

What a scatter plot of given data could look like (not an accurate representation)

We can then try to simply fit a line through the scatter plot.

I’m assuming that you are already familiar with the equation:

y = mx + b (1)

  • y = prediction of what the price will be on the y-vector
  • m = slope of line
  • x = x-value
  • b = y-vector intercept

Here’s what the data may look like if we try to pass a line through it (the graph is not an accurate representation of the data).

The line is formed by regressively trying to pass a line through/as close to the scattered points as possible.

However, as you may have guessed already, this may not always be an accurate way of predicting prices.

At this point, for the sake of extensibility and clarity, I’m going to re-write our previous equation (1) as:

  • Hypothesis from (2) = y from (1)
  • theta0 from (2) = b from (1)
  • theta1 from (2) = m from (1)
  • x from (2) = x from (1)

For these initial days, we’re going to maintain this linear relationship between the hypothesis and the features. Over time, I think as we delve into more advanced concepts this relationship may change, depending on the techniques we use.

Thank you for taking the time to read this! If you have any doubts/questions, feel free to leave it in comments. I’ll be adding more content soon!

Relevant links:

Andrew Ng’s Coursera course: https://www.coursera.org/learn/machine-learning

Andrew Ng’s YouTube videos: https://www.youtube.com/watch?v=UzxYlbK2c7E&list=PLA89DCFA6ADACE599

Udacity+Georgia Tech Machine Learning Course: https://classroom.udacity.com/courses/ud262

--

--