An Intro to Machine Learning

Week 7: AI6 Ilorin

The Scenic Route
ai6-ilorin
5 min readFeb 12, 2020

--

Six weeks later and I finally found the will to open one of Dr. Andrew’s many videos on Machine Learning. The facilitators of AI Saturdays Ilorin were kind enough to supply us with resources that would enhance and accelerate our learning process — talk about ‘lecturers’ who are actually genuinely interested in your growth, even going as far as tracking said growth with feedback strategy!

I missed one class and as expected, I was disoriented in the weeks to come. The class I missed, Week 5, was the very introduction to Machine Learning. Thankfully, there were videos hidden in a forgotten folder in my workspace.

What exactly is Machine Learning?

According to Arthur Samuel in 1959, Machine Learning gives computers the ability to learn without being explicitly programmed.

Remember this?

A computer program is said to learn from experience E with respect to some class of task T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. — Tom Mitchell

I totally understand the above statement now. Funny how everything becomes clear as day, after some time.

Let me explain;

Imagine you want to build a model for your forum. You want the model to distinguish which registered users on the forum are ‘spam’ and separate them from the other users that are not spam. The classification of each registered user as spam or not spam is the task T.

The computer needs to learn. Learning means improving its performance at a particular task with experience. It needs to experience that which needs to be learnt (the task, T) before it can improve on its performance (P). That experience,E, is the data available for the machine to learn from. And performance measure P, refers to how well the model does, distinguishing registered users.

Machine Learning is an application of AI. It focuses on the development of computer programs that can access data and use it to learn for themselves. As models we build are exposed to new data, they(the models) are able to independently adapt. They will learn from previous experience to produce reliable results. Machine Learning is employed in various applications of day-to-day life, ranging from Virtual Personal Assistants like Alexa and Siri, Video Surveillance, to Face Recognition like Clearview AI , the tech start-up that helps law enforcement match photos of unknown people to their online images.

Machine Learning is categorized into 4 types:

  1. Supervised Learning
  2. Unsupervised Learning
  3. Semi-supervised Learning
  4. Reinforcement Learning

The two types we’re more focused on are Supervised and Unsupervised Learning.

In Supervised Learning, users input correct answers(known data) to learn from, while the machine uses that information to guess new answers. Supervised Learning is of 2 classes; Regression and Classification.

Linear Regression in ML

In the world of big data, powerful computers and AI, Linear Regression is one of the fundamental statistical and machine learning techniques.

First of all, it is important to understand that Regression, in itself searches for the relationship among variables. There are two variables in this discourse: the dependent variable y and the independent variable x. Each problem may differ by virtue of the number of independent variables it has.

An example:

Employee salaries are examined in relation to a number of factors like level of experience of the employee, role of the employee, gender, and probably which city their branch is situated. We are trying to determine the relationship between these factors or features and the salary.

Here, the salary is the dependent variable y, depending on the level, gender or role. The number of factors mentioned above are the independent variables.

It could also be about forecasting the prices of buildings based on their locale, number of rooms, distance to ‘happening’ places, even age of the building.The price (y) is dependent on the independent features (x). That is, the price of a building or piece of land depends on either or all of the features mentioned in the beginning of this long paragraph. A house in Lekki, despite having the same number of rooms as a house in Gaa-Akanbi will never be the same price. Not while Gaa-Akanbi is Gaa-Akanbi.

So, the aim of Regression Analysis is to:

  • Answer whether these independent features really influence the price of a house and also how it is influenced.
  • Show how these variables are related.
  • Forecast a new price using a new set of predictor variables.

Variables, dependent variables can be discrete, continuous or categorical outputs. Continuous outputs are exactly like prices; pricing could range from a thousand dollars to a million dollars.

Discrete outputs are more limited to white or black answers without a grey area. For example, in determining whether a tumor is malignant(y) or benign (y), depending on its size (x) , the output is limited to one of two answers and nothing more.

Categorical outputs allow for a little more scenarios, but not up to the range of continuous values outputs.

Since Week 5, we’ve learnt simple Linear Regression ( having a single independent variable, x). We learnt about its cost function and gradient descent all through the next week. It is important to reiterate what cost function and gradient descent are useful for.

While I’m still working out the math of it all (currently scouring my brother’s room for a New General Maths or FurtherMaths textbook), I’ve been able to comprehend the English, thanks to Adnan Haddy ‘s tutelage and a few Medium articles.

A cost function helps the learner correct behaviors to minimize its mistakes. It shows you how wrong your model is in terms of its ability to estimate the relationship between x and y, your variables. It basically tells you how badly your models are performing — the ultimate ML critique.

Models learn by minimizing this cost function. So the idea is to minimize it!

To minimize cost function, you use a gradient descent. Gradient descents give the model directions to reducing errors.

It wasn’t until Week 7, transitioning from Linear, Multiple and Polynomial Regression to Logistic Regression that I began to grasp well enough. Multiple and Polynomials are just regressions with two or more independent variables. Simple in English, but harder to interpret in Math.

The Logistic Regression with its neatly named Sigmoid Function is the most recently treated in our AI class. I recall the instructor standing over a foray of arrays and unfinished equations on a white board.

A typical AI Saturday.

The class ended with us being grouped into threes for class projects to be presented in about 3 weeks time.

I really need to start posting our group pictures.

--

--