A Brief History of Machine Learning by Kamal Hawking

Kamal Raydan
Zaka
Published in
10 min readDec 15, 2020
Photo by Clarisse Croset on Unsplash

Trivia: Machine learning (ML), deep learning (DL), Elon Musk, Google, Dark Souls 2: Scholar of the First Sin, AlphaGo, (Hanson Robotic’s) Sophia, Pac-Man, Kamal — You’ve probably heard of at least two out of these nine names and expressions and you’re wondering what’s the link between them.

Here’s the answer: Artificial Intelligence (AI)! All of the above are either subsets, users, admirers or fearsome of AI. This wonderful technology which could be the greatest incubator for human technological and social progress or a futuristic weapon succumbing us to an eternal cyber-doom.

Let’s not be too melodramatic for now and think of AI in macro-scale for a quick second. In simple terms, artificial intelligence is giving machines the ability to learn and perform human-like tasks.

So, when, what & how do machines learn? Let’s dive in, shall we!

Outline of the blog:

  1. When?
  2. What?
  3. How?
  4. What to take away
Figure 1. Can you guess what’s happening here? | Dataversity

When?

So, you’ve probably come across the term Machine Learning at some point and thought,

“Is this the end?”

Good news, it still isn’t… yet.

Machine Learning is a term that was coined in 1959 by Arthur Samuel, who described it as:

“The field of study that gives computers the ability to learn without being explicitly programmed.”

This captures the essence of what Machine Learning tries to achieve. Samuel, while working for IBM, wrote a program to play the game Checkers. While he wasn’t an aficionado in Checkers, he had the program play thousands of games against itself, till it racked enough “experience” to be able to challenge people at it. The premise of the program was simple, in that its main challenge was to play favorable game moves that put it in a position to win. Even computer programs enjoy rewards!

In fact, Alan Turing, a polymath, coined the term “Learning Machine” a decade before Arthur in his paper on Computing Machinery and Intelligence, which introduced the now widely known concept of Turing Test, in an attempt to introduce the topic of Artificial Intelligence.

All this is to say that Machine Learning is not a new concept. Only until recently, thanks to Moore’s Law, greater processing powers have allowed Artificial Intelligence to rightfully gain major traction and is now found everywhere. De facto, Thomas Bayes was born in 1701 and introduced the Bayesian Decision Theorem which was later used for classification and decision making all the way back in a paper dating back to 1763.

What?

As was previously hinted, ML is a field that uses statistical models to allow machines to mimic human task performance without needing any kind of brute force code to help it run.

Over the decades, as the theory behind Machine Learning algorithms developed, so did their definitions.

Figure 2. Machine Learning in a nutshell | pinterest

ML [the fidget spinner-looking center] breaks down into three main categories

  1. Supervised Learning
  2. Unsupervised Learning
  3. Reinforcement Learning

Conveniently, the list so happened to be arranged in increasing order of complexity, beginning with Supervised Learning being the least complex to Reinforcement Learning being the most complex (relatively speaking, so to say).

So let’s dissect each of the learning approaches to get a better understanding of them.

2.1 Supervised Learning Algorithms

The world of Supervised Learning (SL) is an intuitive one albeit nuanced in its own way. It is a type of learning that feeds [learns] off of data that is labeled.

How so?

Before I get into what it means for data to be labeled, I think it is important to first classify [get it?] the categories [I did it again…] of problems that come under Supervised Learning.

There are mostly two types of problems you will encounter when dealing with Supervised Learning

  1. Classification-type problems
  2. Regression-type problems

Classification is the act of classifying certain observations into a category. Usually, it is a 0/1-Dog/Cat -Yes/No -Spam/Not Spam type category which we call Binary Classification. You could, however, have multiple classes to identify such as Bird/Cat/Mouse/Dog — Yes/No/Maybe, in which case it would be called Multi-Class Classification.

There are many algorithms available to tackle classification problems, some of which are:

  1. Logistic Regression
  2. Support Vector Machines
  3. Neural Networks and etc.
Figure 3. Classification | Medium

On the other hand, Regression is a problem whereby we have to come up with a “lesser” form, hence regression, but yet representative line for the data so that we are able to interpolate or extrapolate for new, real-valued data!

As with everything, there are many algorithms that are equipped to take on regression problems, some of which are but not limited to:

  1. Simple & Multiple Linear Regression
  2. Support Vector Regression
  3. Neural Networks…
Figure 4. Simple Linear Regression | Medium

So now that we know what types of problems might arise, let’s go back and explore what it means for data to be labeled.

In a given dataset, you usually have a bunch of features [also called predictors] and the variable to be predicted [dependent variable]. In our case here, the dependent variable is known and used while training hence the learning is supervised and the data is labeled. A feature is a certain measured characteristic of our problem that helps describe certain instances.

So let’s say you’re trying to classify a salmon vs. bass fish, some of the features would be lightness, width, number of fins, and so forth.

2.2 Unsupervised Learning Algorithms

Unsupervised Learning gets a tiny bit more abstract since the model we deploy learns off of data that isn’t labeled. As you’ve probably guessed, data that is not labeled refers to data that only has a bundle of features but no dependent variable to help categorize those features.

What ends up happening now, is that it’s up to the algorithm to find the hidden relation between the features. This is important because when faced with millions of data points and high-dimensional feature spaces, the structure of it all is not so evident. Therefore, we pass it onto an Unsupervised Learning model in order to capture some of the non-trivial insights that are hidden from within the data.

To deal with this, two widely used ML concepts called clustering and association are deployed.

Clustering is a way of throwing a bunch of features, with nothing to characterize them, onto your model and allowing it to come up with clusters of data that are similar in their own ways. An example of an algorithm deployed to tackle this problem is called K-Means Clustering.

Figure 5. K-Means Clustering (Notice the colors?) | Javatpoint

Association rule learning on the other hand, while slightly seeming similar to Clustering, serves a different purpose. Let’s take a clear example from Wikipedia on Association,

“For example, the rule {onions, potatoes} => {burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat. Such information can be used as the basis for decisions about marketing activities.”

One of many algorithms used to tackle the problem of Association rule learning is the Apriori Algorithm, which is mainly used to determine trends in databases containing transactional (from the word transaction) histories, as seen below:

Figure 6. Association Rule Learning using Apriori | kdnuggets

In a formal sense, association rule learning is a method used to discover interesting relationships between variables, whereby acting on a variable can infer the other through this discovered relationship.

2.3 Reinforcement Learning

Reinforcement Learning (R.L) is probably one of the coolest ML techniques out there. Remember, back in the day, when you couldn’t finish that level in Mario Kart in record time without failing miserably? Well…

Figure 7. Mario Kart + R.L | Kevin

RL is a bit of a complicated topic with quite a deep involvement in the math aspect of things, but at the core of it, it is an agent and an environment whereby the agent performs an action (A_t) and the environment updates the agent’s state and either penalizes or rewards the agent based on the agent’s previous action. [Simplistically speaking]

Figure 8. Reinforcement Learning | Nuggets

This is where we will leave it… for now.

How to Build & Train ML Models in Python

3.1 How do Machine Learning Techniques Learn?

Having read this far in the blog, you might be itching to see some sample code or better yet the specific classes that take care of the “learning” for you. Well you’ve come to the right section!

Note: These modules have been made keeping in mind that no time needs to be wasted creating Machine Learning frameworks from scratch. So if you want a deeper idea on how these things work, visit Scikit-Learns documentation site.

I will briefly be showcasing the different classes that Scikit-Learn [A Machine Learning library] & Keras [A Deep Learning library] have to offer with regards to:

  1. Simple Linear Regression
  2. Decision Trees
  3. Neural Networks

The rest of the algorithms usually follow the same idea, just from different classes.

3.1.1 Simple Linear Regression (S.L.R.)

In order to begin using the method suggested by SLR, you must first import the right class

from sklearn.linear_model import LinearRegression

Then, in order to use the Linear Regression class, you must instantiate an object from the class — this is basically creating a new empty model ready for you to train

regressor = LinearRegression()

Note: Sometimes when calling the constructor of a class, you can tune its parameters to improve your model if needed.

Having done this, you can now call functions such as regressor.fit() & regressor.predict() to be able to create your model and predict values, respectively!

3.1.2 Decision Trees (D.T.)

The same goes with Decision Trees. Firsthand, you will have to identify whether you need a Classification-based DT or a Regression-based DT (here we use a regression based DT):

from sklearn.tree import DecisionTreeRegressor

From there, we have to insantiate an object from the DT Regressor class by, calling its constructor and initializing an argument called “random_state”.

I explain the concept of “random_state” in my Intro to Data Science (Part II) blog in the Modeling & Training section!

regressor = DecisionTreeRegressor(random_state = 0)

Now, as with all the others, you can continue on to training your model and predicting continuous values using the same regressor.fit() & regressor.predict() methods.

3.1.3 Neural Networks

Neural Networks are by far the most interesting, in my humble opinion. The methods to use its powers are the same. There are only a couple of differences. The first of which is that there is more than one class to import since a neural network is an entity, and entities have components.

The second is that there are two API’s for you to choose from, when considering Neural Networks, and they are:

  1. Sequential API: This is an API that covers the basic feed-forward neural network, nothing crazy.
  2. Functional API: This API, on the other hand, gives you the flexibility to move layers around and connect them in a way you see fit.

Each API has syntax differences and their own use cases.

Once you’ve chosen your API, we can continue to import the necessary classes.

Sequential API

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

Then, as is getting common by now, we can instantiate the “Sequential” class and perform various things with it.

model = Sequential()#Create Hidden layers
#add an input layer of 2 neurons and a hidden layer with 4 neurons
model.add(Dense(4, input_dim=2, activation = 'relu'))
#add another hidden layer with 4 neurons
model.add(Dense(4, activation = 'sigmoid'))
...

Functional API

from tensorflow.keras.models import Model
from tensorflow.keras.layers Input, Dense

Create the input layer, hidden layers, then instantiate an object from the functional class “Model”,

#Input Layer
input_layer = Input(shape=())
#Create Hidden Layers and connect different layers together
layer1 = Dense(...)(input_layer)
layer2 = Dense(...)(layer1)
...
#Instantiate Object
model = Model(inputs = input_layer, outputs = ...)

Conclusion

So the main take out from this entire blog should be that there is this thing called Machine Learning and that it splits further down into three main categories: Supervised Learning, Unsupervised Learning & Reinforcement Learning. Each one of those categories, like a person, has their own personalities [definitions], their own use cases & of course their own superpowers [algorithms] with ups and downs to each one! The more you read on them and apply them, the more it becomes second nature.

The field of Artificial Intelligence is large yet “irreplaceably” rewarding. The very thing you used to consider magic as a kid, is now within arm's length. So what are you waiting for? Go spill some magic!

If you have any questions, please don’t hesitate to reach out to me on Instagram, LinkedIn, or by mail!

Interested to start your journey in Machine learning? Register today to Zaka’s live virtual Artificial Intelligence Bootcamp and get ready to develop your knowledge in AI!

Don’t forget to support with a clap!

You can join our efforts at Zaka and help democratize AI in your city! Reach out and let us know.

To discover Zaka, visit www.zaka.ai

Subscribe to our newsletter and follow us on our social media accounts to stay up to date with our news and activities:

LinkedInInstagramFacebookTwitterMedium

--

--

Kamal Raydan
Zaka
Writer for

Always Learning — So much to learn, so little time.