A Detailed Overview of Artificial Intelligence

22 min readNov 17, 2022

AI (artificial intelligence) has been called the most misunderstood technology of our era, and with good reason…

It seems like everywhere you look, there are stories about how AI is going to take over the world and destroy humanity as we know it. But the truth is, AI is just a tool — and like any tool, it can be used for good or for evil, depending on who is using it and for what purpose.

A huge chunk of this mistrust comes from how people just don’t get how AI works. It’s not magic, It’s math.

This article will clear things up, hopefully giving you a thorough understanding of AI and how it fundamentally works. Scroll down to the bottom for a TL;DR!

Introduction to Artificial Intelligence

Let’s start with the basics, think of Artificial Intelligence like a child.

It learns slowly from its surrounding environment, pulling information from different factors to feed its development. As a result, as the young child grows into an adult, they will be able to contribute to society and find work.

A time lapse of two kids growing up to 16 years of age

Artificial intelligence (AI) is simply the ability of a computer program or a machine to “think” and “learn”. Note, artificial intelligence isn’t actually “intelligent”.

AI is mostly just math + data + algorithms

However, it appears intelligent because it is able to identify a collection of complex patterns humans can’t easily detect or understand.

These patterns are derived from processes such as deep learning, machine learning, and artificial neural networks. Which is where the term “artificial” in “artificial intelligence” comes from. If you think of AI as a powerful statistical method, you’ll be closer to the mark.

How Artificial Intelligence Generally Works

The model on the right shows the general process on how AI functions

In a nutshell, AI works by combining large amounts of data with fast, iterative processing and algorithms, allowing the software to learn automatically from patterns or features in the data.

AI = Data + Algorithm + Iterative Processes

Background Information

Before we get into the technicalities of algorithms and how they function, there’s a few thing we need to know, particularly computing power, data and mathematics.

These topics sound really nerdy, but I promise they are important!

Computing Power

Computing power is simply computer performance; the amount of useful work accomplished by a computer system. Generally computer performance is estimated in terms of accuracy, efficiency and speed of executing computer program instructions.

With the rise of artificial intelligence as well as many other emerging technologies, there is an increasing demand for these powerful computers and, as a result, processing power is rapidly increasing.

There’s even a law about it, Moore’s law!

Quick history lesson, in 1965 a man named Gordon Moore predicted that every two years the number of transistors in a dense integrated circuit (IC) doubles.

A dense integrated circuit (IC)

A IC is a chip that contains a high density of transistors and other electronic components. The high density of components on an IC allows it to perform complex functions while occupying a extremely small amount of space.

Rather than a law of physics, Moore’s law was an observation and projection of a historical trend. The need for computation power is incredibly important.

You can think of computational power in AI as being like the horsepower in a car.

The more horsepower a car has, the more powerful it is and the faster it can go. The same is true for computational power in AI. The more computational power an AI system has, the more powerful it is and the faster it can process information.

To put this into perspective, let’s talk about simulating molecules.

Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. These simulations are incredibly important because by understanding the dynamics of molecules, scientists can learn about the properties of matter and the interactions between molecules. This knowledge can be used to develop new materials such as more effective drugs.

The thing is, such simulations require massive amounts of computing power, because interactions between three or more interacting particles quickly become hairy and complex.

For instance we actually have no clue what a caffeine molecule looks like.

So we don’t know how exactly caffeine affects our bodies. Even with Quantum Computers, these simulations are still a bit inaccurate.

Data

With the massive revolution of technology I’m pretty sure everybody knows what data is, but it’s often overlooked for how important it is. Data is essentially pure information that holds many interpretations, it’s knowledge!

There’s a reason why big corporations are obsessed with customer data, and will do anything to get their hands on it. With that knowledge, corporations are able to make informed decisions about the company’s direction, strategies, and operations.

In terms of AI, it’s fundamentally how models learn and improve.

Math

Gradient Descent and Matrix multiplication both often found at the heart of certain machine learning techniques

Oh no… It’s math.

If you hate math, AI is bad news for you, because mathematics plays a huge part in an AI’s ability to everything! Read, write, debate, you name it.

Without a comprehensive understanding of math, all we see is magic behind AI, and never actually understanding the process that creates all that magic.

Breakdown of Artificial Intelligence

Diagram of a breakdown of the process of Artificial Intelligence

AI can be split into two major concepts: Machine Learning and Deep Learning.

Machine Learning takes care of the strategy of training the model and Deep Learning carries out all the dirty work (basically the math).

Deep learning is machine learning what the roots are to a tree. Just as the roots of a tree give the tree its strength and stability, deep learning provides the foundation for strong and accurate machine learning.

Machine learning

Machine learning is what it literally sounds like. “Machine” “learning”, where a machine learns by itself without the need for human intervention.

Machine Learning is like a child growing up. The child starts off not knowing anything, but as it is exposed to more and more data (experiences, information, etc.), it gradually learns and improves.

There are three main types of Machine Learning:

The three main types of Machine Learning: Supervised, Unsupervised and Semi-supervised Learning

Supervised Machine Learning
Unsupervised Machine Learning
Semi-Supervised Learning

Honourable Mention (Reinforcement Learning)

Let’s Start with Supervised Machine Learning

General Supervised Machine Learning Model Diagram

Supervised machine learning uses labeled datasets to train algorithms that to classify data or predict outcomes accurately.

Think of supervised machine learning like teaching a child how to read. You start by teaching them the alphabet, then how to put those letters together to form words, and finally how to read sentences and paragraphs. With each step, you are providing the child with more and more information until they are able to read on their own.

Supervised machine learning can be split into 2 models:

Regression Model ▶️ Understand the relationship between dependent and independent variables
Classification Model ▶️ Accurately assign data into different categories or classes

Comparison between classification and regression model

1. Regression Model

Regression models are used to understand the relationship between dependent and independent variables.

It’s widely used in scenarios where the output needs to be a finite value, for instance, height or weight, etc. The ultimate goal of the regression algorithm is to plot a best-fit line or a curve between the data.

The main regression models can be split into:

a. Linear Regression ➡️Identify the relationship between two variables.

b. Logistic Regression ➡️Used when dependent variable is categorical or has binary outputs like ‘yes’ or ‘no’.

Linear Regression

Linear Regression is a tool to identify the relationship between two variables, typically used for making future predictions.

Linear regression is simply a line on a graph that shows the relationship between two variables, one on the x-axis and one on the y-axis. In supervised learning, linear regression is used to find the line of best fit for a set of data.

This line is then used to make predictions about new data.

Logistic Regression

Logistic regression is used when the dependent variable is categorical or has binary outputs like ‘yes’ or ‘no’.

Logistic regression is similar to a person sorting a stack of papers into two piles, one for “to be filed” and one for “to be shredded.” The person looks at each piece of paper and decides which pile it goes into based on its contents. In the same way, logistic regression looks at each piece of data and decides which category it belongs to.

Which is why it is so useful to solve binary classification problems.

2. Classification Model

The Classification Model is a type of supervised learning algorithm that is used to accurately assign data into different categories or classes.

It recognizes specific entities and analyzes them to conclude where those entities must be categorized.

Think of the classification model simply as a filter, the models in AI are like a filter that sorts items into different categories.

Some of the classification algorithms are as follows: k-nearest neighbor, naive bayes, tree-based algorithms.

Types of Classification Algorithms In order from Tree-Based, K-Nearest Neighbor and Naive Bayes (based on Bayes’ Theorem of probability)

Advantages and Disadvantages of Supervised Machine Learning

Supervised machine seems pretty cool, however it’s not foolproof, and that’s where many other machine learning methods come into play.

Different methods have different strengths and weaknesses, meaning picking a certain method to match your situation is very important.

With that in mind let’s talk about the advantages and disadvantages of supervised machine learning.

Advantages of Supervised Machine Learning

Supervised machine learning can learn complex models

Supervised machine learning algorithms can learn from labeled data very effectively, meaning that they can learn complex relationships between features and target labels. Enabling them to learn complex models that can accurately predict the target labels.

2. Supervised machine learning can make accurate predictions

Supervised machine learning algorithms have been designed to make predictions that are as accurate as possible. They do this by learning from labeled data and using that knowledge to make predictions.

Disadvantages of Supervised Machine Learning

Supervised machine learning is time-consuming

Supervised machine learning can be extremely time consuming due to the fact that it requires a lot of data to train the model. Additionally, this data must be labeled, which can be a time-consuming process, due to data preparation consisting of several steps, which consume more time than other aspects of machine learning.

2. Supervised machine learning is expensive

As mentioned before, supervised machine learning requires a lot of labeled data, which is not only time consuming but expensive to obtain.

Collecting data in general can be costly in terms of time and resources. If the data is sensitive or private, special care must be taken to ensure that it is collected and stored securely. Some data may be difficult to obtain because it is proprietary or guarded closely by its owner.

3. Supervised machine learning requires enough labeled data

Supervised machine learning only deals with labeled data but not all data sets have labels. As stated previously, processing that data can be very time consuming and expensive, making supervised machine learning limited to the data it consumes.

Primary Problem with Supervised Machine Learning

Examples of Under-fitting, appropriate-fitting and over-fitting in learning models

The main problem with supervised learning is overfitting. Too much emphasis on getting the function right makes it too right.

That’s right, supervised learning models are perfectionists.

Overfitting can lead to poor generalization. This means that the model will perform well on the training data but will not be able to generalize to new data. Overfitting can be caused by a variety of factors, including having too many features, having too few training examples, or having a complex model.

A common analogy for overfitting is to think of it as memorizing a set of training data rather than learning from it.
If a model memorizes the training data, it will not be able to generalize to new data. This is similar to how a student who memorizes a set of facts for a test will not be able to apply that knowledge to new situations.

However, nothing is perfect, which is why there are many different approaches to machine learning, such as unsupervised, and semi supervised learning.

Lets take a dive into unsupervised machine learning!

Unsupervised Machine Learning

Unsupervised machine learning is the opposite of supervised machine learning (thus the name).

Instead of using labeled data like its big brother (supervised machine learning) it uses unlabeled datasets. Making these algorithms very useful for discovering hidden patterns or data groupings.

Think of it as when a child is left alone in a room with a bunch of toys. The child will explore the toys and figure out how they work without any guidance from an adult.

Again there are two major types:

Clustering Unsupervised Machine Learning ▶️ Clusters/groups data based on their similarities or differences
Dimensionality Reduction Unsupervised Machine Learning ▶️ It reduces the number of data inputs to a manageable size

Clustering

Clustering is a technique which clusters/groups data based on their similarities or differences.

Clustering algorithms such as unsupervised ones are used to process raw, unclassified data objects into groups represented by structures or patterns in the information.

Clustering algorithms can be categorized into a few types however lets focus on exclusive clustering and overlapping clustering.

Exclusive clustering

Exclusive clustering is a form of grouping that stipulates a data point can exist only in one cluster. This can also be referred to as “hard” clustering.

Think of exclusive clustering in AI as a group of friends who only hang out with each other and never talk to anyone else.

2. Overlapping clustering

Overlapping clusters differs from exclusive clustering in that it allows data points to belong to multiple clusters.

For examples if you were to take a group of people and put them into different sized groups based on their shared interests, and then have those groups overlap with each other so that some people were in multiple groups.

Dimensionality reduction

Usually more data = more accurate results…

However this isn’t always true as too much data can impact the performance of machine learning algorithms (e.g. overfitting) and it can also make it difficult to visualize datasets.

Dimensionality reduction is a technique used when the number of features, or dimensions, in a given dataset is too high.

It reduces the number of data inputs to a manageable size while also preserving the integrity of the dataset as much as possible.

Think of dimensionality reduction in AI like compressing a file. By reducing the number of dimensions, you can represent the data more efficiently, which can lead to faster processing times and improved performance.

Advantages and Disadvantages of Unsupervised Machine Learning

Advantages of Unsupervised Machine Learning

Unsupervised machine learning requires less manual data preparation (i.e., no hand labeling) than supervised machine learning. Capable of finding previously unknown patterns in data, which is impossible with supervised machine learning models and difficult for humans to identify.

Unsupervised machine has the unique capability to look at data in different perspectives.

Disadvantages of Unsupervised Machine Learning

Unsupervised machine learning has generally lower accuracy rates of the results due to the input data are not known and not labeled by people in advance.
Another problem in unsupervised machine learning is about feedback.

Unsupervised learning in AI is like a child learning to play a musical instrument without any guidance from a parent or teacher. The child may be able to figure out some basic notes and rhythms on their own, but without any feedback or direction, they are unlikely to ever become a proficient musician.

Unsupervised machine learning is really cool and so is supervised machine learning, but they all have their downsides… But what if we took the best of both worlds?

Introducing Semi-supervised machine learning!

Semi supervised learning

Semi-Supervised Machine Learning Model Diagrams

Here with semi-supervised machine learning we take the best of both worlds, this learning uses labeled data to ground predictions, and unlabeled data to learn the shape of the larger data distribution.

Honestly if this isn’t the future of AI I don’t know what is.

Semi-supervised learning in AI can be thought of as a middle ground between supervised and unsupervised learning.

Supervised learning relies on a dataset that has been labeled by humans, while unsupervised learning relies on a dataset that has not been labeled. Semi-supervised learning algorithms make use of both types of data to learn patterns and improve results.

Semi-supervised learning in AI is like a child learning to read with the help of a parent. The child is given some basic instruction and guidance, but is ultimately left to figure out most of the details on their own.
In the same way, a semi-supervised learning algorithm is given a few labels to work with, but must learn the rest of the information by itself.

It uses a small amount of labeled data and a large amount of unlabeled data, which provides the benefits of both unsupervised and supervised learning while avoiding the challenges of finding a large amount of labeled data.

How Does Semi-Supervised Learning Work?

Examples of Semi-Supervised Models and Analogies

Process of Semi-Supervised Learning

Train the model with the small amount of labeled training data just like you would in supervised learning, until it gives you good results.
Then use it with the unlabeled training dataset to predict the outputs, which are pseudo labels since they may not be quite accurate.
Link the labels from the labeled training data with the pseudo labels created in the previous step.
Link the data inputs in the labeled training data with the inputs in the unlabeled data.
Then, train the model the same way as you did with the labeled set in the beginning in order to decrease the error and improve the model’s accuracy.

Honorable mention: Reinforcement Learning

The Picture in the middle is a Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks

Reinforcement Learning is a special kind of AI, which works in an environment, and gets “rewards” and “punishments”, based on what it does, and it will slowly start doing more rewarding actions and less punishing actions.

Think of reinforcement learning as training your dog tricks! You reward them when they do the right action and don’t when they do the wrong action.

Reinforcement learning is mainly used in robotics!

Now that we covered machine learning, lets dive into the second half of AI

Deep Learning

Deep learning is machine learning what the roots are to a tree. Just as the roots of a tree give the tree its strength and stability, deep learning provides the foundation for strong and accurate machine learning.

Deep Learning (Neural Network) Number Classifier and Picture of Dog Classifier

Deep learning is a sub category inside of machine learning that’s more focused on brain-like explicitly programmed algorithms called neural networks.

Neural networks tend to get a bad reputation 😢 , because it is pretty complicated, hard to understand, and challenging. That’s unlucky because they are the fundamental building blocks to artificial intelligence.

Neural networks in AI are like the human brain!

Relation of a neuron to a node in neural networks

The brain is composed of many interconnected neurons that process information and transmit signals.

A neural network is a series of algorithms that recognize relationships in a set of data through a process that mimics the way the human brain operates.

The input layer is like our five senses, which take in information from the world.
The hidden layer is like the brain, which processes that information.
The output layer is like the muscles, which take action based on the information.

Parts of a Neural Network

There are typically three parts in a neural network:

Input Layer ▶ ️ Brings the initial data into the system for further processing by subsequent layers of artificial neurons
Hidden Layers ▶️ Located between the input and output of the algorithm, in which the function applies weights to the inputs
Output Layer ▶️ The final layer in the neural network where desired predictions are obtained

Parts of a Neural Network (Input, Hidden and Output)

The little dots are called units or nodes, the units are connected with varying connection strengths (or weights) to other units.

Weights and Biases

1. Weights control the signal (or the strength of the connection) between two neurons.
2. Biases are constant, are an additional input into the next layer that will always have the value of 1.

In other words, a weight decides how much influence the input will have on the output.

Weights and Biases Diagram and Real application example

Weights and biases can be thought of as the dials and switches that control how a machine learning algorithm works.

Just as you can tweak the settings on a stereo to make it sound better or worse, you can adjust the weights and biases of an AI algorithm to make it perform better or worse.

How Does it Improve an AI Algorithm?

Think of it this way: if you have a stereo with the volume turned all the way up, and the bass and treble turned all the way down, it’s going to sound pretty terrible. But if you adjust the settings so that the volume is at a reasonable level, and the bass and treble are at moderate levels, it’s going to sound much better.

The same is true for weights and biases in AI. If they are all turned up to maximum or minimum values, the algorithm is going to perform poorly. But if you adjust the values so that they are somewhere in the middle, the algorithm will perform much better.

How Do You Choose the Weights and Biases?

Different methods of choosing/calculating the right weights and biases

1. Trial and error: This is the most basic approach. You just try different values and see what works best.

2. Gradient descent: This is a more sophisticated approach that uses a technique called gradient descent to gradually adjust the weights and biases in a way that improves the performance of the algorithm.

3. Evolutionary algorithms: This is a more advanced approach that uses a technique called evolutionary algorithms to generate new weights and biases that are better than the current ones.

Process of Neural Networks

Neural networks feed information forward.

Input Layer▶️Hidden Layer▶️Output Layer

Simple Diagrams of Neural Network Models

Let’s walk through a simple process of a neural network.

1. Neural networks receive input, which is then processed by a series of hidden layers.
2. The hidden layers extract features from the input data and pass them to the output layer.
3. The output layer produces the final result, which is then sometimes passed back to the input layer. (When the output is feed backwards, recurrent neural network)

Now that we discussed the general concept of neural networks let’s explore different types of neural networks, how they work and how they are used!

2 Major types of neural network (CNN and RNN)

1. Convolutional Neural Networks (CNN)

A convolutional neural network is a class of neural networks, most commonly applied to analyze visual imagery.

CNNs are particularly useful for finding patterns in images to recognize objects, faces, and scenes.

Example of What’s Happening in a Convolution Neural Network (A look through the different layers)

In this case, a pixel is affected by all the pixels surrounding it. It’s not simple sequential data. So convolutional neural networks look at “windows” of pixels instead of one pixel at a time

The layers of convolutional neural networks and the process of “Pooling”

Some filters detect edges and lines others may recognize curves, this is known as pooling.

How a CNN works is like if you were to play a game of zoomed in pictures guessing game. You only see parts of a whole at once but able to identify certain features that later on can fit all together.

2. Recurrent Neural Networks (RNN)

RNNs are commonly used in speech recognition and natural language processing. Recurrent neural networks recognize data’s sequential characteristics and use patterns to predict the next likely scenario.

Recurrent Neural Network Diagram (Feedforward)

RNN works on the principle of saving the output of a particular layer and feeding this back to the input in order to predict the output of the layer.

Thus the name “recurrent”, its feeding forwarding information.

It’s like having a conversation with someone. The more you talk to them, the more you learn about them. The same is true for recurrent neural networks. The more data you give them, the more they learn.

What Role Do Neural Networks Play in Machine Learning and the Development of an AI Model?

Machine learning models control what type of neural networks, how the neural networks are built/overall strategy/approach. Neural networks carry those out actions/the math.

Looking all around us, AI is capable of some pretty amazing things. Hopefully this article gave you a much better understanding of how AI works!

Artificial Intelligence Breakthrough through 540 billion parameters

At its core, AI is all about making computers smarter. And there are a few different ways to do that. One is by giving them the ability to learn on their own. Another is by making them better at solving problems.

But ultimately, it all comes down to making computers think more like humans. By doing that, we can make them even better at doing all the things we’ve mentioned above.

And that’s why AI is so exciting. It has the potential to change the world as we know it.

TL;DR (Too Long Didn’t Read)

Artificial intelligence (AI) is simply the ability of a computer program or a machine to “think” and “learn”
It works by combining large amounts of data with fast, iterative processing and algorithms. AI = Data + Algorithm + Iterative Processes

Background Information

Computing power: With the rise of artificial intelligence, there is an increasing demand for these powerful computers
Data: Data is important for training data but often overlooked.
Math: Without a comprehensive understanding of math, all we see is magic behind AI, and never actually understanding the process that creates all that magic.

Breakdown of AI

AI can be split into two major concepts: Machine Learning and Deep Learning.
Machine Learning takes care of the strategy of training the model and Deep Learning carries out all the dirty work (basically the math).

Machine Learning

Machine Learning is like a child growing up. The child starts off not knowing anything, but as it is exposed to more and more data (experiences, information, etc.), it gradually learns and improves.
Three Main types:

Supervised Machine Learning: Use only Labeled Data sets
Unsupervised Machine Learning: Uses only Unlabeled data sets
Semi-Supervised Learning: Takes the best of both worlds (the future of AI)

Honorable mention: Reinforcement Learning

Reinforcement Learning is a special kind of AI, which works in an environment, and gets “rewards” and “punishments”.

Deep Learning

Deep learning is a sub category inside of machine learning that’s more focused on brain-like explicitly programmed algorithms called neural networks. It is the fundamental building blocks of artificial intelligence
3 parts to a Neural Networks

The input layer is like our five senses, which take in information from the world.
The hidden layer is like the brain, which processes that information.
The output layer is like the muscles, which take action based on the information.

Layers made up of nodes and units that connect to each other. These connections are tied with weights and biases.
Weights and biases: weight decides how much influence the input will have on the output.

2 Main Types of Deep Learning (Neural Networks)

CNN, A convolutional neural network is a class of neural networks, most commonly applied to analyze visual imagery.
RNN, are commonly used in speech recognition and natural language processing

Thank you so much for reading this article, A Simple Overview of Artificial Intelligence, and learning about machine learning and deep learning! You have now have a pretty good overview of AI.

Hopefully you enjoyed dipping your toe into the Artificial Intelligence world.

~ Written by Shirley Yang