Artificial Intelligence Explained Part 1

I wanted to write a few posts explaining the ‘magic’ behind artificial intelligence for people who like me aren’t math or programming geniuses.

This is the first post in the series that will start from the beginning assuming little-to-no knowledge of what A.I is.

What is Artificial Intelligence or A.I?

Google has a pretty good definition of what A.I is:

The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

However, one misconception people often have is that all A.I is artificial general intelligence.

Artificial General Intelligence

What is Artificial General Intelligence? Again, Google has a pretty solid definition:

Artificial general intelligence (AGI) is the intelligence of a machine that could successfully perform any intellectual task that a human being can.

Essentially, AGI is the A.I you see in the movies, A.K.A fully sentient beings that are usually hostile towards the human race 🔫 🔪 🔥.

For better or worse, AGI is far from existence 😱. From the time of writing this post, there is no such thing as artificial intelligence that is sentient in the sense of being human-like.

However, what has excelled in current years is Weak AI or Narrow AI, which is as the name suggests, weak and narrow (to a degree).

Weak AI / Narrow AI

I like to define Narrow AI (I prefer this term over Weak AI) as artificial intelligence that can perform one single (or narrow) task, such as recognizing the difference between cat and dog images or convert sound to text.

Unlike AGI, Narrow AI is a reality today. Siri on the iPhone is an example of the combination of multiple Narrow AI’s, such as speech to text among others, that is connected to a database in the cloud. However, Siri is in no way aware of her surroundings or has any senses or feelings. Narrow AI like Siri can often easily be tricked.

A few more examples of Narrow AI include:

Much of Narrow AI fits under the category of Machine Learning, which we will explore in the rest of this post.

Machine Learning

Machine Learning is a subset of A.I that can be defined broadly as machines (or computers) learning to analyze specific sets of data. The best way to explain what machine learning does is to give an example.

Imagine a dataset of house auctions that contains the following information per sale:

  • The sale price of the house
  • The size of the entire property

Machine learning, in this case, could be used to develop an algorithm that can predict the sale price of a house based on the size of the property. But how would this work intuitively? We can find out by first visualizing our dataset.

Because our dataset contains two variables per house sale, we can visualize the information by plotting it on a graph.

In high school, you may have learned about linear regression, or in simple terms, a line of best fit. Just by looking at this data, we can see that it follows a trend where the house price is (to an extent) directly proportional to the property size, or in other words, the larger the property size, the higher the cost of the house.

Intuitively, what a machine learning algorithm does is attempt to find the line of best fit in a given dataset and then make predictions using that line of best fit.

So, our algorithm would first make an effort to find the line of best fit:

And then, if we gave it an input of 0.55 Acres, the machine learning algorithm would draw a line starting from the input value of 0.55 on x axis until it intersects with the line of best fit, and then it would trace a line until it intersects the y axis to yield a prediction of $13.5 million for a property size of 0.55 Acres.:

Notice once the algorithm has found a line of best fit, it can make a prediction given any acre size.

This is in essence how a machine learning algorithm learns to make predictions. Intuitively, it seems pretty simple, right? 🎉

Except there are a few caveats…

Machine Learning Challenges

Although the previous example seemed simple, Machine learning faces the following challenges:

  • Dataset size
  • The dimension problem
  • Nonlinearity
  • Computational power

Let’s dissect each of them one by one 😎

Dataset size

As we saw in the previous example before a machine learning algorithm can make predictions it needs a line of best fit (or a relationship of some sort). Well, unfortunately, discovering that line of best fit can be quite a challenge.

We will get into how exactly a line of best fit can be calculated in machine learning, but in general, to find a quality line of best fit that can be used requires large amounts of data. An algorithm Google developed used a data set that contained roughly 1.2 million images.

Of course, not every machine learning algorithm requires datasets that are that large, however, depending on the situation, if limited data is available it can be impossible to create a decent model.

The Dimension Problem

The Dimension problem is one of the most challenging aspects of machine learning in my opinion. The easiest way I can think to explain it is like so:

Whenever we plot data on a graph, imagine each axis is a dimension. Recall in our previous example we have both an x and y axis, thus, our previous example can be considered two dimensional.

However, if we were to add another variable to our previous data set we could no longer use a two-dimensional graph to plot our data.

If you remember, our data set contained the following information per house sale:

  • The sale price of the house
  • The size of the entire property

Now, if we add another variable, in this case, the number of rooms in each house, we would then have three variables per house sale in our dataset:

  • The sale price of the house
  • The size of the entire property
  • The number of rooms in the house

Because we represent each variable on an axis and we have three variables, we would naturally have three axes (x, y & z) if we were to plot our data on a graph.

Our data set plotted on a graph would look something like this:

Suddenly, our machine learning algorithm is no longer dealing with a two-dimensional graph but a three-dimensional graph. I’m sure you would agree just by looking at this graph there is no clear, easy to see relationship between the data.

This is, in essence, the dimension problem. Often in machine learning, we deal with multiple variables and the number of dimensions doesn’t stop at three. We can easily have four, five, six or even twenty variables which result in a lot of dimensions which are difficult to visualize.

So in summary, the more dimensions, the harder it can be to find relationships within data and the more computing power is needed. Machine learning can get complicated, fast 😳.

Nonlinearity

In our previous example, I showed how the relationship within the data was a line of best fit. Another challenge in machine learning is the relationships within data are not always linear.

This means relationships in our data will start to look like this:

Yeah, weird lines everywhere.

If we combine nonlinear relationships with multiple dimensions, we get something like this:

Pretty crazy right? One of the reasons machine learning is fantastic is it can find amazing relationships between huge amounts of data that are not necessarily obvious to us humans.

Computational power

The last challenge I would like to point out for now is the everlasting concern of computational power.

I’ll get into how exactly machine learning algorithms find relationships in data in another post, however, I would like to stress the idea that it is extremely rare for machine learning algorithms to find perfect relationships within data.

Because our datasets can get extremely complex with multiple dimensions and nonlinear relationships, it is physically impossible for our algorithms to perform brute force computations that calculate every possible relationship to find the perfect fit for the data.

The reason being is the time to perform such calculations can easily take 100’s of years to complete, and I’m not exaggerating.

Machine learning algorithms use special techniques such as gradient descent (which we will get into in another post) and as a result of these techniques the algorithms usually only find close to rough versions of relationships within the data.

It is rare for an algorithm to find perfect relationships and honestly perfect relationships are not worth the time it takes to find them.

The time it takes for computers to find relationships within data depends on the computational power of the machine. To give you an idea, it usually takes a computer anywhere from a few hours to a few days to find a decent relationship in a given dataset to be able to make accurate predictions.

Conclusion

Machine learning on its own is a huge topic containing many different models that serve different tasks. A computer vision machine learning model would look different to a model that learns to play chess.

I hope this post helped give you a quick introduction to A.I and a basic intuition behind machine learning.

In the next post, I plan to tackle the different types of machine learning models and artificial neural networks.

Thanks for reading 🙏🏼