Artificial Intelligence Is the Future…So, How Does It Work Again?

If you’ve ever wondered how A.I. works, this is the place for you!

Sterling Kalogeras
The Startup
7 min readJan 5, 2021

--

When you hear the term “artificial intelligence”, what is the first thing that you think of? Is it a robot that checks humans into a hotel? Or is it a newscast that’s covering how people are losing jobs to machines?

Sophia is a robot from Hanson Robotics that demonstrates how artificial intelligence may be used in the future. Sophia humanoid robot — Word Investment Forum 2018 (44775984264) by UN Photo is licensed under CC BY-SA 2.0

Regardless of which one you thought of, before discussing the pros and cons of artificial intelligence (or A.I., for short), it’s important to have a general understanding of exactly what A.I. is.

So that’s what I’m here to do today: I’m going to briefly explain two important topics in the world of artificial intelligence: machine learning and unsupervised vs. supervised learning.

If you already consider yourself an A.I. expert, this article can serve as a refresher. Otherwise, power yourself on and welcome to the world of artificial intelligence!

Machine Learning

Unfortunately, some problems are too big for humans to tell machines what to do. That’s where machine learning comes in!

You can think of machine learning like going to school. A student learns from facts, studies for a test, takes a test, and then uses that information outside the classroom. This is very similar to the process of machine learning!

Suppose we want to create a model that asks a machine to distinguish if it is being shown a picture of pancakes or waffles.

The first thing we may want to do is to collect data on each one. For example, we may collect the thickness of each food in centimeters and time needed to cook the food in minutes.

We then want to gather this data with one more piece of information: whether the food is a pancake or a waffle, of course! Once we do this, we have what we call training data.

Next, we move on to data preparation. This is where we, you guessed it, prepare the data for the training phase (which of course we’ll discuss shortly). For example, you might want to randomize the rows of the table with our data, since we want our machine’s determination of the food to not be impacted by the item that’s before and after it.

Something to note is that we should get roughly the same amount of data about pancake and waffle to make sure it works in real world. What I mean by this is that you don’t want to give the machine 99 pancakes to train with and only 1 waffle because this will make it harder for the machine to guess accurately when put to the test.

Take data preparation seriously. If data preparation is not done well, it could ruin the whole model. Photo by Campaign Creators on Unsplash

Going back to the process, 80% of the data will be used for training and 20% will be for evaluation. Think of it this way: we don’t want to use the same data for evaluation because the machine could then just memorize the questions.

In school, questions on tests are never the same as the questions on students’ homework assignments. If they were, students could then just memorize the questions for the test! We don’t want this for students or machines.

We’ll then choose a model to train with. We’ll discuss the differences between the different types of models in the second half of the article. For now, just realize that you can pick the supervised model or the unsupervised model.

Training comes next. The main idea of training is that we want to incrementally improve the model’s ability to distinguish between pancakes and waffles.

How training works is that we adjust some values, let the machine make a guess, and then compare its predictions with the answers we know. In this case, we’ll adjust the thickness of the food and the time needed to cook the food.

After seeing how the machine does, we adjust the values to make the machine’s predictions more accurate. One of these cycles is called a training step. We continue training until we’re confident in the machine’s ability to distinguish between the two foods.

When you learn how to ride a bike, you keep practicing until your confident you can ride by yourself. You don’t just ride on your own after one attempt! This applies to training a machine, too.

Once training is complete, we use evaluation to test our machine against data that has never been used for training. This is supposed to be representative of how well our model will work in real world.

One of the last steps is called parameter tuning. In this step, we see if we can improve the model. This means that we change some factors that went into training. For example, we may change the number of training steps we must complete.

Just like tuning a guitar makes it sound better, parameter tuning will make your model better. It’s a crucial step. Photo by Alexis Baydoun on Unsplash

Finally, it’s the moment we’ve been waiting for! We can test the model in the real world! Get those kitchens ready because we’re going to need some pancakes and waffles!

Unsupervised vs. Supervised Learning

Two popular machine learning techniques are unsupervised and supervised learning. I’m going to explain the main idea of each of these techniques, discuss potential problems with each of them, and explore a potential solution to these problems.

Unsupervised Learning

When you use unsupervised learning, it means that you are working with data that is not labeled. For example, looking at our pancake and waffle example from earlier, we might have information on a food’s thickness and time needed to cook, but our training data does not tell us whether the food actually is a pancake or a waffle.

Because the data is not labeled, the machine figures out for itself how to sort the data. In other words, the machine will guess if a piece of data is similar to another piece of data, and if it’s not, the machine will separate it into a different group.

The process starts by telling the machine to sort the data into a certain number of groups. Then the machine will go ahead and figure out how to sort the data into those groups.

This technique is quick, but there is a major problem to it: since the data is not labeled, we can’t be sure whether or not the data is actually sorted well.

Another problem is that the machine must be given the number of groups to sort the data into, but sometimes the whole point of sorting data is to figure out how many groups there are.

So while the premise of unsupervised learning is a relatively simple, it does have some potential drawbacks.

Supervised Learning

Supervised learning is a technique that is used when you have labeled data. This means that we already know what’s what for the training data.

Notice how the data in this picture is labeled with labels like session length and page views. This makes this data a great candidate for supervised learning. Photo by Luke Chesser on Unsplash

The algorithm here guesses where each piece of data goes and then we tell it how to improve next time through revealing the answers. Each time we go through the process, the algorithm will improve. Eventually, this algorithm will be able to sort perfectly.

Unfortunately, that actually can be a huge problem. Believe it or not, the algorithm can actually be too perfect! For example, let’s look at the pancake and waffle example. If we give the algorithm a picture of french toast, then the machine won’t know what to do with it. The algorithm may put it in a random place. This is why it’s important to not over-train the algorithm.

Another problem is that you may not have all labeled data for some problems, but of course, you need the data to be labeled, so you won’t be able to use the algorithm anymore. Bummer.

Semi-Supervised Learning

Fortunately, there is a new technique in town that is still being researched: semi-supervised learning.

The algorithm works if you have a few labels for some data. The basic premise is that you use the unsupervised learning approach, but then you can look up the label for a certain piece of data. Then you can say that all the data with the same or a similar label should be in the same group.

This may be the algorithm of the future. As datasets get bigger, it will take way longer to label data. Simple solution: let computers label data themselves, right? Not quite, as they can’t label data well on their own, so it would be impractical for them to try to do so.

Again, this technique is still being researched, but it is a promising one.

What Now?

Well, now you should have have a basic understanding of the fundamentals of artificial intelligence. Of course, there are more specific topics to explore, like neural networks, but for now, congratulations and welcome to the world of AI!

If you want to reach out, you can check out my LinkedIn and my newsletter. Thanks for reading!

--

--

Sterling Kalogeras
The Startup

18-year-old innovator with a love for computer science, math, and government.