# Machine Learning: A Primer

## an introduction for both technical and non-technical readers

May 27, 2018 · 12 min read

# How does machine learning work?

Attention all math shy readers: I regret to inform you that a basic understanding of some key mathematics concepts is required to fully comprehend most machine learning algorithms. But fear not! The required concepts are simple and draw on classes you’ve probably already taken. Machine learning uses linear algebra, calculus, probability and statistics.

## Regression Algorithms

Possibly the most popular machine learning algorithms, linear regression algorithms are supervised learning algorithms that predict a specific outcome based on continous variables. Logistic regression, on the other hand, are used specifically to predict discrete values. Both these (and all other regression algorithms) are known for their speed; they are consistently ranked among the very fastest machine learning algorithms.

## Instance-Based Algorithms

Instance-based analysis uses specific instances of provided data to predict an outcome. The most famous instance-based algorithm is k-Nearest Neighbor, also known as kNN. Used for classification, kNN compares the distance of data points and assigns each point to the group it is closest to.

## Decision Tree Algorithms

Decision tree algorithms take groups of “weak” learners and have them work together to form one strong algorithm. These learners are organized in a tree-like structure, branching off one another. A popular decision tree algorithm is the Random Forest Algorithm. In this algorithm, the weak learners are randomly chosen. This tends to lead to a strong predictor. In the example below, we can discover numerous common traits (like eyes that are or are not blue), none of which would be enough on their own to identify the animal. When we put all these observations together, however, we are able to form a more complete picture and make a much more accurate prediction.

## Bayesian Algorithms

Unsurprisingly, these algorithms are explicity based on Bayes’ theorem. The most popular is Naive Bayes, which is often used in text analysis. Most spam filters, for example, use Bayesian algorithms. They use user-inputted data labeled by class to compare new data against and categorize appropriately.

## Clustering Algorithms

Clustering algorithms focus on finding commonalities between elements and grouping them accordingly. A common clustering algorithm is K-Means Clustering. In K-Means, an analyst selects the number of clusters (denoted by the variable K) and the algorithm groups the elements by physical distance into appropriate clusters.

## Deep Learning and Neural Network Algorithms

Artifical neural network algorithms are based on the structure of biological neural networks. Deep learning takes the neural network model and updates it. They are large, extremely complex neural networks that use small amounts of labeled data and much larger amounts of unlabeled data. Neural networks and deep learning have many inputs that go through several hidden layers before resulting in one or more outputs. These connections form a specific cycle that mimics the way the human brain processes information and makes logical connections. In addition, the hidden layers often get smaller and more nuanced as the algorithm runs.

## Other Algorithms

The diagram below is the best one I’ve found to show the major machine learning algorithms, their categories, and their relationships with each other.

# Why is machine learning important?

## Internet of Things

The term internet of things, or IOT, refers to the network-connected physical devices in your home and office. A popular IOT device is the smart lightbulb, sales of which skyrocketed over the last few years. With advances in machine learning, IOT devices are smarter and more sophisticated than ever. Machine learning has two main applications pertaining to IOT: making your devices better and gathering your data. Making the devices better is very straightforward: using machine learning to personalize your environment, i.e. using facial recognition software to sense who is the room and adjust the heat and AC accordingly. Gathering data is even more straightforward; by keeping network-connected devices (like an Amazon echo) powered on and listening in your home, companies like Amazon gather key demographic information to pass onto advertisers, like what television shows you watch, what time you wake up and go to sleep, and how many people live in your home.

## Chatbots

We have seen a proliferation of chatbots in the last few years and sophisticated language processing algorithms are improving them every day. Chatbots are used by companies, on both their own mobile apps and third-party apps like Slack, to provide virtual customer service that is both faster and more efficient than a traditional (human) representative. To order a shirt from the clothing company H&M for example, you can now tell their chat bot in natural language what you’d like and what size you need and order the item without ever leaving the chat screen.

## Autonomous (a.k.a. self driving) Cars

My personal favorite of the next big machine learning projects is one of the furthest from widespread production. Nevertheless, self driving cars are currently in development at several huge companies like Chevrolet (through their Cruise brand), Uber, and Tesla. These cars use technologies made possible through machine learning for navigation, maintenance, and safety procedures. One example is the traffic sign sensors, which use supervised learning algorithms to identify and parse traffic signs and compare them to a labeled data set of standard signs. Thus, the car sees a stop sign and recognizes that it does, in fact, signify stop, rather than yield or one way or pedestrian crossing.

Written by

Written by