Classification Models using Neural Networks

Sean Gahagan
2 min readOct 20, 2022

--

My last note began looking at neural networks, how they work, and some of the nuance around leveraging them.

This note will look at how neural networks can be used for classification models.

Classification with Neural Networks: What is it?

When we looked at classification models earlier, we looked at it as a set of hypothesis functions, where each one was predicting whether or not a thing was a specific type of thing. Instead of using multiple hypothesis functions to classify multiple types of things, you can use one neural network with multiple output nodes in its output layer (see illustration).

A simple illustration of a neural network structured for multi-class classification

With this structure, each output node is calculating the probability (perhaps using a sigmoid function or rectified linear unit) that a given thing is a different specific type of thing. The model then predicts that the given thing is the specific type of thing that has the highest calculated probability.

In the illustration’s example, our model is predicting whether a new thing is a pedestrian, a car, or a truck. It’s making these predictions based on the thing’s height, weight, and width. One of the neural network’s output nodes is calculating the probability that the thing is a pedestrian. Another output node is calculating the probability that the thing is a car. The last output node is calculating the probability that the thing is a truck. Then, since the highest probability (0.75) is that the thing is a truck, our model predicts that the thing is a truck.

Scaling Up

In our example, our neural network is only using 3 inputs, but one of the main applications for neural networks in classification can use thousands of inputs: image classification. These models will often take each color dimension (i.e., a number for red, green, and blue) of each pixel in an image as a unique input to help predict what things are in the image.

Up Next:

The next note in this series will step away from supervised learning (which was the focus of the last 7 notes) and look at a specific type of unsupervised learning.

Past Notes in this Series:

  1. Towards a High-Level Understanding of Machine Learning
  2. Building Intuition around Supervised Machine Learning with Gradient Descent
  3. Helping Supervised Learning Models Learn Better & Faster
  4. The Sigmoid function as a conceptual introduction to activation and hypothesis functions
  5. An Introduction to Classification Models
  6. Overfitting, and avoiding it with regularization
  7. An Introduction to Neural Networks

--

--