Building Intuition: What is a Neural Network?

TL;DR: a machine that can learn patterns from large amounts of data

Yujian Tang
Plain Simple Software
2 min readNov 12, 2023

--

Neural networks are the state of the art technique for machine learning. This technique is so powerful, and so popular, that pretty much every big tech company uses it in some capacity. Starting out as a way to classify data that was not linearly separable, it soon became the de facto model for every kind of complex data imaginable.

The concept of neural networks is derived from an idea of how the brain works. A neural network is made up of many individual perceptrons or neurons. This collection of neurons can be trained to recognize patterns from a data set.

Sample Deep Neural Network Image from Stack Exchange

The image above shows a sample diagram for a basic deep neural network, networks with at least one hidden layer between the input and output. Since the 1970’s, we’ve evolved many types of neural networks. In the 1990s and 2000s, we built Recurrent Neural Networks (RNN) for language, and Convolutional Neural Networks (CNN or ConvNet) for vision.

As we progressed language models, we found that RNNs suffered from sequence transduction issues. The size of the input sequence and the size of the output sequence had to match. In addition, it lost context over time through vanishing gradients.

These issues were solved with Transformer or Encoder Decoder models. These models “encode” the entire context of the input into a matrix, or set of vectors. This context is then sent to a “decoder” along with another matrix called “self-attention” to then be used to make a prediction about something like the next token (in the case of GPT).

GPT is a decoder only model that intakes a set of embeddings and generates its best guess at next tokens. This brings us up to where we are today (2023), roughly.

Like this article? Follow me, Yujian Tang, for more posts about NLP, Software, and Growth. Make sure to follow Plain Simple Software for more software articles too!

--

--