Perceptrons in Swift
Understanding the inner-workings of basic neural networks.
With the new machine learning (ML) frameworks introduced during WWDC 2017, iOS/macOS/tvOS programmers now have the tools to simplify the implementation of powerful ML algorithms within their apps. A simple hand-waving explanation of how neural networks work will get you surprisingly far, but a little more knowledge and experimentation will help remove some of the mystery behind how they actually work. This first article will implement a simple perceptron network using a Swift Playground to show how a basic network works.
Perceptrons were invented in the late 1950s and implemented directly in hardware to explore machine vision. Perceptrons represent a simplified model of a biological neuron. A biological neuron has multiple inputs and a single output. Different types of neurons, or cells, can perform different types of functions based on the data provided by its multiple inputs.
We’ll use a software model of a perceptron in the form of a very simple artificial neural network (ANN). Our perceptron model will have just two inputs and one output. Input and output values are restricted to either a 0 or a 1. Each input will have an associated value or weight (w).
In use, the perceptron multiplies each input (x) with its associated weight (w). The results are added together to create a single weighted sum (x). A bias ( b ) is added to the weighted sum. The bias added to the weighted sum becomes part of a simple activation function: If the result is greater than 0, the output is 1. Otherwise, the output is 0.
Configuring the Perceptron
Inputs with the associated weights and the bias control the behavior of the perceptron. Input values change as new data is presented to the network, but the weights and biases remain constant while the network is in use. An input weight determines the importance of its input. For example, if an input is required, the weight might be set to a large positive number. For an input that is not required, the weight might be close to 0. If it’s important for an input to be absent, the weight might be a large negative number.
The bias controls how easily the perceptron is “activated” (output driven to 1). A large negative bias will deactivate the perceptron unless the weighted sum is large enough to offset the bias (the weighted sum plus the bias must be positive for the perceptron to be activated).
Let’s create a two input AND gate. For the weights (w) we’ll use a value of 2 and a bias (b) of -3. The weighted sum plus the bias will be:
From inspection, it’s easy to see that if both inputs are 1, the weighted sum is 4. Adding the bias of -3 to the weighted sum gives us 1 which is greater than 0, so the output will be 1. If either input is 0, the sum plus the bias is -1 making the output 0. If both inputs are 0, the weighted sum plus the bias is -3 and the output is 0.
A Simple Perceptron in Swift
Let’s build a simple perceptron using Swift. Copy the following code into an empty Swift Playground:
Add the following code to build an AND gate:
AND gate output:
AND:[0,0] = 0
AND:[0,1] = 0
AND:[1,0] = 0
AND:[1,1] = 1
Change both of the input weights to -2 and set the bias to 3. The perceptron now models a NAND gate (a negative AND gate) which is a universal gate that can be used to build all other gates. Add the following code to the Playground:
NAND gate output:
NAND:[0,0] = 1
NAND:[0,1] = 1
NAND:[1,0] = 1
NAND:[1,1] = 0
Just for fun, let’s build a half adder using perceptrons modeling NAND gates. A half adder adds two binary inputs. The output from an adder is a sum and a carry. The sum output is 1 when one, and only one input is 1. Otherwise, it is 0. The carry output is 1 only when both inputs are 1.
The image on the left represents a NAND gate version of a half adder which we’ll implement using perceptrons trained to model NAND gates as shown on the right.
Add the following lines of code to your Playground:
Half adder output:
0 + 0 => Sum: 0 Carry: 0
1 + 0 => Sum: 1 Carry: 0
0 + 1 => Sum: 1 Carry: 0
1 + 1 => Sum: 0 Carry: 1
Since NAND gates are universal gates, it’s possible to use perceptrons to simulate all logic gate circuits. But perhaps the best use of perceptrons is to develop a basic understanding of artificial neural networks. Perceptrons are limited in functionality in part by their binary output. A small change in a weight or bias can produce a complete flipping of the output. To be most useful, we need a network with an output that changes slowly when making small changes to the weights and biases. This allows the network to be more easily configured (trained).
The modern ANNs supported by the new Apple frameworks use floating point values for weights and biases, and a variety of activation functions whose output allow for gradual training. We’ll build one in the next article.
Historically, perceptrons were implemented in hardware directly, and it’s rather easy to model them in software. But more importantly they give us a basic understanding of how ANNs work. In the next article we’ll dive deeper into understanding the details of modern ANNs using Swift as our language of choice. Our non-optimized implementation will allow us to focus on how they work, so that when we deploy the highly optimized and efficient Apple ML frameworks we’ll at least have a basic understanding of all the bit twiddling taking place.