From Perceptrons to Tesla Vision

Part 1. An Origin Story

Ronald Boothe
ILLUMINATION
5 min readNov 12, 2023

--

Tesla Model. Image by author.

Tesla recently announced that the next version of its Tesla Vision (Formerly Full Self Driving)” package, Version 12, will no longer be a rule based system designed by human computer programmers, but will instead be based entirely on an Artificial Neural Network (ANN).

https://www.teslarati.com/tesla-full-self-driving-12-rules-based/

Before retiring, I worked as an academic Visual Scientist and this gave me a front row seat to observe some of the historical development and application of ANN’s to the domain of visual perception, including machine vision systems such as those currently being implemented by Tesla. In this essay I will give a brief summary, from my own personal perspective, of how these perceptual ANN’s originally came about, an origin story.

The scientific study of visual perception has a long history, going back to psychophysical studies carried out in the 19th Century. Continuing into the early decades of the 20th Century, scientific studies related to visual perception were mostly carried out by scientists working in isolation from one another in traditional academic departments such as Physiology, Anatomy, and Psychology.

Physiologists and anatomists typically did not have much interest in developing theories of perception, concentrating instead on the more limited goal of trying to understand the properties of the presumed underlying biological substrate; neural pathways that process signals from the eyes. Meanwhile, psychologists working on visual perception, usually had little interest in relating their psychological theories to the underlying neural substrate, concentrating instead on rule-based cognitive models.

Then in the second half of the 20th Century a new interdisciplinary discipline called Cognitive Neuroscience emerged and visual scientists began to be increasingly interested in developing cross-disciplinary models that incorporated concepts from both the neurosciences and cognitive psychology. They began to construct models based on a seminal concept, first proposed by McCulloch and Pitts in 1943; a formal neuron that tried to model some of the information processing capabilities of a biological neuron.

A Schematic representation of a Biological Neuron. Figure adapted from Boothe.¹

Communication of signals from one neuron to the next in biological tissue takes place at a synapse. An axon from a presynaptic neuron sends electrical signals to its terminal located near the cell body of a postsynaptic neuron. When these electrical signals arrive, they cause vesicles filled with neurotransmitters to pass across a synaptic cleft and influence the membrane of the postsynaptic neuron, which can in turn generate electrical signals to be sent down its own axon. Some synapses are excitatory, meaning they increase the probability that the postsynaptic neuron will generate its own electrical signal. Others are inhibitory, decreasing the probability. The strength of the influence of a synapse on the postsynaptic membrane varies with some having a large effect while others have very little.

A formal neuron is an attempt to model these characteristics of biological neurons in terms of information flow.

A Schematic representation of a Formal Neuron. Figure adapted from Boothe.¹

Formal Neuron i in the above diagram is shown receiving input from some finite number of other formal neurons, σ₁ — σₙ. The input from each neuron is given a weight, W, that can be thought of conceptually as the synaptic efficacy of the connection. A weight of 0 would mean the synapse has no effect on the postsynaptic neuron, a positive weight would have an excitatory and a negative weight an inhibitory effect. The weighted inputs are collected at the cell body according to some formula, typically:

where h is the effect produced on the postsynaptic cell after the weighted inputs have been collected from presynaptic neurons σ₁ — σₙ. Once the value for hᵢ is determined, an output signal (σᵢ) is generated based on some function Ψ, typically something along the lines:

where Tᵢ is the threshold of neuron i, conceptualized as the activation level of a biological neuron that is needed to generate an electrical spike that can leave its cell body and travel down the axon.

One of the first attempts to apply a formal neuron to a model of visual perception was the Perceptron, proposed by Frank Rosenblatt in the 1950s.²

Rosenblatt’s Perceptron, adapted from Boothe.¹

A Perceptron is composed of a formal neuron, i, that receives inputs from input elements, σ₁ — σₙ, and responds to patterns of light stimulation in the environment analogous to neurons in the retina that have visual receptive fields. Rosenblatt argued that a properly constructed Perceptron should be able to act as a visual pattern detector, for example be able to detect the presence or absence of a circle:

However, Marvin Minsky and Seymour Papert in a book published in 1969,³ did a formal analysis of the capabilities of the Perceptron. They demonstrated that it was impossible for the Perceptron to solve a simple Boolean exclusive OR (XOR) problem, such as deciding whether a circle or a square, but not both, was present in a scene.

This was considered to be a fatal flaw and most vision scientists abandoned further work with the Perceptron for the next decade. Anatomist and physiologists went back to studying the neural pathways that process signals from the eye. Cognitive psychologists went back to working on rule based models of perception. That all changed in the 1980s when it was discovered that slight alterations in the architecture of the Perceptron would allow it to solve not only the XOR problem, but any well behaved mathematical function, i.e. any problem whose solution can be characterized formally in terms of an algorithm.

I address this issue in Part 2 in this series of posts about Perceptrons and Tesla Vision where I discuss the topic of ANN architectures.

In Part 3 I discuss issues of Partial Completion and Dynamics in ANNs.

In Part 4 I discuss how connections get formed in biological and artificial neural networks.

Ronald Boothe, psyrgb@emory.edu

NOTES:

  1. Ronald G. Boothe. Perception of the Visual Environment. Springer-Verlag New York, 2002.
  2. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408.
  3. Marvin Minsky and Seymour Papert. A Step toward the Understanding of Information Processes: Perceptrons. An Introduction to Computational Geometry. M.I.T. Press, Cambridge, Mass., 1969.

--

--

Ronald Boothe
ILLUMINATION

Professor Emeritus, Emory University, Atlanta, GA, USA