Member-only story
Vision needs brain science
to boost AI by emulating human-level sight
Animals survive due to their recognition of the world through senses and effective motion. A brain theory needs to explain how this works. It starts with a discussion about representation.
When a computer scientist is asked about how to represent a 3D object, their default approach is to use 3D coordinates and other attributes. That converts an object into a mathematical problem.
When today’s AI experts are asked that question, they ask for data — lots of 2D images that include the object in question from various angles and some labels. It makes the representation of 3D objects into a statistical problem to be solved. An artificial neural network links to the label via the probability of an image matching with it within the training data.
A third approach is Patom theory (PT), a brain model looking to represent vision as a collection of real-world objects to be recognized with any associated sense — vision, hearing, touch, and more. When you ask someone questions about vision, they tend to rely on their knowledge of the world to determine relative sizes. Deficits seen with brain damage expose significant detail of where visual recognition takes place.