Mathesis: Elements of Learning and Intelligence
A formal and interdisciplinary theory of learning and intelligence that combines biology, neuroscience, computer science, engineering and various branches of mathematics to provide a unifying framework, direction and a broader horizon for neural network and machine learning research.
By John (Ioannis) A. Drakopoulos
Learning and intelligence are essential for the future of ecommerce. The increasingly complex ways that buyers and sellers engage online, a number of existing and emerging issues and opportunities, and the range of applications (including natural language processing, search, computer vision, fraud/authenticity and virtual reality) require ecommerce platforms to be flexible, sophisticated and, ultimately, intelligent. The associated scientific fields have evolved without an underlying theory, driven primarily by empirical and incremental research. This has hindered progress in many ways. The future requires a different approach.
Mathesis (Hellenic Μάθησις, “learning”) is an interdisciplinary theory of learning and intelligence that combines natural and formal science, including biology, neuroscience, computer science, engineering and various branches of mathematics. It attempts to unify all of learning under a common framework and provide the missing theoretical basis that can drive research and applications. The theory has been developed over multiple decades and demonstrates the importance of rigor and perseverance, as well as the potency of interdisciplinarity over complexity.
Synthesis and coherence are two of the theory’s foundational principles which underpin the larger framework. The theory starts with a simplified definition of learning as optimization and a formal definition of neural networks. It derives axiomatic principles from biology and neuroscience, emphasizes the role of synthesis and its consequences, introduces the concept of coherence, and applies it to learning.
Synthesis
Mathesis defines neural networks as compositions of simpler neural structures that may receive input from multiple sources through cross-connections and aggregating functions. The theory discusses locality and receptive fields, provides a critique of convolutional neural networks, and examines their advantages, drawbacks and flaws.
The theory initially treats learning as optimization, which is a dramatic but necessary simplification for the gradual expansion of the theory. It considers existing learning paradigms and discusses the computational limitations of learning, as well as its inherently local — and thus coherent — nature.
Mathesis considers biology to be imperative for learning and intelligence and borrows biological principles to construct a partial axiomatic basis. It derives an analogous learning theory from cell theory and proposes energy, heredity, homeostasis and evolution as essential concepts and processes in learning systems. It demonstrates the thermodynamic inefficiency of von Neumann architectures (which are fundamentally incompatible with neural networks), and sets the stage for a new computing architecture.
Cell theory and our axiomatization imply that learning and intelligence constitute a large-scale synthesis of simpler elements. Synthesis requires recursive structure, synergy, state and the convenience of a neural algebra. It thrives in diversity and leads to the synergy conjecture that divides the space of interactions into three main regions: detachment, synergy and mimesis.
Coherence
Coherence represents the theoretical core of Mathesis and is one of its most fundamental contributions. Coherence underlies all learning phenomena, unifies them under a common framework, explains their structure in a fashion similar to comparative anatomy, and predicts or explains various empirical results and observations in machine learning. It also provides a foundation for the creation of more sophisticated methods, neural structure and architectures. Most of Mathesis is effectively a transition from the current empirical and almost random state-of-the-art into coherent entities. This includes coherent functions, coherent learning calculi, coherent structure, coherent plasticity and growth, and coherent evolution.
Coherence provides a scientific definition for generalization in the same way that evaporation and condensation provide a scientific explanation for rain. Locality is shown to be a special case of coherence. The same applies to regularization. Virtually everything in learning relates to coherence because learning is coherence.
Common operators can be used to manipulate coherence and define coherent learning methods, such as synaptic, neural or dendritic coherence. Those are applied to parameters, their values, gradients and updates. Most known learning algorithms are special cases of synaptic coherence.
Initial Evaluation
Mathesis introduces new local neural structures such as flowers, sunflowers, corollas and striders. Based on the theory, we combine such structures into a larger synergetic model, train them with synaptic coherence, and measure the error rates on mnist and imagenet. The network has achieved a new record on mnist. Our experiments on imagenet are still in an early stage but they have produced some modest gains so far. We will update the evaluation as more results become available.
What’s Next
Our next step is to apply Mathesis to eBay data on various domains, starting with computer vision and machine translation, and then progressing to search and search-related problems. The overall synthesis (elements, structure, synergy, diversity), learning parameters and methods that we have used in our experiments are not optimal and may evolve significantly in the coming months and years. The theory is not meant to replace evolution, but rather to initiate, fuel and drive it. As the technological landscape continues to change in both scope and complexity, Mathesis will provide the necessary evolutionary framework.
Originally published at https://tech.ebayinc.com on January 25, 2021.