Keywords and notes on a humanoid AI

Wolfgang Stegemann, Dr. phil.
Neo-Cybernetics
Published in
11 min readJul 25, 2024

In my rough considerations, I dealt with the question of how artificial intelligence (AI) could be developed that is more similar to the human brain. To do this, it is important to understand how the brain differs from conventional computers.

1. Basic differences between brain and computer

Traditional computers work according to the “process principle”. This means:

· You execute commands step by step.

· Each task is broken down into smaller subtasks and processed one after the other.

· The processing is linear and predictable.

The human brain, on the other hand, functions according to the “excitation principle”:

· Many processes are taking place at the same time.

· Stimulus processing, thinking, remembering and evaluating happen simultaneously or in one process.

· Processing is parallel and highly networked.

A vivid example of this difference is the heart’s response to physical exertion. The heart immediately adjusts its rhythm without conscious, step-by-step processing. Similarly, the brain reacts immediately to stimuli without going through a linear processing process.

2. Three-dimensional network architecture of the brain

An essential aspect of the brain is its three-dimensional structure:

· It consists of about 86 billion neurons.

· These neurons are connected in a complex, three-dimensional network.

· The connections (synapses) between neurons are not static, but are constantly changing (neuroplasticity).

In my considerations, I propose to replicate this structure in AI systems:

· A dense, three-dimensional artificial neural network is being designed.

· In this network, stimuli are supposed to generate specific “figures” or activation patterns.

· Similar stimuli would cause similar activation patterns.

This approach is fundamentally different from traditional AI architectures, which are often based on flatter, less dynamic structures.

3. The concept of the “interpreter”

In the human brain, there are structures that mediate between different brain regions. An example of this is the entorhinal cortex, which plays an important role in memory formation by mediating information between the hippocampus and the neocortex.

The entorhinal cortex (EC) plays an important integrative role in the brain:

· Integration of subcortical and cortical processes: The EC acts as an interface between the hippocampus and the neocortex, which puts it in a unique position to integrate information from different brain regions.

· Far-reaching connections: The EC has numerous connections to other brain areas, which underlines its role as an integrator of various neuronal processes.

· Memory function: The EC is closely linked to the hippocampus and plays an important role in memory processes, especially declarative and episodic memory.

· Spatial navigation: The EC is crucial for spatial orientation, indicating its ability to process and integrate complex information from different brain regions.

· Affective and behavioral regulation: Through its serotonergic and dopaminergic neurons, the EC is also involved in the regulation of emotions and behavior

In my AI model, you need an equivalent structure — an “interpreter”. This is intended to:

· “Read” and interpret the activation patterns in the network.

· Recognize similarities between different patterns.

· Enable contextual interpretations.

4. The Attention Mechanism as a Possible Interpreter

As a promising approach for this “interpreter”, I have identified the attention mechanism used in modern AI architectures such as transformers.

The Attention Mechanism:

· Allows the system to focus on the most relevant parts of the input data.

· Dynamically weights the importance of different parts of the input.

· Can capture relationships between distant elements in the data.

My suggestion is to store the results of the attention mechanism as a kind of metastructure. This metastructure would evolve and refine over time, with the “interpreter” emerging as an emergent property from the totality of these stored experiences.

5. Integrating Piaget’s Learning Principles

To improve the efficiency and scalability of our system, we integrate concepts from Jean Piaget’s cognitive development theory:

a) Assimilation:

· New experiences are placed in existing cognitive structures.

· In the AI context: Similar experiences are summarized and reinforced.

b) Accommodation:

· Existing structures are adapted if new experiences do not fit in.

· In the AI context: The system can develop new categories or patterns of interpretation.

A concrete example of a mathematical model that can map complexity reduction and accommodation are self-organizing maps (SOM). These neural networks learn by representing input data in a low-dimensional map, which compresses and abstracts the data. Adapting the map to new data can be seen as an analogy to accommodation, where new experiences are integrated into the existing schema and the schema is adjusted accordingly [1].

Through these processes, our AI system would:

· Continuously learning from experience.

· Hide unimportant details and reinforce essential patterns.

· To make a kind of “distillation” of experiences, similar to the human brain [2].

6. Potential benefits and challenges

Advantages of this approach:

· Higher adaptivity: The system could flexibly adapt to new situations.

· Improved generalization capability: It could derive general principles from specific experiences.

· Emergent creativity: By combining different abstract concepts, new, unexpected solutions could emerge.

· Efficient use of resources: By reducing redundant information, storage capacity would be used optimally.

Challenges:

· Developing efficient mechanisms for assessing the “materiality” of information.

· Finding the right balance between detail preservation and abstraction.

· Implementation of “forgetting” as an active process to optimize the storage structure.

It must be emphasized that such an interpreter, i.e. something similar to an ego, is not substantial, and certainly not quantitative, but a dynamic system whose evolving property is to constantly reinterpret.

A major challenge here is the question of how the same stimulus always creates the same figure in a three-dimensional network.

Here is a possible solution to this problem, based on the concept of “dynamic attractors”:

Dynamic Attractors:

Imagine that each stimulus does not create a rigid three-dimensional figure, but forms a dynamic attractor in the neural network. This attractor would be a stable state that the system tends to do with similar inputs.

Probabilistic activation:

Instead of deterministic activation, each stimulus could trigger probabilistic activation of neurons in a specific area of the network. The probability of activation would be highest in the center of the attractor and would decrease outwards.

Self-organizing cards:

Let’s implement self-organizing maps (SOMs) within the 3D network. These would adapt to incoming stimuli and reproduce similar stimuli in neighboring regions.

Hebbian Learning with Topological Component:

Let’s use a modified form of Hebbian Learning that not only amplifies the strength of the connections between neurons that are active at the same time, but also takes into account topological proximity. This would lead to similar stimuli activating similar spatial patterns.

Fuzzy Boundaries:

Let’s define the boundaries of the “figure” not sharply, but as a probability distribution. This allows a certain flexibility in the reactivation while maintaining the core structure.

Topological persistenz:

Let’s use concepts from topological data analysis to identify and preserve the essential features of the figure. These persistent features would serve as anchors for reconstruction.

Quantum mechanical inspiration:

Inspired by quantum mechanics, one could introduce the concept of superposition. The “figure” exists in a superposition of possible states until it “collapses” due to a specific context or additional information.

Fractal Compression:

Let’s use fractal compression algorithms to store the essential features of the character in a compact form. When reactivated, the algorithm would reconstruct the complete figure from this core information.

Kontextual Priming:

Let’s integrate a system for contextual priming that increases the probability of correct reactivation by taking into account the current context (e.g., other stimuli present at the same time or the overall state of the system).

Adaptive Resonanz:

Let’s implement a form of adaptive resonance theory, in which incoming stimuli are compared with stored patterns. If there is enough match, the saved pattern will be updated and reinforced instead of creating an entirely new one.

These approaches together could create a robust system capable of generating similar three-dimensional figures when presenting the same stimulus repeatedly, while also providing the necessary flexibility and adaptability needed for a humanoid AI.

In summary, our approach aims to create a new generation of AI systems that come closer to human thinking and learning. Instead of just processing data, these systems should be able to “understand” information holistically and learn adaptively from experience.

The next steps include the development of theoretical foundations as well as concrete implementation strategies.

In order to deepen the theoretical foundations and develop concrete implementation strategies, the following steps could be taken:

1. Deepening the theoretical basics:

a) Cognitive science models:

- Further development of models that formalize Piaget’s assimilation and accommodation processes.

- Integration of theories of implicit learning and intuition.

- Self-interpretation of the interpreter.

b) Information-theoretical consideration:

- Investigation of information density and flows in biological neural networks.

- Development of mathematical models for the emergence of meaning from activation patterns.

c) Complexity theory:

- Analysis of the emergence of emergent properties in complex systems.

- Investigation of self-organization principles in neural networks.

2. Development of concrete implementation strategies:

a) Architectural design:

- Conception of a flexible, three-dimensional network architecture that allows dynamic reconfiguration.

- Development of algorithms for efficient management and updating of the network structure.

b) Attention mechanism extension:

- Implementation of a multi-level attention system that takes into account local and global contexts.

- Integration of feedback loops to continuously optimize attention control.

c) Metastructure development:

- Design of a hierarchical data structure for storing and organizing attention results.

- Implementation of mechanisms for dynamic adaptation and evolution of this metastructure.

d) Abstraction and reduction mechanisms:

- Development of algorithms for the automatic identification and extraction of essential patterns.

- Implementation of procedures for contextual compression of information.

e) Learning strategies:

- Developing training methods that combine both supervised and unsupervised learning.

- Implementation of mechanisms for continuous learning and adaptation on the fly.

f) Evaluation methods:

- Development of test scenarios and metrics to evaluate the adaptivity and generalization capability of the system.

- Implementation of procedures for visualizing and interpreting the internal representations of the system.

g) Hardware optimization:

- Investigation of neuromorphic computing approaches for the efficient implementation of the proposed architecture.

- Exploration of quantum computing technologies for the simulation of complex, high-dimensional networks.

To advance these aspects, an interdisciplinary collaboration of neuroscientists, cognitive scientists, computer scientists and mathematicians would be required. Experimental studies could be conducted in parallel with theoretical work and software development to continuously validate and refine the concepts.

Iterative prototyping and rigorous testing phases would be crucial to demonstrate the practical feasibility and performance of the proposed approach. A special focus should be on the scalability and efficiency of the system to ensure that it works effectively even with complex tasks and large amounts of data.

Remarks:
— — — — — — — — — — — — — — — — — — — — — — — — — — — —
[1] In order to mathematically model Piaget’s principle of accommodation, various approaches from the theory of complex dynamical systems and artificial intelligence can be used. Here are some possible approaches:

1. Differential equations:

One could use a system of nonlinear differential equations to describe the adaptation of cognitive structures over time. For instance:

S represents the state of the cognitive schema, E the environmental influences, f a nonlinear function of adaptation and α a decay rate.

2. Neural networks:

Self-organizing maps (SOM) or other types of artificial neural networks can model the adaptation and reorganization of cognitive structures. The weight adjustment of neurons would represent accommodation:

Where wi is the weights, x is the input vector and α is the learning rate.

3. Bayesian models:

These can model the adaptation of beliefs (schemata) to new experiences:

Where H represents the hypothesis (schema) and E the new experience.

4. Information theory approaches:

Accommodation could be modeled as minimizing the Kullback-Leibler divergence between the current model and the new data:

Where P represents the current model and Q represents the adapted model.

These mathematical models can depict the continuous adaptation and expansion of cognitive structures in the sense of Piaget’s accommodation principle. They capture the dynamic nature of the learning process and the interaction between existing schemes and new experiences.

— — — — — — — — — — — — — — — — — — — — — — — — — — — —

[2] To reduce complexity by changing the topology, there is a method called “Topological Data Analysis” (TDA). This method uses concepts from algebraic topology to simplify complex data sets and capture their basic structure.

The basic idea of ​​TDA is to examine the “shape” or structure of data by looking for topological properties such as connected components, holes or higher-dimensional voids. This makes it possible to identify the essential features of a complex system and reduce its complexity.

Some main steps of TDA are:

Data representation: The data is embedded in a metric space.

Filtration: Different scales are considered to analyze the data structure at different levels of detail.

Topological summarization: The topological properties are extracted, often in the form of persistence diagrams or barcodes.

Interpretation: The topological information obtained is used to gain insights into the underlying structure of the data.

Piaget’s principles of accommodation can be linked to the topological analysis of the brain. This idea can be applied to the brain, where the spatial arrangement of information within a limited space plays a key role.

Topological reorganization:

Piaget’s accommodation describes how existing cognitive structures are adapted to integrate new information. In the brain, this could correspond to a reorganization of synaptic connections and neural networks without significantly changing the total number of neurons.

Information coding through spatial patterns:

In a voxel (three-dimensional volume element) of the brain, complex information could be encoded by different spatial arrangements of synaptic connections and activation patterns. This corresponds to a topological change at the microscopic level.

Dynamic reconfiguration:

The brain could dynamically reconfigure its internal topology to integrate new information or modify existing concepts. This would allow flexible adaptation to new experiences without changing the basic structure.

Multidimensional representation:

Topological analysis could help to understand multidimensional representations of information in the brain. Complex concepts could be represented by the connections between different neuronal ensembles in different brain regions.

Efficiency through topological optimization:

By changing the topology, the brain could optimize the efficiency of information processing and storage by shortening or strengthening the “paths” between relevant information units.

Cross-scale analysis:

Topological data analysis could be applied to examine changes both at the level of individual synapses and at the level of larger neuronal networks, thus obtaining a comprehensive picture of information processing.

--

--