Part 4: Artificial Intelligence & Artificial General Intelligence: Moving the Research Base Forward.

Published in

DataDrivenInvestor

19 min readMar 22, 2019

Welcome to Part 4 of this series, where I aim to discuss final themes: Algorithms/Nets/GANs (Engineering), Cognition and An Apparent Trajectory for Development. These themes stem from original research about the work undertaken by OpenAI; stemming from December 11th, 2015 up to and including 11 June 2018, manifest in the write-up of my study OpenAI Blogs: A Micro-Event Analysis.

Further elaboration on the outcomes from that research; in this paper series (Part 1, Part 2 and Part 3), is presented in order to address the aim: how the research base for Artificial Intelligence (AI)and General Intelligence (AGI) might move forward. A focus is on outlining current challenges in the main, relating to the development of mainstream — focus on @ scale AI and AGI. The overall purpose for this series of papers is to consider paradigm shift — affordances, which can enable advanced development; and in turn, inform how the research base might move forward.

The table below serves as a reminder of the final themes from the original OpenAI research; the benchmarks that the organizations were measured by for inclusion, in terms of literature/research base and/or application aligned with five or more items (see green column).

All companies/organizations show an established or emerging stance for the themes: Algorithms/Nets/GANs (Engineering), Cognition and An Apparent Trajectory for Development; evident in the research base and/or applied. To reiterate, companies such as Microsoft, Amazon, and IBM do not.

Algorithms/Nets/GANs (Engineering)

Established Algorithm Development

Based on the very simple premise; computers can only do what they are told to do, we can surmise this is because a coder has told them what to do. Machine learning models are made up of data and the role of the algorithm is to make decisions about the data. Code built into an algorithm for such a purpose might look something like the following C++ Implementation of Decision Tree Algorithm:

In the next section, you will see a ‘Star’ diagram (let’s call it that, it sounds starry) about the movement of algorithmic development. More to the point, how algorithms have evolved along with computing (I referred to compute in Part 3 of this series).

Given, all companies/organizations were rated as established for the named themes: Engineering strands, Cognition and an Apparent trajectory for development, I shall begin with Reinforcement Learning(RL). I will then shift across the items in the diagram with a synthesis of examples pertaining to each company/organization, aiming to highlight affordances that inform an evident paradigm shift; with the purpose of considering how the research base might move forward.

RL is a strand of machine learning that includes algorithms with built-in rewards and penalties. It has the same principles as Social Learning Theory (e.g. Bandura, 1971); and the first building block for development at OpenAI was the introduction of Gym:

‘Gym is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball.’

Following on from this, during April 2016, OpenAI Gym Beta was the very first output for the public domain:

‘a toolkit for developing and comparing reinforcement learning (RL) algorithms. It consists of a growing suite of environments (from simulated robots to Atari games), and a site for comparing and reproducing results… compatible with algorithms written in any framework, such as Tensorflow…’

[GoogleAI] ‘TensorFlow™ is an open source software library for high-performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.’

Indeed, it was Google DeepMind who introduced a general reinforcement learning algorithm that masters chess and shogi through self-play. Convolutional Neural Network (CNN) was used and trained via reinforcement learning.

Play is a popular way to get machines to learn, whether structured (because the rules of the game are given and serve as a boundary for activity) as we see in DeepMind’s work or, unstructured in the work at DARPA:

‘Behind the USC researchers’ robotic limb is a bio-inspired algorithm that can learn a walking task on its own after only five minutes of “unstructured play” — or conducting random movements that enable the robot to learn its own structure as well as its surrounding environment. The robot’s ability to learn-by-doing is a significant advancement towards lifelong learning in machines.’

Two points can be carried forward here.

First, with regards to how ‘play’ is defined. Child development theories highlight different modes of play in the child’s repertoire to aid physical and cognitive developmental. Piaget outlined stages/milestones and claimed the child develops mental schemas/schemata to support development (see Part 3 of this series). He focused on identifiable, behavioral features and characteristics to inform his theory for example, at the beginning of physical development the child spends time imitating what they see and this can aid fine and gross motor skill development (Jean Piaget: Play, Dreams, and Imitation in Childhood, 1951). Additionally, games with rules serve as a starting point for entry into learning about strategy, though, the child according to Piaget’s theory of development, at the very beginning of engagement with such games will rely on (structure) cues such as social interaction, sticking to the rules or not; and learning about a sense of fairness, driven by intuition and experience. Over time, the mature child across his/her ages will be able to engage in a logical, strategical and competitive level. Skill-development, therefore, develops within a structured play. When we compare this to DeepMind’s work, it can be argued unstructured would entail the machine route-finding the rules, first.

The second point, I would like to make relates to bio-inspired algorithms (utilized by DARPA) and links to what Piaget did to construct his theories. Piaget relied on observations of children to inform his theories; according to the academic literature base, bio-inspired algorithms are also informed by observation, which has been inspired by the social behavior of animals: Bio-inspired computing: Algorithms review, deep analysis, and the scope of applications (Ashraf Darwish, 2018):

‘…nine bio-inspired optimization algorithms are presented… Genetic Bee Colony (GBC) Algorithm, Fish Swarm Algorithm (FSA), Cat Swarm Optimization (CSO), Whale Optimization Algorithm (WOA), Artificial Algae Algorithm (AAA), Elephant Search Algorithm (ESA), Chicken Swarm Optimization Algorithm (CSOA), Moth flame optimization (MFO), and Grey Wolf Optimization (GWO) algorithm which have been inspired by the social behavior of animals. There are several simulation stages involved in the developing process of these algorithms which are (i) observation of the behavior and reaction of the animals in the nature, (ii) designing a model that represent the behavior of these animals (iii) converting into mathematical module with some assumptions and setting up of the initial parameters, (iv) developing the pseudo code to simulate the social behavior of these animals (v) testing the proposed algorithm theoretically and experimentally, and redefine the parameter settings to achieve better performance of the proposed algorithm.’

A possible recommendation from the analysis above, is in the fact that when the notion of play is used for reinforcement learning in machines, is it worth considering what types of play are characterized and why?

Moreover, we might wish to ask how can animal — behavior observations transfer to inform algorithmic design in terms of human behaviors? Let’s consider, briefly, the role of General Adversarial Networks (GANs), Convolutional Neural Networks (CNN) and more elaborate consideration of the shift from RL towards RL combined with Deep Learning (DL) and the Q-learning algorithm (DQN). I have circled where we are on the Star Diagram, once I have covered all the stars, I will move on to compute.

It is with the use of convolutional neural networks (CNN) for visual imagery, based on simulated biological processes such as neuron-connection patterns akin to the visual cortex of an animal/human, which inform algorithmic design. Machine learning techniques such as GANs, use an adversarial network to enable the machine to generate images based on the input of images given. Vehicles, such as Electrical Vehicles and Space X’s Crew Dragon use such machine learned imagery in their algorithm design in navigation, for example.

In short, the companies in the table above, whether they have an established or emergent purpose to use these nets, it is the GANs (Goodfellow et al., 2014) that learn to distinguish images and it is the CNN's that can co-ordinate sequencing of images, like neuron-connection patterns in the visual cortex.

Since 2014, GAN research has moved on. Though, an apparent complexity remains in 2019, in terms of what Rich Sutton suggests here:

‘we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead, we should build in only the meta-methods that can find and capture this arbitrary complexity. Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done.’

The Star Diagram shows progress in the form of DQN. In 2015, (paper link is here), Google’s DeepMind developed ‘a novel artificial agent, termed a deep Q-network’. Again, based on the same premise: ‘behaviors’ as discussed above and the following extract is taken from the research paper:

‘The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behavior, of how agents may optimize their control of an environment.’

The related blog claims, ‘the work represents the first demonstration of a general-purpose agent that is able to continually adapt its behavior without any human intervention, a major technical step forward in the quest for general AI.’

The implication is agent-adaptation achieved with DQN in the machine, via reproducibility paradigm, can transfer to robotics development. However, adaptability, it is recognized, is about a shift from a predominant focus on algorithms reliant on psychological features to include behavior characteristics too.

During May 2017, OpenAI released the blog: Open AI Baselines DQN and associated code. And, it is apparent OpenAI moved the research for DQN development forward in 2017:

Just like Piaget’s (ibid) observations of children’s behaviors/actions, which informed his theories about cognitive development; OpenAI, as you can see above, included splitting the neural network to focus on ‘behaviors/action’. This, in turn, contributed a huge leap forward for development, in terms of one strand doing behavior observations/evaluations/meta of the other, to inform action.

The example below shows OpenAI Gym/Universe training Deep Q Learning (DQN) of a simulated lunar lander using CPU on an iMac:

More recently, GoogleBrain (March 2019) has shown the importance of distributed learning for RL & DL linkages. The work spans 18 Months of research, carried out with a focus on distributional algorithms. One algorithm

‘explicitly performs gradient descent on a distributional loss (something neither C51 nor QR-DQN does). The result is an algorithm we’ve named S51 (Bellemare et al., AISTATS 2019)’

essentially, this work is of particular importance due to the recognition of the value of distributed learning; moreover:

‘were able to show that S51 has convergence guarantees when combined with linear function approximation. Along the way, we also gathered evidence that there are pathological examples where the predict-and-distill approach is a worse approximation than directly predicting expected values, a natural consequence of what one reviewer called being “more prone to model misspecification”.’

And

‘…distributional RL does learn richer representations. Below is a visualization of the features learned when predicting the value distribution of the random policy using C51 (left), or using QR-DQN (right).’

Ultimately what the research suggests, is distributed systems can have an impact as to the depth/richness/quality of learning. Similarly, seminal education — literature highlights how cognition can be developed for optimal learning opportunities. For example, research I undertook with university students, I made reference to David Perkins (Harvard Graduate School & Richard Pea, Stanford):

‘The developmental approach-plus extends the theory to suggest cognition is not solely dependent on one’s ability to think in a certain way; cognition involves not only thinking but also other people, symbolic media and exploiting the environment and artifacts (Pea, 1993). In other words, the developmental approach-plus considers the relevance of the immediate physical and social resources outside of the person as a confluence of features; conceptualized as ‘distributed cognition, which takes place mostly in situations of authentic and extended inquiry’ (Perkins, 1993, p.93).

Referring back to the Star diagram, I want to refer to the Central Processing Unit (CPU) and Graphic Processing Unit (GPU). In the original research for this paper series, the theme: Processing, included a focus on computing:

‘…latency is the speed in which information travels/processes across a network. If we combine the two, inefficient algorithms and high latency this contributes to poor computer processing… in the OpenAI blog: Faster Physics in Python, it is suggested GPU rendering has a faster processing speed than CPU, and so by definition, users can expect lower latency. Further, other examples of GPU hardware — dependent development work, can be found in the blogs: Proximal Policy Optimization, where it is stated:

‘We’re also releasing a GPU-enabled implementation of PPO, called PPO2. This runs approximately 3X faster than the current PPO baseline on Atari.’

As you can see, CPU/GPU computing aids the speed and scale at which machine learning algorithms run. Yet, some might say, with regard to GANs there is a mismatch between what the research base offers and what gets applied for impact/value added in the public domain, the inference is that GANs have shown limited applied-transferability as an entity in itself. Similarly, according to Rich Sutton:

‘Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters, in the long run, is the leveraging of computation. These two need not run counter to each other, but in practice, they tend to.’

Recent research referenced by OpenAI, focuses on energy-based models (EBM), the paper is here; and the actual main affordance of EBM functionality is that hidden variables can be included in the design. Not too dissimilar to the notion of degrees of freedom utilized in control theory algorithms applied in aerodynamics, see for example this Ph.D. submission to NASA 1996, for example, p.14 onward:

additionally, I also wrote about degrees of freedom in Part 3 in reference to quantum computing:

Yet, in terms of efficient compute; and while EBM models can incorporate Adaptive Computation Time + Adaptive Generation, shifting the research base forward beyond GAN constraints; self-regulation for efficiency in computing remains absent, labeling can be out of synch due to variation in what needs to be recorded at the same time (courtesy of @Calvinn_Hobbes on Twitter):

The Star diagram shows progress toward the existence of neuromorphic chips. Let’s start with that item first, then I will address spiking neurons.

A good example as to how CPU has evolved is in its use for electric vehicles. In the article: Hybrid/Electric-Vehicle Software Delivered in the GreenBox ‘The GreenBox contains a 64-bit, quad-core, ARM Cortex-A system with interfaces and analog support’:

ARM-architecture can support a neuromorphic chip too. Essentially, this type of hybrid system can compute running on CPU; with a large amount of information designed in shortcode contained on a neuromorphic chip, whereby the repertoire of capability, affordances, and functionality can become a four-way pipeline: CPU processing for general software, convolutional neural network(s), binarized neural network with time-domain analog and digital mixed-signal processing, as I outlined in Part 1 of this series. The related paper reference is here:

The inference is that computing can be manipulated with shortcode on a chip that runs over CPU in parallel/or not, dependent on the TDNN coprocessor.

There is no reason whatsoever, as to why this same/similar architecture could not be developed; based on the use of EMG signals to work out the code for neural signals and place the algorithms/net information on a chip, as well as with additional hydraulics technology used for Boston Dynamics BigDog:

To reiterate, given we know CPU/GPU computing aids the speed and scale at which machine learning algorithms run; and that combined processing can run over CPU for example, the shortcode on a chip can also be made to run as efficiently as possible. The Star diagram shows within chip development ‘spiking’. In 2013, Navaridas et al., published the paper: SpiNNaker: Fault tolerance in a power- and area-constrained large-scale neuromimetic architecture:

‘SpiNNaker is an application specific design intended to model large biological neural networks — the name “SpiNNaker” being derived from ‘Spiking Neural Network architecture’. It consists of a toroidal arrangement of processing nodes, each incorporating a purpose-built, multi-core System-on-Chip (SoC) and an SDRAM memory (Fig. 1). Neurons are modeled in software running on embedded ARM968 processors; each core is intended to model a nominal 1000 neurons. Small-scale SpiNNaker systems have successfully been used as control systems in embedded applications [1], providing robots with real-time stimulus-response behavior as described in [2]. However, the ultimate aim of the project is to construct a machine able to simulate up to 109 neurons in real time. To put this number in context some small primates have brains with slightly lower neuron counts whereas the human brain has roughly 86 times this number [3]. To reach this number of neurons more than one hundred thousand integrated circuits will be needed (half of which are SpiNNaker chips and the other half SDRAMs).’

Navaridas et al., show in an overall view of the SpiNNaker, the toroidal arrangement of processing nodes; how the ‘traffic’ is processed.

For a closer look at the basic tenet of the torus, Minhyong Kim explores in his work: the link between pure math and physics to consider space and related interstices:

So, it goes, such architecture on a chip, (bearing in mind SpiNNaker was developed in 2013)

can also be integrated with the four-way pipeline I mentioned earlier. Logic says so.

Moving on, we now arrive at the final development: Brain Computing Interface (BCI)/Brain Machine Interface (BMI) paradigm on a chip. This paradigm shift relates to the theme from the original research I undertook with the OpenAI data, namely Cognition. Before I delve into BCI/BMI chip functionality; I will briefly refer to other related work about cognition in the machine.

The OpenAI data analysis revealed a focus on Language/Communications Development, Meta-Learning and Learning Transfer. Similarly, SingularityNET/Hanson Robotics, developed its own open-source, artificial general intelligence project: OpenCog; and a good example of an algorithm framework is on this page for RelEx, where it is stated,

‘is well-suited for question-answering and semantic comprehension/reasoning systems’.

The aim of SingularityNET is to develop in OpenCog; theoretically-driven, algorithms for application to robotics. And, we see as a consequence of this aim, Hanson Robotics has created Sophia, for example, whereby question answering, though currently pre-scripted, the goal is to move beyond.

Shifting the Paradigm

Returning to Brain Computing Interface (BCI)/Brain Machine Interface (BMI) paradigm on a chip. According to Fetz (2007):

‘Successful operation of brain-computer interfaces (BCI) and brain-machine interfaces (BMI) depends significantly on the degree to which neural activity can be volitionally controlled.’

Where Fetz states, ‘volitionally controlled’; given we find ourselves in 2019 and in view of the literature we have addressed thus far, it is fair to suggest we can re-phrase with: volitionally controlled on a chip. We can also define volition to include human agency/human acts of agentive activity. Fetz, published in 2007 (12 years ago):

‘Neurons in sensory association areas are also volitionally activated in conjunction with cognitive imagery… Thus, internal representations of stimuli and movements often employ many of the same neurons involved in overt sensory or motor behavior. Beyond representations of sensory and motor events, internal cognitive activity like ‘thinking’ must also have neural correlates and these also represent volitionally controllable processes…Thus, conventional experiments have revealed a range of circumstances in which central control of neural activity is evident. Volitional input could be considered to reflect an activating modality existing in addition to the better-studied sensory and motor modalities. The degree to which it is available for BCI/BMI control signals remains to be empirically determined. Conventional experiments, such as those described, are typically designed around a particular behavior, and indirectly reveal the volitional components of correlated neural activity. Reversing this paradigm, biofeedback experiments directly elicit the volitional control of neural activity and allow the correlated behavior to emerge.’

Essentially, Fetz suggests that volitional thought/cognition can be controlled to determine behavior with the aid of reinforcement activity.

An Apparent Trajectory for Development

The final theme in the table that has led this paper series is An Apparent Trajectory for Development. The OpenAI data showed direction with regard to seeking eventual adaptation/adaptive capabilities in the machine. This is also established or emerging in the body of work across the companies in the table:

NASA, Space X, Neuralink & DARPA

In order to think about the application of the Basic biofeedback paradigm, NASA will be considered next.

A good example is in the plan to send a Mars 2020 rover to the Red Planet in early 2021, it will carry with it a small helicopter. According to Discover:

‘The helicopter must have built-in systems to safely land itself if it encounters any errors. So the Mars Helicopter will be autonomous, flying itself on short flights. It will receive commands and communicate via the rover, but has its own solar-powered batteries for power and a heater to keep it warm on cold Martian nights.’

Such autonomy can be achieved with the biofeedback paradigm whereby human — brain, neural activity is modeled and volition controlled; which in turn can be placed on a spiking chip, akin to the way in which the autonomous navigation processes for Space X’s Crew Dragon-2.

Honing in, a good algorithm example for Space X, can be considered when thinking about the Crew Dragon-2 vehicle. In the video time-lapse footage below, Crew Dragon can be seen docking to the International Space Station (ISS). In order to do this, many algorithms were deployed akin to the way in which I explained in Part 1 when I referred to Von Neumann’s work in relation to compute and the function of shortcode.

When Crew Dragon departed the ISS, Max Fagin posted on Twitter (Mar 7, 2019, the detail below with image) to explain why the vehicle climbed to a slightly higher orbit to pass above the station; as for the mathematics (including for example, if you look closely, Sine/Cosine functions) you can see, would have been in the form of transform algorithms.

‘Unavoidable artifact of orbital mechanics. Dragon is docked to the prograde port of the station. Departing the station means adding a few ~m/s to its orbital velocity, which means it will climb to a slightly higher orbit, and pass above the station.’

More specifically, transform algorithms enable the mathematics to be transformed with the use of matrices; clustering the computations so that they are shorter and speedier to process, see for example Matrix Kronecker Product and Its Properties, (Ali N. Akansu, Richard A. Haddad, in Multiresolution Signal Decomposition, 2001) e.g. :

It is notable, in the above example that image arrays; based on the relational matrices, can be the computational outcome of 2D transforms and can impact how the kernel operates.

It is fair to suggest, therefore, reverse engineering played a part; in that once we knew the vehicle had successfully docked at the ISS, then it was a matter of departure and standard operating procedure followed for its return on the trajectory back to Earth.

Keeping in mind, the notion of reverse engineering and given we have an idea as to its success; Ray Kurzweil states, the process of reverse engineering and redesign will also encompass the brain. So, it goes, if all of the regions of the brain can be reverse engineered, just like the engineering for the Crew Dragon vehicle; models and simulation can provide for software, hardware and algorithmic methods, such as micro-code/processing, to simulate brain activity via a chip, such as the biofeedback paradigm chip as well as a mesh, for example.

Relatedly, in very simple terms, when we consider Neuralink; though algorithm detail is not explicit in the literature or research base, the science is in place stemming from the theoretical; and in the foreseeable future, application when Elon Musk announces development milestones. What might be the implications of this for SingularityNET/Hanson Robotics?

Connected with the possibility of reverse engineering all regions of the brain, it has been suggested algorithmic methods could simulate emotional intelligence. Emotional intelligence enables humans to adapt to their environments and be adaptive across relations with others.

DARPA’s work includes, established, bio-inspired AI algorithms; and this year work is focused on the next generation of AI algorithms and applications, such as explainability and common sense reasoning with the purpose of furthering adaptation capabilities.

In sum, and to conclude this paper series; it is clear the paradigm shift for the development and application of AI with the aid of machine learning, is about achieving adaptation in the machine. The progress that has been made to date, including re-usable vehicles for space exploration, shows chip development can enhance engineering, so much so that re-usability can be defined in action as adaptability. To reach eventual, applied — artificial general intelligence (AGI) adaptability is a requirement. Therefore, if a machine can adapt to change direction and complete a round route autonomously, then such adaptation can take place in robots. The paradigm has shifted already, how it maps among humans, not solely in space/on Mars, remains to be seen. Although, on the 26th March, 2019 NY Times shared an overview of Robotics at @GoogleAI: progress in using machine learning to get robots to start doing practical things, from navigating rooms to handling everyday objects, in a way that works well alongside people(for more about latest developments in robotics research at GoogleBrain/AI research, see the following link). Therefore, how might the research base move forward? First and foremost, with a determined focus on ethics.

Part 4: Artificial Intelligence & Artificial General Intelligence: Moving the Research Base Forward.

Written by dawnalderson