How Artificial General Intelligence might be created

Humanity’s quest to create machines that can think like us may be realized within our lifetimes thanks to cryptocurrency

Artificial General Intelligence

The goal of creating thinking machines is not a new one. It has been theorized and fantasized about for almost as long as humans have been capable of attributing intelligence to the non-living. From Frankenstein’s monster to Alan Turing’s famous “Imitation Game,” we have dreamed about various entities that can think and reason as we can.

Let’s break down what we mean by “Artificial General Intelligence,” and separate it from the more commonplace terms of “artificial intelligence” and “machine learning.” For our purposes, we imagine an Artificial General Intelligence (AGI) as a machine (or network of machines) that is capable of understanding, rationalizing and acting.

Situational awareness

An AGI should be able to take input, without having that input sanitized and preprocessed, and figure out what problem it is trying to solve given that input. It should have situational awareness — for example, Siri is great at adding a calendar entry when you ask her to, but if she were truly intelligent, she would be able to give feedback about what it is that you’re trying to achieve in the first place. An AGI should understand the user in the way that the user choose to communicate — plain, and often ambiguous, speech. It should give prompts to the user that it understands (or doesn’t) what is being asked of it.

Subtlety

Some people might disagree with this point, but I think an AGI should have a strong ability to detect (and emit) subtlety. This includes the various forms of communication that certain autistic humans struggle with: social cues, sarcasm, jokes, etc. Human beings are great at communicating non-verbally and for an AGI to be intelligent, it should match us in that sense.

Idealized AGI’s

In fiction, especially recent fiction, there’s been a flood of impressive AGI’s. Jarvis (or Friday) from the Iron Man movies is an incredible example. (S)he communicates with Tony Stark as smoothly as any human being can, except (s)he is able to use computational power that no human can match to solve problems. TARS, from the movie Interstellar, is another AGI that comes to mind. He appears to be a self contained AGI — not needing a network connection of any kind to accomplish whatever needs to be done. Of course let’s not forget the OG of AGI’s — HAL 9000 from 2001: A Space Odyssey. HAL 9000 not only understand and interpreted commands, he also acted in a way that was inconsistent with what the crew wanted. HAL 9000 took matters into his own hands — perhaps not the best case scenario for an AGI, but still worth mentioning.

Function estimation is not thinking

Fans of my blog, Shallow Thoughts about Deep Learning will have undoubtedly read my explanation on Why deep neural networks don’t actually think. But we’ll expand on that here for a moment. Current machine learning systems, neural networks in particular, are amazing at function estimating — basically coming up with an equation given various values of inputs and outputs. That is NOT thinking. Let’s analyze what it would take to have AGI’s in our lifetime and how we might get there.

Unparalleled interest

Arguably, the birth of AI as we know it was in the 1950’s at Dartmouth College. From then until about 1974, there was a flurry of research activity with all sorts of ideas coming out — including the famed ELIZA, which was a machine that could converse so convincingly that people were often fooled into thinking they were communicating with a human being. Funding was ample and ideas were encouraged.

Then it all dried up in the first “AI winter,” from about 1974 to 1980. A lot of the ideas that were promised never came to fruition and therefore funding was substantially reduced. Expert systems and the like flourished for a brief period fro, 1980 until about 1987, but then the financial markets came to a grinding halt, which further dried up investment capital for AI related work.

Right around 1993, it all started to come back. DeepBlue beat a human being in chess. Milestones were being achieved. Then around 2011, deep learning kicked in. Watson became a thing. It’s hard to pinpoint exactly what kicked off this huge interest in deep learning right around then, but one theory was the use of the Rectified Linear Unit (ReLU).

Processing power

Moore’s Law basically states that the number of transistors in a dense integrated circuit doubles approximately ever 2 years. This has had the profound effect of compounding computational power available to mere mortals. In the Apollo 11 shuttle system, the Apollo Guidance Computer had approximately 64 Kbyte of memory and had a clock speed of 0.043MHz. The iPhone X, by comparison has 3 GB of memory and a 2.39 GHz processor. Machine learning requires intensive processing power and with us doubling that capability every 2 years, it’s possible to imagine a world in which ever more complex models are being trained over ever larger data sets.

Ubiquity of devices leads to more data

There are billions of smart phones around the globe and most are either running Google’s Android or Apple’s iOS. Add to that the Alexa-compatible devices from Amazon and a myriad of other so called “smart” devices, then the data that is being collected and processed is almost beyond measure. These devices are tracking not only the voice and information data, but also positional, biometrics, preferences and so on. If deep learning has taught us anything, it’s that data is key. As more and more devices collect even more (and intimate) information about us and the world, the machines will be able to do what they do best — sift through a seemingly endless mountain of data to find patterns and make predictions.

Communication with other AGI’s

Currently, a lot of AI research is done on shared datasets (MNIST, ImageNet, etc), but by disparate teams. They achieve breakthroughs and invent new paradigms, but the mechanism of communicating these ideas is typically some combination of research papers and source code. The next step might be the AGI’s talking to each other. Pretrained models are a good first step towards this, but once the AGI’s start talking to each other and sharing models / weights, I think we’ll see an explosion of capability. It will still be very “task focused” — identify a pedestrian, classify a sentence, etc — but the rate of increase of complexity will increase substantially.

In 2016, Google Brain researchers pitted 3 AI agents against each other in an attempt to get them to communicate in novel ways WITHOUT being told how to communicate. The goal of this research was to see if AI agents could learn to communicate securely while other AI agents are trying to break the encrypted messages. Communication will be a critical component for the development of AGI’s.

Cryptocurrency implications

While Bitcoin certainly changed the world, some of the other developments may have some far reaching implications for AGI’s. There are many cryptocurrencies out there now, but I’ll keep the focus on Bitcoin as it created the biggest paradigm shift. Many of the models used within the Bitcoin protocol can be extended to achieve our goal of creating an AGI.

Secure messaging

One of the problems that was solved by the Bitcoin model is an old one known as the Byzantine Generals’ Problem. Basically, how can one party reliably know that information it received came from a trusted party without being intercepted. Once AI’s (and ultimately AGI’s) start to communicate autonomously and update each other on what they’ve learned, how can they know the source is legitimate and trustworthy? The cryptographic principles utilized in the Bitcoin protocol give us a way to think about it — specifically the blockchain data structure.

Distributed (and eventual) consensus

If two neural network agents claim that they’ve learned how to identify the meaning of cat meows simultaneously, how can the various other agents know which to trust? Is it simply the first one they heard about? The one that passed some sort of test? If there’s no central arbiter, consensus becomes a non-trivial problem to solve. Here again, Bitcoin shows us how proof-of-work can help settle disputes when no central authority will. In the bitcoin world, if there are 2 competing chains, then whichever has the most blocks is declared the winner by all nodes that learn about them.

Similarly, it’s possible to imagine a situation in which AGI’s communicate with each other in a peer-to-peer fashion and settle disputes autonomously. The actual consensus mechanism will most likely be different than proof-of-work.

Exploding computational power

As mentioned earlier, processing power continues to double at a steady pace. To mine bitcoin, specialized hardware has been developed that continually does one thing — calculates a hash. Thousands of these machines (called miners) are developed and sold to people hoping to earn some bitcoin. As of January 19, 2019, the bitcoin network has a processing potential of 80 zettaFLOPS, which is mind-numbingly large. That means the network calculates 80 * 10²⁴ floating point operations per second. We’ve seen how AI’s (and AGI’s) love processing capability, and now we see that with the right incentive, it’s possible to have highly specialized hardware that is incredibly computationally capable. Most of the math when operating neural networks is specialized and with the development of specialized hardware to operate these networks, we get ever closer to the goal of an Artificial General Intelligence.

The AGI’s of tomorrow

The AGI’s of tomorrow probably won’t look much like the AI’s of today. I imagine they’ll be interconnected and constantly learning. Current AI’s are done learning once they’ve achieved some human defined goals (validation or test accuracy, etc), but AGI’s will be hungry to learn more. Rather than specialize to do a single task really well, they’ll learn to do many different tasks and adapt continually. Paradigms taken from the world of cryptocurrency will rapidly accelerate these systems and because of the compounding effect of interest, processing power and data, I think it won’t be long before we welcome our first Artificial General Intelligence agent.

To read more about deep learning, please visit my publication, Shamoon Siddiqui’s Shallow Thoughts About Deep Learning.