The many things AI needs…
It is always interesting to think about the next thing in AI, will it be something incremental or will it be something entirely different ? I like to think about the latter (but let’s briefly touch on the former) before moving on.
Incremental advances: The easy prediction here is to take current AIs/products and assume that they will gradually become better at what they do.The top tier of such current products are prediction AIs that employ Artificial Neural Networks (ANN) and algorithms for things like weather prediction, bureaucratic interactions, mundane tasks like search, not so mundane tasks like object detection, driving a car and so on, if this prediction ages well we will become a more efficient species in many aspects but we will also need to reorganize ourselves accordingly.If for instance self driving cars become predominant, the car industry, government and those affected by the loss of human jobs will need to adapt.
“It’s hard to make predictions — especially about the future.”
Robert Storm Petersen
Here’s the thing, we have a really mixed track record on predicting the future of technology, so rather than fantasize about something that might or not happen which is fun but probably best left for Sci Fi, a better alternative might be to list the things that current AI’s don’t accomplish or accomplish poorly. ( this list is probably incomplete and with time eventually wrong ), the theory being that if we solve these problems new AIs and products will appear, or like the popular quote says:
“ The best way to predict the future is to invent it. ”
Alan Key Et al.
Let’s take vision, which is probably the most researched domain in both computational neuroscience and AIs… A camera required for emulating human vision ( at least in resolution) is more or less common these days, what’s not common is the sophisticated processing your eyes and 1/3rd of your brain does, in comparison computer vision uses shallow task specific algorithms which for the most part are useless if you change the task.
A circle detecting script is useless for depth perception unless you code it to also detect depth, and now this new script is useless for text recognition, and on and on, we haven't written down the one program that does it all.
Artificial Neural Networks (ANNs) can process a lot of information and in many cases perceive something as belonging to certain class ( usually a single class or family ) but this takes training that in turn takes time. In contrast we get our training through a mix of biological evolutionary systems and our daily life and experiences, ( more on this later ) and we can also modify what we see based on higher ( top down ) cognitive processes like conscious or subconscious attention.
A common example of top down subconscious perception: You just bought a new car and chose RED as the paint color, for the next few days you can't help but notice how many RED CARS are on the road, they have always been there, it's just that with your recent purchase the color RED + CAR has gained importance in your mind and that is what your senses are now tuned for.
The point here is that we perceive the world in a different way, at least where it regards to vision, it’s an active top down/bottom up affair we haven’t quite been able to emulate.
The other senses are a mixed bag; touch, smell, taste and the less popular senses like the sense of balance are mostly absent or fully solved, hearing is somewhere in the middle and can be recorded at higher resolutions than we can and in some cases ( like cochlear implants ) the neural code can even be emulated yet when we go up the task ladder of things we do with our hearing AIs fall short.
Think about a crowded room, you can easily pick out the voice of your friend on the other side of the room and make sense of voices you have never heard, even when spoken with heavy accents, you can also perceive the emotional content in voices to derive further meaning.
Besides the addition of top down influence to perception, we also employ a variety of computational operations to stimuli from the environment, so visions for instance is not simply a screen in our head, but rather light information is parsed into different maps thanks to a combination of specific neurons ( color, luminosity neurons) and neural networks that process raw inputs into shapes edges, movement etc, etc, hence the sparse nature of complex perception.
Current AIs might have a mixed record when trying to emulate our individual senses, but where they really start to show how much room there is for improvement is when we consider how they integrate with each other, or rather how they don’t integrate with each other.
A simple integration task: Upon hearing a certain word you (or the Ai) should point to a matching picture, you are also tasked with pressing a button if it is a repeat image.
Integration like vision seems like a simple problem until you figure out that in order to integrate both modalities like we do you will have to find some common code that both can employ to talk to each other and well the systems approach to AIs is currently on its infancy; take a deep breadth, what follows are straight up missing links in current AIs especially if you consider that each level so far builds on top of the previous one, a preliminary stack of sorts if you will:
A big missing piece is Memory and lets start by pointing at the elephant in the room, computers can store vast amounts of information and we long ago passed the mark where we had the upper hand, so what is the problem then ?
It all boils down to how we acquire and store memories in conjunction with sensory modalities and other factors unique to us biological beings.
We get better at certain tasks the more we practice them and so do computers via Machine Learning, that much is true, but we do it online, in realtime and most importantly uniquely and meaningful to each individual.
Do you remember that character in that movie/show or book you saw/read a few weeks ago ? You know the one that you felt strongly about ? Now ask your phone or smart phone the same question...It’s such an unfair competition.
A good place to start bridging this gap would be to emulate the different types of biological memory, ( keep in mind though that the complex stages of memory consolidation are also not being emulated ), here’s an overview :
This table might not make sense to you if you are not into biological learning and memory, the point here is that there are at least 9 (more if you divide by domain) different ways to store information in your brain, here they are roughly divided by duration, domain and access.
Intelligent behavior ( Towards AGI )
Let’s say we solved or emulated the sensory, integration and memory aspects, we would end up with an interesting entity/program which I propose is sufficient for basic artificial consciousness and intelligence (CAI) , but not necessarily for Artificial General Intelligence ( AGI ) which in simple terms means just that, you give a program a problem ( any problem ) and it solves it.
Akinetic Mutism is a seemingly horrible condition where you can perceive external stimuli but lack the ability to speak or move, recovering patients paradoxically don't describe a hell where they are locked in their mind (unlike locked-in syndrome), but rather a lack of anything meaningful to report...nothingness.
Akinetic Mutism might be an extreme disorder in humans yet serves as an example of what we have when there is no internal volition, inner compass or goals that give us meaning and more importantly direction which seems to be a prerequisite for intelligence.
What we want from an AGI is to somehow extract the general intelligence bits without the baggage of individual or species goals, to be honest I don’t think we fully understand the problem yet ( goals might not be the sole requirement, systems including goals might be) so what we are left with are individual strands of intelligent behavior, programs that can’t generalize.
Personality & Inner Monologue
Speaking of things that we don’t fully understand and absent from the previous discussion is our entity, not just in the particular sense that we all have different experiences but in general the inner world of thoughts and dreams that we experience every day.
A singular thought might be easy to emulate, but our inner world is far more complex: the voices in our head, the images and sounds that we hear and experience with or without external stimuli are tied to our experiences and controlled again by a mix of top down/bottom up mechanisms, if this wasn’t a weird enough arrangement, there’s also agency involved.
To illustrate how bizarre and weird our inner world is let's focus on a single element: covert speech ( the voices in your head ) and imagine we are instructed to recreate it, the research is scant, so this is how I interpret my own covert speech:There is usually a single voice but the content can take multiple forms, usually the one that persist through time is the one closest to my current sensory experience, other voices do bubble up and each one is somehow different in content and based mostly on experience or sentiments, the conscious use of my body is mostly, but not uniquely controlled by the persistent voice.
Behind the curtain of this and other covert modalities like working memory and the phonological loop ( seeing/imagining with your minds eye and having a song stuck in your head) there lie complex and dynamic neural networks we’d need to recreate or extract at the algorithm level, a somehow daunting task if you consider that for instance when covert speech is absent ( due to acquired/congenital reasons) , other modalities take over, mute/deaf subjects still have thoughts and blind people can still imagine things; plasticity ( your changing brain, see the memory section ) needs also to be accounted for.
In graphical form this challenge consists of recreating the complex and changing (through time) inner processing our brains do in contrast with a simpler static system that generates simpler behavior.
If thoughts seem like an intractable problem then feelings and emotions present a new type of challenge, the first and more immediate one is emulating them virtually and it bears repeating we need to incorporate them somehow into all the previously mentioned elements.
Adrenaline and the fear response: You see something you interpret as a threat and adrenaline kicks in, a fear response that primes your body to fight or flee, but how would we code such a response ? Neurotransmitters affect selected neurons or neuron populations by increasing or decreasing activity* , if you think of your body as hundreds if not thousands of systems, a hormone or neurotransmitter has the effect of changing the response of a number of these systems, so what we really are after is a piece of software/hardware that can reconfigure itself based on experience or a certain pre/programmed routine, that could be a first take on artificial feelings, but it's far from clear if this is doable at scale, after all we possess a varied repertoire of emotions. * A simplification.
A second and perhaps even more intriguing aspect is whether we need to incorporate emotions into a virtual or artificial construct in the first place, is this required for intelligence ?
As mentioned above emotion allows us to fine tune our behavior based on experience, emotional events have a fast track to long term memory and so evolution has deemed them important. The utility is hard to isolate since we are social animals and most if not all of our emotional baggage subserves our behavior and interactions with others. Empathy for instance allows us to recreate what others experience to a certain degree and this constitutes in its simplest form an exchange of information. ( a frown conveys information without a word )So I guess yes, emotion is required for intelligence, but with the caveat that we are not talking about logic and problem solving at the individual level but at the group one. ( many frowns and the group needs to change its behavior )
“Life is without meaning. You bring the meaning to it. The meaning of life is whatever you ascribe it to be. Being alive is the meaning.” Joseph Campbell
The above quote helps frame what I’ve always believed to be the thorniest aspect of future AIs ( and a favorite of SciFi )… How to somehow infuse the human experience into an artificial entity, how to give them meaning.
Take context and logic. In a social environment we are required to behave in a certain way, we are surrounded by formalities and little rituals ( handshakes, deference, eye gaze, speech tone, volume, hand movements, mirroring etc, etc ), all these behaviors are perfectly logical within a certain context but can quickly be incorrect under a different one and understanding this simple fact requires all the above systems and complex awareness of other entities as related to oneself.What we want from an AI ( in order to better solve our problems) is for the AI to not only be logical in purely mathematical or abstract terms, but to be logical in human terms and to understand our context. All this in theory can be achieved by a complex mutating logical table, but alas we just haven't produced it yet or know if it is enough.
In short, so many of the things we seem to want from AI hinge on the AI understanding or experiencing life as we do, producing that could be the ultimate challenge of AI.
Epilogue: Why ?
If this list looks daunting is because it is, cracking the big AI problems is perhaps one if not the ultimate human Intelligence challenge, on the flip side every aspect mentioned above has the possibility of creating hundreds if not thousands of new inventions once solved, so I believe we could be rewarded at each step of the way, I should also mention that everything we’ve covered so far is not impossible, it will just take time.
Achieving this is not a small thing, but it’s also not a panacea, we have a very bad track record of using new inventions in wholly positive ways and I suspect this will be the case with AI even though the goal should be to better ourselves and our environment.
Thanks for reading.
It is hard coming up with a reading reference for something that doesn’t exist yet, but here’s a curated attempt.
I think a good overview of neuroscience is helpful both for inspiration and to better understand more advanced concepts, here’s 2 such places you can start:
Kandel, Eric R., and Sarah Mack. Principles of Neural Science. McGraw-Hill Medical, 2014.
Purves, Dale, et al. Neuroscience. Sinauer Associates, 2018.
The systems view of the brain is a work in progress, as such knowledge is dispersed throughout, here are 2 good starting points:
Rhythms of the Brain ( Buzsaki )
Networks of the brain, Discovering the Human Connectome ( Sporns )
Memory is somehow better understood, but critical pieces are still missing:
Gluck, Mark A., et al. Learning and Memory: from Brain to Behavior. Worth Publishers, 2008.
On the other side of the table AI is a fast changing field especially when it comes to Neural Networks, but that’s not the whole story, the field as a whole consists of a myriad algorithms and techniques for intelligent behavior, a good introduction:
Artificial Intelligence: A Modern Approach. S.n., 2010.