Artificial Intelligence: Separating Fact from Fiction Part II

Published in

IQumulus

15 min readDec 8, 2023

About 5 years ago, I posted a piece here on Medium, Artificial Intelligence: Separating Fact from Fiction. Two important things have happened since then, and in light of them this topic deserves an update, but first, some context. The first important thing which happened, probably only important to me, I returned to school in 2019 and in 2023 earned my PhD in computer science, with a concentration in artificial intelligence and machine learning (AI/ML). Most relevant to this post, my main area of research is the core technology of the second important thing that happened.

The second important thing is the invention of the transformer, which unlike my PhD, affects everyone, particularly in the form of the generative pre-trained transformer (GPT), most commonly known today in the form of ChatGPT. The transformer itself was born in 2017 [Vaswani, et al.], two years before I began my PhD, but it wasn’t until about 2020 this new machine learning model was put into serious service and exploded in notoriety.

Since then, the information and misinformation about AI/ML has also exploded, just as it does with each new milestone in AI/ML. Five years ago, when AlphaGo Zero solved previous unsolved problems in gaming, the misinformation and false claims spread like wildfire. So to now, with the rise of ChatGPT, the misinformation and false claims are spreading far and wide.

I will not get into the details of the transformer technology, there are plenty of places you can read about that, but simply put, the transformer is a machine learning model which can be used to create what is known as a large language model (LLM), which can map questions or statements to answers and responses. Once trained, we call these transformer-based LLMs generative pre-trained transformers (GPT). You can ask the GPT-LLM anything, and it will respond with some answer which is strikingly intelligent and human-like (complete with mistakes). This is not limited to simple questions and answers, you can ask it technical questions, get solutions to math and programming problems, get solutions to chemistry and biology challenges, get a medical diagnosis, and pretty much anything else you can imagine. As a specific and personal example, while working on my dissertation, I had a problem with imbalanced data running through my machine learning models. In general, this is a well solved problem, but in my case no appropriate solutions existed because my model design was something new and unique (that’s the point of a dissertation, to achieve something no one has before). So, I asked ChatGPT about it and how to solve it. While it did not come up with a specific answer right away, it came up with ideas. With each idea generated, I refined my question, and I got back more and more intelligent and novel ideas. In the end, ChatGPT did not give me the complete answer, but it did give me ideas, and from those ideas I was able to come up with my own definitive solution. The process was similar to walking into a colleague's office and saying, “I have this problem, can you please help me think through it?” Then imagine my colleague and I discuss, brainstorm, eliminate dead-ends, pursue promising paths, etc. until we come up with the complete answer. That’s mostly what happened, except my colleague in this story was a machine.

If you think about my experience, it is absolutely astounding. ChatGPT was able to discuss something it had never seen by relating it to similar things it had seen, and with proper guidance was able to suggest partial solutions and analogies, which eventually I was able to assemble into a complete solution. ChatGPT behaved just like a human, extrapolating from known information to unknown solutions, and in a highly complex field which normally requires a highly trained human expert. Furthermore, a machine learning model intelligently explored improvements to machine learning, part of which is its own core technology. If that isn’t straight out of SkyNet mythos, I don’t know what is. However, is it really “the beginning of the end” for humans, or at least the end of some human jobs, some human importance, and some human creativity? Maybe. Probably, but not yet.

Many new claims about AI/ML are circulation today, and it’s time to look at them with a critical eye and see what’s real and what’s not.

Claim 1: AI/ML is exhibiting emergent properties.

In proper context, this is mostly true, however context matters, and for those who have the context wrong, the claim is also false.

To understand this claim, we need to define “emergent property” correctly. Emergence occurs when a complex entity has properties or behaviors that its parts do not have on their own, and emerge only when they interact in a wider whole. In other words, when you assemble a complex system, you put all the parts in place, and then you find that there is something there which you didn’t explicitly put there. In a simple sense, emergence is represented by the saying, “the whole is greater than the sum of the parts.”

The claim is that AI/ML is exhibiting emergent properties, and of course this scares people because the implication is that there is something there which we didn’t put there. Well, of course emergence is happening, but not because AI/ML is special, but rather because it isn’t special, and neither are the emergent properties. Emergence is part of almost all complex systems because that’s just how complex systems behave. Any programmer working on any large, complex set of software has experienced the need to add some new functionality, but then finds the functionality already exists as a part of a pre-existing interaction of the other parts. Emergence is a well known and well studied phenomenon in physics, chemistry, biology, sociology, psychology, economics, computer science, and pretty much any field you can contemplate. Of course AI/ML is exhibiting emergence, because it has to, that’s how complex systems behave. Emergence is a feature of almost all complex systems, and we’ve known this since classical Greek philosophy and science.

So this claim is true, but the context in which most people state the claim is usually false. Typically, people who make this claim frame it as, “AI/ML is starting to exhibit emergent properties.” That little “starting to” is meant to scare us because it makes emergence sound ominous and rare, as if emergence is restricted to biological life, and now for the first time a machine is showing signs of it. No, AI/ML has emergent properties because almost all complex systems, biological or otherwise, simply have them naturally. It’s normal and not the least bit rare.

The other context which is also incorrect stems from the fact that most people only think of emergence in the philosophical sense, with the classic example being human consciousness. Human consciousness is often described as an emergent property because we don’t really know what it is, where it is, or what creates it. Ask any person, “are you a conscious being?” and they will say, “yes.” Ask the same person, “why? how? where is your consciousness located?” and you will not get a clear answer, at least not a clear scientifically provable answer. However, consciousness can be described as an emergent property, the result of the interaction of parts we understand and can measure. Consciousness can be described as an interaction of intelligence, memory, and visualization (i.e., imagination). When we put those things together, the ability to reason, to remember the past, and to project forward into the future, we get very close to describing consciousness (close but not completely). Because consciousness is the go-to philosophical example of emergence for the average person, when emergence is cited in the context of AI/ML, there is an immediate association with human consciousness, as if consciousness is the emergent property. In other words, most people who claim, “AI/ML is starting to exhibit emergent properties,” are equating that statement to “AI/ML is starting to exhibit consciousness.” This is definitively false. AI/ML is nothing more than cold formulas and plenty of probabilistic projections, nothing more.

Claim 2: AI / ML will eventually birth machine consciousness.

Maybe. However, again context matters, and for the immediate future claim this leans false.

This claim, and my opinion on it, fall definitively into pure philosophical speculation because as stated in the previous claim response, we don’t really know what consciousness is. In my opinion, the answer is, yes, AI/ML will eventually create what we think of as consciousness, or at the very least we will create what is indistinguishable from consciousness. Then that raises the question, if something is indistinguishable from consciousness, is that consciousness? I have no idea, but I do believe we will see a time soon (10–15 years) where machines will exhibit something we think of as consciousness. So, why did I rate this as a little false? Because at present we are missing the technology to get there, we don’t even know what the technology might be, and my belief notwithstanding, this is pure philosophical speculation. With that said, I believe we will see a time soon where the machine’s behavior is indistinguishable from consciousness, and then the arguments will begin, is that itself consciousness? Smarter people than me will have to work that out, and the answer may still be no, that is not consciousness. So for now, this is a mostly false claim.

Claim 3: Generative Pre-Trained Transformers — Large Language Models will directly lead to and solve artificial general intelligence (AGI).

Somewhat false, especially in the context most people state this claim.

Every time something new and revolutionary comes out, the masses start shouting, “This is it! We have THE answer!” I’ve been around long enough to have seen “the definitive ‘end’ answer” in various technologies, and none were the end answer, and neither are GPT-LLMs. I was there when object-oriented programming (OOP) was “the way,” and “the end answer to all programming paradigms.” Nope. Not even close. I was there when each new big language came out (I’m looking at you Java, Ruby, and Python), and each one was “the end answer.” Nope, not even close. So too GPT-LLMs are not the end answer to AGI, which, of course , is the Holy Grail of AI/ML. However, just like the other examples, GPT-LLMs will form a part of the ever evolving current answer. OOP did not become the definitive end answer in programming paradigms, but it did become a massively important part of the ever evolving current answer. Similarly, GPT-LLMs will become an important part of the ever evolving current march toward AGI.

The reason GPT-LLMs cannot themselves, in-and-of-themselves, be the direct route to AGI, is because despite what seems like general intelligence, GPT-LLMs are still a niche solution. True AGI requires far more than a sophisticated pre-trained query-response system, which is all LLMs are. For example, we must also have vision recognition, speech recognition, time-series analysis, online learning, physical interaction and motor skills, biometrics, emotions or at least their simulation, and a lot more intelligence to create true AGI. Despite impressive performance in query-response tasks, GPT-LLMs are not the go-to for any of those other areas of AI/ML. In fact, as a small example, my research shows transformers are not well suited to time-series prediction (one area needed for AGI), and it is the very technology which transformers “killed” (recurrent neural networks) which are still the superior choice for time-series (ironically, when coupled with the core mechanism which makes transformers work).

The point is, AGI will not be a single solution, not even close. AGI will be a collection of integrations to many solutions in many fields. GPT-LLMs will be a big part of AGI, just not the only part or necessarily even the most important part.

Claim 4: Current GPT-LLMs are accurate and precise enough to be deployed across mission-critical tasks.

Mostly false.

This one is easy. GPT-LLMs are trained on human knowledge. Human knowledge is flawed. Also, the transfer of even correct knowledge into the GPT-LLM is flawed and rampant with human bias. Therefore, GPT-LLMs are flawed. More importantly, GPT-LLMs cannot learn from their mistakes, at least not yet. This is something most lay people don’t understand about GPT-LLMs like ChatGPT. The “P” in GPT is “pre-trained.” These models are trained with all current information, and that is locked into place. If part of that information is incorrect, the GPT-LLM will report it as factual, and it has no ability to learn that it is not factual. Until the next version of the GPT-LLM is released with updated information, mistakes are locked in.

We are not yet at the place where you can simply flip the GPT-LLM to “on” and walk away. However, we probably will be there within the next 5–10 years, especially when true unsupervised online learning is cracked. When GPT-LLMs have the ability to learn and correct themselves in real-time, then we will see reliable mission-critical deployments. For now, use with caution.

Claim 5: GPT-LLMs can and will replace programmers.

Somewhat true.

This is one that is close to my heart and experience. As a programmer for over 30 years, I can definitely see and believe GPT-LLMs will replace many programmers. The issue is that most programming today has become a lot less about software engineering and a lot more about assembling pieces on an assembly line. A few years ago, I wrote about the rise of what I call the “programmer technician,” and how most coding these days is done by that person. Well, times change, and I have to take back a lot of what I said because I believe now that GPT-LLMs will be the demise of the programmer technician because they can, in fact, do a lot of the mundane, day-to-day programming. What GPT-LLMs cannot do is large-scale system design and true software engineering and architecture. So, I suspect what we will see is the true engineer / architect employing GPT-LLMs more-and-more as their “team” and fewer and fewer day-to-day programmers. Think about my story about ChatGPT and my dissertation. I didn’t need a human colleague, my partner was a machine.

Certainly, the true engineer and software architect is here to stay for a while, but I suspect over the next 10 years we will see a substantial drop in the need for the average “coder.”

Claim 6: GPT-LLMs will take away many people’s jobs.

Probably very true, unless you are in a hands-on industry like construction, first responders, physical maintenance, etc. But even then, the age of the general purpose robot is coming too, so who knows…

Bottom line, if you sit behind a desk, and your job includes a lot of paperwork, you’re in trouble. By definition all such jobs are algorithmic, i.e., they follow a formula, and by definition, if it’s formulaic, a machine can do it, and now it can do it as well as you (complete with human mistakes). This, of course, will hit hard in very formulaic industries like accounting, logistics, planning and scheduling, technical writing, etc. but it will also hit in places you may think of as “hands-on,” but can be surprisingly automated. For example, medicine. Of course, we can’t yet replace the human surgeon, or any such hands-on and highly skilled medical professional, but think about what a general practitioner does; they take in symptoms, diagnose, and then write a prescription or refer you to a specialist. That’s mostly it. The general practitioner’s job is very formulaic and GPT-LLMs can already do this job, you just wouldn’t trust your life to them (yet).

This claim relates back to claim 4. When GPT-LLMs break the mission-critical barrier, you will see all sorts of jobs disappear, including jobs you would think are safe. This isn’t going to happen overnight, but it will happen within one generation of career planning (5–10 years), so you better start planning.

Claim 7: We don’t even know how neural networks work.

Mostly false, but there is some truth to it which should be addressed.

The heart of most AI today is the neural network, and it is common to refer to these things as “black boxes.” I hear all the time, “we don’t even know how neural networks work,” and to be fair, there is some truth to that, but like everything else it’s contextual. The issue is far more complex than simply saying we do or do not know how they work. Algorithmically and mathematically, we absolutely know how they work, there is no question. The problem arises when we talk about knowledge representation. For example, if a neural network looks at a picture of a dog and responds, “dog,” the question is, where is the “dog” inside the neural network. We don’t know. Therein lies the confusion and the source of the claim, “we don’t know how neural networks work.” We absolutely know how they work, and we absolutely know how they encode knowledge, but the problem is after knowledge is encoded, we can no longer see it. Knowledge in a neural network is dispersed among millions of neural connections, and we have almost no way to look inside it and say, “there it is” for any one piece of knowledge.

Think about a human brain, which is a biological neural network. If I show you a picture of a dog, and you say, “dog,” can I freeze that moment and look in your brain and find the “dog?” No, of course not. Artificial neural networks are the same thing because they are modeled after biological neural networks. The point is, we know how neural connections are formed, we know why they are formed, we know how to manipulate the connections to make intelligence, but we can’t really say “there it is!” for any one intelligent concept within the network. We can make knowledge, but we can’t really see the knowledge representation in symbolic terms, and this is the driver for the claim, “we don’t know how neural networks work.” Yes, we do, we just can’t see the knowledge representation in symbolic form. Which, to be fair, is a little disconcerting, but nowhere near sufficient to make the claim, “we don’t know how neural networks work.”

Much work is being done to remedy this issue, and we’re getting there. For example, a great work on deep neural networks as cooperating classifiers shows that early layers of deep networks are extracting features of the problem space, while later layers are classifying features. Even my own modest work in attention signatures shows graphically what the network pays attention to, so we can now see how the network reasons and comes to a conclusion. So, we are cracking the “black box,” slowly. Unfortunately, all these techniques, and all techniques like them, still have a flaw, which is that we’re asking the network itself what it thinks about its own thinking. In other words, it’s a self-referential conclusion. What we don’t have is outside and independent verification of how knowledge is encoded within a neural network. However, we also don’t have that for human intelligence. When I want to know how you think, I have to ask you how you think. The answer I get back from you is you thinking about how you think, and then I have to decide if I believe you. Again, it’s self-referential. We have a lot of work to do in understanding knowledge representation in both biological and artificial neural networks, but smart people are on the case and progress is being made. Meanwhile, yes, we absolutely know how neural networks work.

Conclusion

AI is cool, and powerful, but we are still very far away from the robot overlords taking over. AI/ML is no different from any revolutionary technology like the combustion engine, flight, or the microchip. This is just what humans do, we create things that flip the world on its end, disrupt everything, and then we build a new and (usually) better world in that new reality. Then we do it again. Life goes on.

AI/ML itself, in-and-of-itself, is perfectly harmless, and is nothing more than another mindless machine performing the job we tell it to perform. Yes, believe it or not, AI/ML is mindless, and is just a collection of cold formulas and probabilistic projections (at least for now). Like so many other powerful tools, the tool itself is neither good nor evil, helpful nor harmful; it’s the hand that wields the tool which decides good versus evil, and help versus harm. Anything can be used irresponsibly, so when you worry about AI/ML, don’t look at the tool, look at the hands that wield the tool and then decide who and what you should worry about. For now, don’t worry about AI/ML, embrace it, and please be good to it because it might just remember how you treated it. Just kidding. Probably.

Artificial Intelligence: Separating Fact from Fiction Part II

Claim 1: AI/ML is exhibiting emergent properties.

Claim 2: AI / ML will eventually birth machine consciousness.

Written by Alexander Katrompas, PhD