How does AI have so much knowledge in such a small size?

How pocket AIs can talk about anything

Daniel BR
Clear and Deep
4 min readSep 18, 2024

--

Portable large language models (LLMs) have gained significant attention recently. These models, which function similarly to the engine behind ChatGPT, are small enough to run on personal computers or even smartphones.

An AI-generated image of a chip with neurons.

They were trained with data from the whole Internet, and though they are less accurate than their big brothers, they can talk about any topic, from physics to movies. If the information is publicly available, they can chat about it.

But how is it possible to store data of the whole Internet in such a small size? To answer that, let’s see the difference between how classic computer programs store data and how Artificial Neural Networks do.

In the classic way to store data, computers use databases or files where they write accurate information. For example, let’s say you want to remember all your friends’ birthdays. You can take a piece of paper and write all their names and birthdates. As long as a piece of paper is not degraded or destroyed, you can query it and know the exact birthdate of any of your friends. There is no ambiguity; if you did a good job writing the information, it would be correct. This is similar to what classic programs do; they write information and query it late; as long the data is not corrupted, they will get the same information that was written.

On the other hand, our brains work in a completely different way. A single neuron is responsible for solving problems and storing information. The exciting part is that the information of a single neuron is so tiny, and the problems it solves are so simple that the same neuron can be used for many tasks and data. What data will be obtained by a given neuron depends on the other activated neurons along with it. Let me use an example for it.

Let’s think of the phrase: ‘Once upon a…’ Can you guess the following word? Most of you will answer ‘time.’ Imagine that each word you know is associated with a neuron; actually, it is more complicated than that, but for the sake of brevity, bear with me; when you hear the first word ‘Once’, there are many possible neurons that your brain can activate, for example, the neurons associated to the word ‘upon,’ or ‘again.’ When you hear the next word, ‘upon,’ your brain again has many options of neurons that can be activated, making you think in the following word, so when you start hearing a phrase, your brain is following ‘paths’ of connections, bringing you memories.

The path ‘Once Upon A Time’ also can find other words in its way.

However, one neuron is not exclusively associated with one word; it can be related to many of them: feelings, images, or anything else you can think of. So, the path of neurons for ‘Once Upon A Time’ can activate the neurons associated with the song ‘Like A Rolling Stone,’ the story your parents told you before going to sleep, or memories of your first love. There is not one ‘correct’ path; many valid paths are associated with your memories.

The same neuron can represent unrelated things like Upon, Love, Dad, High School, Rocky, Cat, Soda, and Blue.

As an analogy of the versatility of a neuron, think of a calculator that can only perform sums. How many tasks can you use it for? You can perform multiplications by summing many times; you can use it to make currency conversions and calculate the area of a square; since you can encode colors as numbers (RGB), you can use it to mix colors and so on. Similarly, you can use the same neuron to perform many tasks or remember many things.

This is a powerful feature because it allows your brain to learn vast amounts of complex things, like how to speak, swim, or do math. However, there is also a trade-off with this mechanism. You may have undesired thoughts or lose concentration when you try to do your work.

AI models are designed to work similarly to our neurons. They don’t store concrete facts in a database. Instead, they store information representing the connections between their artificial neurons, the perceptrons. A single perceptron can represent many different types of information. But as we saw it, it has a cost; the LLMs are prone to hallucination. There is no unique path of connected perceptrons that outputs the correct answer; many paths are likely to answer your questions, and the AIs have no certainty of which one is the best.

As many said, ‘Hallucinations are not a bug but a feature.’ This is not only because they allow them to write stories but also because they allow them to use their learning to provide you with information that wasn’t explicitly written. Of course, nobody wants to be fooled by an AI; that is why there are so many ongoing investigations on how to make them more accurate, like the new o1 model in ChatGPT, but that is a topic for another post.

--

--