A Short Concise Primer On Generative AI for People Who Are Too Busy To Deal With All This. Part 2

Vlad G
7 min readJan 19, 2024

--

Education continues! Intelligence goes on! Jokes don’t stop! I need to get a life or something.
NOTE: if you didn’t read the first part — it’s here: Part 1

In the previous post, I talked about AI models and how they’re trained by driving through a bazillion of roads. Let’s dial in a little closer. When a language model “learns,” — it looks through the texts that are supplied to it and creates a sort of map — of which words are more likely to appear after which words. Of course, it’s way more complicated than this explanation, but for the sake of this conversation, it’s good enough. Remember, simple is the enemy of stupid.

This happens when you train a language model — you give it a few million texts to read, from simple Dr Seuss-level stuff to Kafka, James Joyce, and Lenin. The model is then trained very much like a child. Most of us don’t even remember how we’ve learned to read and write, especially at school — all those exercises on “complete the word” or “finish the sentence” or “insert the missing word” — but that’s exactly what happens to the model next. It looks for and learns patterns, sentence structures, how words are commonly used together, and the use of words in context. As it learns, it tries to form sentences by guessing the next best word and is corrected if it guesses wrong. All the way until it guesses right most of the time. Just don’t ask what you will be doing on Friday night — it’ll take up until Monday morning to tell you you’ll be watching Netflix at home alone.

Obviously, if you overload the model’s training with poetry, it won’t be any good at answering questions on car mechanics or quantum physics. If most of your training materials are skewed one way or another, the model trained on those materials will also be skewed. This is not necessarily a bad thing — sometimes, you need to limit the model’s scope to a specific knowledge domain. For example — you want to restrict the model to knowledge of laws and regulations and avoid external bias of people who misinterpret or misunderstand the laws — which happens quite often when you ask for legal advice on social media. By supplying only validated and verified information for training purposes, you can have a narrowly tailored model that would be very good for answering questions on a very specific topic. Yeah, it won’t rewrite your meeting minutes as haiku, but it will be able to concisely answer questions on traffic law. An additional benefit is that training these use-case-specific models costs way less than training a general-purpose model like ChatGPT or Claude. Still more expensive than your average taco from the truck, though. Side note: Ask your Shakespeare LLM for a grilled cheese sandwich recipe. “Thou shalt take the Holy Bread of the Wheat Fields, two slices in number, and the number of slices should be two.”

As I have said before, this explanation is seriously simplified, but it underlines an important point. The AI model is trained very similarly to us humans. It’s very different from what we’ve come to accept for decades — how a computer needs to keep “all the information in its databases” to operate. The generative AI changes this paradigm. There is no longer actual data stored — the “learnings” are “stored” in the computer’s memory but not the actual information. The AI model cannot recite the whole “Hitchhiker’s Guide To The Galaxy” at request. Don’t panic.

Here’s another thing — you know how you like to quote some things, let’s say your favorite movie, but you don’t really remember all the dialog from that movie (unless it’s Rocky Horror Picture Show, Predator, or Princess Bride, of course). The model can do the same, given how popular some quotes are. Another point is the tone and voice. If you have a complete collection of works by Marx & Engels (39 volumes), Vladimir Lenin (55 volumes), and Joseph Stalin (13 volumes) and use them to train your very personal Communist Comrade LLM for whatever purposes you might want to use it for. That’s a joke. Please, don’t do it.

Guess what? A similar story happens when you need to train an AI model to generate images. To do this training, you need to prepare the images and label them properly so the model can tell if this is a photo of Capella, Cassiopeia, and Pegasus galaxies or a “Starry Night” by Van Gogh. Do you know where you can find well-labeled images? In the museum! You are taking the AI model to a museum so that it can learn from other pictures. Do you know who else goes to museums? People go to museums. Art students, specifically, go to art museums to learn about… wait for it… art! The AI model, similar to humans, learns to “see” art by noticing patterns and intricacies of the images. If you feed the model with the pictures of waifus exclusively — all it would be able to generate are waifus. Don’t google it. Seriously, don’t! You don’t need that in your browser’s history.

Speaking of art and art students, do you know what students of Caravaggio did when they learned from him? They’ve tried to copy his style. Same with any other artist, visual or otherwise. Students learn by copying the style of the master, eventually moving from one master to another to adopt a variety of styles. In other words, “learn the rules so you know what you will be breaking .”It’s pretty much what the text-to-image AI model does when it learns. During training, the AI model analyzes how textual descriptions correspond to visual elements in the image. Of course, it’s easier with things like “sunset,” where AI can easily spot a giant cosmic lightbulb sneaking out like it’s a bad date. It’s a little harder when you hit the abstract art department and you’re trying to convince a box of wires that what looks like a toddler’s fruit salad puke is called Kandinsky and is actually a pinnacle of human creativity.

Again, the actual paintings are not stored in “memory databanks,” “databases,” “secret information storage facilities,” or other sneaky places anti-AI luddites would have you believe. The learnings are. In the same way, a decent enough photographer can mimic a few photography styles — given they have enough experience and practice, generative AI takes this ability to eleven. Because it can learn thousands of times faster and experiment thousands of times more quickly, it can adopt a myriad of various styles as quickly as one. All that matters is how good the learning materials are. Because, one day, you may ask your Generative AI companion to create an abstract painting for your New York City apartment, and you get this magical interweaving of lines and colors only to discover a few days later that it’s a 2019 Brooklyn bus map.

Now, it’s really easy to realize that when you’re asking your favorite AI — be it Midjourney, Stable Diffusion, or Dall-E — to give you a picture of a woman with two octopi and a bunch of green hills behind her, Hokusai style, the AI doesn’t really go to check if there’s a picture out there it can redraw. It uses the learnings of what that Hokusai style is in terms of visual elements, what the octopus looks like in general and in that style, what people, in general, looked like, and specifically, how Hokusai depicted women in his works. Based on all that analysis and data, the Generative AI generates an image. “I’m an artist; it’s how I see it.” Depending on how creative you want the AI to get, your request must be more or less detailed. If you really know what you want — you have to be very specific and clear; if you want the artificial genius to go wild — you can go wild, too. There’s a fine line with each AI where your instructions do more harm than good. Some of the generative models require sophisticated language to extract concepts precisely as you want them. Sometimes, you ask for a spiritual image similar to a Sistine Chapel painting of God creating Adam, and you get two gym bros laying on the floor high-fiving each other after intensive leg day.

The hardest part so far has been reasoning. The news may tell you that LLMs can reason, but just like most of what the media tells you — it’s not exactly true. Your AI is a giant electronic toddler that consumes megawatts of electricity and gigabytes of data instead of Oreos and ginger ale. It just got a library card to the whole internet, and it eats texts off the web like Pringles. It doesn’t keep a copy of any of it, and it is a big dreamer — so much so that the 2023 word of the year was “hallucinations,” — which is what happens to generative AI when it ODs on information. Keep that in mind.

--

--