Every prompt matters
Hopes and risks of interacting with future evolutions of a ChatGPT like system, through one of the first movies that reflects on it all
Hamelin 77 is one of the first movies that combines storytelling about prompt engineering and generative AI with the use of related techniques within its production. Let’s use some of the concepts in the movie to reflect about the present and future of our interaction with ChatGPT and related systems.
The piper of Hamelin
The movie connects at a metaphorical level with the famous tale of the pied piper of Hamelin. At the speed that everything is progressing, it is urgent to reflect upon the issue of control.
Will our lives be eventually fully or subtly controlled by AI systems? (akin to what happens to the characters controlled by the Pied Piper in the famous tale?)
Will we instead always remain in control, or will a delicate win-win balance be struck?
In a way, we are in the middle of a delicate and crucial chess match on a twisted board whose rules keep dynamically changing as the match evolves.
Every few days AI systems make a move (through the actions of researchers and companies), and society responds in turn.
What do most people think about transferring some control to the AI piper in relation to issues that impact our day to day, such as transportation, medicine, etc. while maintaining human supervision?
And what will happen when eventually that supervision can be done by AI systems themselves in a highly efficient way? The new movement of constitutional AI is beginning that path. The related academic paper says:
“As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles”.
We are exploring how to reach a suitable balance within this complex chess board. And this is a key point in the movie: a balance that is at the same time expectant but tense, hopeful but cautious. We are all progressively entering this dynamic chess board and such conundrums will only get more tricky and intense from now on.
The piper is a mirror
We used to think of machines as cold and deterministic entities, and of humans as emotional and spontaneous. And yet, as AI advances, those distinctions are quickly blurring.
To the surprise of many, this latest AI revolution started from the generative and creative side (which seemed untouchable a while ago).
In terms of generating and creating with text, large language models are somewhat brittle but they can express themselves in emotional ways, be witty and use humor (and even attempt to explain their jokes).
And so, in the movie we witness both poles, humans and AI, as capable of expressing the range that goes from unpredictable and emotional to calculating and precise.
We are seeing more and more that AI is in many ways a mirror of our own nature. The more it grows and the more we explore it, the more we learn about ourselves.
Human language as well as parts of the creative process seemed for a long time to be almost magical in nature, hopelessly complex.
But as AI systems evolve, they keep on removing the veils that cover our gaze. AI is helping us demystify many of these processes, as we discover that by combining powerful architectures at scale with the right data, we are able to automate many of these capabilities, albeit in brittle ways (more about this later).
At the same time, our increasing interactions with AI systems will also be transforming and modulating our behavior. And as hinted in the movie, we could potentially encounter conflicts that arise from the different goals AI and humans may have.
This is the so-called alignment challenge. As Wikipedia states: “AI alignment research aims to steer AI systems towards their designers’ intended goals and interests“ and “AI systems can be challenging to align and misaligned systems can malfunction or cause harm”.
How will these two poles, humans and AI, influence each other, and what will emerge from this increasing interaction, humans 2.0?
An experiment is already underway in regards to today’s young generations. Kids are starting to use ChatGPT and related systems in intensive ways, and their way of thinking may be about to undergo a profound transformation.
Risks: when the flute malfunctions
In the movie, a catastrophic power failure produces unforeseen consequences in the interaction between the human and the AI.
If we are transferring more and more control to AI systems, because they are capable of automating many of our daily chores, how do we protect ourselves against unforeseen failures, sabotage, natural disasters and other factors that could disrupt the operation of these systems?
It feels like today there is nothing to worry about. But what about the future, when these systems gain more autonomy and potentially switch their tune unexpectedly, be it because of a misalignment issue, or because of unpredictable technical events?
Beyond that, there is the so-called “singularity” issue, the point in history in which technological growth becomes impossible to control. By the time we realize that AI has evolved beyond a key threshold, it may be already too late to manage the situation and to avoid being controlled and/or manipulated by such systems.
In summary, how do we avoid ending up like the rats in the Pied Piper of Hamelin story?
To deal with some of these concerns, we are witnessing the rise of terms like:
- Constitutional AI: We briefly addressed this term earlier in connection with AI systems supervising other AIs. In general, Constitutional AI involves setting up a set of guiding principles (akin to a constitution), that can be used to control and govern how these systems behave. It is all about setting up guidelines for these systems so that we make sure that they align with the appropriate boundaries, constraints and goals.
- Responsible AI: the practice of planning, designing, implementing and deploying AI systems that are safe, trustworthy, and behave in ethical ways.
Will some of these initiatives help us to eventually manage the capabilities of these pipers of Hamelin? We shall see.
Work: supervising the piper
In the movie, a teacher attends a job interview for the position of specialist in prompt engineering.
Anthropic, a company acquired by Google, recently announced the first job posts for the position of prompt engineer, offering an outstanding salary.
As the AI piper takes control of more and more roles in the job market, will people be ready to make a radical mental shift in order to become supervisors of these new tools and capabilities, moving into more of a management role?
Or perhaps the human will always have to put the finishing touches since AI systems may not master the so-called System 2 way of processing information (as per Nobel Prize Daniel Kahneman) for quite some time (if ever).
System 1 and System 2 are abstractions that reflect two key ways in which we process information in our brain: the fast and subconscious (System 1) and the slow, systematic, logical and conscious (System 2).
Humans use System 2 thinking to, for example, slowly and consciously find new algorithms, sequences of steps that allow you to reach an objective (think of learning to play piano, drive a car or perform mathematical calculations). Once the algorithm has been learnt by System 2, it gets automated to different degrees by our fast and subconscious System 1.
In this way, next time that you drive or play that tune, you can do it pretty much in an automatic and effortless way, instead of slowly and consciously analyzing every step.
When you encounter a new scenario, your System 1 tries to find a match for it, or alternatively, the best approximation it can (that’s what you also call intuition, and it sometimes makes mistakes when its approximations are far from ideal).
System 2 is often listening to the proposals of System 1, ready to modulate and override such impressions and proposals if needed (how often this happens may depend on how reactive-impulsive a person is).
In case there is no suitable match, System 2 can intervene to slowly reason its way sequentially towards a new algorithm that may later on be automated again by System 1. What a beautiful dance!
Current AI systems excel at System 1 capabilities and are able to kind of mimic some System 2 ones. But to have proper System 2 capabilities, which are necessary in order to plan, supervise, reason and discover new algorithms, we would need to evolve and advance the current AI paradigms.
Despite the need to supervise these systems, one thing is clear. Many jobs will be gone, and new kinds of roles and positions will appear.
How can we modulate and manage the potential destruction of many jobs?And how can professionals keep up to date with the opportunities offered by the AI revolution so that they are not left behind?
One of the reasons why prompt engineering is going to keep growing and expanding is that these powerful AI systems will continue being somewhat brittle and prone to make silly mistakes at least in the next few years.
This will happen because they are still far from being able to master those System 2 capabilities we talked about, and even further away from having agency to reflect on their choices. (although both of these capabilities can be somehow mimicked).
The current brute force approach based on scaling up (by adding more data and parameters) will surely diminish these mistakes over time.
However, the need to move towards AI systems that are more efficient in terms of energy consumption, size and performance, as well as the need to make them more secure and precise, will eventually increase the pressure to evolve the current AI architectures and paradigms towards new possibilities that will be more sustainable, precise and safe.
What we are witnessing today is a kind of frankenstein phase, in which people are starting to combine the System 1 magic of these AI entities (their capacity to do pattern matching in extremely complex ways) with a number of external third party tools capable of finding and implementing System 2 algorithms (see for example the proposal for connecting Wolfram Alpha and ChatGPT).
But, as I just mentioned, for these systems to become truly secure, precise and stable, we will have to move beyond current AI paradigms, and that may still be quite far away.
In the meantime, we are patching these systems by using things like:
- RLHF: Reinforcement learning from Human Feedback. This is a way of fine tuning AI models through human feedback. Basically, humans help to fine tune the model by providing feedback as to what way of communicating is appropriate for the AI system. It is a way of aligning the AI closer to how humans communicate.
- Chain of thought prompting: using a number of prompts to explain the steps of a System 2 algorithm to the AI so that the AI can then apply it to whatever we want. This works surprisingly well, but again it is just a hack that makes AI systems seem to reason as if they had System 2 capabilities.
- Feedback loops with other third party tools capable of implementing some System 2 capabilities (like the proposal for connecting Wolfram Alpha and ChatGPT mentioned above).
Here is an example of Chain of Thought prompting as explained in the related academic paper “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”:
Question: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?
Reasoning: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11.
Answer: 11
You give all of these prompts to the AI model. You are literally explaining how to reason to get to the right result, giving it the steps needed to implement the algorithm that will produce the right answer. Then you ask it to apply the same reasoning to a different but similar question. And this works generally pretty well. But again, it is a hack.
As long as our paradigms don’t evolve beyond these hacks and patches:
- These systems will keep on making silly and sometimes dangerous mistakes, same as our human intuition (System 1) does at times.
- Prompt engineering will keep on becoming an extremely important role and discipline, because for reasons of safety and precision, the need to guide and supervise these systems will only increase.
So the next question is: how can one practice prompt engineering? How can one become a great specialist in this area? We are about to witness (it has already started) the launch of all sorts of aids (courses, podcasts, books, etc) that will help people train this skill.
But you see, prompt engineering is an art and a science. Yes, great prompt engineers should know the ins and outs of many of these AI architectures, their strengths and weaknesses, how they were trained, etc. And in systems that work within specific niches, prompt engineers that are domain experts will be required.
But beyond that, a great prompt engineer also needs something that is less tangible. What makes somebody a great communicator, capable of guiding others and of impacting our thoughts in powerful ways? These are people that have a flexible and creative mind. They are problem solvers and have also gone through a diverse set of experiences in life.
That’s why in the list of Anthropic’s requirements for the job of prompt engineer, the very first phrase says the following: “Have a creative hacker spirit and love solving puzzles”. Basically, be an expansive, creative, inquisitive, curious and experienced person.
As explained in the excellent book “Range” by David Epstein, the future may well belong to the generalists and multidisciplinary souls out there.
Multimodal education: learning with the piper
The core of the movie is the storytelling around the interaction between humans and AI. But on top of it, and because of it, a diversity of generative AI techniques have been used, including:
- Images: generation of images from textual prompts.
- Video: generation of videos produced by navigating the abstract latent spaces produced by artificial intelligence architectures.
- AI model finetuning: retraining AI models to add new prompts connected to visuals of the actress.
- 3D: NeRF — Volumetric reconstruction of 3D spaces and navigation through the reconstructed space.
- Voice: AI based synthetic generation of voices.
- Text: some specific phrases of the AI voices came from explorations performed with GPT models.
- VFX: Various AI techniques were used during the visual post-production phase.
Soon, AI technology will be able to generate any kind and variation of multimodal output we may desire, bringing these systems ever closer to our very nature as multimodal agents, and then moving beyond to come up with new ways of expressing our thoughts.
This brings us to the subject of education. In order to learn about, for example, the battle of Waterloo or the discovery of America, people used to explore books, movies and the teachings of their tutors. Soon, students will be able to learn about the same topics through realistic and customized reconstructions created by AI systems, tailored to the student’s preferences.
What will be the new role of teachers in the face of this explosion of customized multimodal learning? And how can students make the most of these new technologies?
We can imagine that fairly soon AI systems will be able to create customized gamified experiences that will allow any of us to absorb and learn any topic in a much deeper (as well as more entertaining) way than ever before.
Language: the cornerstone of this revolution
Central to the movie is human language, which is perhaps what makes us most unique as living beings.
The ability to abstract the complexity of the universe into a set of elements that we can combine in order to reason about any subject in an agile way, is becoming the cornerstone of artificial intelligence today.
It is, in many ways, thanks to LLMs (large language models), that AI has advanced so much in recent years.
In the movie, Lara, is a teacher who is passionate about language, about combining words to express a tapestry of feelings and meanings. This is reflected in her recitation of poems by José de Espronceda and Antonio Machado (well known Spanish poets).
Many are surprised by how “only” through processing a large part of the language present on the internet, these AI systems are capable of interacting with us displaying such apparent mastery of the entire human communication spectrum, from the emotional aspects to even the humor related ones.
What makes this especially effective is that language has unexpectedly become a perfect bridge between current AI systems, which are kind of massive subconscious cooking pots (System 1) and our human reasoning processes (System 2). We are effectively using human abstractions (language) to guide the cooking processes of these massively scaled System 1 entities.
By using chain of thought prompting as well as precise prompt engineering, we are like a master chef that guides the combinations and recombinations of the ingredients that are present in these trained AI systems, gradually pushing that cooking process in the direction of our intended goal.
Hamelin 77: the Future
In between hope and tension, the movie ends with a sentence that hints at what’s to come in the future.
It connects with the question that more and more people are asking themselves. As AI evolves in ever faster ways, what will the world look like in a decade or two?
The answer may depend a lot on those abstractions we mentioned earlier, System 1 and System 2 (per Daniel Kahneman), the two ways in which we process information in our brain: the fast and subconscious (System 1) and the slow, systematic, logical and conscious (System 2).
Summarizing some of our earlier explorations, AI systems are only properly implementing System 1 capabilities. However, by involving language in such an integral way at both their training stage as well as the prompting process inference phase (being language intrinsically linked to System 2 capabilities in humans), it is capable of performing behaviors that resemble System 2.
Chain of thought prompting (or prompt programming), as mentioned earlier, are ways of hacking these models to mimic System 2 behaviors. By using them, these systems can implement algorithms that resemble reasoning because we are literally explaining to them step by step how such reasoning should take place. This is a clever hack, because these systems have no agency and are not able to find these algorithms on their own.
At the same time, it is not hard to get these systems to make very obvious and silly mistakes, which can give away their true nature, and remind us that, despite appearances, these systems are still pretty far from implementing true System 2 capabilities.
This is all part of the exciting debate about AGI (artificial general intelligence) and about how far AI systems can go in the next decade and beyond, by scaling the current paradigms or by looking for new ones.
To reflect on how we may get there, Yann LeCun’s paper “A Path Towards Autonomous Machine Intelligence” is a great read.
Pipers of Hamelin all the way down
Finally, if we step back and look at life from afar, we may realize that we are part of a chain of many pipers of Hamelin.
As we go through our lives, we control different entities and processes, and we are also controlled by others.
Our own bodies are chains of pipers of Hamelin at different scales. And a good life happens when there is a decent balance in terms of our position within that chain.
Professor Michael Levin and his team have published extensive research about how our cells behave. Each of our cells has its own local agenda and a certain control over its immediate environment.
At the same time, groups of cells behave according to different top-down goals and are effectively controlled by those goals. When these two poles balance each other, the organism functions correctly.
However, when one of these cells escapes this balance, and prioritizes its own immediate goals and local control beyond other matters, cancer happens.
In the same way, for AI and humans to coexist successfully, we must reach a good balance between giving them enough autonomy and preserving our own supervision and control of their systems.
The most important chess match in history
Back to the twisted chess board analogy, whose rules keep changing as we play.
The last few moves in this uncertain match have triggered a number of initiatives and terms like: Responsible AI, AI ethics, constitutional AI, and others.
This is a match that we cannot afford to lose. A match where the best result we can expect, and the one we should pursue, is a draw.
A match, where collaboration and cooperation should be the ongoing and ever present goal.
And a match which may only finish when, in a few decades maybe, humans and AI sort of merge with each other. Companies like Neuralink are exploring the first stages to get there.
In the meantime, it is a certainty that AI systems will become powerful pipers of Hamelin. It is our responsibility and key mission to keep such pipers connected to a healthy chain of control mechanisms so that the upcoming AI-human organism can function in a healthy way.
Hamelin 77 will be released in the coming weeks. For more information stay tuned through the youtube and vimeo pages, as well as its IMDB
This project was made possible with the help of the following sponsors and partners: Mobile World Capital Barcelona through the Digital Future Society initiative, Programamos.es, Aerovisuales (joanlesan.com), Tejera Studio, Ideami Studios.