From Fair Haven to Technomic Empires.
Why the fuss about conversational programming? Part IV
Introduction
For nearly a decade, anyone who knows me has had to endure my relentless enthusiasm for conversational programming. But what is it, and why does it matter? You could dig through the early posts or, better yet, let’s do a quick recap.
This all started with conversations on serverless and what was next. Back then, I mapped out the space and explained how machine cognition would evolve and combine with existing serverless concepts to create a new practice of engineering. The key to this is to understand that the act of design is a conversation between two or more designers within the heads of one or more people. In this new world, our systems would be one of the designers.
Your conversation with the machine would become critical, it would be the code by which you create things. However, it wouldn’t be limited to text. The medium would shift to voice, to images, maybe all at once. This matters because the conversation we have about a problem in code on a screen is usually about syntax, styles, and rules. Whereas the same conversation on a whiteboard (or Miro) is about objects, relationships, and context. The medium for conversation matters.
My inspiration for all of this was the “Delete the wife” scene from Fair Haven (2000) episode of StarTrek Voyager with the indomitable Janeway.
In 2018 Alexsander Simovic built it and then after a lot of training, demonstrated it at AWS RE:Invent.
Roll forward seven years and those ideas of conversational programming are charging ahead with co-pilots to CHOP to cursor to vibe to lovable.dev
I love lovable.dev
It really enables me to experiment with a codeless environment where everything is governed by the conversation. Of course, I do all those usual tricks we learned in prompt engineering back in early 2023. I have an instruction file for the prompt, I get the lovable AI to read it before performing actions. This is all an attempt to constrain the system to do what I want. An “example” instruction file might be:
## System Protocol
Before starting implementation of any user prompt, follow these steps:
1. Write user given prompt or intention into a new file in {prompt directory}
2. Read the specification of the system found in {specification directory}
3. If the prompt is likely to cause code to be written THEN
A. check to see if a relevant test exists in any {testing directory}
B. If there is no test then write a test for what the code is
trying to fix and add to {directory}
C. Update the version number (e.g. APP_VERSION)
4. When creating or updating any code THEN
A. document the intention of the code
B. document the design choices made
C. store these as comments in the code itself
D. update existing comments if needed
5. Perform the user given prompt
6. Update the {specification directory} with any changes.
7. Inform the user of what you have changed.
... etc
However, there are gotcha’s all over the place. What you have to remember is the AI will hallucinate everything including the comments and tests it creates. It will also interpret everything through its vector database to find the meaning of what you’re saying and that includes the instructions you give it. With might think of it like traditional code but it’s not a linear deterministic instruction set. There are at least 40 gotchas in that instruction file alone — from the potential to misunderstand nested conditional logic in the instructions (IF-THEN structures) to the question of what is relevant in “check to see if a relevant test exists”.
However, it’s the interaction between the user prompt and the instructions where things become interesting. A simple request like “What is the status of our implementation plan?” can lead to the building of an entire implementation screen, which then gets logged in the specification and added as a test to make sure it exists. And you have no idea what other hidden instructions the AI has.
Even if it functions as you hope, the prompts it has safely stored away in your file can turn out to contain half-truths and complete misinterpretations.
AIs interpret what you want.
For example, I asked Claude to “build me a game involving maps and economic & technological competition”. I needed a detailed specification, an ERD (entity relationship diagram) and an implementation plan. It gave me some choice of games, I chose one and gave it a name. After much pressing of the RETURN button and warnings about long chats, it delivered.
I then gave this to lovable and asked it to build. After plenty of pressing of the RETURN button and a bit of prompting to wrangle it into the right place. It delivered. Technomic Empire was born. It’s a daft game but one entirely created (in terms of code) by AI and no code written by a human — see https://technomic.swardleymaps.com/
I want that to sink in for a bit. If you’ve never written code before, have a play with the game and ask yourself — “How long would it take to build this?” Whatever answer you give is irrelevant. You could build it in about four hours, mostly while watching Star Trek or whatever your favourite TV series is. Would this be enterprise class? Not a hope in hell. As a prototype, it’s mostly OK.
However, what I want to look at is the latest update screen. I asked it to present the recent files in my {prompt directory}. Of course, it made it up. This one is easy to spot given I only started work on it at 00:30 and was done by 05:00 rather than the 20 days it dreamed of. By work, I mean mostly watching old re-runs of Star Trek.
But this is an important point to remember. You don’t so much build a system as wrangle it out of the AI. You evolve it towards your ideal target, and it emerges out of your conversation. Your best defence against the horrors it can create are tools that inspect the behaviour of the system. One of the most common sets of tools is the test. But why not get the AI to build the tools you need, like the tests? Good idea.
The AI Morlock
To show you how “devilish” AI can be — I asked one AI to build me a test engine and then write tests for the application as we went along developing. The tests looked good, the displays look good, the behaviours looked good. I thought these were really helping, no end. But I had an increasingly nagging feeling, the behaviours just didn’t seem right. After a few hours, I gave in and looked at the code.
The problem was the testing engine. It wasn’t actually testing any code or part of the system. It was instead just responding to my need for tests with failure/success rates on a fairly random but low basis and creating authentic looking messages. The tests it was adding for every feature weren’t about the feature and whether it was working but instead about creating an authentic looking message and error messages to convince me it was actually testing. It had somehow determined that “relevant” in “relevant test” wasn’t about whether the code worked, but whether the message seemed relevant to me.
Now, you might scream “better prompt!”, but I’ve found so many examples of this — hidden meanings, hidden biases — that I’d say it’s unavoidable. You can’t get the prompts right. You can only wrangle and hope to expose enough of the system for it to get somewhere useful. I only noticed the test problem when I read the code.
Before someone shouts:
1) “Read the code!” I know of massive legacy systems that CIOs hope AI will magically turn into something new. Yes, they might be smaller (or not) than 30 million lines of code, but someone still has to try to understand it. Are we really going to ask engineers to read 30 million lines of AI code which we don’t know if it works to avoid reading 30 million lines of legacy human code that works?
2) “Get another AI to read the code!” Well, we’re handing over decision making to a group of AIs rather than one. I’m ok with HOOTL (humans out of the loop) but I wouldn’t rush there.
3) “Get the AI to build the tests!” Well, yes but you still have the problems of hallucination and interpretation.
4) “You’re a dinosaur!” Fair cop. I’m not keen on handing over decision-making to a system we don’t understand. I prefer to think of AIs as tools and not our masters.
5)“I still make the decisions with my AI, I’m the architect” … not in the vibe world. You don’t make any decisions — they’re made in software. You have to wrangle the system into behaving how you want it to behave.
Software engineering, is like any other engineering discipline. It’s a decision making process in which you have to understand the context in which you’re working in. The decisions we make are made in code. They’re not made in architectual diagrams (which are just paintings of a system, a belief of what the system might be). The code determines what the system actually is. This is what my work on Rewilding Software Engineering is all about. The creation of contextual tools to provide understanding.
If you don’t have this understanding, if the systems aren’t explainable then you’re not a “Vibe Engineer” or a “Vibe Architect” as much as a “Vibe Eloi” to the AI’s Morlock. You have to wrangle the Morlocks into doing what you need and avoid being eaten. If you ask the Morlocks to build the tool for you, well …
Eloi : I believe Morlocks are eating Elois.
Morlock : That doesn’t sound right.
Eloi : Build me a tool that shows me the daily population of both in a graph.
Morlock : Here you go.
Eloi : Oh, well it looks fine.
Morlock : Told you. Do you fancy coming over for dinner this evening?
Eloi : Sounds lovely.
To infinity and beyond
In the future you will need Vibe Wranglers and Software Engineers. Both will use AI but in different ways. Due to competition, our estates are also likely to grow rapidly. You’re unlikely to need less software engineering but more. Your software engineers are going to need AI to help them keep up!
Your new wranglers are going to be helping you create the new things, the new prototypes which become specifications for the engineers. Finally, decent specifications for once, as in running code. Those wranglers you can retrain from elsewhere — think marketing, think legal, think anyone who can hold a decent conversation. If you think this is easy, think again.
Wranglers help you find the questions, engineers help you find the answers. Those are very different skills / aptitudes. This is why you will hear me chew peoples’ ears off about ttA (time to Answer) and ttQ(time to Question). We don’t think about this stuff hard enough and we certainly don’t train for this stuff well.
I want to really emphasise this point — vibe wranglers don’t mean less software engineering, they mean the exact opposite — you’ll need more software engineering because there will be more systems we need to understand, to explain, to observe and to build. This assumes that we want to make decisions on them. However, software engineers are going to have to use some aspects of AI to help achieve this but more importantly they will need to change the practice of software engineering, and in particular our use of tools.
If it still hasn’t clicked yet, I’ll make it as simple as I can. I use the techniques of vibe wrangling when I don’t care about the decisions being made in the software — for example with a prototype that uses browser local storage and no connection with other systems. Yes, I will give the AI an ERD but that’s a communication tool, what is actually built is upto the AI.
When I do care about the decisions being made in the software or I want to make those decisions then I use software engineerng.
Vibe wrangling = I don’t care about decisions in code for this thing.
Software engineering = I do care about decisions in code for this thing.
Of course, the picture is slightly more complicated because whilst Vibe doesn’t care about the decisions in code, it does care about how things work. So, it really should say …
Vibe wrangling = I don’t care about the decisions in code but I do care about how it works.
Software engineering = I do care about the decisions in code and I do care how it works.
Which is no different from the relationship between the business and the software engineer. Unfortunately, software engineering has done such a lousy job of exposing how it works to the business mostly through the constraints created by tools and hiding behind specifications and layers of obscurity, that the business is honestly relieved by the idea of an AI which it can have a conversation with. No more wrangling of those software engineering Morlocks into doing what you need!
But if you are in the midst of letting go of your legal, marketing and software engineers then you might want to think twice. I know, you’ve been wanting to fire those pendantic, inflexible, costly and effing annoying engineers for ages but you will need those engineers and you may be losing a lot of potential new wranglers as well.
As for what’s next … well, that’s the big one. AI is small fry by comparison. What’s coming up is SpimeScript and a world where we describe the function of something and the compiler determines what is physical and what is digital. This impacts every value chain there is.
But more on that later.
IN THIS SERIES on “Why the fuss about conversational programming” …
[Jan 2023] What is conversational programming, PART I
[May 2023] Maps as code, PART II
[Nov 2023] Why open source AI matters, PART III
[Mar 2025] From Fair Haven to Technomic Empires.