When machines write our code for us, developers will be busier than ever
Imagine, for a minute, that AI algorithms like ChatGPT can write production quality machine code from a user’s instructions. Given enough instructions, it could build very complex applications, instantaneously generating huge social media platforms like TikTok, or even mission-critical software like the navigation system of an A330. To be clear, this is still well over the horizon, but it feels possible. Let’s call this brave new era the post-script world, where engineers abandon higher order programming languages for pure logical instruction.
Teh software engineers that remain on company payroll in this scenario are even more important than before. They have three critical tasks:
- writing the set of the instructions that defines what the program should be
- writing a second set of instructions for a testing suite that ensure the first program works
- defining the environment and parameters the first program should run in, so that it is both accessible and “safe”, with the ability to access only the resources it needs. This is essential, because an AI-written program needs to prove it’s negative as well as it’s positive. That is, it’s not enough to prove that it does what it should, one must also prove that it doesn’t do anything else.
Consider a fictional company called Mozart, who sells software to large airports to power their baggage handling system. This is a pretty high-risk piece of software, in that its failure would cause a lot of lost baggage, heartache for travelers and financial losses for the airlines and airports.
Mozart’s competitive advantage is in expertise and customer trust. They have people who really understand baggage handling. They have gone to all the airports, they know the quirks of the different scanning and sorting machines that each airport uses, and have strong intuition for which pieces of the system are likely to fail both for individual pieces of luggage and at scale. As such, they have a really strong opinion for how the logic of baggage handling should work. Mozart’s customers are in effect buying a logic encoded philosophy of operation, which has been rigorously tested at each failure point of the baggage handling system. What’s more, Mozart’s leaders have spent a lot of time probing the borders of their software’s “lane,” which is to say what it should and shouldn’t do. Airport operators can run the software safely, because Mozart has containerized it and killed off instructions that look like anomalies. To ensure customer trust, Mozart regularly publishes audits where they simulate recovery from large system failures, or embed malicious code in the program that runs opposite it’s primary goal, to show that their product remains robust.
This isn’t fundamentally different from the way software companies operate today. Software has never really been about code, or the elegance of the instructions you provide to a computer. It’s about the formalization of logic and process into a set of steps that can be repeatedly executed. The software team leader’s main battle is to work with business leaders to define this process, then to simplify it, then to make it robust, and then to really know what the program can do via the execution of a testing suite. The hard part of the game is moving fast enough to keep up with the business, but making your software robust and nimble enough that business process it codifies can be affordably edited.
AI generated code provides some key advantages here, because we’re maintaining a set of logical instructions rather than getting into the weeds of implementation. We have some loose corollaries to this in modern development with cloud deployment recipes like Ansible or Chef, though those are still implementation heavy. This abstraction, however, brings enough problems that it’s not clear it will always be worth the tradeoff. First, it requires 100% code coverage — not in the sense of unit tests, but in the sense that every piece of functionality must be tested just to detect its existence. Second, it raises the danger that the program does things you don’t detect, and won’t know exist. There is an analogy here to the mutations we experience in our own cells: most of these are harmless, but some are cancerous and can pose a massive danger to the health of the overall system. The rise of the AI written code, therefore, saddles engineers with a new problem, that of detecting expected outputs of a program and controlling its access, so that unexpected things can’t endanger the overall health of the system. In very sensitive environments, it will be essential to just write code the old-fashioned way.
To state my point in short form, AI-written code isn’t a home run that will clear the world of developers. There will be fewer of us, but those that remain will be even more crucial than before. Our job will be to develop platonic prototypes of processes, rather than know gory details of things like table locking in a database.
What’s wonderful about this foresight is that lead developers should start working this way today, thinking first about the business process and then designing from there. Engineers should be as knowledgeable about the business they are mechanizing as a company’s founder. Just as Linus Torvalds famously claimed that “the best code can’t make up for poor data structures,” good data structures and well written code help no one if they model a terrible process. What is beautiful about code is what is can represent about a real world process. Its implementation is truly secondary, and product leaders should spend their time thinking about how to make the process work, and testing that it does work, rather than focusing too much on the beauty of an application’s architecture. Machine generated code will push us to work that way in the long run.