Toward AI Automation of Software
Beyond Lambda, Pi and Rho: the Phi Calculus
A previous article on rho calculus can be found here
In the Beginning… Was the Command Line
Blockchain developers may be quick to dismiss an interesting return back to the command line. Back in the the 1960’s, computing was largely a command line experience. This was not entirely bad — in fact it allowed the notion of interactive human-machine conversations and the industry was taking major steps toward AI at the time. Databases and operating systems were built around this approach. However, text-driven CLIs have obvious expressive limits and with the introduction of the Apple Macintosh and Microsoft Windows, we entered the visual age which engaged both sight and touch via screen and pointing device. However, visual GUIs lack composability and Microsoft is now bumping up against complexity limits (try navigating around the hundreds of windows and dialog boxes in their latest products).
Engaging *All* the Senses
Maybe the answer is all of the above. I believe a next-generation computing platform should engage all of the senses:
- command line interface — TALKING and LISTENING
- graphical user interface — SIGHT and TOUCH
Yes, these sorts of capabilities used to be associated with operating systems but we can no longer look to conformist Silicon Valley for much in the way of infrastructure innovation (the short-sighted Sand Hill Road crowd is understandable but the lack of interest within academia is disturbing). So despite billions of dollars in “investments”, we see an industry forced to cobble bespoke stacks and frameworks on the backs of OSS developers laboring for free. From this has emerged grassroots hybrid systems such as Jupyter Lab or Apache Zeppelin — driven mostly by the NYC data sciences community. The question is how we can extend these ideas to general computing.
Structural Limits of Lambda Calculus
Once you accept that your big tangled hairball of software code might also be a form of “data”, you can perhaps appreciate why the data sciences approach might also work for general software development. Unfortunately most of the west coast software industry is still trapped in something called lambda (𝝀) calculus (not to mention von Neumann) — a direct result of the industry’s punchcard legacy — which is all about submitting jobs to a machine and hoping to see lights flash correctly in response. In lambda world, programming skills largely equate to how well your star devs can imagine what a list of instructions might do (versus simply running them).
So what plumbing am I talking about? Perhaps a more specific example might help…
Synchronous Control Flow
Back when early lambda programmers had to code by kerosene lantern, monochrome computer screens were so dimly lit we had to use semicolons to find the (horse-drawn) carriage returns:
let x = proc1();
let y = proc2(x);
let z = proc3(y);
Writing it this way means the programmer wants these instructions to run ‘sequentially’ as follows:
- Put the result in variable
proc2()with the value of
- Put the result in variable
proc3()with the value of
- Put the result in variable
My rather un-artsy control flow dependency graph is below, where we read the functions and parameters from top left to bottom right:
Basically we are saying it is pointless to run
proc2() until we get the output from
proc1()and so forth. An implied linear dependency tree is the main reason we traditionally write sequential instructions; calls are synchronous because it makes no sense to continue to the next instruction until the prior result comes back. It is also easier for humans to think imperatively and we are ordinarily reluctant to fiddle with ordering because it might introduce strange bugs. But it may not scale.
Asynchronous Control Flow
Okay so we adopt asynchronous programming. Let’s assume our procedures are now re-engineered and declared
async somehow and therefore it is possible to effectively run everything after the first instruction in parallel:
let x = await proc1();
let y = proc2(x); // async call
let z = proc3(x); // async call
The dependency tree now looks like this:
The programmer is now telling the computer it is safe to call proc2 and proc3 in parallel. But if the machine could see that proc3 requires x or y, it could have determined the ordering on its own.
Function Calls Are Considered Harmful
What’s really going on here is that the computer is a bit too “dumb” to figure out dependency paths on its own — so the software developer is forced to play babysitter in ways that confuse code with execution details. For starters, pure functions can be memoized (replaced by a cache) but should we tinker with the caller or the callee? Worse, we are also starting to fiddle with the type system. Suppose that we call an asynchronous function which returns a
Promise. That means the caller is supposed to know implementation details about the callee and must decide to decorate the call with
await or not. Talk about leaky abstractions. Is
Promise<float> equivalent to
<float>? Well it depends on the situation and there is a long running debate about why Promises are not truly covariant (I’m not sure Futures are much better). Oh and exceptions (which are also out-of-band/async) have a similar problem. And then you have microservices. It’s almost like something else should be handling all of this mess.
In a Decentralized System, Who Is The “Real” Type Authority?
The lack of strong typing almost makes distributed systems a non-starter. Although few like to admit it, any program that connects to an external datastore should probably be inheriting the type system of that datastore. Otherwise your normalized data has a very denormalized type system. In other words, .NET devs suddenly using “var” all over the place is not exactly laziness if the program is pulling from SQL Server. Of course, since most programming languages have such poor built-in data handling you often still need to know when you are dealing with things like arrays. But even “iterability” only makes sense with legacy (single-threaded) von Neumann machines — and parallel-friendly category theory operations like “map” or “all” for arrays still force devs to refactor every time the code goes from scalar to vector (heaven forbid we mention ‘set’ theory) and various algebraic interpretations of what “+” means when dealing with something as basic as numbers vs strings.
Moving Past Lambda
Wow the async world certainly opens up a can of worms. It appears that trying to write a few ‘lines’ of code reveals (1) we have no stable execution plan and (2) no solid type system. In other words, no coherent definition of what is ‘code’ and no data. So what does that say about trying to build an entire production system? You have to wonder if Silicon Valley really was built on a pile of sand.
Infrastructure vs Language
Jumping to higher order calculi has long been a challenge because of a west coast tendency to throw a flashy new programming language at every new problem. Exotic 200MPH race cars like Haskell and Julia look great on the showroom floor but you probably don’t want to take them grocery shopping.
The main technical debt with ‘programming languages’ is that they tend to be rather drastic “all or nothing” propositions that don’t play well with legacy systems, incremental adoption or hybrid models. Besides, CIOs are already dealing with language fatigue.
How about if we let the computer sort all this out?
After the 2008 crash, Wall Street starting abandoning lambda calculus in a serious way. But how did NYC manage to pull this off? Although Goldman developed their own programming language, their real sauce was the system that powered it. That is, banks recognized the limits of programming languages and pushed higher computing models into the underlying infrastructure.
NOTE: I should point out that Serverless computing (aka FaaS or Function-as-a-Service) is also heading this direction in that the programmer simply defines lambda functions and the infrastructure handles the rest. Of course, the strength of that infrastructure quickly comes into question.
The Pi (𝞹) or “Process” Calculus
In fairness, NYC had already started moving past lambda when they entered the reactive world of Excel spreadsheets. The humble spreadsheet is still largely unappreciated by Silicon Valley, which has always found them bafflingly difficult to compete against. How can someone possibly run a company on crappy Excel spreadsheets instead of my million dollar software? Finally we see Airtable, SmartSheet and Zapier finally starting to tackle this.
The powerful calculus behind spreadsheets is largely captured by the work of Robin Milner, Tony Hoare, J Paul Morrison, Carl Hewitt and others (perhaps not surprisingly, Hewitt was banned from Wikipedia from trying to point out his research in this area). These are all variations of a theme I will just lump into 𝞹 (process) calculus and involve expressing computing as data moving between runtime ‘processes’ or perhaps more accurately ‘actors’ and ‘messages’ (in the case of Excel, the cells are very simple ‘actors’).
The trouble of course is that “runtime” behavior is very hard to visualize when staring at inert source code. Most programming is akin to an empty Excel spreadsheet… devoid of data… almost impossible to follow... The accounting department would need a team of scientists just to close out the monthly books. Without data, how the hell can anyone tell if the spreadsheet is correct? Well, we can try to build graphs — and the process calculus is most recognizable for its Biztalk-like flow visualization graphs:
But we are still on the outside looking in. Can we do better?
Data Driven Execution
How can developers get closer to the runtime experience? Take our earlier example and suppose we had some sort of Excel-like data triggers. That is,
proc3() only fire when x experiences some sort of state change:
proc1() => x
change x => proc2(x) => y
change x => proc3(x) => z
Of course, this implies assembling one hell of a dependency graph (which is why these Wall Street systems are so massive).
It would be much easier to climb inside and see what is going on. Unfortunately lambda does not really allow ways to “inject” oneself into its fixed control flow. Plus, you need some sort of way to store and manipulate data while you are developing your program. Even if you could fire up the debugger to test a function, you still need to be able to call other functions to assemble the necessary parameters to send to your target function. This gets messy fast.
Carl Hewitt argues this is because runtimes lack a formal way to express the dimension of time. But Hewitt wants to represent time as Promises or Futures when most ‘functional programming’ (FP) is really about just removing von Neumann time dimensionality entirely so that the code is “declarative” e.g. runs instantly / in zero time. Seems like opposing forces.
Speaking of clarity, mathematical proofs are so handy in programming because they can declaratively describe qualities about a series of state transitions (e.g. instructions) without actually having to churn through the time dimension manually and clutter up the code accordingly. Ideally, you try to avoid nasty surprises by considering all possibilities in advance. Unfortunately, this is a bit of a circular argument. Bugs and features are two sides of the same coin. This is probably why Einstein said the “only source of knowledge is experience”.
Along these lines, the Paramount movie Arrival suggests the entire linear notion of time is relative to the observer (if not outright fake) and touches upon the controversial Sapir-Whorf hypothesis which may suggest that a lot of the Curry-Howard correspondences we find between programming and logical proofs may arise simply because humans express both on a two-dimensional surface (again back to the punchcard). This begs the question of what sort of proofs/code you could write by using additional dimension(s).
But what did Einstein mean about ‘experience’? Hewitt suggests his actors are observers. This is a big problem — both Turing and von Neumann recognized that trying to observe a system invites spooky quantum trouble. Indeed, reactive programming still does not get much rigorous treatment in computer science because it requires the ability to observe a system without a purely deterministic model of computation. Sure, anyone can demonstrate a deterministic Excel spreadsheet but they always assume a known origin. What if someone is changing the cells as you go along? It’s called non-determinism and can be a recipe for disaster when trying to do proofs.
For starters, how do reactive (or any) programs make reliable observations about runtime state? Developers would recognize this capability as “introspection” or “reflection” but programming languages are notorious for secondary treatment of data handling — e.g. something as trivial as reading a value gets messy fast. I realize Silicon Valley has no time for database-y transactional calculus stuff — it’s all left to “higher-level” programming. Well that argument might have been passable 20 years ago but at some point yesterday’s higher-level needs to be today’s low-level hardware. In fact, this suggests major flaws in how we build hardware. Take the venerable mutex lock:
Mutex locks are needed when there is some sort of race condition. Suppose Thread A reaches the mutex region first. What happens to Thread B? We are taught that B blocks. But should it block?
Well it’s called a race condition for a reason — and Larry Ellison became a billionaire by realizing that Thread B should never block, just take the prior value/initial state and keep going.
Wut wut? So if Thread B can keep going, you would think the supposedly performance-hungry chip makers would be trying to push for improvements here. But how long can B keep going before something starves? Maybe the question is when does Thread B actually need the value of
var. Does Thread B even belong in our model of computation or is it just an implementation artifact of the response to the
var change? Which comes first: the thread or the data change? The code or the data?
Hardware fans might recognize some of this as transactional memory…
Here Comes the Tick!
You could also argue that “code” itself is really an expression of Hewitt’s time dimension, since it requires time to execute. If our reactive system contains an observer that must “see” something — that implies a time delay, if only long enough to let the state change propagate down a wire. But since there are no signs of ‘reactive hardware’ from the chip giants on the horizon (well maybe not), current implementations are not that simple and our model must include some sort of coordinator e.g. a clock “tick”. The desire of course is for state changes to immediately fire — zero clock — or worse case assign a clock per thread — and you have the Wall Street high-frequency trading industry in a nutshell.
I neglected to address the earlier question. Which is the model and which is the artifact? Where is Richard Feynman? Do we need a hadron collider? Seems like the Process Calculus leaves us wanting.
The Reflective (Rho) Calculus
In 2005, Lucian (Greg) Meredith and Matthias Radestock were the first to formally address the gaps between Process Calculus and the realities of reading state in reactive systems (speaking of hadron colliders, Mike Stay also has extensive research in this area that readers may find interesting and also gets into ‘topologies’, something I explore with Single Level Memory).
Although the folks at Cardano are using a similar calculus cooked up around 2011, I’d say that Rho is the cleanest modern formal description of the “reactive” calculus without dragging in full transactional semantics. As I’ve mentioned ad nauseum before, a lot of these same problems already exist in database world. Before in-memory databases and persistent memory came long, thinking of programming in terms of a SQL didn’t make much sense:
SELECT x FROM memory
Syntax aside, I realize most programmers don’t think about the correspondences between traditional ‘coding’ and database operations. Here we assume
memory is an in-memory region not unlike a table. It is important to remember that VMs like Ethereum are essentially “persistent memory”machines.
The bigger problem is that most programs start from a cold state and therefore also have a bunch of DDL going on e.g.:
CREATE OBJECT memory (x BIGINT)
The above is equivalent to a variable declaration. Somewhere else, someone sets x to a meaningful value:
UPDATE memory SET x = 5
Putting a database trigger on our
memory object is reminiscent of data-driven execution, except we have no idea yet who wants to consume the value of
At least we can agree on the set of SQL instructions required. We just don’t know the order to apply them.
Rules, not Instructions
A central premise of Rholang is that if humans can just supply a set of rules, then it would be far easier and safer for the ‘computer’ to figure out the dependency graph and sequence things properly given available computing resources. By removing humans from a lot of imperative coding, it certainly would reduce the amount of code we would have to write and reduce the chance of errors.
Because Rholang is asynchronous anyway, it is pointless trying to read the code in any particular order. It is really more of a specification than a programming language.
Rholang VM and Artificial Intelligence
In fact, Meredith made a point of separating instructions with vertical bars to get developers away from the idea of sequential flow, not unlike Prolog. In other words, Rholang starts to remind us of an AI system.
Huh? I thought Blockchain and AI were only used in the same sentence by aggressive marketing. Well maybe blame Meredith for venturing into this rabbit hole…
I would say AI is the logical end-state of Hewitt’s actor-driven model. Applying some reactive “intelligence” to actors in a distributed system is almost unavoidable because managing complexity is still a very real problem.
“Correct By Construction” vs Reactive Calculus
Okay so we agree that writing rules is less error prone than coding. But how do you know if your rules are correct? Who is policing the police? Remember the mathematical proofs dilemma — you can have “correct” code that your customers will still insist on calling a “bug”. Rholang itself is not exactly easy to grok, as human brains (unlike heptapods) prefer to think sequentially. This means Rholang code will inevitably contain mistakes and developers will have no choice but to “load the Excel with data” e.g. observe the system in action. Of course, the RChain effort is working to mitigate this. But does this mean the calculus is incomplete?
Getting back to Hewitt, one wonders whether humans or human-like agents can play the role of reactive Actors within the system and what that looks like. Maybe we need an “observable” calculus unto itself.
The Interactive (Phi) Calculus
This takes us to the final calculus called the phi (𝜙) or philosophy calculus. I realize this might seem a bit fuzzy wuzzy as most systems are assumed static and therefore relatively easy to apply proofs against. However Charles Bachman pointed out that databases are dynamic systems best explored interactively in his 1973 Turing Award paper The Programmer as Navigator.
It is probably no coincidence that the blockchain world is seeing the revival of REPL-based interfaces starting with
geth. If we accept that programmers always start from a base state of an empty program (‘null’ in Rholang) and MUST have the ability to mutate the system if for no other reason than adding code. This means we are almost certainly thrust into a non-deterministic non-algorithmic interactive world of Actors that simultaneously create and perceive… and hence the phi symbol from the Warner Bros movie Inception:
What we are doing is bringing the software development process itself into our model… including the human (or automated) programmer. Otherwise either (1) the system can never change or (2) the model is incomplete — which I call the Bootstrap Paradox. This brings us to Peter Wegner’s paper “Why Interaction is more Powerful than Algorithms” which I think any owner of a self-driving car would agree.
Non-determinism is unruly almost by definition but so is trying to meet software deadlines. Human agents are unpredictable. So I think the hallmark of interactive systems is the notion of the “soft” error state — the ability to gracefully handle unexpected situations and recover. But you can really only do that if you keep your dependency trees shallow and isolated from each other…e.g. the opposite of lambda.
Rather than repeat too much of Wegner here, I would highly encourage readers to give the paper a download.
Inception started as a basic thought experiment — can I create an empty program and add lines of code to it — on the fly — while keeping it running? This is not unlike how blockchain and database systems work. You don’t bring down the entire Oracle database just to add a table or a row.
Why bother modeling the human programmer at all? Isn’t that just academic overkill? Inception comes from the world of AI —that is, can we break down programming tasks into simple interactive instructions and let the computer observe what changes a developer is making to the system and eventually learn to automate those steps?
Along the way, Inception’s attempt to move development closer to the runtime opened up a number of other questions about how we think about computing. You can find us on Discord at https://discord.gg/skh9zhb.