The Startup
Published in

The Startup

For Software 2+2 May Not Be 4 Anymore

Deterministic Behavior of Software has Come into Focus with AI.

Photo by Jamie Haughton on Unsplash

Ada Lovelace spent more than a year thinking about the world’s first program. She published it in 1843. Since then, there have been many definitions of software and many characteristics that industry has generally accepted for good software — it should complete a specific task, it should not be fallible to small errors, it should be debuggable and maintainable, it should have clear metrics that track its success, among others.

So far, no one has bothered to say that in addition to all other such characteristics, a good software should exhibit deterministic behavior. Determinism has evolved as a philosophical concept, distant from the hard, cold math that is the basis of any software. Any agent is deterministic if its output depends solely on the various inputs to the agent. In other words, if we give the same set of inputs to a deterministic agent a thousand times, it will produce the same output a thousand times. If we provide a set of inputs, then for a deterministic agent the set of outputs will correspond exactly irrespective of the sequence of inputs.

So far, there has been no reason to question the deterministic nature of software. After all, everything translates down to the stark world of zeroes and ones. In fact, this deterministic behavior of software has been a great friend to engineers. If two plus two is not four for some piece of code, that has been the surest sign that the code is not working. The entire discipline of unit testing, an integral part of software development, is based on this very concept.

So far, large scale production systems had not met artificial intelligence.

In May 2016, a Tesla Model S collided into a semi-rig at 74 miles per hour without even trying to brake or perform evasive maneuvers. The AI behind Tesla’s famed Autopilot system had misjudged the truck for a signboard. Acting responsibly, the company completely reset its software. And yet, something similar happened in March 2019 and June 2020. While Tesla’s accidents made headlines, these incidences are far from isolated in the world of AI.

By its very definition, artificial intelligence is a type of software that learns from its inputs and changes its output based on its learnings. In other words, it’s no longer deterministic. If we provide the same input to an AI system multiple times, the output could change at every iteration. For a set of inputs, the sequencing matters — the set of outputs could be different for each permutation of the inputs. This has created completely new challenges for software engineers and made the role of a business leader more important to software development.

  • First is the exponentially high costs of training and dealing with inaccuracies. The rigor in training an AI system must be super-high before we can deploy it. The testing itself is more complex by a few orders of magnitude. We must also use every possible trick in the world to maximize the system’s accuracy. Still, we must make peace with the fact that the system will never be 100% accurate. Business leaders proficient with AI systems think about a human in the loop at this stage. Despite all such efforts AI systems are still prone to misjudgment, but at this point we hope that the error rate is lower than that of a human. After all, even many human drivers collide into trucks with no software involved.
  • Another facet of this problem is that of data completion. When training the system, how can we know that we have thought through every possible scenario, including the corner cases, and provided enough data for the machine to learn from it? Did the Autopilot data science team not think that the machine should learn to tell between a truck with its wheelbase and bed from a signboard with its poles and the display? If they did, did they make sure that the machine was able to learn the difference from the training data? Some data scientists object that the whole point of AI is for humans to not have to think about these corner cases, although I hear that argument more from research teams than business teams.
  • And then there is data migration. An AI system keeps learning forever. A production scale AI system sees inordinate amount of data continuously. Over time the data moves away from its initial characteristics — that’s just the nature of the real world. As a result, over time the data science team has less and less control over the system’s behavior. It also has less and less knowledge of the system’s behavior, which becomes scary. I would guess that later Tesla accidents had something to do with similar data migration. To fight back, teams use continuous testing, human validation, and rigorous monitoring systems. A school of thought that I subscribe to prescribes using AI to learn business behavior and use those models, not the AI itself, in production.

All this because we decided to give up the iron-clad comfort of determinism.

These are hard problems, and yet not the biggest one. The biggest problem is that now business leaders must insist that AI systems are built in-house. If we stick to traditional SaaS model, then the already insane number of uncontrollable variables goes up exponentially — now we must worry about not only our accuracy, data completion and data migration, but that of all the other customers of our vendor. This, combined with challenges in availability of data and data security, has led to AI being a solutions-driven software rather than product-driven. In fact, there are only three archetypes of AI software that can scale in a venture backed model.

(Only for nerds) A common objection to my point of view stems from computer scientists that claim, rightly so, that at the current state of art even an AI software is deterministic. Take a deep neural network. We need to set it up with some initial conditions and then provide a set of training input-output pairs in a sequence to train it. Then for a set of inputs, it predicts set of outputs. As I have written multiple times, no matter how many times we train the system the output to a given input will be the same if we take the same set of initial conditions and the same training data. At the end of the day, the most complicated AI models use nothing more than middle school arithmetic.

I concede that today the AI systems are also deterministic in an absolute way. However, we are woefully short on accuracy, predictability, and explainability of these systems, especially if the number of variables involved is high; even more so if we don’t fully understand the variables. There is also the continuously changing (learning) nature of many of these systems. It is these shortcomings that show up as non-deterministic behavior.

As in real life, if someone knows everything and their relationship with each other in the world of AI, they will be Gods. Us mere mortals should be content that two plus two may not be four anymore.

Suggested next for you:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Praful Krishna

Praful Krishna

AI, Product, Strategy, Digital Transformation