I was recently was posed the question, “how do we define standards for AI?” I am primarily focused in the space of Deep Learning Artificial Intelligence (AI) and deep Learning is a specific set of tribes in a much wider umbrella of what is known as AI.
The term Artificial Intelligence is quite ancient and was proposed over half a century ago:
In fact, the idea of understanding human thought goes back much earlier in history:
“The design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic … and, finally, to collect … some probable intimations concerning the nature and constitution of the human mind.”
- George Boole (1850–1864)
We can go even further back to Rene Descartes of the 16th century and even all the way back to Aristotle of 322 BC.
Western Civilization has built up a ton of intellectual baggage in its understanding of how the human brain works. This accounts for the decades of work in GOFAI, where essentially the approach is to work top-down from formal logic into deriving intuition and instinct.
Alan Turing, the father of modern computing, had anticipated computation for the brain. His unpublished papers, discovered 14 years after his unfortunate death, anticipated the development of connectionist architectures. These are architectures that are more well-known today as Deep Learning:
Therefore, the question I am seeking to understand regarding the standardization of AI is not the standardization of all methods labelled under the massive AI umbrella, but rather I seek to understand the challenges of standardization of Deep Learning.
The first question to ask is “Why do we need standardization?” Standardization is associated with interoperability. So in the context of Deep Learning, what does interoperability mean and how can we achieve greater interoperability? Ever since 2012, the technology stack for Deep Learning has become significantly more advanced and richer.
Here’s a rough sketch for a DL stack in 2018:
This does not include all the other orchestration requirements that come from the data engineering or BigData universes. It also does not include the application specific layers such as visualization and active learning that may also be required for a comprehensive solution. In other words, it is a vast landscape that is still evolving at a break-neck pace.
The insight we can get from looking at the stack above is that there’s a lot of existing standards being adopted and can be leveraged. As a result, one already has a considerable launching pad to explore DL standardization independently of the standardization of other AI fields.
Interoperability standards are important not only for a global community, but also for any organization or company. To scale development, one needs interoperability to maximize the opportunity of reusable development. The important question for an individual organization is “where do you draw the line for interoperability?” This is a key architectural question that has big ramifications in one’s ability to scale execution.
Independent of technology, there is also a need for common terminology. This space may be moving at a rapid pace; however we are all in need of a DL ontology:
This is sorely needed to be able to quickly capture and make use of the latest developments in research. It does not help if research uses terminologies that are more novel than consistent.
We can also speak about standardization at a level above the technology stack. That is from the perspective of industrial processes in the development of these new DL based systems. A paper in 2017 “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain” by Tianyu Gu, Brendan Dolan-Gavitt and Siddharth Garg provides a good starting point in discussing the need to focus on data quality used in training DL systems. Perhaps ideas from the world of biotechnology manufacturing controls may provide better insight of what needs to be considered here.
A focus on process controls also brings into question what standardizations we need to put in place regarding safety, performance, latency, correctness, bias and even privacy. In fact, there’s a lot to talk about regarding how we handle data and data provenance. It is even more important with machine learning methods like Deep Learning which derives its behavior directly from the data it is trained on.
We need to standardize the best practices of developing Deep Learning so that not only can more teams accelerate their development but also innovative solutions can be developed independently and be plugged in to accelerate a much larger process. In conventional software development, we have a more mature conceptual framework that has evolved over time. We are able to mix and match different tooling like IDEs, code checkers, testing frameworks, continuous integration, profiling, performance monitoring etc. Deep Learning introduces new kinds of requirements, so we need to understand what these are and what standardize the class of tools needed.
It is always instructive to take a look at current standardization in the Automotive field. For this, we can learn from the Society of Automation Engineering (SAE). SAE has an international standard which defines six levels of driving automation (SAE J3016). This can also be useful in classifying the levels of automation in domains other than self-driving cars. A broader prescription is as follows:
Level 0 (Manual Process)
The absence of any automation.
Level 1 (Attended Process)
Users are aware of the initiation and completion of the performance of each automated task. The user may undo a task in the event of incorrect execution. Users, however, are responsible for the correct sequencing of tasks.
Level 2 (Attended Multiple Processes)
Users are aware of the initiation and completion of a composite of tasks. The user however is not responsible for the correct sequencing of tasks. An example will be the booking of a hotel, car and flight. The exact ordering of the booking may not be a concern of the user. However, failure of the performance of this task may require more extensive manual remedial actions. An unfortunate example of a failed remedial action is the re-accommodation of United Airlines’ paying customer.
Level 3 (Unattended Process)
Users are only notified in exceptional situations and are required to do the work in these conditions. An example of this is in systems that continuously monitor security of a network. Practitioners take action depending on the severity of the event.
Level 4 (Intelligent Process)
Users are responsible for defining the end goals of automation, however all aspects of the process execution as well as the handling of in-flight exceptional conditions are handled by the automation. The automation is capable of performing appropriate compensating action in events of in-flight failure. The user however is still responsible for identifying the specific context in which automation can be safely applied to.
Level 5 (Fully Automated Process)
This is a final and future state where human involvement is no longer required in the processes. This of course may not be the final level because it does not assume that the process is capable of optimizing itself to make improvements.
Level 6 (Self Optimizing Process)
This is an automation that requires no human involvement and is also capable of improving itself over time. This level goes beyond the SAE requirements but may be required in certain high performance competitive environments such as Robocar races and stock trading.
Ethics and Benefit to Humanity
Ultimately however, any form of AI standardization should be framed on how we can best steer AI (or AGI development) for the maximum benefit of humanity. It does not help if our standardization leads to more advance autonomous weaponry or more advanced way to predict human behavior and thus manipulate human behavior.
The challenges of AI standardization cover many levels of concerns. However, it should be ultimately guided by the need to accelerate the development of human beneficial AI and not the other kind.
To scale Deep Learning development into a practice that is predictable, reliable and efficient will require standardization. The intent of standardization is to maximize participation of many independent parties. It is a common language or a coordination mechanism for parties to accelerate progress. Accelerated progress is necessary for Deep Learning to not just be confined to research labs but also to be industrialized and available to many.