Design is the keystone to reveal the potential of AI

Published in

Dataveyes Stories

10 min readOct 9, 2017

This cover represents the most frequently used terms in this article, their relationships and their musicality.

(A French version of this article is available here)

Ever since the Industrial Revolution, we have been constantly striving to optimize the way we work. From the first scientific organizations of production, such as Taylorism and Fordism, to the currently popular agile methods, we have been looking for ways to break down tasks and allocate resources to move towards greater efficiency. Nowadays, new data processing techniques, classified under the term “artificial intelligence”, question our relationship to work. So much so that some herald the advent of a 4th industrial revolution.

Without going so far, significant advances in AI bear questioning. Because, although the AI dates back to the 1950s, the new techniques derived from it, such as machine learning, have profoundly changed the way we think and conceive computer systems.

AI is the new black (box)

Usually, software is explicitly programmed to follow procedures developed by an engineer. With machine learning, the system learns by itself. The engineer directly specifies the target objective, and the system then has to learn how to reach this goal, using a set of training data. The system achieves this task by performing intertwined operations on very large datasets. It is hard to figure out the path followed by such a system towards decision-making.

The operation of machine learning algorithms is much less explicit than that of classical procedural algorithms. Deep learning algorithms in particular may appear to us as black boxes. These algorithms are supported by deeply interconnected neural networks with up to several hundred million links. Each of them can learn, and contributes by specializing in a part of the problem. This is why it is so difficult for us to understand and represent these complex models. I have often heard: “We can build these models, but we do not know how they work”.

We cannot live with impenetrable systems

The algorithms in question are not reserved for a few data-rich professions or for mathematical research. Two recent examples prove that they are connected with what makes our society and identity:

— This software used by American police to predict future criminals, exhibiting bias against black people in its risk assessment.

— More recently, this algorithm that can “detect” the sexual orientation of a person from a simple photo of their face.

Of course the technical specifications and the source code sometimes provide enough information to insiders to allow them to work with a machine learning model. But this is not enough for the great majority of us, data specialists or not, to understand these models, and therefore trust their results. Not even data scientists can claim to read these algorithms as an open book.

However, our need to see clearly in these algorithms is all the more justified that they are not as reliable and robust as they seem.

Machine learning has emerged as a promise to solve problems that were up until now unsolvable by men. When a very large number of criteria are involved in the determination of a phenomenon, and can only be evaluated through gigantic samples, machine learning algorithms achieve useful modeling where classical algorithms reached their limits. Nevertheless, their modeling is only autonomous in a deterministic view of the world, that is to say, assuming that the analyzed data describe comprehensively the many facets of the problem to be solved. In real life however, the universe is not as perfectly delineated: it is difficult to clearly draw all the contours of a problem, and it is difficult to list all the possible answers on which an algorithm must be trained. Some answers may still be unknown, or may be neglected because of our own human biases. Machine learning algorithms mechanically reproduce these biases, and can only model part of the reality: the one we were able to expose to it. They give us the impression of better understanding the world, when in reality they merely reproduce the facets of the world that we already know. This is called quantification bias: the unconscious belief that drives us to value much more what we can measure than what we cannot. In an “AI-first” world, this phenomenon is exacerbated.

“What gets measured gets... attention.”
— Not Peter Drucker

As a result, it is difficult, if not impossible, to prove that a machine learning model will work in all cases for which it was designed, as some situations may be poorly represented in the training data. Unfortunately, correcting what is wrong with this type of case is a complicated task: since the underlying structure of the system is extremely complex, finding the sources of bias is tantamount to looking for a needle in a haystack. Not to mention the difficulty of constituting an important and healthy training dataset, representative of the problem that one seeks to solve.

For these reasons, we need machine learning systems to be more intelligible for both data scientists and the general public. Because an algorithm that can be interpreted is an algorithm that can be improved. We are still a long way off. Yet I think this is a crucial point for strong adoption of machine learning. When our societies realize that sophisticated algorithms have invaded professional and intimate spheres, they may reject en masse the presence of these black boxes.

This poses two challenges.

On the one hand, machine learning specialists need to better control the algorithms they design, be able to explain their decisions, and discuss them with a wide audience, in order to obtain safer and less discriminatory models.

On the other hand, other professions working with data scientists, as well as everyone in their daily lives, must be able to appreciate the effects of algorithmic models, and better understand them to gain better insight into them.

To succeed on these two points, I am convinced that we need to improve the way we interact with these complex systems: we must create mediation interfaces for algorithms.

HDI: the science of mediation between humans and data

We developed the habit of using the term “Human-Data Interactions” (HDI) to describe what we do at Dataveyes. HDIs include all devices designed to improve the way we understand and use the information contained in data. I encourage you to read our article on the subject, explaining this term and our approach.

In a conventional HDI approach, not confronted with machine learning, humans are at the heart of the system and do most of the work: they get information by interacting with the data, and use it to make informed decisions.

One may assume that an AI system and an HDI approach are incompatible, or even opposed, since in AI systems humans seem to disappear from the processing chain.

On the contrary, I think they are in fact complementary rather than contradictory, provided roles are redistributed. Humans and machines must each be responsible for meeting needs of different nature and complexity: the responsibility of learning, calculating, classifying, and so on, should lie with AI systems. And humans should bear the responsibility of understanding, analyzing, sensing or experiencing reality. The two should therefore work together in confidence.

Confidence and understanding are linked. Understanding is an essential objective of human-data interactions because it makes trust possible. HDIs address this issue through design: the design of interfaces, interactions, and information flows must provide humans with the understanding necessary to perform their role beside data-rich systems.

Taming AI through design

What interfaces can help very different audiences — data scientists on one hand, and non-experts on the other — to better understand algorithms? Interfaces that focus on data visualization and interactivity.

Such interfaces do not aim to teach us how to read the mathematical formulas of algorithms. They instead display how algorithms transform data, they help us understand and deduce what these algorithms produce and how the system operates, by showing us the structures, groups, hierarchies, distributions, relationships and correlations in a dataset.

Often, these interfaces feature filters, zooms, cursors to move, buttons to click, etc. Because an important aspect of human-data interactions lies within interactivity. When a visualization simulates data that evolve over time, or when it allows us to vary parameters, it reveals to us the influence of input variables on output variables. The interface gives us a feel of the sensitivity of the data, and allows us to build a mental picture of their interdependencies.

To see a case study on multi-device interactions with data, I encourage you to consult this article on our website: http://dataveyes.com/#!/en/case-studies/maquette-pedagogique.

Such interfaces can remain simple and playful, thus allowing the general public to gain better insight into the main algorithmic systems that surround it. They provide non-experts with the level of information necessary for a trustful relationship with data.

In their most advanced versions, visualization interfaces can also help data scientists evolve their work methods.

Towards better control of AI by data scientists

To meet the challenges of AI in our societies, data scientists must be able to thoroughly audit the mechanisms of machine learning and provide context for the answers it offers. To do this, they must extend their scope of intervention to incorporate good practices:

— Properly preparing data and ensuring its quality. In most cases, algorithms are not fed with raw data, but with data re-processed to bring out features relevant to the problem.

— Auditing the operation of the whole system by probing its internal workings.

— Analyzing results by testing their relevance.

With appropriate visualization tools, data scientists can see what happens to data when they initiate and set up a machine learning model. Appropriate visualizations promote their understanding on both a global and local scale: at any given moment in time, they display the whole dataset and a granular view of certain parts of the set. These visualizations allow data experts to more easily test stability or important specific cases of a learning algorithm, at each processing step.

These days, such visualization tools are increasingly used by specialists to carry out the design and training of machine learning models. Many initiatives are heading in this direction:

— Fast Forward Labs recently released a research paper on the concept of Interpretability in machine learning models.

— Google has shared open source visualization works on the topic of machine learning, with Facets and TensorFlow.

— Uber builds its own machine-learning-as-a-service platform with a strong visualization focus.

— The open source Deep Learning platform H2O.ai has also made a lot of efforts in this direction.

— Finally Distill.pub wants to be a platform dedicated to the explanation of machine learning.

Thus, a new approach is emerging, where the AI system is supervised by data scientists who are themselves assisted by interfaces, upstream (input) and downstream (output).

Better AI with assist from Human (data scientist)

Extend our natural ability with AI

If machine learning algorithms are better when they are assisted by humans, the converse is also true: men can go beyond their limits thanks to machine learning systems.

Joi Ito, the MIT Media Lab manager, gave an interesting perspective on the design principles for working with AI. According to him, humans should concentrate on the concept of extended intelligence rather than on robotics and AGI (artificial general intelligence), for it is the essence of humans to use technology as an extension of themselves.

I would go further by drawing a parallel with the concept of geosophy, introduced by geographer John Kirtland Wright in 1947 in his essay Terrae Incognitae. For him, geography is an “academic” science that must be enriched by what he calls geosophy. Geosophy extends the knowledge of geographers to peripheral conceptions: those of farmers, fishermen, business leaders, but also poets, novelists, painters, etc. J.K. Wright does not hesitate to forge a link between imagination, intuition, knowledge and science. He preaches thus for the widest possible knowledge of territory.

This approach is similar to the one we propose when discussing human-data interactions. HDIs include data science as geosophy includes geography; but they also go beyond data science, integrating the point of view of end users: practitioners of all trades in contact with data, but also inhabitants of smart cities and connected houses, drivers of autonomous cars, users of networks, citizens, etc.

Like geosophy, HDIs mix instinct and knowledge. They use machine learning models to improve our perceptions, and develop our intuitions. In so doing, they enable us to extend our decision-making and creative capabilities, to solve problems more efficiently, and to understand in depth the world we live in.

In business, HDIs bridge the gap between data specialists and domain experts. They enrich a common knowledge of data, favoring a data literacy essential to the improvement of everyone’s work. Organizations that smartly combine design and machine learning will therefore gain a definite advantage over others.

So it is not man versus machine, but rather man AND machine, trying to learn from each other. Our future will be algorithmic, and design will assume an increasingly important role in it. Not only because it allows a safer and more productive relationship with machine learning, but above all because it allows a more collaborative work.

How will design and machine learning enable physicians, technicians, farmers, designers, musicians, etc. to improve their professions? At Dataveyes, the company I co-founded, we are interested in the complete spectrum of interactions between humans and data: from the practices of computer engineers to those of experts in all fields, to users of the products and objects of tomorrow. We are firmly convinced that we can open up new possibilities by designing AI systems that integrate the human dimension at their core.