On bias, black-boxes and the quest for transparency in Artificial Intelligence
An increasing number of researchers, practitioners and policy makers are realizing that much needs to be done to deal with bias in data and algorithms, and to promote transparency of AI models. Only in this way can the proper use of AI can be ensured and benefits to people’s lives and support for fundamental human rights can be expected
Opacity in Machine Learning, the so-called black-box effect, is often mentioned as one of the main impediments for transparency in Artificial Intelligence. Machine Learning algorithms are developed with the main goal of improving functional performance. This leads to complex functions that are optimised to provide the best possible answer to the question at hand (e.g. recognise pictures, analyse x-ray images, classify text….) but they do it by fine-tuning outputs to the specific inputs by approximating a function’s structure without giving any insights on the structure of the function being approximated.
On the other hand, machine learning algorithms are trained with and reason about data that is generated by people, with all its short-comings and mistakes. We all use heuristics to form judgements and make decisions. Heuristics are simple, efficient rules that enable efficient processing of inputs guaranteeing a usually appropriate reaction. Heuristics are culturally influenced and reinforced by practice, which means Heuristics can turn into bias or stereotypes when they reinforce a misstep in thinking, or a basic misconception of reality. Therefore bias, are natural in human thinking and an unavoidable part of data collected from human processes.
Because the aim of any machine learning algorithm is to identify patterns or regularities in data, it is only natural that these algorithms will identify bias. Removing the algorithmic black-box will not eliminate the bias. You may be able to get a better idea of what the algorithm is doing but it will still enforce the biased patterns it `sees’ in the data. Because, what we really don’t want is the machine to act on bias, i.e. to be prejudiced, not that the machine does not use heuristics.
Transparency is then better served by proper treatment of the learning process than by removing the black box. Trust in the system will improve if we can ensure openness of affairs in all that is related to the system. The following principles to the design of AI systems should be required from all models that use human data or affect human beings or can have other morally significant impact
- Openness of data:
o Which data was used to train the algorithm?
o Which data does the algorithm use to make decisions?
o How is this data governed (collection, storage, access….)
o What are the characteristics of the data? How old is the data, where was it collected, by whom, how is it updated…
o Is the data available for replication studies?
2. Openness of processes
o What are the assumptions?
o What are the choices? And the reasons for choosing and the reasons not too choose
o Who is making the design choices? And why are these groups involved and not others
o How are the choices being determined? By majority, consensus, is veto possible…
o What are the evaluation and validation methods used?
o How is noise, incompleteness and inconsistency being dealt with?
3. Openness of stakeholders and stakes
o Who is involved in the process, what are their interests?
o Who is paying and who is controlling?
o Who are the users, and how are they involved (voluntary, paid, forced participation)
A Design for Values approach to AI models ensures that these principles are analysed and reported at all stages of system development.
Alongside these requirements, we need to rethink the optimization criteria for Machine Learning. As long as the main goal of algorithm design is to improve functional performance, algorithms will surely remain black boxes. Demanding a focus on the observance ethical principles and putting human values at the core of system design, calls for a mind-shift of researchers and developers towards the goal of improving transparency rather than performance, which will lead to a new generation of algorithms, which can turn Machine Learning into Valuable Learning.