AI and the Social Sciences, Part II

Danil Mikhailov
5 min readJan 15, 2019

--

In my first post, I talked about bias in algorithms. This is a significant issue in a society rapidly transformed by AI. As my colleague Nicola Perrin and I argued in an opinion piece for the Guardian newspaper in the UK, one of the biggest potential problems would be if the public were so spooked by the constant drip drip of negative stories about unintended negative consequences of AI that its huge untapped potential for good, particularly in healthcare, would never be realised. So, what can be done?

Social sciences have the right methodological tools to analyse potential impacts of new AI applications on society and there has been a spike in scholarly publications on this topic in the past couple of years. Governments and learned societies have set up new centres and institutes to focus on the ethics of Big Data and AI, such as the Ada Lovelace Institute in the UK. But the speed at which academics review new technologies and publish their results is much slower than the speed with which new algorithmic technologies are being developed.

At best, social scientists working in traditional academic ways are able to analyse the social impacts of AI retrospectively, long after the technology is taken up and used in society. Finding that a given technology is biased against a whole slice of society after it has become pervasive is too late; the harm will already have been done. The addictiveness of modern technologies and the hockey-stick nature of their rapid adoption means that even once we know that harm is being caused, withdrawing the technology from use is not simple.

A far better approach would be to get social scientists to work side by side with software developers and data scientists, able to analyse the potential impacts of a technology as it is being designed. This is early enough to influence the direction of the design and avoid the worst unintended circumstances. The issue is that, under this model, social scientists would need to radically adjust the way they work.

Typically, software development and data science teams work in an iterative, agile way, producing code fast and testing it repeatedly. They hold to a “fail faster” philosophy that tries to get to the point where errors become clear as quickly in the development cycle as possible. Catching errors early costs far less than catching them late, after the product is launched. Social scientists, however, are often forced to work much slower. This is caused by following funding and publication cycles that require them to jump through a number of hoops to get their work accepted and peer-reviewed before it is published. This delays the point at which social science work becomes public. Software developers and data scientists, on the other hand, have a culture of getting their code publicly available as early as possible.

The difference in speed is further exacerbated by different cultural norms. There is a strong aversion to being ‘scooped’ within academia, which leads to greater secrecy as an idea or argument is developed. There is a growing Open Science movement that is trying to change this norm, but it is still very much a minority practice, though one that is accelerating via great programmes like Wellcome Open Research. In software development, open source code communities are much better established and even the most competitive private tech companies routinely engage and use code from such communities and contribute back to them. An example is the hypertext transfer protocol (HTTP) underpinning much of the Web, or the work of the Apache Foundation, or the Linux operating system. For open source software communities, the code or the model is the main event, rather than the idea or argument and the best players make their work available openly in online repositories such as GitHub. There they are recorded as the originator and build their authority by the number of other developers who use their code or commit modifications on top of it; their code being adopted by others is a good thing.

Another key difference that needs to be overcome for software developers and social scientists to work effectively together is the way research ideas are arrived at. Data science and software development works with probabilistic, emergent models: get lots of data, start looking for patterns and then adjust direction according to what emerges. The theory emerges from the data. Social scientists, particularly those of the quantitative persuasion, follow the scientific method of positing a hypothesis and then testing it through an experiment. Even qualitative social scientists, like myself, who might use more emergent approaches like ethnography and grounded theory in our studies, still write up the results of those studies in a structured way, informed by existing literature and constrained by the norms of the discipline in terms of language and presentation.

This structure around research is seen as important in social sciences for the maintenance of academic authority and objectivity, but it does have the consequence that social scientists require considerably more thought and preparation to be done before the research starts than a data scientists or software developer might be comfortable doing. This is neither bad or good, arguably thinking a bit more about consequences for data scientists creating the next bit of transformative AI is no bad thing, but getting the balance right between theoretical grounding and preparation on the one side and speed, agility and iteration on the other, would be a challenge.

These examples are, of course, an over-simplification, and there is plenty of grey area and blended approaches in both domains. Many social scientists accelerate their impact on the world by using Twitter, Reddit or blogs to get their points across fast before following up with an academic publication. There is also a growing positive trend of using pre-publication platforms such as F1000. But the above distinctions in speed and approach between social scientists and data scientists and developers are real enough.

Requiring social scientists to work at the rhythm and speed of software developers or data scientists is a radical adjustment to the way that they work. However, there is a precedent already for this happening in the shape of the User Research discipline. In the digital industry, User Research is the approach of studying and measuring the needs, motivations and experiences of the person using a new piece of technology. It has become a dominant force in software development for the simple reason that often the difference between a hundred users and a million for a new app or interface is how easy and enjoyable it is to use.

As a discipline, User Research is grounded partly in social science approaches and borrows significantly from sociology and anthropology. This means User Research practitioners working with digital teams can serve as a template for how social scientists can work closely and successfully with data scientists and software developers, with a few tweaks. In my next blog I will put forward a new methodology for how this can be accomplished, one that I have started to trial in the data science team I run in the Wellcome Trust.

--

--

Danil Mikhailov

Anthropologist & tech. ED of data.org. Trustee at 360Giving. Formerly Head of Wellcome Data Labs. Championing ethical tech & data science for social impact.