Why we’ll need more than data diversity to avoid prejudiced AI

Chris Wigley — Solution Partner, QuantumBlack

Earlier this year, a company survey asked 1,000 American consumers whether they could name a female tech leader. Only eight per cent said they could — and of that, a quarter named ‘Siri’ or ‘Alexa’.

It’s yet another sad reflection of the tech sector’s lack of diversity, which has been plain to see for some time. But as we enter an era where AI and its applications will have an ever-increasing impact on our day-to-day lives, it’s never been more important to address the imbalance. We risk exacerbating existing biases if we do not ensure that the variables driving future AI innovation aren’t diverse by design.

Put simply, we face a now-or-never opportunity to ensure our data, and how it is applied, is as diverse as our society.

This was the theme of QuantumBlack’s latest event in our Operating at the Boundaries lecture series, held partnership with the Royal Institution of Great Britain, the acclaimed home of British science. I was delighted to be joined by two renowned speakers — Diana Biggs, Head of Digital Innovation for HSBC Retail Banking in UK & Europe, and Gina Neff, Senior Research Fellow and Associate Professor at the Oxford Internet Institute and at the Department of Sociology, University of Oxford.

It would be impossible to distil the entire evening’s discussion into a single article, but I wanted to share the key takeaways that stuck with me long after I left the event.

Why data diversity matters

Firstly, some perspective. While there have been rapid developments in AI in recent years, we’re still some way off a machine that can interpret information about the world in the same scope of humans, or even animals. When given narrow tasks, such as calculating data to drive cars and play chess, machines triumph. But in general intelligence — the ability to observe, learn and apply information about the wider world — current machines trail behind rats. We’re still some way off human-level intelligence.

Read the headlines and general intelligence is the hoped — and sometimes feared — aspiration for AI. Yet we already see the impact that narrow AI algorithms have in our lives, from marketing being automatically targeted at customer segments to driving language translation tools.

As Gina pointed out, diversity problems have already been observed with narrow AI algorithms. Translation tools have been criticised as sexist for automatically gendering language based on the content — gender neutral phrases such as ‘o bir doktor’ (‘they are a doctor’) becomes ‘he is a doctor’, whereas ‘o bir hemşire’ (‘they are a nurse’) becomes ‘she is a nurse’.

These translation algorithms rely on patterns in language from across the internet — and this exemplifies the problem. If a machine is free to interpret data from the world with no checks and balances against bias, then of course the results will reflect the biases found in that data. The majority of online content says nurses are female and doctors are male, so the algorithm calculations reflect that.

This matters. As Gina highlighted, if these issues are already found in nascent machine learning, imagine the impact that inherent bias could have when general intelligence AI is dictating everyday life. The models that will dictate how an AI behaves will determine the decisions it makes, so it’s never been more important to ensure the personnel and data building these models are reflective of our diverse society — and are calibrated to counteract the biases we already experience.

So what can we do?

Our speakers proposed a range of solutions across the evening, but the crucial factor was time. We’re on the cusp of the next phase of the digital revolution — so how do we avoid AI reproducing the inequalities already seen in society?

Diversity of perspective in the talent and leadership pipeline is key, and education plays an integral role in this. White men dominate computer science university courses, but as Diana mentioned, a shift in mindset is needed well before students make decisions about college specialisms. Steps must be taken early in a child’s education to explain the opportunities afforded by pursuing STEM subjects.

Diana also argued the business world must encourage more diverse candidates into leadership positions — but companies must purposefully seek these individuals out and understand the opportunities and language they respond to. HSBC has started programmes to specifically look at which candidate segments may question whether they can fulfil their potential in the organisation and have launched specific training to help diverse individuals rise up the career ladder, without having to hide or sacrifice the characteristics which make them different in the first place.

All three of us were in agreement — diversity is a fundamental requirement for success in innovation and business.

A homogenous team — even one comprised of top talent — will never achieve as much as a diverse group. Lionel Messi is renowned as one of the best footballers of his generation, but a squad of Messi’s would falter. The same is true at QuantumBlack — we recruit for a range of skill sets and our business relies on balancing teams of specialists such as data engineers, data scientists, designers and software engineers.

But we also hire with diversity of experiences, perspectives and backgrounds at the forefront of our minds. As I mentioned during the event, this isn’t a CSR tick box exercise but a way for us to bring together fresh points of view in order to solve challenges — it’s a business necessity, and it’s in the wider corporate world’s interest to recognise this.

Personnel aside, future AI developments should also be working from diverse data sets. At QuantumBlack, our projects involve a vast array of incredibly complex data sets comprised of different variables.

A recent project tasked us with protecting banks and consumers from fraudulent activity. We started by aggregating and analysing two years’ worth of payments data, which involved pulling together vast amounts of information. By examining behaviour patterns from a wider group, we were able to develop an algorithm and model, enabling us to spot questionable transactions.

Although we were covering millions of transactions from thousands of different people, triangulating across these data sets brings us different perspectives and insights on how different people behave with their own financial transactions, as well as an assurance that we’re not over reliant on one source. One person’s transaction data would have been enough to spot future questionable activity in their account, but it wouldn’t have told the whole story.

It was a fascinating event, and I’d like to take another opportunity to thank our speakers. The full session will soon be available on The Royal Institution’s YouTube channel, and we’ll be announcing future QuantumBlack events in the coming months.

A long held QuantumBlack mantra has been that interesting things occur at the boundaries of disciplines and combining the perspectives of both the academic and business worlds will be necessary if we’re to address technology’s diversity challenge. In the meantime, we all have a responsibility recognise our own inherent biases and question how they influence our decisions.



QuantumBlack, AI by McKinsey, helps companies use data to drive decisions. We combine business experience, expertise in large-scale data analysis and visualisation, and advanced software engineering know-how to deliver results.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
QuantumBlack, AI by McKinsey

An advanced analytics firm operating at the intersection of strategy, technology and design. www.quantumblack.com @quantumblack