Misconceptions that Kill Progress in Personal Data

Alan Mitchell
Mydex
Published in
6 min readMar 24, 2021

This is the fourth in a series of blogs which provide edited extracts of key points made by Mydex CIC in its response to the UK Government consultation around a new National Data Strategy.

This blog focuses on common misconceptions that need to be addressed if we are to progress. Previous blogs focused on how to unleash the full potential of personal data, on why every citizen should be provided with their own personal data store, how to achieve these changes at scale, and the sheer size of the social and economic opportunity.

To catch up on progress on our Macmillan My Data Store Pilot click here.

It wasn’t so long ago that sober, intelligent, responsible adults believed the sun orbited the earth, that if you were ill the best cure was to have a leech suck your blood, and that it was the right thing to do to hunt and burn witches at the stake. They also gave patronage to alchemists and astrologists who claimed they could turn lead into gold and foretell futures. These conventional wisdoms were deeply and dangerously wrong, but the majority of the great and good went along with them without a second thought. If everybody believes it, it must be true. Right?

Something similar is happening with debates about personal data today. Mydex CIC’s submission on the UK Government’s proposed National Data Strategy highlighted three areas where this is the case: a naive belief in the magic cure-all powers of ‘Big Data’ and artificial intelligence; confusion about what citizens ‘controlling’ their data really means; and whether citizens are actually capable of doing so. This blog summarises what we said on these issues.

Beware AI hype

Many narratives about a national strategy for data follow a familiar refrain. Artificial intelligence can gain much better insights from data and make much better decisions than human beings can. The opportunities are endless. To enable the analytics that will drive these insights we need to gather together as much data as possible.

There is so much wrong with this trope that we will need to devote future blogs to it. But basically, it contains three core errors.

  1. Claims for what AI can actually do are being wildly inflated. AI is good at solving certain types of problems (which require lots of computation and where judgements based on context or values are not required). But most of the important problems our society faces do not fit this specification.
  2. By the nature of the processes needed to deliver AI, policies designed to promote it automatically favour existing data monopolies, thereby exacerbating the extreme imbalances of power and reward that already exist.
  3. AI operates by crunching huge amounts of data. But when it comes to actually applying any insights or decisions generated to specific individuals, we hit a transition point. At this point we are no longer dealing with ‘big’ statistical data or (misnamed) artificial ‘intelligence’, we are dealing with specific bits of data about specific people: we are dealing with personal data.

This last point needs some elaboration.

Economically speaking, by far the most important and valuable uses of personal data do not derive from ‘insights’ derived from ‘analytics’. Businesses like Google and Facebook may loom very large in peoples’ consciousness, but their business models are organised around advertising, which accounts for less than 2% of all economic activity.

The really big uses of personal data lie elsewhere: in public administration, health, financial services, education, transport, retailing and so on. The key point about uses of personal data in these arenas is that they are operational. Service providers use data to plan, organise and implement activities and to undertake associated administration. Such activities include ensuring they are dealing with the right people, making decisions relating to eligibility and access to services, configuring service provision to the needs and circumstances of particular individuals, planning and organising the delivery of such services, keeping associated records, undertaking billing, dealing with queries, etc.

A key characteristic of these activities is that completing each one successfully does not require the aggregation of large amounts of data. Instead, they each require the ability to access and use exactly the right data at the right time.

The infrastructure and capabilities that make the construction of such data-driven services is not the aggregation of the biggest possible databases but the availability of safe, efficient data logistics: the ability to get exactly the right data to and from the right people and places in the right formats at the right times.

Establishing such personal data logistics infrastructure has almost nothing to do with ‘big data’ or ‘artificial intelligence’ and is entirely overlooked by AI enthusiasts. Personal data stores provide this infrastructure — infrastructure that is really needed to turbocharge the accelerated improvement of the big part of the data picture, operational uses of personal data for the purposes of improved service provision.

Do individuals really want to control their data?

Most debate about the issue of citizen ‘control’ of their personal data is fundamentally confused. There are two very different types of ‘control’:

  • Individuals trying to control data that is collected about them and used by organisations
  • Individuals trying to use their data to manage their lives better, which means they have to be able to exercise control over it

Virtually all current debate about ‘control’ focuses on the first, narrow, meaning. Virtually all of the value of exercising control derives from the second meaning, which is nearly universally ignored. Personal data stores empower individuals with the second type of positive control: the ability to use their own data for their own purposes; to add value in their lives.

A second set of misplaced assumptions follows. When individuals seek to control data that organisations collect about them, the process is usually adversarial. The individual is trying to stop the organisation doing things with data that the individual doesn’t like. But when individuals seek to use their own data for their own purposes, on most occasions they positively want to share data with bona fide service providers because these service providers can help them add value.

If a National Data Strategy frames debate and policy making about citizens controlling their data in the first narrow, adversarial way it will never unleash the full potential of personal data.

Are individuals capable of controlling their data?

A related, common misconception is that individuals do not want to or are unable to exercise control over their data, because doing so is too difficult and complex.

The opposite is true. Processes currently used by organisations to collect and use data manufacture complexity, thereby imposing significant burdens on individuals.

Current burdens lie in four main areas:

  1. Individuals are required to fill in multiple forms when seeking to access and use services, often having to provide the same information many times over, on occasion even to the same organisation.
  2. Individuals are required to prove claims that they make about themselves (e.g. about their age or address) often involving the physical presentation of physical documents
  3. Overly complex and cumbersome consent processes, including privacy policies and terms and conditions that are evasively worded and difficult to read and understand. These are often deliberately made excessively complex by organisations seeking to induce individuals to consent to data sharing for reasons and purposes that go beyond accessing the data that’s needed to provide the service the individual is seeking.
  4. The organisation-centric nature of today’s data ecosystem means that individuals have to manage each relationship with each organisation separately — so that the above burdens of information provision, proofs of claims and consent are multiplied many times over as individuals deal with multiple organisations. Most individuals have 100 or more such data relationships, so to effectively ‘control’ all their data, they need to undertake these processes 100 times over. The transaction costs are so high that hardly anyone bothers.

Personal data stores greatly reduce and often eliminate these manufactured burdens. They do so by:

  1. Standardising data sharing agreements around ‘safe by default’: only sharing data necessary for the purposes of service provision.
  2. Automating data sharing processes so that they work for and on behalf of individuals ‘while they sleep’. For a parallel, think of standard orders and direct debits which provide individuals with full and complete control over their money, but work in standardised, automated ways that mean the individual no longer has to think about their operation.
  3. Creating single, centralised tools for the citizen for managing relationships with suppliers (equivalent to setting up, changing or ending direct debits). For example, consent management dashboards enable individuals to see and simply manage all their data sharing agreements with suppliers in one place (rather than having to log in to dozens of different separate accounts)

Summary

It is not possible to make good policy decisions about priorities for investments, grants, innovation and research projects or rules and regulations if the grounds for these decisions are faulty. Currently, effective policy making is hampered by widespread misunderstandings about where the biggest economic opportunities lie, the nature of issues such as control, and the role of citizens in the workings of the data economy. This is resulting in misallocation of available resources and overlooked opportunities. For a National Data Strategy to succeed, a fresh assessment of the assumptions that lie behind current debates about personal data is essential.

--

--