Digital Diplomacy
Published in

Digital Diplomacy

What does data ethics have to do with border control?

Photo by Jason Leung on Unsplash

Data ethics has become something of a buzzword: as the importance of AI in areas like government, healthcare and finance grows, so do concerns about what, how and why data is used. But data ethics is not a new topic of concern; it has existed for far longer than the recent rise of AI technologies. Taking a look at the UK Border Agency’s 2009 Human Provenance Pilot Project helps us understand the importance of data ethics in technologies beyond AI, and why undertaking a data ethics analysis is essential to uncovering the impacts of data-driven border control.

What is the HPPP?

In February 2009, the UK Border Agency (UKBA) began a project known as the Human Provenance Pilot Project (HPPP). The HPPP was designed to use asylum seekers’ genetic and isotope data to determine their national provenance, and therefore their right to claim asylum in the UK. This gave UKBA case workers permission to request samples of saliva, nails and hair from asylum seekers whose story they deemed as suspicious.These samples would then be tested to determine whether the individual claiming to be fleeing persecution from a specific country actually came from another. In particular, one goal of the project was to determine whether asylum seekers claiming to be from Somalia were actually from Kenya, which would revoke their right to asylum. The first test of the pilot took place in September 2009, and the study ended in March 2010. A total of 38 people were tested. The project received extensive criticism from the scientific community at the time of its deployment, particularly over the validity of the methods used. However, the UKBA persisted with the project until the end of the pilot and has not ruled out the possibility of reintroducing something similar in future. Therefore, a critical evaluation of the project as well as an exploration of potential mitigating factors remains pertinent today.

The data ethics considerations

As highlighted by criticism of the project at the time of its deployment, the HPPP raises a number of significant data ethics considerations. Here, I will focus on three: consent, data methods, and wider impacts.

a) Consent

Consent is a central concept in data privacy: it establishes a mechanism which enables individuals to retain a level of control over how their data is collected, controlled and used. Legal privacy scholar Arthur R. Miller argues that “the basic attribute of an effective right to privacy is the individual’s ability to control the circulation of information relating to [themselves].” However, the importance of consent is not always recognised under the law, and although the GDPR sets a high standard for consent, it is only one of six lawful bases for processing personal data.

Whilst the HPPP did not need to use consent as a lawful basis for processing asylum seekers’ data, the project’s Deputy Director stated that “all samples will be processed voluntarily”. Therefore, the notion of consent was alluded to in the project’s aims. However, the protocol also instructed that “if an asylum applicant refused to provide samples for the isotope analysis and DNA testing the case owner could draw a negative inference as to the applicant’s credibility.” Therefore, it is difficult to support the claim that participation in the project was voluntary since participants were penalised for opting out. As a result, the HPPP breached individuals’ ability to control how their information was collected, controlled and used, and hence their right to privacy.

“The basic attribute of an effective right to privacy is the individual’s ability to control the circulation of information relating to [themselves].”

b) Data methods

The HPPP used several data methods to assess the relationship between DNA and nation of provenance. However, unreliable use of data can lead to unethical data-driven border control. There are three key reasons why this was the case in the HPPP: the counting problem, causal inference, and false negatives.

The counting problem: The input variables chosen for the HPPP were DNA data derived from hair, nail and saliva samples. These were then used for DNA and isotope testing. However, geneticist and isotope specialists have described the use of such data to determine national provenance as “naïve” and “flawed”. This is not only because the methods used are not scientifically valid, since there’s no scientifically accepted evidence that isotope signatures at birth are still present in adult samples, but also because determining ethnic origin is not the same as determining nationality. There are several reasons for this: people migrate across borders, borders change over time, countries are often composed of varied and changing ethnic groups, and populations in border regions are more likely to have similar ethnic origins to their national neighbours. Therefore, the DNA data that was being ‘counted’ for data analysis provided an unreliable input for determining nation of origin, penalising individuals for failing a flawed test.

Causal inference: Advances in genomic technology have improved the ability of genetic data to reliably predict the composition of a person’s heritage. For example, a study on the ethics of DNA testing demonstrated the ability of consumer-grade genetic testing from companies such as 23andMe to reliably predict an individual’s genetic heritage (see figure 1).

Figure 1: 23andMe study

However, this research highlighted that such tests show no evidence to support claims of an individual’s nationality. For example, whilst the test could determine an individual’s Greek genetic heritage, the test could not distinguish whether a subject was a Greek national or an American national with Greek heritage. Furthermore, the study highlights the potential diversity of individuals’ genetic background — any or none of which may be the same as their nationality. This research exposes the risk of conflating genetic heritage with nationality, as was the case in the HPPP. Therefore, while there may be a correlation between DNA results and national citizenship, this does not establish a causal relationship between the variables. Consequently, using these results to determine a person’s nationality, and therefore their right to asylum, is an inaccurate and invalid data method.

False negatives: In Machine Learning (ML), a visualisation known as a confusion matrix is used to identify how successful a classification model’s predictions are. These tables summarise the number of true positive, true negative, false positive and false negative results of a model’s predictions. Machine Learning models aim to optimise the number of true positives and negatives, and minimise the number of false positives and negatives. However, not all errors have equal impacts. A false negative in the HPPP can result in dramatic, and potentially life-threatening consequences. If a model incorrectly classifies an individual as having a national provenance that is different from their claim, this could result in a rejection of their right to asylum and deportation to likely dangerous conditions (since this is why they are claiming asylum). Therefore, such a model should account for the possibility of incorrect classification by providing claimants with the opportunity to appeal their decision (i.e. a ‘human in the loop’).

“The DNA data that was being ‘counted’ for data analysis provided an unreliable input for determining nation of origin, penalising individuals for failing a flawed test.”

The UKBA responded to criticism of their methodology by stating that “Ancestral DNA testing will not be used alone but will be combined with language analysis, investigative interviewing techniques and other recognised forensic disciplines,” suggesting that negative DNA results would need to be confirmed by other methods. However, linguistic tests are also a contested method of determining nationality. Therefore, this may only compound case workers’ confidence that the claimant is lying, despite being subject to ‘false negative’ results. This is particularly problematic in cases where Machine Learning or other automated decision-making technologies are used, since the case worker is likely to be influenced by automation bias: the propensity for humans to favour suggestions from automated decision-making systems.

c) Wider impacts

The HPPP does not only impact the decision-making process for the UK Border Agency’s asylum vetting service, but also has wider socio-political implications. Firstly, the use of DNA testing for asylum vetting redefines the categories of identity on which immigration and asylum decisions are based. By conflating nationality with genetic ancestry, such procedures open the doors to eugenic nationalist practices which promote certain ancestral lines or stigmatise against others. This doesn’t necessarily have to be the aim of the project to result in negative consequences for specific ethnic groups: although the intention of the HPPP was to detect fraud, it resulted in a discriminatory screening policy which stigmatised Somalians considered to be lacking ‘pure’ ancestry.

Secondly, policies which seek to centre the asylum vetting process around the detection of fraud reinforce the idea that asylum seekers are mostly ‘bogus’ refugees seeking admission to the country for economic, not humanitarian reasons — a stereotype which is loaded with political implications. Therefore, projects like the HPPP shape public and policy discourses, and thus the ability of asylum seekers to escape dangerous or harmful conditions.

Potential ways forward

For the time being, the UK Border Agency has decided not to take forward DNA/isotope testing for country-of-origin identification purposes. However, they have not ruled out the possibility of doing so in the future. Therefore, an analysis of potential mitigating factors remains pertinent.

Some migrants who have difficulty demonstrating their place of origin may benefit from technologies such as DNA testing, where no alternative evidence can be found. While alternative methods are preferable, since they are more likely to have higher rates of accuracy and not to conflate place of origin with genetic ancestry, DNA tests may support specific edge-cases if they are used to promote inclusion rather than exclusion. However, it is essential to ensure that these DNA tests are reliable. During the HPPP, scientists suspected that the testing procedures were conducted by private laboratories, which were subject to less regulatory oversight than public ones. Any future use of DNA testing for ancestry by the state should ensure that laboratories are subject to robust regulatory oversight mechanisms.

Secondly, if DNA testing is to be used, individuals with substantive expertise must play an important role in the design of the programme. For example, the UKBA may wish to consult a population geneticist with expertise in a particular population, rather than relying on consumer-grade genomic ancestry tests.

Thirdly, the issue of consent must be adequately addressed. It should be clear to both the asylum seeker and the case worker that any refusal to engage in the DNA testing process has no implication on the outcome of the case. The asylum seeker should also be given transparency over the DNA testing process and how it affects their claim before consent is given.

Finally, and perhaps most importantly, the adequate data to determine individuals’ right to asylum should be given preference in asylum vetting processes. As highlighted throughout this analysis, there is a clear difference between ancestral heritage and nationality. Asylum vetting agencies should refrain from using processes which are unreliable. Instead, these processes should seek to use data which have clear causal relationships with the individual’s nation of origin.


Despite claims that data-driven services create more ‘objective’ decision-making processes, examples such as the HPPP serve as a reminder of the socio-political nature of data science. Analysing the data ethics considerations of such projects can help us to identify the intended and unintended consequences of data use, and what governance mechanisms can be put into place to ensure more ethical outcomes.

As we’ve seen from an analysis of the HPPP, there are some clear data ethics considerations for the use of DNA and isotope testing in asylum vetting services. These include:

  • The importance of consent in data privacy and how contextual factors (such as the threat of deportation) impact consent mechanisms;
  • The role of data methods in exacerbating inequality in the asylum vetting process: namely how the selection of input variables impacts the reliability of a model, the risk of determining causal relationships from correlating data, and the importance of addressing false negatives; and
  • The wider socio-cultural impacts that such data-driven processes can have on ideas of identity, ethnicity and nationality.

Finally, it’s important to recognise the importance of mitigating factors such as regulatory oversight, the consultation of subject matter experts, better mechanisms of consent and more robust use of data. Further analysis should also investigate alternative methods of verifying nation of provenance, how to build inclusion into the process, and the role of public opinion and policy in technology-driven asylum vetting services.

Only by conducting a thorough data ethics analysis of such technologies can we begin to unpick the ethical considerations of technology solutions which are often shrouded in technical language, and start to design technologies that promote fairness, inclusion, privacy and transparency.




Tech, digital, and innovation, at the intersection with policy, government, and social good.

Recommended from Medium

How Can Industries Help Society Transition Into Industry 4.0

Taking the Dirty Break Seriously

Pandemic Virtues: The Importance of Humanity in the Grip of Coronavirus

Build Back Better — It will take more than banging pots and pans

Life…, In The Time of Covid-1

Millennials and Boomers are More Alike Than Either Admits

It’s not about you

Irresponsibility: The Propagation of a Pandemic

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sophie Taylor

Sophie Taylor

Sophie is a Senior Digital Ethics and Innovation Consultant at Sopra Steria.

More from Medium

Stats and R Is 2 Years Old

track page views, sessions, users engagement in R with the {googleAnayticsR} R package

Transform YouTube Videos on Demand into Interactive Experiences with Mercury

Fast and Easy-to-Use Storage for CryoEM Workflows

From Lamps to LEDs: The Evolution of Headlights