Putting an exact monetary value on an individual’s DNA data may seem as dystopian as it is a technical challenge. However, over the past several years, we have begun to see just how much organisations are willing to pay to gain access to individuals most valuable piece of personal information. In this blog, we will look at some of these examples and attempt to answer the question:
How much is my DNA really worth?
Genomics has enormous potential to re-shape drug discovery and development by addressing some of the life sciences industry’s biggest issues, moving the industry from ‘blockbusters’ to ‘niche busters’ by leading the drive towards personalised medicine. This industrial trend has meant genomic data has become an immensely valuable dataset to pharmaceutical and biotechnology companies, many of whom have invested multi-millions of dollars into gaining access to the wealth of information that this dataset holds.
Perhaps the most notable example of this being the $300 million dollar deal between pharmaceutical giant GlaxoSmithKline (GSK) with 23andMe in 2018. This came as a surprise to many 23andMe customers over how their most personal data had been shared for profit, and first highlighted to the public the potential monetary value of their DNA data, as well as considerable privacy concerns.
In this 23andMe/GSK deal, $300M for access to 23andMe’s 4 million person database equated to ~$75 per person. This, however, is not the only example.
Below is a table that outlines many of these types of deals that have taken place over the past several years. These include not only consumer DNA testing companies but also privately and even publicly-owned national DNA sequencing initiatives.
If we were just to use these examples in the table above, we could calculate the average price per unit and amount that your DNA is really worth to be ~$3300. However, doing so would be a rather extreme guesstimate. As you can see the price per unit paid by pharmaceutical companies to gain access to genomic data varies considerably from as little as $75 to as much as $20,000.
Some of the reasons for these extreme differences are explored below:
Genomic data type
As you may know, there is more than one way to ‘read’ DNA from a biological sample (e.g. saliva/blood), each that generates data with varying value to the life sciences industry.
Genotyping involves reading and identifying individual positions in DNA called single nucleotide polymorphisms (SNPs), which vary within populations and can explain certain traits (e.g. eye colour, inherited disease, etc). Although there are over 660 million SNPs in the human genome, this method generated data that is considered the least valuable to researchers, hence the $75 paid in the 23andMe/GSK deal and $150 in the FinnGen deal. The DECODE/Amgen PPU figure can be considered an outlier here due to it being the first deal of its kind 2012 and part of an entire company acquisition.
Exome sequencing involves sequencing all of the protein-coding region of genes in DNA. These regions are made up of around 30 million base pairs and make up ~1% of the entire genome. This method identifies genetic variants that alter protein sequences, and does so at a much lower cost than whole-genome sequencing. This extra valued dataset compared to genotyped data is reflected in the larger $300 PPU UK Biobank deal.
This leaves whole genome sequencing (WGS). This technique sequences all 3 billion base pairs and ~100% of a genome. Although there is much of the genome we don’t understand yet, WGS gives us the opportunity to change this over time. Therefore, it is by far the most valuable genomic data type as both a research and clinical tool. This is evidenced by the elevated PPU amounts outlined in the 23andMe/Genentech, GMI/WuXi NextCODE and 2019 UK Biobank deals.
In terms of quantity, the logical assumption is (as evidenced in the table) the larger the genomic database (independent of data type), the more valuable it is. Big-data allows researchers to identify meaningful patterns and identify novel mutations that can be targeted with drug treatment. One example is how researchers were able to use large public databases to identify differences in the molecular signatures of breast tumours in younger compared with older women. The only exception in this table is the $20,000 PPU (3000 quantity) 23andMe/Genentech deal, but we’ll address this later on.
Other health data
Having access to genomic data on its own is also not useful, hence why in all these deals either medical records or self-reported health data were also shared with pharmaceutical companies. This type of data integration allows for identification of genotype-phenotype interactions that deepen our understanding of the role of genetics and genomics in complex human phenotype outcomes. Medical records can be considered the most valuable vs self reported health data (e.g. 23andMe), as the latter can suffer from mis-information. For example, an estimated 2/3rd of people with self-reported psoriasis likely did not actually have psoriasis (Tsoi et. al 2017). On the other hand, self-reporting can represent an effective way to gain wider and useful datasets from study participants including lifestyle information (e.g. smoking, exercise regime, diet, etc).
Ability to re-contact
From the table above, it appears that a clear value add of gaining access to genomic databases is the ability to recontact study participants to gather more information. All the national DNA sequencing initiatives retain some ability to do this, which is reflected in the increased PPU figures. However, the most clear example is the 23andMe/Genentech deal which involved explicitly recontacting and collecting new information from the participants. As mentioned previously, this led to a PPU value of $20,000.
A key reason national (public and private) DNA sequencing initiatives have been established and continue to be established around the globe, is the value that homogenous genomic databases have to researchers. People from countries like Iceland, Ireland and Finland have very similar genetic make-ups, meaning more powerful genotype-phenotype associations can be made. This is also reflected in individuals that suffer from the same condition and further emphasises why Genentech paid such a large PPU value for access to 3000 Parkinson’s disease patients.
Although the genomic data market is still somewhat maturing, and explicitly being able to quote an exact dollar value of an individual’s DNA data would remain an educated guess as things currently stand. There are certain features that we can say maximises value of this data using examples outlined above.
For an individual to reap the most value from their DNA data, they should have their whole genome sequenced and this be part of a large and homogenous genomic database with a ‘concentration’ of individuals with specific high-value traits or conditions for pharmaceutical research. In addition, other health and lifestyle data should be integrated to give researchers a richness of healthcare data, as well as the ability to directly re-contact the individual to obtain updated datasets over time.
Privacy and ownership
Aggregating highly sensitive datasets and giving researchers the ability to directly re-contact individuals may be supercharge the value of DNA data but opens up considerable privacy and security issues from cybersecurity perspective. Your medical information alone is worth 10x more than your credit card number on the black market. Hence, adding to this your entire genomic data, lifestyle data and a direct link to the individual, is a scary thought to put it lightly.
In addition to this, in the current model you as a contributor to these genomic databases do not retain any ownership rights and are not entitled to any value that your sharing your most personal data has helped create. In none of the deals above were individuals remunerated when their data was sold. In the case of 23andMe, individuals actually pay on top of this.
Current initiatives do not have a viable solution to these problems.
Genomes.io allows sensitive user genomic data to be encrypted and securely stored within their own ‘DNA Vault’ to which they can control access from their mobile device. No one is able to access these fully encrypted virtual machines (including our team), until explicit permission is granted by the individual ‘private key’ holder. Even when granting third-party access, raw user data never leaves this secure storage environment, meaning individuals can contribute anonymised snippets of their data to researchers and get paid for doing so, without the possibility of ever being re-identified.
This gives only trusted professionals (e.g. researchers) the ability to gain access to whole genome sequence data, enriched with other health and lifestyle data in a repeat-consent model that enables re-contact (user controls access via mobile), without being able to identify the individual. Our model also provides researchers with a more efficient and ethical genomic data acquisition model, in which they do not have to take liability for security or ownership of the data.
So, although we may not be able to put an exact amount on your DNA data quite yet, Genomes.io technology means you are able to maximise the value of it, protect it and most importantly own it and benefit from its value, whilst responsibly contributing to important scientific and medical research.
We are now backed by a lead investment from the London-based health technology focussed venture capital fund TenX Health, but also want to give our Crowdcube investors an opportunity to invest at this early stage of our exciting venture. To help us on our journey to give 1 billion people control over their most valuable piece of personal information, become an investor today: