Think Before You Share Your Genome

Do companies deserve a slice of your genetic information? Photo by Chinh Le Duc.

If you live in the United States, the National Institutes of Health (NIH) is questing for your genome.

Their All of Us program looks to collect medical information — including the genomes — of one million Americans. This information will be made available to researchers to search for new drugs and cures for some of the most common diseases, like cancer and heart disease.

The NIH has incredibly strict privacy safeguards. Scientists must apply for access to this data. If their application is approved, they only receive de-identified data (no names or addresses associated), and they must keep the data stored on secure servers.

These restrictions are cumbersome and frustrating, and will likely slow down the pace of research. Some scientists argue that all of the data should be released openly. After all, they posit, people are willing to share vast amounts of information on Facebook in exchange for a nice app that lets them post updates to their friends. Who wouldn’t openly share their genome to help improve medical research?

But many would argue that Facebook’s stance on data sharing hasn’t been entirely positive. Facebook allows for targeted advertising, increases the spread of fake news, and tracks what you do — not just on Facebook itself, but all across the web.

Should you allow your genome to be openly released?

Data sharing in recent years

Think of these nachos as your personal habits, and the hungry hands as advertisers. Photo by Herson Rodriguez.

The question of sharing — what, when, how much — has become an ever-present concern in the last few years. There was the Equifax data breach in 2017, which exposed the personal information of over 143 million Americans. More recently, it came out that Facebook lets anyone search for you by phone number — and if you use 2-factor authentication to “keep your account safe”, you’re searchable and you cannot opt out. Until just recently, the Android TV app allowed any user to see all photos uploaded by other users to Google Photos — even if those pictures were private.

Information has value. With information like a home address and a Social Security number, it’s possible to steal an individual’s identity, open credit accounts, and rack up unauthorized charges. Inadvertently shared photos can let others track your location, see your belongings, or gain blackmail material to use against you. The ability to find someone on a social network raises concerns about stalking.

But are people also sharing their genetics? The answer, it turns out, is yes.

Genetic data sharing is on the rise

More and more Americans are sending out tubes of spit for DNA testing.

Companies like and 23andMe have sold millions of these test kits. In exchange for a tube of spit, these companies sequence hundreds of thousands of points on your genome, looking to see what base (A, T, G, or C) you have at specific positions. Different populations have different bases at these positions, allowing these companies to tell you about where your ancestors originated.

Of course, ancestry isn’t the only information available in your genome. Companies like 23andMe also promise to estimate your risk of developing various diseases — the company recently announced that their tests will estimate the risk of developing diabetes. The company will also tell you about your risk of developing Parkinson’s or Alzheimer’s.

While you were eating leftover Thanksgiving stuffing, Ancestry sold over a million genetic testing kits. Photo by Chelsea shapouri.

The kits are wildly popular. In 2017, Ancestry reported that it sold more than 1.5 million genetic testing kits, just on the weekend after Thanksgiving. 23andMe offers scientists the opportunity to explore a cohort of more than 5 million samples — for a price.

This is the real business of these genetic testing companies. 23andMe isn’t valued at 2.5 billion dollars because it sells a kit for a hundred bucks to consumers. The real value of 23andMe, Ancestry, and other companies lies in the data that they’ve collected.

Similar to the NIH, these companies claim that when they share your genetic data, it’s shared anonymously, with identifying details removed. And while the companies require an application before granting access to this data, they don’t stop individuals from uploading their genetic information to other websites, like GEDmatch, which offer fewer privacy controls. (This, for example, is how the police caught the Golden State Killer.)

Consider the implication of sharing

Even if you’re not a serial killer, you may still want to think twice about using one of these services — or about pushing for increased sharing and openness when it comes to genetic sequence data.

I’ve written about this before; your genome isn’t as immediately valuable as, say, your credit card information, or your Social Security number. A hacker cannot snatch your genome and use this to open a fraudulent account in your name. Your health insurance company cannot mine your genome to figure out whether they should raise your insurance costs, thanks to the Genetic Information Nondiscrimination Act (GINA) of 2008.

However, it’s worth keeping in mind that GINA stops health insurers from charging more based on genetic information — but it doesn’t cover life or disability insurance. Bills have been proposed to also block companies that provide life insurance or disability insurance from using genetic information to set pricing, but they haven’t yet been passed.

Furthermore, although the NIH promises that any shared information will be “de-identified”, with information like name, address, and phone number removed, researchers have shown that it’s surprisingly possible to identify a person from their genetic sequence alone. From the link:

For at least the past decade, researchers have demonstrated that by cross-referencing anonymous DNA data with datasets that include personal information, such voter or census rolls, they can correctly “re-identify” significant portions of participants.

The All of Us project from the NIH will be a valuable, powerful dataset for researchers. Yes, it forces researchers to jump through hurdles to obtain access to that data, which may result in slightly slower progress for medical advancements. However, this inconvenience shouldn’t be enough reason to remove all privacy restrictions and release the data openly. Unlike an address, phone number, or credit card information, a genome can never be changed — once it’s public, it can’t be pulled back.