Big Data, Genealogy and Genetic Testing Offer Insights To Past And Future

Everyone who knew my grandmother knew she was French. She spoke the language fluently, studied at Le Cordon Bleu in Paris and earned a Master’s Degree in French Literature, an uncommon feat for a woman at the time. You can imagine my surprise when I learned she was half Spanish.

Her identity was French, but her genetics were more nuanced. Knowing relatives tells us one story about where we come from, if we’re lucky enough (or unlucky as the case may be) to have access to them. But this isn’t the same as understanding our origins. This deeper question takes us into genealogy.

Modern genealogists have added the internet and genetics to their records about the past. With technology, people can trace their family tree and uncover genetic insights against the uncertain backdrop of big data.

Technology Makes Tracing the Family Tree More Convenient

From storing records electronically to software programs designed specifically for family history, technology is making it easy for people everywhere to discover their roots. You can even order a genetic test for your dog. Online genealogy forums have been using crowd sourcing long before websites like Kickstarter and Indiegogo.

Genealogy brings interesting questions into the mix of big data. Who owns information about the past? HIPPA protects the privacy of people 50 years after death. Perhaps as we get more information and compile it, even people who lived well before computers and electricity, will participate in big data, too. When combined with the living, this would be similar to extending an individual’s big data back several generations.

Of course, the availability of genetic testing could be incredibly value to adopted children and those who don’t know their medical history. Would big data make it easier to track down long-lost relatives with statistical certainty? There are privacy regulations within the system, but they have been known to change. And we don’t yet understand big data’s effect on open source.

In New York City, government agencies envision data sharing across social services. Social change enthusiasts hope big data can help understand why young people drop out of school. That sounds like a good thing, but what if big data became part of the college application process as well? According to this story from Market Place, they’ve already started.

Genetic Testing For the History of Humanity

The human race can trace its roots as well. National Geographic recently launched an effort to give more context to genetic data. They documented the DNA of over half a million people from around the world. Their test, available for $199, takes your story past the last few generations, tens of thousands of years. Click here to watch a quick video about the project.

What the video doesn’t cover, but the user agreement probably does, is who owns the data. It might not be a big deal, information being stored about who is more Neanderthal than who. It’s a shame money’s an object. There are a few people I’d like to wager on that one.

Researchers from the Smithsonian used genetics to peel back the layers of humanity’s seedier moments: the slave trade. The genome of modern Caribbean people holds tiny clues to dates of historic migrations. Because we know about the general history of slavery and genocide, the DNA offered specific clues about movements in populations as well as massive drop-offs. Centuries after the slave trade silenced and dehumanized their voices, the DNA of people lives on, a living record of the past.

In the Moment: The Present and Future

A few years back, several governments started using DNA tests to confirm relationship status in refugee applications. Most countries have programs to allow refugees to join family members already established. Due to cultural differences and simple desperation, people sometimes lie. In 2008, genetic testing halted a family reunification program. You can read the response from the UN Refugee agency here. In addition to issues such as privacy and informed consent, DNA testing in this context ignores the painful reality of rape and orphans in most conflicts.

With all the questions surrounding big data, its reliability is a good one to look at. It will affect its value, but only if enough people understand the difference between correlation and causality. Let’s not forget the importance of a random sample in looking at big data for generalizations. It’s so easy to make decisions based off numbers and forget about evaluating the quality of the numbers.