Using Big Data to Hack Autism

Using Big Data to Hack Autism

It’s been 10 years since Michael Wigler had a breakthrough revelation in autism genetics — one that arguably launched the field as we know it.

In April 2007, Wigler and his then colleague, Jonathan Sebat, reported that ‘de novo’ mutations — those that arise spontaneously instead of being inherited — occur more often in people with autism than in typical people. The mutations they noted were in the form of ‘copy number variants’ (CNVs), deletions or duplications of long stretches of DNA. CNVs crop up frequently in cancer, an earlier focus of Wigler’s work. But his find that they are also involved in autism came as a surprise to those in the field. “Genetics was striking out with other efforts based on transmission and inheritance,” Wigler says. “In that vacuum, the new idea was quickly embraced.”

The discovery fast led to further advances. Focusing primarily on de novomutations, three teams of scientists, including one led by Wigler, began hunting for genes that contribute to autism. Their approach was efficient: Rather than looking at the entire genome, they scoured the 2 percent that encodes proteins, called the exome. And they looked specifically at simplex families, which have a single child with autism and unaffected parents and siblings. The premise was that comparing the exomes of the family members might expose de novomutations in the child with autism. The approach yielded a bumper crop: Based on data from more than 600 families, the teams together predicted that there are hundreds of autism genes. They identified six as leading candidates. Some of the genes identified at the time — CHD8, DYRK1A, SCN2A — quickly became hot areas of research.

In 2014, the number of strong candidates jumped higher. In two massive studies analyzing the sequences of more than 20,000 people, researchers linked 50 genes to autism with high confidence. Wigler’s team looked at simplex families and found rare de novo mutations in 27 genes. In the second study, researchers screened for both inherited and de novo mutations and implicated 33 genes. The two studies identified 10 genes in common.

Two years ago, the tally of autism gene candidates shot up again. Deploying statistical wizardry to combine the data on de novo and inherited mutations, along with CNV data from the Autism Genome Project, researcherspinpointed 65 genes and six CNVs as being key to autism. They also identified 28 genes that they could say with near certainty are ‘autism genes.’

“For so long, we’ve been saying if we could just find these genes, we’d be able to really make some headway,” says Stephan Sanders, assistant professor of psychiatry at the University of California, San Francisco, who co-led the study. “Suddenly, you’ve got this list of 65-plus genes, which we know have a causative role in autism, and as a foundation for going forward, it’s amazing.”

These advances establish beyond doubt that autism is firmly rooted in biology. “More and more, we are erasing this idea of autism being a stigmatizing psychiatric disorder, and I think this is true for the whole of psychiatry,” Sanders says. “These are genetic disorders; this is a consequence of biology, which can be understood, and where traction can be made.”

This is just the start, however. As scientists enter the next chapter of autism genetics, they are figuring out how to build on what they have learned, using better sequencing tools and statistics, bigger datasets and more robust models. For example, they are looking for common variants — which are found in more than 1 percent of the population but may contribute to autism when inherited en masse. And they are also starting to look beyond the exome to the remaining 98 percent of the genome they have largely neglected thus far.

“Most of the genetic advances fall into a category of large-effect-size de novovariants, which is only one piece of the puzzle,” says Daniel Geschwind, professor of human genetics at the University of California, Los Angeles.

Posted on