How do scientists find genes that cause diseases?

Charlotte Guzzo
Sano Genetics
Published in
3 min readJan 23, 2018

We introduce Genome-Wide Association Studies (GWAS)

We often come across reports linking genetic markers as associated with a certain disease — what are these markers and how is their relationship with disease determined?

Genetic variation in the population

On average, a person shares 99.5% of their DNA with any other human being. However, there is some genetic variation that occurs naturally in the population. Most often, this variation is in the form of single nucleotide polymorphisms (SNPs). Human DNA is essentially like a book, with a series of letters A, G, T and C, in different combinations to spell a set of words, each with a unique position and function, or genes. There are also other words which control or regulate the genes themselves, or regulatory regions. SNPs are when one of the letters in either type of word differs in a few people compared to the rest. Occasionally, having this sort of genetic variation might increase (or decrease) the risk of developing a certain disease or trait. For example, those possessing the letter ‘C’ at the SNP rs2237892 were reported to have an increased susceptibility to type 2 diabetes, compared to people with ‘A’, ‘G’ or ‘T’ at that same DNA position, becoming a genetic marker of diabetes.

Genome-wide Association Studies

SNP-disease associations such as the one above are identified by genome-wide association studies (GWAS). Here, an extensive set of SNPs, distributed randomly and evenly throughout the human genome, is profiled across the DNA of a large number of people with a certain disease or trait, such as diabetes, as well as in a group of people without diabetes, preferably matched in age, sex and geographical location as varying these factors can potentially affect the analysis. If there are any specific SNP versions that differ between the two groups, in this case the ‘C’ version or C allele of rs2237892, they are said to be associated with an increased risk of getting the disease. It is certainly a ‘the more the merrier’ situation in which the more the people who are included in this study, the greater the chance that the genetic association is real. This eliminates the differences in disease onset that might be influenced by external factors such as exercise or diet, or even by other genes. It is worth noting that all people with type II diabetes do not possess this SNP, and neither does every person with the C allele of rs2237892 develop diabetes. However, having the SNP increases your chances of developing it compared to other people.

A Manhattan plot of a GWAS for eye colour

How can this information be used?

GWAS are very useful in predicting predispositions to disease and may lead to novel therapeutic avenues as well as a better understanding of the disease itself. They are also effective in identifying genetic markers for various other traits, such as height, hair and eye colour, and even behaviours such as a tendency to smoke. This has been utilised by companies to make personalised health and ancestry assessment profiles for consumers.

However, this information should be interpreted with care. As these studies are correlative, they do not necessarily identify the SNP that directly contributes to disease risk. Therefore, scientists are very keen to take the studies further and investigate the reasons behind SNP-disease associations. In the diabetes study, it was also reported that the SNP was located in a gene, KCNQ1, that could be involved in insulin secretion, which may explain its role in diabetes. Researchers are also delving further into complex associations such as in a recent study identifying the roles of multiple genes acting together through several SNPs, to affect human insulin resistance.

To conclude, decades of work have established that our genetics contribute to every aspect of our life, including disease. What is now being uncovered is the complexity of these associations, as well as the impact of non-genetic factors, and the potential of these discoveries.

--

--