How Does Whole-Genome Sequencing Help Me Resolve Epidemiological Problems?
Hello, I’m Rou-Ya Bai, an undergraduate on a research adventure at National Tsing Hua University. No exaggeration, I’m pretty fortunate to be living in this era, where whole-genome sequencing(WGS) is widely used for extensive purposes in the biological field.
Most of you probably have heard of the term, WGS, when skimming the latest news, or academic papers related to cancer prediction or predicting genetic disorders.
Have you ever known that the potential of WGS can be used to infer the transmission of a communicable disease? I’m not sure whether it is beyond your imagination. For me, it truly was before I stepped into the current lab.
Owing to the decrease in the cost of whole-genome sequencing, research requiring whole-genome data has become readily available to be conducted in recent decades.
.
1. Why does WGS matter in the genomic field?
It is a huge progress that every researcher in the bioinformatic field is eager to witness because it means the information they are hungry to explore underlying human genes is getting more and more complete. For genetic epidemiologists (if I count), we are no longer forced to choose a certain segment of genes as the target due to devices with limited calculating power. However, it’s not always “complete” to some extent.
.
2. Whole-genome datatype is not always “whole”.
When trying to use whole-genome to conduct research, regardless of humans or pathogens, we often have to “mask” or “remove” some segments of sequences highly repeated, and unreliable to make the results clear and accurate. Therefore, remember when people say “whole-genome” datatype, it is not always “whole” as you read. It’s quite literally “whole” after they process the sequences.
Take my research as an example, what I’m doing is inferring the level of transmission of tuberculosis (TB) over the years in Taiwan. It is about 4.4M base pairs that tuberculosis owns, which is nearly 145 times longer than COVID-19 does. Such a long sequence has some repeat regions affecting the results. So, I would mask these regions when doing the bioinformatic process, for example, alignment.
.
3. Why do I prefer the method of whole-genome sequencing?
Biologists say DNA has lots of information. Okay, that sounds reasonable for any of us having biology class before college. But, what exactly that information is? If you’re a student interested in Life Science, what would you want to know from genes before writing a proposal?
Here are my opinions. First of all, within an individual, his or her DNA might underly the information about inherited diseases or mutations causing a tumor to grow. When it comes to a population, the information can be expanded to see not only the incidence of specific diseases but also the genetic relationship with one another. This concept can be applied to infer the relationship between TB among patients.
Try to conjure up what I just said. If I only use a specific segment of the TB sequence, called “genotyping”, and compare it to "WGS”, what will happen? It represents that I only have the result about the relationship of “this segment”, not that of “TB”. You may imagine how much bias it would create to affect our results.
.
All in all, whole-genome data is necessary for me to infer the transmission of TB cases, even using the current sample to estimate the “time” of the outbreak of TB.
Nonsense?
Let me convince you next time.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Hi, I’m Rou-Ya Bai, an undergraduate researcher at National Tsing Hua University. If you have any problems with my research, feel free to ask.
Email: zionoahfb@gmail.com