Science, Data, and Data Science

It’s hard to appreciate, but the rate of scientific and technological progression of the human race is increasing at an amazing, almost frightening pace. The computing power that sent us to the moon in the sixties can now fit comfortably in our pockets. My father was a computer enthusiast, and taught me how to build a computer when I was but a wee youth. In those days, half a gig of RAM was considered exorbitant; what could possibly need that much computing power? Now, of course, such numbers are considered quaint at best, if not laughable. Twenty years ago, I would never have guessed the world as it exists today, with nearly the entire breadth of human knowledge available at the tips of our fingers and on our phones. And I believe that looking back in twenty years, I will have a similar feeling.

This progress is also evident in medicine and health. There are many metrics we can use to evaluate this, but a simple one is global life expectancy.

Looking at the past two hundred years, the world life expectancy has jumped from below 30 years to over 70 years, more than doubling. Modern medicine has improved leaps and bounds as our understanding of biology and chemistry has increased, as well as our understanding of public health, sanitation, and other related factors.

As an undergraduate, I received my degree in biological engineering, specifically with a focus on biomedical engineering. This is the field of drug development, medical devices and improving imaging and diagnostics. The reason I was drawn to this field was my fascination with the human body. Despite all the technological advances I just mentioned, the human body remains, in my opinion, a more advanced machine than anything we have built ourselves. The human brain (very loosely estimated to operate on the order exaFLOPs, although applying machine performance metrics to wetware is spurious) is more efficient, adaptable, and performs orders of magnitude better than the best supercomputers. The human body performs a myriad of functions such as self-repair, detection and motion, at a level far beyond machines we invent dedicated to a single one of these tasks.

Our understanding of the human body is very much in the initial stages, as much remains unknown or mysterious to us. Biomedical engineering has only recently emerged as a standalone field, arising as an interdisciplinary focus combining engineering and medicine (…somewhat obviously). One of the largest areas of overlap is the application of data science to the information being gather by biologists and chemists in wet labs. The amount of raw data being generated today is staggering.

One of the best examples of this is the completion of the human genome project in 2003. One of the largest undertakings in biological research, spanning thirteen years of effort and cooperation among labs from a dozen countries, it successfully sequenced the human genome. Looking strictly at the information generated by that project, the information of the human genome could fit somewhere in the range of 3 to 4 DVDs (which is fascinating in and of itself, as DNA is capable of storing petabytes of information itself). Today, modern genomics labs can generate terabytes of information each day from experiments, as the cost of sequencing decreases and technology improves.

However, all of this information does nothing unless it is properly interpreted and applied. One of the biggest areas of research is epigenetics — the study of how the genome is expressed, and how this expression changes over time or due to different situations. Essentially, this is how the body can respond to different stimuli or stages of development. The genes you express as a baby will be drastically different than those you express as an adult. Understanding how and why these changes occur may possibly help reverse aging, or facilitate the regrowth of damage nerves. Finding better therapies for genetic diseases such as cancers is also dependent on our understanding of epigenetics. However, the number of variables to consider for even a single gene is staggering, and is rarely found in the body anyways. Genes are often parts of webs of networks that affect one another. Analyzing these relationships and dependencies is no easy feat, but with monumental effort, we are making progress.

This is where data science and its techniques have proven invaluable. Wading through the endless seas of numbers to pick out the key variables and correlations is essential to turning raw data into translatable, real-world applications. When we can definitively identify three genes as being critical to cancer development, then we can create the drug to shut those off (although, of course, that is easier said than done!). But isolating those three genes from the hundreds or thousands tested has become a major obstacle that needs to be overcome. As data analytics, and the technology to support it, has improved, this task becomes more feasible each day, leading to new discoveries.

Genetics is not the only area of medicine that is benefiting from the application of data science. Hospitals, for example, are in the process of transitioning from paper to electronic records. A major benefit of this, besides hopefully saving a forest’s worth of trees, is consolidating individual medical history in one place that is easy to access. By applying analytics to a patient’s medical history and lifestyle, medical care can be personalized to best fit individual needs. Combining all of this information together can also allow for analysis of the population as a whole. This is especially relevant today, as the world is in a grip of a pandemic. Who is most at risk of covid-19? What factors can affect this? Our ability to answer these questions has never been better, thanks to advances in data science.

As someone who was educated in biology and is in the process of learning data science, the opportunities are vast. I decided to leave the academic environment in part so that I could be in a place where my efforts would lead to real-world impact. “Big Data” is becoming a part of nearly every aspect of every business and effort, and as I outlined, medicine and biology are no exception to that. There are so many opportunities to use both of my areas of expertise, whether it be in a hospital by tailoring treatments to patients, or at a pharmaceutical company developing new and improved drugs.

I believe that the increase in life expectancy that we’ve seen in the past 200 years won’t be stopping soon. Like I said, I can’t imagine the world that will be in 20 years, but I do believe that our rate of progress in medicine will continue to accelerate. And life expectancy is but one focus area. Improving quality of life and increasing accessibility of advanced health care to parts of the world that might not enjoy such privileges is becoming more of a concern for many. One of the labs I had the opportunity to interact with was developing diagnostic and testing equipment that could operate remotely, which could be brought to people who might not be able to travel to hospitals or afford the costs associated with that. With the power of the internet and their technology, these tests could help reach people who might be most at risk of diseases such as malaria or tuberculosis.

As I look to my own personal future, I hope that I can be involved in endeavors such as these. With the skill set I am developing, I believe I am placing myself in a position where I can meaningfully contribute to improving peoples’ health. While my personal impact may be small in the scheme of the human race as a whole, it is still a part of that effort, and is why I decided to study biology those many years ago. I don’t know where I’ll be specifically in 20 years, but I know that if I’m in such a position, that it will be enough for me.

Thanks for reading!

ciao