Bioinformatics.

Michael Hall
PhD Files
Published in
4 min readAug 16, 2017

The path you should consider if you love technology.

The question I dread the most is “So what do you do?”.

My reason for dreading this perfectly valid question though is not what you would expect. Generally, this conversation plays out something like this…

Person: “So what do you do?”
Me: “I’m doing my Masters at the moment.”
Me (to myself): “Please don’t ask me what I am doing my Masters in!”
Person: “Oh cool. What are you doing your Masters in.”
Me: “Ahhhh, Bioinformatics?”

Then one of two things happens…

Person: “Oh cool.” (Eyes glaze over)

or

Person: “What is that? It sounds hard.”
Me: “It’s like a mixture between Biology, Computer Science, and Statistics.”
Person: “Oh cool.” (Eyes glaze over)

Understandably, but sadly, this is a common occurrence for people in my field. The reason I say understandably is that the name bioinformatics sounds complicated.
You have the “bio” part, which people immediately think

“Biology. Shit I hated that at school!”

And then “informatics”. Most people probably don’t have something that they associate this word to immediately. As such, their brain is reeling and this explains the glazed eyes or the “that sounds hard!” comment.

I totally understand this. When I first heard of bioinformatics my brain did that thing it does when I read hard to pronounce names such as Hermione — a weird mumbling noise that resembles someone speaking with their mouth full.

I hope to set the record straight here and explain what bioinformatics is and why people should care. If you’re new to science and technology and want to find an area where you can work on complex problems that have direct impact, this is the field for you!

The term “bioinformatics” was first coined back in the 1970s by Paulien Hogeweg and Ben Hesper and has had definition changes over time since then (you can read an essay about the history of the term here). As the state of the field stands, the most concise definition I have come across is

“The science of collecting and analysing complex biological data such as genetic codes.”

Let’s break down the concise definition and see what bioinformatics is.

The collection of biological data refers, in many cases, to a bioinformatician working in conjunction with a wet-lab scientist to try and answer a question about biology. Wet-lab scientists work in an actual laboratory doing experiments with cells, bacteria, mice, etc., and generate data. A dry-lab scientist is someone who works on said data from a computer. Broadly speaking, bioinformaticians do not generate their own data in the lab, but have close collaborators who do — although there certainly are some that do straddle both wet and dry labs. This data can take many forms. Genetic codes, as mentioned in the definition, are obtained from sequencing the DNA of an organism. There is also data in the form of fluorescent microscope images or how x-rays bounce off proteins in order to determine their structure.

The collection of this data is not as simple as “Here’s a file. Can you analyse it?”. Bioinformaticians need to have a solid understanding of biology in order to know what the data is representing. How was it gathered, could this method of gathering introduce biases, and what kind of information are you expecting to get from the data given the conditions surrounding its generation?

This close connection with wet-lab scientists creating the data is a lot of fun. You are constantly learning and being humbled. It gives you a greater appreciation of the hard work that goes into that file sitting on your computer screen and, for me, I want to return the effort by wringing as much information out of it to make their effort worthwhile.

The analysis of this data is intellectually addictive. I often find myself losing track of all manner of time as I bury myself in the challenge of understanding how all this information explains biology. And that’s the best bit. In analysing data in bioinformatics, you are helping to explain another piece of the puzzle that is biology! I want you to read that again, pause for a moment, and contemplate.

You are helping to explain another piece of the puzzle that is biology.

I won’t go into details here about the analysis of biological data except to say bioinformatics is very diverse and includes approaches from quantitative sciences such as computer science, mathematics, and physics to name a few. At the end of the day though, the central goal is increasing our understanding of biology.

If you want to know more about the specifics of analysis stay tuned as I will be producing future posts that address this.

I’m sure many people in the field will have their complaints about this definition, but you get that when trying to define just about anything in science. The definition is not important, what is important is encouraging people to get involved and excited about it. I hope I have been able to whet your appetite a little here.

Basically anything in the world of biology that you can gather data on, bioinformatics has a sub-field for. If you like the idea of algorithm development, you can work on more efficient ways of predicting protein structure. Or if you prefer machine learning, you can apply the latest image recognition techniques to classifying images from microscopes.

Whether you’re a university student trying to decide what classes to take, a tech industry engineer looking for something that will contribute to society in a more meaningful way, or a biologist who loves the challenges of biology but not the lab work, you’ll find something to ignite your passion in bioinformatics.

--

--

Michael Hall
PhD Files

PhD Student in Bioinformatics — University of Cambridge and EMBL-EBI.