Sidebar: What’s a Gene?

Andrew Evans
Hack My Cancer
Published in
5 min readFeb 16, 2016

I’m going to do some posts on some of these common terms that we hear all the time, but may not really know too much about or have thought about a lot. So here we go with the first one — the gene.

When you delve into the details of your particular cancer, or get genetic testing results, or hear about promising new therapies, there is usually a lot of talk about genes — usually with cryptic alphabet-soup names, like BRAF, MEK, BRCA2, or BAP1. Ever wonder what that’s all about? Well, read on…

In Cancer Biology for Patients — Episode 1, we started to talk a bit about how our genome is organized — all the DNA that forms the instructions for making you resides in 46 long stringy molecules (23 pairs of them, actually) called chromosomes (we’ll do another Sidebar just on those — they’re interesting too). Taken all together, these 46 molecules house 6 billion “letters” of your DNA code — two copies — one copy from mom, another from dad.

We also talked about how there are four “letters” in the code — A, C, G and T — and the chromosome is just a long string of these. This is a key point about DNA — it’s a chemical representation of information. 6 billion letters of information (most of which is duplicated in the two copies). But as of right now, we don’t know what it all means. In fact, we currently believe that the majority of it might actually just be gobblety-gook. But interspersed among long stretches of nonsense, there are segments of the DNA that we know to contain useful information. These are the genes.

Genes carry the instructions for making proteins, which are the spare parts that the machinery of our cells are made of. Through decades of spectacular biochemical hacking and study, we’ve catalogued a lot of stuff in the human genome. Currently, we believe there are about 20,000 genes that are active and useful in the genome. That sounds like a lot, doesn’t it? Well don’t get too cocky, because tomatoes have over 31,000 genes, and wheat has over 90,000!

We’ve also found that there is a lot of evolutionary “junk” in our genetic attic — sprinkled around all that gobblety-gook are old, broken genes that are not activated (their letter sequence does not attract the right machinery in the cell) but we think they are kept around as “raw material” for evolution — so we’re really genetic hoarders. “I know this thing is broken, but I’ll just throw on the pile over here with all the other broken ones — never know if it might come in handy 40,000 years from now!” Re-activating of these dormant genes can also happen in some cancers — they can get “un-broken” and cause havoc.

During all of this study, researchers had to come up with ways to name the genes so that they could refer to them later, of course. With 20,000 genes, that’s not a trivial task, actually. An interesting note about gene naming in the heady early days of genetic study, when much research was carried out with fruit flies — no naming conventions had been established, so researchers just called them whatever they felt like — and they had some pretty random stuff on their minds, apparently — “sonic hedgehog,” “faint sausage,” “tribbles,” and “cheap date” to name a few. Because some of these genes have lasted in similar forms down through evolutionary time, we have versions of some of these genes too, and naturally (at first) it seemed to make sense to carry over these names to the human versions as well, for consistency if nothing else. Seems cute until you have to explain to a grieving parent that their child has a bad copy of the “sonic hedgehog” gene that’s causing severe brain defects.

So — now researchers are a bit more restrained with gene naming, and they use cryptic abbreviations instead (the human version of “sonic hedgehog” is now just called SHH, for instance). These abbreviations are referred to as gene symbols (kind of like stock ticker symbols) and are usually a few letters long, and sometimes have a number after them, usually to denote closely related or associated genes — like BRCA1 and BRCA2, for instance (which are both early-onset breast cancer-related genes). Sometimes the names relate to disease association (as with BRCA 1 and 2 — BReast CAncer), and other times they are just an abbreviation for the the protein product the gene codes for (like MAPK, which is the gene for a protein called Mitogen-Activated Protein Kinase). Other times they’re really cryptic or refer to historic names like “sonic hedgehog.” Sometimes they even refer to relationships to other genes — like the BAP1 gene, which is short for “BRCA1-associated protein 1.” So it’s kind of all over the map.

So now when you read genetic test results or hear about genes being targeted by a certain drug, Google the gene symbols and see what they actually mean!

Why so much interest in individual genes in genetic tests or as drug targets? Well, as we discussed above, genes are the instructions for making important spare parts for cells. What happens if those instructions get garbled? It depends. Some errors that show up in genes don’t do anything bad at all — in fact, they don’t even change the final protein product the cell makes (these are called “silent mutations”). Others change the protein that’s made, but not in ways that seem to make a material difference to how the protein does its job. Still others change the way the protein works — sometimes slightly, and sometimes dramatically. Sometimes the changes disable the protein entirely. These mutations of course are most likely to make bad things happen, and in fact they play prominently in most cancers, as we shall see.

So to recap:

  • Genes are stretches of DNA code on our chromosomes that contain instructions for making important parts
  • We have about 20,000 genes (wheat puts us to shame though)
  • Most of the genome however isn’t active genes
  • There are old broken genes lying around that can provide fuel for evolution but also can cause trouble
  • Scientists have weird taste in gene names (clinical geneticists are not amused)
  • Human genes now have abbreviations like stock ticker symbols

--

--

Andrew Evans
Hack My Cancer

Bioinformatician, startup cofounder, American expat in Europe, cancer survivor