Understanding a genome browser: what are tracks?
I am just starting to learn the jargon biologists use. So looking at a genome browser is intimidating. But UCSC (University of California Santa Cruz) has some very useful video tutorials. But they assume some existing knowledge, so they are not ideal for biology beginners. The following is an attempt to make sense of the language biologists love. Open this genome browser and let’s dig in.
Tracks
These are horizontal lines/layers that show different types of data on a genome browser. A genome browser is also called a genome “map”. A traditional map with geospatial data also has “layers” displayed on top of each other. In this context, different layers/datasets have one thing in common: the location data. Similarly, in a genome map, the reference genome is that one thing in common among all the tracks.
Reference Genome
Biologists aim to create a reference genome per species. Individuals in a species have slightly different genomes. For example, 2 individuals might have different versions of a gene. So, when creating a reference genome, biologists need to decide which version is the “ideal” version and this version will be part of the reference genome. Clearly, there is some subjectivity involved. But the reference genome acts as a standard. So, all biologists studying the same species can use the same reference genome. As a consequence, all their insights can be “mapped” to a reference genome and displayed interactively in a genome browser.
Next Steps
- Work through other tutorials for the UCSC Genome Browser like this YouTube playlist by Genomics Guru