How to Analyse Your Own DNA: A Personal Experience

Several months before my PhD defence


I had 2 reasons for performing my DNA analysis. I wanted to study if the incidences of ovarian cancer or pre-cancerous states which occurred in all my female close relatives after 40 yo are hereditary or sporadic. And — out of curiosity. I was like a shoemaker without the shoes so far =)

Choice of analysis

Before deciding to which company we want to bring our money we need to clarify which method they are going to use for DNA testing. Here I can not skip an introduction, so let’s start with the basics.

Picture from wikipedia: how DNA is organised in our cells
Set of chromosomes that humans have
Genomic variation (picture from “Analysis of genetic variation and potential applications in genome-scale metabolic modeling”, Cardoso et al). Short variants are shown in panel A, long variants — in panels B and C. Wild-type here shows the “reference” healthy human genome.
Our genome — can you still read the initial message? Space here means gap — there is no actual gap in our DNA, but it is convenient for us to represent two genomic pieces like this. The way we placed these 2 lines in order to maximize the similarity between the words, inserting spaces where needed, is called alignment (or mapping).
  • microarrays — they analyse only hundreds of thousands of DNA letters, but these letters are non-randomly selected from the human genome — most of these letters are meaningful and may have consequences, e.g., we may design array in a way so it detects the change from b to d in the word “brown” so it becomes “drown” — huge change, is not it? And let our array also detect the change from “fox” to “fax”, but not from “fox” to “fix” — our array is not designed to check this change. I think it is time to say — sometimes changes in the genome do no change anything for the organism, if, e.g., we will misspell “brown” as “brawn” — we will still be able to understand the message, so this variant is considered as “benign”, so the idea to detect only certain genomic changes that lead to the phenotypic changes makes sense;
  • targeted short-read sequencing (NGS panel of genes or whole-exome) — when the actual sequence of your DNA is determined for several genes or for all of them (whole exome). E.g., we know that the words “fox” and “brown” are responsible for hereditary cough — we design a panel to catch only these 2 genes and to read them only (or how do we say it in jargon, ‘sequence only these genes’, which means ‘find the sequence’). So if “fox” will change to “fax” or “pox” or “ox” — we will know it (unlike for microarrays). The whole-exome usually means we sequence the whole original phrase, skipping the abracadabra in between since we have no idea what does it mean anyway;
  • whole-genome short-read sequencing (NGS) — when you read the whole set of texts that comprise DNA. Of course, when you try to read more DNA — you also need to pay more. Why would you do this whole-genome analysis then? In a lot of cases, this abracadabra between words actually has a meaning and in some cases, we can even figure out which.
  • long-read single-molecule sequencing (third-generation sequencing) — the short-read NGS-based methods has one disadvantage — the machine used for reading of DNA can read only short pieces of it, 100–300 letters, and it complicates seriously the discovery of large variants. From the other hand, NGS-based methods allow much accurate detection of short variants. So, in my opinion, long-read sequencing is not directly useful for direct-to-customer genetic tests.
  • hybrid approaches (e.g. short-read and long-read sequencing of the same DNA) have a lot of advantages except one: price.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store