Should we be worried about CRISPR/Cas9 off target effects?
This week a short communication came out in Nature methods on off targets (OT) effects following CRISPR/Cas9 editing into the mouse zygote. This paper has made few headlines in media outlets leading to a fall in stock market for biotechnology companies such as Editas Medicine or Caribou Biosciences. This piece made several key observations I commented on Twitter or in the media here or here. This piece will detail my initial arguments over this paper.
The key findings of the Nature methods paper:
The authors co-injected 3 ng/µl of px335 plasmid encoding Cas9D10 and a single guide RNA cloned into this plasmid from Zhang lab along with 3 ng/µl of Cas9 protein and 1µM single stranded Oligonucleotide into the mouse zygote of FVB/J mouse strain. The expariment aimed to restaure blindness by replacing an allele in Pde6b gene. Following the generation of the edited mice, they checked 2 founders for their OT effects using Whole genome sequencing. They eliminated common genetic variation by filtering variants against FVB/J strains and 36 additional mouse strains sequenced from the Sanger institute. They found a large number of variants ~ 1,700 mostly in non coding sequences. Many of these variants were shared between these 2 founders. The repartition of these variants were> 90% Single nucleotide variants (SNV) vs < 10% indels (Figure 1 in the paper). Interestingly they noted that none of these variants were predicted from OT predictors.
This paper has generated lots of interests and also lots of skepticisms. I will try here to dissect some of the findings and reply to the following question: Should we be worried about CRISPR/Cas9 OT effects?
A CRISPR/Cas9 construct may persist up to a week in cells.
The Nature Methods paper raised an interesting question: How long a CRISPR/Cas9 construction lasts in a cell? The short answer to this question is it depends on a delivery method.
Studies such as that one from Thermofischer R&D team have compared DNA, RNA or Protein modes of deliveries of Cas9 enzyme. They clearly showed that Cas9 DNA delivery in a form of plasmid remained longer in the cell than Cas9 mRNA or protein and generated significantly more OT effects (Figure 3)
In our experience, we have seen similar trends. We could see persistence of the plasmid up to a week in a cell. Of course, longer Cas9 is active in the cell, the more likely it would create random mutations in the genome and OT. We and others noted too that increase concentration of Cas9 enzyme could lead to OT. What is a little bit atypical in this injection protocol from the Nature Methods paper is they co-injected Cas9 protein along with px335 in the mouse zygote. The co-delivery of Cas9 and Cas9D10 could certainly increase OT. In our hands we could get an editing up to 60% rate (depending of the efficiency of a guide RNA) and generate point mutations and using Cas9D10 Px461 plasmid (Cas9D10 enzyme fused with a GFP) which was a little bit surprising to us. Does it explain such all the magnitude in OT effects reported in the Nature Methods paper? The clear answer is No.
Bioinformatic filtering and the identification of true causative variants: the ENU tale:
Before going more into details on the bioinformatic pipeline, variant filtering and the sample size issue, I would like to tell a tale on a project I conducted from 2008 to 2016.
In 2008 we conducted a large scale mutagenesis in mouse on SJL/J mouse background using N-Ethyl-N-Nitrosourea (ENU). ENU is an alkylating agent generating a semi-random mutation in the genome ~ every 1 Mb. We phenotyped these mice for Red blood cells abnormailites or resistance to the malaria parasite. Once we identified an interesting phenotype, we exome sequenced the mice to find the causative mutations underlying the observed phenotype. Our bioinformatic pipeline was published 2 years ago in BMC genomics. For this work, the challenge we rapidly faced was the lack of reference genome for SJL/J and the paucity of variants available in databases such as dbSNP to filter out false positive (it was in 2010). We ended up with many false positive variants we thought were causative. The only way to reduce the false positive rate was to increase the number of mice to sequence. We could substantially reduce the false positive rate after sequencing over 100 mouse exomes from > 50 founders. We also noted that available databases such as dbSNP or the sequences of available mouse strains made little difference to reduce our false positive rate. Many variants were private to a mouse strain in our animal facility and were not described in any databases. The only way to capture those private variants was to sequence a significant number of mice. In our hands it was > 50 exomes.
Here comes my point of the sample size issue. FVB/J is an inbred strain of mouse relatively commonly used for work on transgenesis or retinal degeneration. The authors of the Nature Methods paper sequenced 2 founders and one control mouse from their facility. As mentioned above their filtering eliminated some variants but not all, in fact far from all. From their published list, there are clearly false positives based on the fact that: 1) in their list recurrent false positives from our exome data such as olfactory genes or duplicated genes were noted as CRISPR induced (Suppl Table 1) 2) They detected too many homozygous SNVs (90%). Cas9 creates in large majority indels in the genome. The indels rate was too low (<10%). For a co-delivery Cas9 along with Cas9D10, this is clearly unusual 3) These variants were not detected from prediction softwares. Given Cas9 is sensitive to the number, position and distribution of mismatches, this is a little bit unusual 4) Many variants (too many) were commonly shared between these 2 founders. In our hands (We have generated > 100 edited mouse lines since 2014 using DNA plasmid, mRNA or protein delivery), we have exceptionally seen homozygous SNVs and shared variants that are CRISPR related from 2 founders on a desired modified locus.
Right control mice and large sample size would have considerably reduced the number of variants described in this short communication.
As mentioned above, one critical issue in this short communication is the small sample size. in our ENU experiment, we could paritally eliminate false positives after sequencing over 50 exomes. A sample size of n= 3 with only a single guide RNA shows clearly this study is underpowered and therefore does not give any meaningful indication on OT following pronuclear injection of a CRISPR/Cas9 construction.
In short the increase in sample size, additional single guide RNA and the inclusion of a sham-injected control would have considerably reduced the number of false positive detected variants and tell a very different story on OT.
Do others investigators found such magnitude of off-target effects after CRISPR/Cas9 delivery?
One important question is this unusual OT rate was observed before? The simple answer is NO.
For instance this report published in Scientific reports examined the number of OT after injection of a CRISPR/Cas9 construct using exome sequencing. They found 3 indels; 1 was probably a carried-over in the mouse background, a second one was a de novo mutation and the third one was Cas9 related. Additional reports such as that one in Nature Methods reported similar conclusions. We conducted whole exome sequencing experiments in some of the CRISPR/Cas9 edited mice. Similarly to those published reports, the number of OT were less than 5. finally the International Mouse Phenotyping consortium (IMPC) reported similar results too from their CRISPR/Cas9 nodes.
In general the number of OT reported in the literature after CRISPR/Cas9 editing is generally low and predictable as Cas9 is highly sensitive to mismatches.
Why this discrepancy between the report published in Nature Methods and previous observations?
From reading this report published 3 days ago in Nature Methods, it appears clear to me that 1) The methods of Cas9 delivery was unusual leading to the generation of an increase OT as normally reported 2) The low sample size, one control sequenced, only one guide RNA tested revealed a seriously underpowered study and menaingless results 3) The poor experimental design and bioinformatic filtering failed to distinguish between a) variants that are proper to the FVB littermate housed in the animal house b) de novo mutations that have appeared after injection c) variants that were generated from the Cas9 nuclease. Therefore this led to an abnormal number of mutations reported. I would predict very few if not close to none of these variants are CRISPR related.
Should we be worried about off target effects?
This leads to our central question: Should we be worried about Off targets? The OT issue is known for years in the scientific community working on genome editing technology and a large body of work has been dedicated to reduce OT. Amongst many examples, the use of Ribonucleoprotein complex delivery instead of mRNA or plasmid considerably or the use of Cas9 mutants such as the eSPCas9 or the HF-Cas9 has been proven to reduce OT. For more details Addgene published an excellent post on this.
The novelty here is non specialists of genome editing technology or the general public are discovering Off target effects, which is somehow, not such a bad outcome. Many will realise that the safe use and efficient delivery of CRISPR technology for therapeutic purpose has a long way to go. As many mentioned before, a careful evaluation of possible OT is of course critical and predictable for any therapeutic purpose.
The next question is should we systematically evaluate OT for the knockout and knockin mice we are generating. The simple answer is NO. OT mutations are usually eliminated from the mouse genome by breeding one or two generations with thier parental strain background.
What would the next step for us?
We and possibly other investigators will compare their results to existing SNV databases. In our side their results are being compared to our 3,000 mouse exomes. I would anticipate that many of those variants will be found in our database. Many of those genes described are familiar to us and reccurrently detected as false positives. I will update with our last findings and certainly post our work on BioRxiv or elsewhere, hopefully soon.
A final word on the media hype.
This following words are a personal take on this paper and how the findings were communicated to the general audience. For those that are not interested in my personal opinions, please feel free to leave it there.
I find absolutely astonishing this paper got published in Nature Methods. This is a terrible paper and as a reviewer I would have dissmissed it from the first round of review. This is a worrying trend from ‘high impact’ journals to promote the hype over good science. The publication of this paper is clearly a failure in the peer review process. These journals are also dissmissive over their own mistake and this is highly irritating. As an example the NgAgo paper was published a year ago in Nature Biotechnology and after many unsuccessful attempts of replicating those results, the paper is still not retracted ! Many times I have heard, after so many scandals such as STAPS and so on that none of these ‘high impact’ scientific journals are trusted and in general scientists are not trusted too. As a scientist, we should be really worried about this trend.
Conflict of interests:
I have no conflict of interests to declare. I do not hold any stock markets or shares in genome editing companies. I am not a board member of any companies. I’ve never met or communicated with any of the author of this Nature Methods piece. Finally my view do not represent my employer.