A Tale of Mutants: Sequencing the Full-Length Influenza Virus at the Single Cell Level

Liz T
PacBio
Published in
6 min readJun 17, 2019

It was February 2019 in Seattle. Outside the Fred Hutchinson Cancer Research Center, it was snowing — a rare event for the temperate Emerald City. My colleague Jason Underwood led me through the maze of corridors with non-descript rooms (though I interned at Fred Hutch when I was a grad student in Seattle, I routinely got lost inside), finally arriving at Jesse’s office.

Jesse Bloom and his postdoc Alistair Russell had been looking at the growth and cellular response of the H1N1 influenza virus at the single cell level using short read RNA-seq data (Russell et al. 2018). They infected A549 cells with the H1N1 influenza virus and used the 10X Chromium system and Illumina sequencing to study viral infection and cellular response at the single cell level.

The single cell RNA-seq revealed extreme variation in viral and immune response from cell to cell. At 10 hours post-infection, most of the cells have very little viral mRNA, while a small fraction of the cells was heavily dominated by viral transcripts.

Figure 4 from Russell et al. (2018) using single cell RNA-seq to quantify the fraction of viral mRNA at different times post infection. A549 cells were infected with the H1N1 influenza. Single cell library was generated on the 10X Chromium system and sequenced using Illumina short reads.

Collaborating with Cole Trapnell (a faculty member at UW Genome Sciences), Alistair and Jesse looked into what was causing the extreme heterogeneity. Difference in timing in infection was ruled out. Difference in initial infection dose was also not the major contributor. The only thing they could determine — based on the short-read data, which only provided 3’ gene count information — was that the absence of certain viral genes partially explained the viral load variability. Still, without access to the full viral sequences, they could not associate mutations across multiple viral genes.

Another surprise was that very few viral infections activated an innate immune response in the infected cells. Furthermore, the short-read data made it hard to determine if features of the virus — such as mutations — determined when a cell activated an immune response, since the short-read data only allowed Alistair and Jesse to see if a gene was completely absent.

All these unsolved mysteries led Alistair and Jesse to arrive at the conclusion that in order to have an unbiased view of all the viral mutations that are occurring at the single cell level, short read-based gene counting was not enough.

Inspired by my colleague Jason’s work on the Iso-Seq method (PacBio’s full-length RNA sequencing method), they decided to use accurate long reads to sequence the full viral transcriptome. They infected A549 reporter cells with H1N1 and enriched for interferon response. To further enrich for viral transcripts, they tried several PCR-based methods (recorded in excruciating detail in the publication) while still keeping the 3’ end that contained the 10X single cell barcodes and UMI information.

Figure 2 from Russell et al. (2019) showing the experimental design. A549 reporter cells were enriched for interferon (IFN) response. Single cell data was generated using the 10X Chromium system followed by Illumina and PacBio sequencing.

The PacBio single cell Iso-Seq data gave an unprecedented view of the complex and variable mutational landscape of the influenza virus. Of the 150 infected cells, only 49 were wild type at 10 hours post-infection. Even within cells infected with unmutated virus, there were differences — some cells expressed viral mRNA up to 65% with no immune response, while others only had 20–30% with some expression of the innate-immune signaling molecules interferon (IFN). For the mutated cells, the picture was even more complex. While the lack of genes in the viral polymerase such as PB2, PB1, PA or NP was sure to cause low viral transcription, other kinds of mutations may still result in high (~50%) viral load with IFN detection. Failure to express NS, observed in a number of cells, did not always activate IFN.

Excerpt of Figure 4 from Russell et al. (2019) showing 150 infected cells sequenced using PacBio long reads. Observed mutations on the eight viral genes are indicated for each cell, along with the measured viral load (green box) and IFN response (orange box). To see all 150 cells, see the full Figure 4 from publication.

In short: It’s complicated. Genetic variability is a major contributor, but not the sole contributor of viral load and immune response. Other factors, such as pre-existing cell heterogeneity and cell to cell signaling, are likely to play a role as well.

At this point, Alistair and Jesse wrote up their single cell Iso-Seq findings and put it on bioRxiv as a preprint. Then another researcher, AJ te Velthuis, read the preprint and reached out to them! AJ’s prior work showed that shorter, aberrant viral RNA may have an immunostimulatory effect by triggering RIG-I (te Velthuis et al. 2018). Together, they showed that for the influenza study, there are two mutations on the PB1 gene that affects polymerase activity: D27N reduces processivity, whereas T677A increases activity. These mutations may lead to accumulation of shorter, aberrant RNA products that may explain its effect on IFN response. For those interested in more details on the various effects influenza gene mutations have, I encourage you to read the full paper.

When Jesse first showed me his presentation on this work, my mind was blown. This is the first unbiased, non-hypothesis-driven investigation of viral mutation at the single cell level using full-length RNA sequencing. It showed unequivocally that viral mutation varies from cell to cell and is an important contributor to differences in viral burden and immune response. What remains to be seen, and this paper already began down that path, are the mechanisms that link genetic mutation to functional outcomes. The bigger question from a macroscopic point of view, however, is explaining how natural influenza infections progress and alter interact with the host immune system.

A hospital ward in Fort Riley, Kansas during the 1918 Spanish flu epidemic.

In writing this blog, I also interviewed Alistair and Jesse to get their thoughts on the project. Below are my questions and their answers.

Q: Can you give me some background on how this single cell project (both using RNA-seq and Iso-Seq) came about?

(Alistair) It’s been known since the 60s that there is high heterogeneity in viral infections. Infections are often caused by only a few virions (McCrone et al. 2018) and most infections fail. Our eLife (Russell 2018) paper was the first study that used single cell transcriptome measurements to explain this heterogeneity. Then, with the short reads we could only do 3’ gene tagging and we were missing all the genetic mutations, and that’s why we followed up with long reads.

Q: To enrich for viral transcripts for PacBio sequencing, you tried several PCR methods. What lessons did you learn from all these attempts?

(Alistair) We tried different amplification methods as we went along and we did not want to throw away any of the data. We learned that emulsion PCR was likely unnecessary and would not affect chimera rate if the amplification remains under saturation. For this, it was important to have viral tags at both ends of the molecule, which allowed us to see that the emulsion PCR was not worth the extra pain if you control the PCR cycles.

(Jesse) The biggest thing we learned from the PCR efforts was that amplifying each of the eight genes individually, instead of all at once, was a better approach. Also, the 10X single cell sequencing approach requires a polyA tail — not all viruses have polyadenylation, so that’s one limitation.

Q: What’s next?

(Alistair) This viral heterogeneity is not flu-specific. However, the extent of genetic variability may vary across different flu strains. We would like to explore that and also search for mechanistic explanations.

(Jesse) What we call “single cell RNA-seq” using short reads is not really sequencing…it’s counting! This study really demonstrates for studying viruses that have high mutation rate with cell-to-cell variability, counting alone is inadequate. Moving forward, we would also want to combine other single cell techniques, such as single cell proteomics, to look beyond transcription.

--

--

Liz T
PacBio
Writer for

All things RNA. Bioinformatics. Opinions are my own.