Absolute quantitation of individual SARS-CoV-2 RNA molecules provides a new paradigm for infection dynamics and variant differences

  1. Department of Biochemistry, The University of Oxford
    OxfordUnited Kingdom
  2. Nuffield Department of Medicine, The University of Oxford
    OxfordUnited Kingdom
  3. Chinese Academy of Medical Sciences (CAMS) Oxford Institute (COI), The University of Oxford
    OxfordUnited Kingdom
  4. MRC-University of Glasgow Centre for Virus Research, The University of Glasgow
    GlasgowUnited Kingdom
  5. Ludwig Institute for Cancer Research, The University of Oxford
    OxfordUnited Kingdom
  6. Sir William Dunn School of Pathology, The University of Oxford
    OxfordUnited Kingdom
  7. UK Health Security Agency, UKHSA-Porton Down
    SalisburyUnited Kingdom
  8. Department of Infectious Diseases, University Hospitals Sussex NHS Foundation Trust
    BrightonUnited Kingdom
  9. James & Lillian Martin Centre, Sir William Dunn School of Pathology, The University of Oxford
    OxfordUnited Kingdom

Abstract

Despite an unprecedented global research effort on SARS-CoV-2, early replication events remain poorly understood. Given the clinical importance of emergent viral variants with increased transmission, there is an urgent need to understand the early stages of viral replication and transcription. We used single-molecule fluorescence in situ hybridisation (smFISH) to quantify positive sense RNA genomes with 95% detection efficiency, while simultaneously visualising negative sense genomes, subgenomic RNAs, and viral proteins. Our absolute quantification of viral RNAs and replication factories revealed that SARS-CoV-2 genomic RNA is long-lived after entry, suggesting that it avoids degradation by cellular nucleases. Moreover, we observed that SARS-CoV-2 replication is highly variable between cells, with only a small cell population displaying high burden of viral RNA. Unexpectedly, the B.1.1.7 variant, first identified in the UK, exhibits significantly slower replication kinetics than the Victoria strain, suggesting a novel mechanism contributing to its higher transmissibility with important clinical implications.

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the COVID-19 pandemic. The viral genome consists of a single positive strand genomic RNA (+gRNA) approximately 30 kb in length that encodes a plethora of viral proteins 44Kim et al.202094Zhao et al.2021. SARS-CoV-2 primarily targets the respiratory tract and infection is mediated by Spike protein binding to human angiotensin-converting enzyme (ACE2), where the transmembrane protease serine 2 (TMPRSS2) triggers fusion of the viral and cell membranes 37Hoffmann et al.202086Wan et al.2020. Following virus entry and capsid trafficking to the endoplasmic reticulum, the first step in the replicative life cycle is the translation of the gRNA to synthesise the replicase complex. This complex synthesises the negative sense genomic strand, enabling the production of additional positive gRNA copies. In addition, a series of shorter subgenomic RNAs (sgRNAs) are synthesised that encode the structural matrix, Spike, nucleocapsid and envelope proteins, as well as a series of non-structural proteins 44Kim et al.202077Sola et al.2015. The intracellular localisations of these early events were described using electron microscopy 50Laue et al.2021 and by antibody-based imaging of viral double-stranded (ds)RNA 52Lean et al.2020. However, the J2 dsRNA antibody lacks sensitivity and specificity at early times post infection as the low abundance of SARS-CoV-2 dsRNA is indistinguishable from host dsRNAs 22Dhir et al.2018. Our current knowledge of these early steps in the SARS-CoV-2 replicative life cycle is poorly understood despite their essential role in the establishment of productive infection.

Since the initial outbreak in the Wuhan province of China in 2019, several geographically distinct variants of concern (VOCs) with altered transmission have arisen 15Chen et al.202056Lythgoe et al.2021. Emerging VOCs such as the recently named Alpha strain (previously known and referred to herein as B.1.1.7), first detected in Kent in the UK, possess a fitness advantage in terms of their ability to transmit compared to the Victoria (VIC) isolate, an early strain of SARS-CoV-2 first detected in Wuhan in China 11Caly et al.202020Davies et al.202143Kidd et al.202184Volz et al.2021. Many of the VOCs encode mutations in the Spike (S) protein 67Rees-Spear et al.2021 and, consequently, the effects of these amino acid substitutions on viral entry and immuno-evasion are under intense study 46Kissler et al.202188Washington et al.2021. However, some of the mutations map to non-structural proteins, so could impact viral replication dynamics. To date, the early replication events of SARS-CoV-2 variants have not been characterised as the current techniques for quantifying SARS-CoV-2 genomes and replication rates rely on bulk approaches or have limited sensitivity.

The use of single-molecule and single-cell analyses in biology offers unprecedented insights into the behaviour of individual cells and the stochastic nature of gene expression that are often masked by population-based studies 30Fraser and Kaern200964Raj and Oudenaarden2009. These approaches have revealed how cells vary in their ability to support viral growth and how stochastic forces can inform our understanding of the infection process 6Billman et al.20177Boersma et al.202016Chou and Lionnet201874Shulla and Randall201576Singer et al.2021. Fluorescence in situ hybridisation (FISH) was previously used to detect RNAs in hepatitis C virus and Sindbis virus-infected cells with high sensitivity 32Garcia-Moreno et al.201965Ramanan et al.201676Singer et al.2021. This approach has been applied to SARS-CoV-2 in a limited capacity 9Burke et al.202168Rensen et al.2021 with most studies utilising amplification-based signal detection methods to visualise viral RNA 5Best Rocha et al.202014Carossino et al.202034Guerini-Rocco et al.202042Jiao et al.202048Kusmartseva et al.202052Lean et al.202054Liu et al.2020. These experiments used either chromogenic histochemical detection using bright field microscopy or detection of fluorescent dyes, which both lack the sensitivity to detect individual RNA molecules. Consequently, the kinetics of SARS-CoV-2 RNA replication and transcription during the early phase of infection are not well understood and lack quantitative, spatial and temporal information on the genesis of gRNA and sgRNAs. To address this gap, we developed a single-molecule (sm)FISH method based on earlier published protocols 28Femino et al.199863Raj et al.200876Singer et al.202183Titlow et al.2018 to visualise SARS-CoV-2 RNAs with high sensitivity and spatial precision, providing a powerful new approach to track infection through the detection and quantification of viral replication factories. Our results uncover a previously unrecognised heterogeneity among cells in supporting SARS-CoV-2 replication and a surprisingly slower replication rate of the B.1.1.7 variant when compared to the early lineage VIC strain.

Results

SARS-CoV-2 genomic RNA at single-molecule resolution

To explore the spatial and temporal aspects of SARS-CoV-2 replication at single-molecule and cell levels, we carried out smFISH experiments with fluorescently labelled probes directed against the 30 kb viral gRNA. 48 short antisense DNA oligonucleotide probes were designed to target the viral ORF1a and labelled with a single fluorescent dye to detect the positive sense gRNA, as described previously (33Gaspar et al.2017; Figure 1A). The probe set detected single molecules of gRNA within SARS-CoV-2-infected Vero E6 cells, visible as well-resolved diffraction-limited single spots with a consistent fluorescence intensity and shape (Figure 1B). Treatment of the infected cells with RNase or the viral polymerase inhibitor remdesivir (RDV) ablated the probe signal, confirming specificity (Figure 1—figure supplement 1A). To assess the efficiency and specificity of detection of the +ORF1a probe set, we divided the probes into two groups of 24 alternating oligonucleotides (‘ODD’ and ‘EVEN’) that were labelled with different fluorochromes. Interlacing the probes minimised chromatic aberration between spots detected by the two colours (Figure 1C). Analysis of the SARS-CoV-2 gRNA with these probes showed a mean distance of <250 nm between the two fluorescent spots, indicating near-perfect colour registration and a lack of chromatic aberration. 95% of the diffraction-limited spots within infected cells were dual labelled, demonstrating efficient detection of single SARS-CoV-2 gRNA molecules (Figure 1C). To assess whether virion-encapsulated RNA is accessible to the probes, we immobilised SARS-CoV-2 particles from our viral stocks on glass and incubated them with the +ORF1a probes. We observed a large number of spots in the immobilised virus preparation that was compatible with single RNA molecules (Figure 1—figure supplement 1B), suggesting that detection of RNA within viral particles was achieved.

Sensitive single-molecule detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic RNA in infected cells.

(A) Schematic illustration of single-molecule fluorescence in situ hybridisation (smFISH) for detecting SARS-CoV-2 positive strand genomic RNA (+gRNA) within infected cells. (B) Reference spatial profile of a diffraction-limited +ORF1a smFISH spot. The calibration bar represents relative fluorescence intensity (top). Frequency distribution of smFISH spot intensities, exhibiting a unimodal distribution (bottom). (C) Assessment of smFISH detection sensitivity by a dual-colour co-detection method. Maximum intensity projected images and corresponding FISH-quant spot detection views of ODD and EVEN probe sets are shown. Scale bar = 5 µm. Density histogram of nearest-neighbour distance from one spectral channel to another (top). Vertical line indicates 300 nm distance. Percentage overlap between spots detected by ODD and EVEN split probes, calculated bidirectionally (bottom). (D) Heatmap of probe sequence alignment against various Coronaviridae and host transcriptomes. Each column represents individual 20 nt + ORF1a probe sequences. The minimum edit distance represents mismatch scores, where ‘0’ indicates a perfect match. Melting temperatures of each probe at the smFISH hybridisation condition are shown. (E) smFISH against +ORF1a in SARS-CoV-2-infected formalin-fixed paraffin-embedded (FFPE) lung tissue from Golden Syrian hamster at 4 days post infection. Hamsters were infected intranasally with 5 × 104 plaque-forming unit (PFU) of SARS-CoV-2 BVICO1. At necropsy, lung samples were fixed in 10% buffered formalin and embedded in paraffin wax. Red arrows in magnified panels indicate single-molecule RNA spots. Scale bars = 1000, 10, or 2 µm. (F) Experimental design for visualising SARS-CoV-2 gRNA with smFISH at different timepoints after infection of Vero E6 cells. Cells were seeded on cover-glass and 24 hr later inoculated with SARS-CoV-2 (Victoria [VIC] strain at multiplicity of infection [MOI] 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion and cells fixed at the timepoints shown. Representative 4 µm maximum intensity projection confocal images are shown. The calibration bar labelled with the symbol ‘#’ is used to display wider dynamic contrast range. Magnified view of insets in the upper panels is shown in lower panels. Scale bars = 10 µm or 2 µm.

library(tidyverse)
#library(hrbrthemes)
library(scales)
library(patchwork)
library(rstatix)
library(ggridges)
library(janitor)
library(ggbeeswarm)
all_spots <- 
  read_tsv("./Data/Figure1/_FISH-QUANT__all_spots_201125.txt", skip = 13) %>%
  filter(TH_fit == 1) %>%
  select(c(INT_filt, INT_raw)) %>%
  mutate(strand = "(+)ve strand smFISH")

all_spots %>%
  ggplot(aes(x = INT_filt)) + 
  geom_density(adjust = 1.25, size = 1,colour = "goldenrod1") +
  scale_x_continuous(limits = c(5, 65)) +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  scale_colour_manual(values = c("magenta", "green")) + 
  labs(x = "Single spot fluorescence intensity (A.U.)",
       y = "% Frequency") +
  theme_classic(base_size = 9) +
  theme(axis.line = element_blank(), 
        panel.border = element_rect(colour = "black", fill = NA, size = 0.25),
        legend.title = element_blank(),
        legend.position = "bottom") +
  guides(colour = guide_legend(reverse = TRUE))

Frequency distribution of smFISH spot intensities, exhibiting a unimodal distribution

nn.spots <- read_csv("./Data/Figure1/nearest_neighbour_spots.csv")
nn.summary <- read_csv("./Data/Figure1/nearest_neighbour_summary.csv")

nn.spots %>%
  ggplot(aes(x = nn.dist, y = ..scaled.., colour = category, fill = category)) +
  geom_density(aes(size = category), adjust = 2, alpha = 0.1) +
  scale_size_manual(values = c(0.5, 0.25)) +
  geom_vline(xintercept = 300, alpha = 0.3, size = 0.5) +
  labs(x = "nearest neighbour distance (nm)", y = "Scaled density") +
  scale_x_continuous(limits = c(0, 1000)) +
  scale_colour_manual(values = c("#75b8d1", "#d18975")) + 
  scale_fill_manual(values = c("#75b8d1", "#d18975")) + 
  theme_classic(base_size = 7) +
  theme(axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.title = element_blank(),
        legend.text = element_text(size = 5),
        legend.key.size = unit(0.3, 'cm'),
        legend.position = c(0.8, 0.8),
        legend.background = element_blank()) +
  guides(colour = guide_legend(reverse = TRUE),
         fill = guide_legend(reverse = TRUE),
         size = FALSE) -> p1

nn.summary %>%
  ggplot(aes(x = category, y = percentage.overlap)) +
  geom_jitter(position = position_jitter(0.05), size = 1, colour = 'gray') +
  geom_pointrange(stat = "summary", fun.data = "mean_sdl", fun.args = list(mult = 1), 
                  colour = 'red', size = 0.15) +
  scale_y_continuous(limits = c(80, 100)) +
  labs(y = "% overlap in 3D (< 300 nm)") +
  theme_classic(base_size = 7) +
  theme(axis.title.x = element_blank(),
        axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.title = element_blank()) -> p2

p1/p2

Density histogram of nearest-neighbour distance from one spectral channel to another (top), and percentage overlap between spots detected by ODD and EVEN split probes, calculated bidirectionally (bottom).

Vertical line indicates 300 nm distance.

#' @width 22
#' @height 14

# IMPORT

specificity <- read_tsv("./Data/Figure1/gRNA_pos_transcriptomic_matches.tsv") %>%
  mutate(name = str_remove(name, "ORF1a_"))

# MAKE PLOTTING DF

## Viral df

plot_df_viral <- specificity %>%
  select(
    name,
    ends_with('genome_edit_distance'),
    -monkey_txome_edit_distance,
    -human_genome_edit_distance
  ) %>%
  dplyr::select(all_of("name"),
                contains("_edit_distance")) %>%
  pivot_longer(
    contains("_edit_distance"),
    names_to = "species",
    values_to = "value"
  ) %>%
  mutate(species = str_extract(!!sym("species"), "^[:alnum:]*(?=_)")) %>%
  mutate(species = case_when(
    species == "SARS2" ~ "SARS-CoV-2",
    species == "SARS1" ~ "SARS-CoV-1",
    species == "MERS" ~ "MERS-CoV",
    species == "E229" ~ "229E",
    TRUE ~ species
  ))

## Host df

plot_df_host <- specificity %>%
  select(
    name,
    monkey_txome_edit_distance,
    human_txome_edit_distance
  ) %>%
  dplyr::select(all_of("name"),
                contains("_edit_distance")) %>%
  pivot_longer(
    contains("_edit_distance"),
    names_to = "species",
    values_to = "value"
  ) %>%
  mutate(species = str_extract(!!sym("species"), "^[:alnum:]*(?=_)")) %>%
  mutate(species = case_when(
    species == "human" ~ "Human",
    species == "monkey" ~ "Green monkey",
  ))


## Tm df

plot_df_tm <- specificity %>% dplyr::select(name, gene, Tm)


# PLOTTING

## Define plotting variables

limit <- 6
tile_ratio <- 1
guide_title <-  "Minimum\nedit distance"
base_size <- 9

viral_species_order <- c("SARS-CoV-1", "SARS-CoV-2", "MERS-CoV", "229E", "NL63", "OC43" , "HKU1")
host_species_order <- c("Human", "Green monkey") 

viral_colours       <- c("white", RColorBrewer::brewer.pal(limit+1,"Blues"))
host_colours <- c("white", RColorBrewer::brewer.pal(limit+1,"Greens"))

label_order   <- c(0:limit, stringr::str_c(limit+1, " <"))

myPalette   <- colorRampPalette(RColorBrewer::brewer.pal(9, "YlOrRd"))
guide_title_tm <- "Tm"

min_val <- min(plot_df_tm[,"Tm"])
max_val <- max(plot_df_tm[,"Tm"])

tm_colours <- myPalette(max_val)

## Viral plot

plot_df_viral %>%
  mutate(count = ifelse(
    !!sym("value") > limit | is.na(!!sym("value")),
    stringr::str_c(limit+1, " <"),
    as.character(!!sym("value"))
  )) %>%
  ggplot2::ggplot(
    aes(
      y = factor(!!sym("species"), levels = rev(viral_species_order)),
      x = factor(name, levels = unique(name)),
      fill = factor(count, levels = label_order) )) +
  ggplot2::geom_tile(colour="lightgray", size=0.2) +
  ggplot2::coord_fixed(ratio = tile_ratio) +
  ggplot2::scale_fill_manual(values = viral_colours, drop = FALSE) +
  ggplot2::guides(
    fill = ggplot2::guide_legend(
      title = guide_title, 
      title.position = "top",
      label.position = "right",
      ncol = 1
    )
  ) +
  ggplot2::labs(x = "", y = "", subtitle = "Coronaviridae transcriptome") +
  ggplot2::theme_minimal(base_size = base_size) +
  ggplot2::theme(axis.text.x = ggplot2::element_blank(), 
                 legend.position = 'bottom') -> Virus

## Host plot

plot_df_host %>%
  mutate(count = ifelse(
    !!sym("value") > limit | is.na(!!sym("value")),
    stringr::str_c(limit+1, " <"),
    as.character(!!sym("value"))
  )) %>%
  ggplot2::ggplot(
    aes(
      y = factor(!!sym("species"), levels = rev(host_species_order)),
      x = factor(name, levels = unique(name)),
      fill = factor(count, levels = label_order) )) +
  ggplot2::geom_tile(colour="lightgray", size=0.2) +
  ggplot2::coord_fixed(ratio = tile_ratio) +
  ggplot2::scale_fill_manual(values = host_colours, drop = FALSE) +
  ggplot2::guides(
    fill = ggplot2::guide_legend(
      title = guide_title, 
      title.position = "top",
      label.position = "right",
      ncol = 1
    )
  ) +
  ggplot2::labs(x = "", y = "", subtitle = "Host transcriptome") +
  ggplot2::theme_minimal(base_size = base_size) +
  ggplot2::theme(axis.text.x = ggplot2::element_blank(), 
                 legend.position = 'bottom') -> Host

## Tm plot 

plot_df_tm %>%
  ggplot2::ggplot(
    aes(
      y    = !!sym("gene"),
      x    = factor(!!sym("name"), levels = unique(name)),
      fill = !!sym("Tm")
    )
  ) +
  ggplot2::geom_tile(colour="lightgray", size=0.2) +
  ggplot2::coord_fixed(ratio = tile_ratio) +
  ggplot2::scale_fill_gradientn(
    colours = tm_colours,
    limits  = c(min_val, max_val)
  ) + 
  ggplot2::guides(
    fill = ggplot2::guide_legend(
      title = guide_title_tm, 
      title.position = "top",
      label.position = "right",
      ncol = 1,
      reverse = TRUE
    )
  ) +
  labs(x = "", y = "", subtitle = "Melting temperature") +
  ggplot2::theme_minimal(base_size = base_size) +
  ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5), legend.position = 'bottom',
                 axis.text.y = element_blank()) -> Tm

## Patchwork

(Virus / Host / Tm) + 
  plot_layout(guides = "collect") & 
  theme(legend.position = 'right',
        legend.key.size = unit(0.2, "cm")) -> specificity_plot

specificity_plot

Heatmap of probe sequence alignment against various Coronaviridae and host transcriptomes.

Each column represents individual 20 nt + ORF1a probe sequences. The minimum edit distance represents mismatch scores, where ‘0’ indicates a perfect match. Melting temperatures of each probe at the smFISH hybridisation condition are shown.

virus_spots <- 
  read_tsv("./Data/Figure1/_FISH-QUANT__threshold_spots_210509.txt", skip = 13) %>%
  filter(TH_fit == 1) %>%
  select(c(INT_filt, INT_raw)) %>%
  mutate(strand = "(+)ve strand smFISH")

virus_spots %>%
  ggplot(aes(x = INT_filt)) + 
  geom_density(adjust = 0.8, size = 0.5, colour = "goldenrod1") +
  # scale_y_continuous(labels = percent_format(accuracy = 1)) +
  labs(x = "Single spot fluorescence intensity (A.U.)",
       y = "Density") +
  theme_classic(base_size = 6) +
  theme(axis.line = element_blank(), 
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.title = element_blank(),
        legend.position = "bottom") +
  guides(colour = guide_legend(reverse = TRUE))

Density distribution of smFISH spot intensities (right panel), exhibiting a unimodal distribution (n = 1664 spots).

Specific detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA using single-molecule fluorescence in situ hybridisation (smFISH).

(A) Specificity of the +ORF1a smFISH probe for SARS-CoV-2 RNA. Vero E6 cells were infected with SARS-CoV-2 (Victoria [VIC], multiplicity of infection [MOI] = 1), fixed at 8 hr post infection (hpi), and hybridised with +ORF1a smFISH probe. In the remdesivir (RDV) condition, the drug was added to the cells at 10 µM during virus inoculation and maintained for the infection period. For the RNase digestion, permeabilised cells were treated with a cocktail of RNaseT1 and RNaseIII in the presence of MgCl2 to digest RNA prior to probe hybridisation. Representative full z-projection (8 µm) confocal images are shown. Scale bar = 10 µm. (B) Visualisation of encapsidated SARS-CoV-2 RNA with smFISH (left panels). Virus was immobilised onto poly-L-lysine-coated coverslips and visualised via the +ORF1a probe. Mock (negative control) condition was prepared by incubating coated coverslips in PBS without the virus. 1 µm maximum z-projected confocal images are shown. Scale bar = 20 µm or 5 µm. Density distribution of smFISH spot intensities (right panel), exhibiting a unimodal distribution (n = 1664 spots). (C) Calu-3 (upper panels) and Huh-7.5 (lower panels) cells were infected with SARS-CoV-2 (VIC) and HCoV-229E (MOI = 1), respectively, fixed at 24 hpi and hybridised with the SARS-CoV-2-specific +ORF1a probe. In addition, cells were stained with anti-dsRNA (J2) to identify heavily infected cells. Representative single slice confocal images are shown. Scale bar = 10 µm.

To verify the specificity of the +ORF1a probes for SARS-CoV-2, we aligned their sequences against other coronaviruses and the transcriptomes of both human and African green monkeys. Many of the oligonucleotides showed mismatches with SARS-CoV-1, MERS, and other coronaviruses along with human and green monkey RNAs (Figure 1D). The specificity of the +ORF1a probes highlights the applicability of our probes to detect SARS-CoV-2 in different mammalian hosts. The high level of mismatches with other coronaviruses predicts that the +ORF1a probes are unlikely to hybridise with RNAs of other coronaviruses. To evaluate this, we assessed the ability of the +ORF1a probe set to hybridise RNA from the common cold coronavirus HCoV-229E. Although the antibody against dsRNA (J2) detected dsRNA foci in the HCoV-229E-infected Huh7.5 hepatoma cells, no signal was detected with the SARS-CoV-2 +ORF1a probe set (Figure 1—figure supplement 1C). In contrast, the +ORF1a probe set bound with intense signals in J2-positive SARS-CoV-2-infected Calu-3 cells.

Next, we tested whether smFISH could be used to detect SARS-CoV-2 RNAs in formalin-fixed and paraffin-embedded (FFPE) lung tissue from experimentally infected Golden Syrian hamsters. The animals were inoculated with SARS-CoV-2 BVIC01 (5 × 104 plaque-forming unit [PFU]) by intranasal delivery and infection assessed by qPCR measurement of viral RNA (mean 6.5 × 106 copies/ml) and titration of infectious virus (mean 5.3 × 103 PFU/ml) in throat swabs at 4 days post infection. Animals were assessed daily for the onset of clinical symptoms with the most severe being the presence of laboured breathing that was noted in all infected animals from day 3 post infection onwards 26Dowall et al.2021. Infected animals lost weight from day 1 post infection and by day 4 a loss of 10% total body mass was recorded; however, no significant change was noted in body temperature. The +ORF1a probes detected SARS-CoV-2 gRNA in a representative section of the infected lung tissue, showing variable intracellular RNA levels within the lung tissue (Figure 1E). Our results demonstrate the specificity of the +ORF1a probe set to detect single molecules of SARS-CoV-2 gRNA within infected tissue samples fixed and treated in a manner similar to clinically derived tissues. We conclude that smFISH is likely to work well in clinical studies on material derived from infected patients and provide a highly sensitive method to visualise viral RNA in these samples.

Having established smFISH for the detection of SARS-CoV-2 gRNA, we used this technique to assess both the quantity and distribution of gRNA during infection. Vero E6 cells were inoculated with virus at a multiplicity of infection (MOI) of 1 for 2 hr and non-internalised viruses were removed by trypsin digestion to synchronise the infection. At 2 hr post infection (hpi), most fluorescent spots correspond to single gRNAs along with a small number of foci harbouring several gRNA copies (Figure 1F), consistent with early RNA replication events. By 8 hpi, we noted an expansion in the number of bright multi-gRNA foci, and at 24 hpi there was a further increase in the number of multi-RNA foci that localised to the perinuclear region (Figure 1F); consistent with the reported association of viral replication factories with membranous structures derived from the endoplasmic reticulum 85V’kovski et al.2021. Interestingly, our observation of individual gRNA molecules at the periphery of cells (Figure 1F) is also consistent with individual viral particles observed at the same location by electron microscopy 19Cortese et al.202047Klein et al.2020. We conclude that detection of SARS-CoV-2 + gRNA by smFISH identifies changes in viral RNA abundance and cellular distribution during early replication. Our method detected +gRNA molecules in all the expected subcellular locations, namely in virions, free in the cytoplasm and in clusters at the periphery of the nucleus, reflecting different steps of the viral life cycle 19Cortese et al.202035Hackstadt et al.202158Mendonça et al.2021.

Quantification of SARS-CoV-2 genomic and subgenomic RNAs

SARS-CoV-2 produces both gRNA and subgenomic (sg)RNAs which are both critical to express the full scope of viral proteins in the right time and stoichiometry. However, quantitation of sgRNAs is challenging due to their sequence overlap with the 3′ end of the gRNA. To estimate the abundance of sgRNAs, we designed two additional probe sets labelled with different fluorochromes; a +ORFN set that hybridises to all canonical positive sense viral RNAs, and a +ORFS set that detects only sgRNA encoding S (S-sgRNA) and gRNA (Figure 2A; 44Kim et al.2020). Therefore, spots showing fluorescence only for +ORFN or +ORFS probe sets will represent sgRNAs, whereas spots positive for both +ORFN or+ORFS and +ORF1a will correspond to gRNA molecules. We applied this approach to visualise SARS-CoV-2 RNAs in infected Vero E6 cells (6 hpi) and observed a high abundance of sgRNAs compared to gRNAs (Figure 2B), in agreement with RNA sequencing studies 1Alexandersen et al.202044Kim et al.2020. Further analysis revealed that the +ORFN single-labelled sgRNAs were more uniformly dispersed throughout the cytoplasm than dual-labelled gRNA, consistent with their predominant role as mRNAs to direct protein synthesis (Figure 2—figure supplement 1). However, gRNAs were enriched near the periphery of the nucleus in a clustered fashion. Association of gRNA with nucleocapsid (N) is essential for the assembly of coronavirus particles 13Carlson et al.202024Dinesh et al.202041Iserman et al.2020. To monitor this process in SARS-CoV-2, we combined smFISH using the +ORF1a and +ORFN probe sets with immunofluorescence detection of the viral nucleocapsid (N). Our findings show that N protein primarily co-localises with gRNA, while displaying a limited overlap with sgRNAs (Figure 2C, Figure 2—figure supplement 2). Together, these data demonstrate the specificity of our probes to accurately discriminate between the gRNA and sgRNAs.

#' @width 11.1
#' @height 4.4

line_profile_data <- read_csv("./Data/Figure2/N-protein_gRNA_line-profiles.csv")

ggplot(line_profile_data) +
  geom_line(aes(x = Distance, y = normalised, colour = signal),
            size = 0.5) +
  scale_colour_manual(values = c("#8fd175", "#d18975")) +
  labs(title = "Fluorescence profile",
       x = "Distance (\u03BCm)",
       y = "Normalised \nfluorescence") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        legend.position = "bottom",
        axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.title = element_blank()) -> plot1

# Import data
N_overlap <- read_csv("./Data/Figure2/p_NProtein-overlap_random.csv") %>%
  filter(str_detect(data_type, "Observed")) %>%
  mutate(data_type = str_remove(data_type, " Observed"))

# Get mean labels
N_overlap_anno <- N_overlap %>%
  group_by(data_type) %>%
  summarise(mean = mean(p_NCed_stringent)) %>%
  ungroup() %>%
  mutate(label = paste0(round(mean, 1), "%"))

# Statistics
N_overlap %>% group_by(data_type) %>% shapiro_test(p_NCed_stringent) # normal
star <- N_overlap %>% t_test(p_NCed_stringent ~ data_type) %>% add_significance() %>% pull(p.signif)
pval <- N_overlap %>% t_test(p_NCed_stringent ~ data_type) %>% add_significance() %>% pull(p) 
stat_anno <- paste0(star, "\np=", pval)

# Plot 
N_overlap %>%
  mutate(data_type = fct_rev(data_type)) %>%
  ggplot(aes(x = data_type, y = p_NCed_stringent, fill = data_type)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#d175b8", "#8fd175")) +
  geom_label(data = N_overlap_anno,
             aes(x = data_type, y = mean + 13, label = label), 
             size = 1.5, label.padding = unit(0.15, "lines"), label.size = 0.1,
             inherit.aes = FALSE) + 
  labs(title = "N protein colocalisation",
       x = "", 
       y = "% of molecules") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  annotate(geom = "text", x = 1.5, y = 90, label = stat_anno, size = 1.5, colour = "gray10") +
  coord_flip() -> plot2

# patchwork
plot1 + plot2

Fluorescence profiles of N immunostaining and gRNA smFISH intensity across a 2 µm linear distance (left).

Percentage of co-localised gRNA or sgRNA molecules with N protein at 6 hpi. Co-localisation was assessed by N fluorescence density within point-spread function ellipsoids of RNA spots over random coordinates. sgRNA were defined as single-coloured spots with +ORFN probe signal only (n = 7) (right).

#' @width 8
#' @height 8.25

# Import data 
dsRNA_infection <- read_csv("./Data/Figure2/p-infection_J2_positive.csv") %>%
  mutate(Time = case_when(condition == "MOCK" ~ "Mock",
                          TRUE ~ Time))

smFISH_infection <- read_csv("./Data/Figure2/p-infection_smFISH_positive_cells.csv") %>%
  mutate(Time = case_when(condition == "MOCK" ~ "Mock",
                          TRUE ~ Time))


dsRNA_intensity <- read_csv("./Data/Figure2/Quantification_J2_intensity.csv") %>%
  mutate(time.adj = as.character(time.adj)) %>%
  mutate(hours_post_infection = case_when(condition == "INF" ~ paste0(time.adj, "hpi"),
                                          condition == "MOCK" ~ "Mock"))

smFISH_count <- read_csv("./Data/Figure2/Quantification_smFISH_count.csv") %>%
  mutate(time.adj = as.character(time.adj)) %>%
  mutate(hours_post_infection = case_when(condition == "INF" ~ paste0(time.adj, "hpi"),
                                          condition == "MOCK" ~ "Mock"))

# * * * * Plots
dsRNA_infection %>%
  mutate(Time = as_factor(Time)) %>%
  mutate(Time = fct_relevel(Time, c("Mock", "2hpi", "6hpi"))) %>%
  ggplot(aes(x = Time, y = p.infected, fill = Time)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#d175b8", "#9c4c85","#6a2656")) +
  labs(title = "dsRNA (J2)",
       x = "Hours post infection", 
       y = "% cells above 95% percentile \n Mock fluorescence") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        axis.title.x = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p1

smFISH_infection %>%
  mutate(Time = as_factor(Time)) %>%
  mutate(Time = fct_relevel(Time, c("Mock", "2hpi", "6hpi"))) %>%
  ggplot(aes(x = Time, y = p.infected, fill = Time)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#8fd175", "#659e58","#204121")) +
  labs(title = "smFISH gRNA",
       x = "Hours post infection", 
       y = "% fluorescence positive cells") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        axis.title.x = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p2

dsRNA_intensity %>%
  ggplot(aes(x = factor(hours_post_infection, levels = c("Mock", "2hpi", "6hpi")), 
             y = dsRNA_density, colour = hours_post_infection)) +
  geom_quasirandom(width = 0.4, size = 0.25, alpha = 0.5) +
  geom_hline(yintercept = 1, alpha = 0.1, size = 0.25) +
  scale_y_continuous(limits = c(0, 7)) +
  # scale_y_log10(limits = c(0.4, 1e2),
  #               labels = trans_format("log10", math_format(10^.x))) +
  scale_colour_manual(values = c("#9c4c85","#6a2656", "#d175b8")) +
  labs(title = "J2 intensity", 
       x = "Hours post infection",
       y = "Normalised fluorescence density") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.title.x = element_blank(),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5)) -> p3

smFISH_count %>%
  mutate(total_count = total_count + 1) %>%
  ggplot(aes(x = factor(hours_post_infection, levels = c("Mock", "2hpi", "6hpi")), 
             y = total_count, colour = hours_post_infection)) +
  geom_quasirandom(width = 0.4, size = 0.25, alpha = 0.25) +
  geom_hline(yintercept = 1, alpha = 0.1, size = 0.25) +
  scale_y_log10(limits = c(1, 1e5),
                labels = trans_format("log10", math_format(10^.x))) +
  scale_colour_manual(values = c("#659e58","#204121", "#8fd175")) +
  labs(title = "smFISH gRNA count", 
       x = "Hours post infection",
       y = "RNA count + 1 (log10)") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.title.x = element_blank(),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5)) -> p4

# Patchwork
(p1 + p2) / (p3 + p4)

Detection of both positive and negative genomic RNA by denaturing viral double-stranded RNA (dsRNA) with DMSO and heat treatment at 80°C (upper panels).

3 µm z-projected images of infected Vero E6 cells at 8 hpi are shown. Scale bar = 10 µm. Schematic illustration of +ORF1a and -ORF1b probe targeting regions (lower panel). -ORF1b probe target region does not overlap with +ORF1a target sequences to prevent probe duplex formation.

Dissecting severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) gene expression using single-molecule fluorescence in situ hybridisation (smFISH).

(A) Schematic illustration of transcript-specific targeting of SARS-CoV-2 genomic RNA (gRNA) and subgenomic RNA (sgRNA) using smFISH. (B) Transcript-specific visualisation of gRNA and sgRNA in infected (Victoria [VIC] strain) Vero E6 cells. Cells were infected with SARS-CoV-2 (VIC strain) and hybridised with probes against +ORF1a and +ORFN probe at 6 hr post infection (hpi) (upper panels) or +ORF1a and +ORFS probe at 8 hpi (lower panels). Representative 3 µm maximum intensity projected confocal images are shown. Orange circular regions of interest (ROIs) indicate S-sgRNA encoding Spike, whereas dual-colour spots (teal-coloured ROIs) represent gRNA. Scale bar = 5, 10, or 20 µm. (C) Co-detection of viral nucleocapsid (N) with gRNA and sgRNA. Monoclonal anti-N (Ey2A clone) was used for N protein immunofluorescence. Representative 3 µm z-projected confocal images are shown. The inset shows a magnified view of co-localised N and gRNA. Scale bar = 10 µm. Fluorescence profiles of N immunostaining and gRNA smFISH intensity across a 2 µm linear distance are shown in the image inset (lower left). Percentage of co-localised gRNA or sgRNA molecules with N protein at 6 hpi. Co-localisation was assessed by N fluorescence density within point-spread function ellipsoids of RNA spots over random coordinates. sgRNA were defined as single-coloured spots with +ORFN probe signal only (n = 7) (lower right). Student’s t-test. ****p<0.0001. (D) Detection of both positive and negative genomic RNA by denaturing viral double-stranded RNA (dsRNA) with DMSO and heat treatment at 80°C (upper panels). 3 µm z-projected images of infected Vero E6 cells at 8 hpi are shown. Scale bar = 10 µm. Schematic illustration of +ORF1a and -ORF1b probe targeting regions (lower panel). -ORF1b probe target region does not overlap with +ORF1a target sequences to prevent probe duplex formation. (E) Comparison of anti-dsRNA (J2) and gRNA smFISH. Full z-projected images of infected Vero E6 cells co-stained with J2 and smFISH are shown. Scale bar = 10 µm. (F) Percentage of infected cells detected by J2 or smFISH (upper panels). For J2-based quantification, we defined the threshold as 95th percentile fluorescent signal of uninfected cells (Mock) due to the presence of endogenous host-derived signals. Fluorescent positive signals were used for smFISH-based quantification. Data are presented as mean ± SD. Comparison of quantification results between J2 stain and smFISH (lower panels). Each symbol represents one cell. J2 signal was quantified by fluorescence density over 3D cell volume, which was normalised to the average signal of uninfected control cells (horizontal dotted line). gRNA count represents sum of single-molecule spots and decomposed spots within viral factories. The symbol denoted with ‘#’ is the infected cell shown in Figure 2E (J2 stain, n = 3 independent repeats; smFISH, n = 4). One-way ANOVA and Tukey post-hoc test. n.s., not significant; *p<0.05; **p<0.01; ****p<0.0001.

#' @width 12
#' @height 8
# * * * * * Import dataframe
RDI_raw <- read_csv("./Data/Figure2/RNA-dispersion-index_Vero_6hpi_RAW_DATA.csv") 

RDI_tidy <- RDI_raw %>%
  clean_names() %>%
  dplyr::select(contains("periph"), contains("polarization"), contains("dispersion")) %>%
  pivot_longer(cols = everything(),
               names_to = "data_type",
               values_to = "index") %>%
  mutate(species = if_else(str_detect(data_type, "localized"), "ORFN", "gRNA")) %>%
  mutate(RDI_type = case_when(
    str_detect(data_type, "periph") ~ "Peripheral Distribution Index",
    str_detect(data_type, "polarization") ~ "Polarisation Index",
    str_detect(data_type, "dispersion") ~ "Dispersion Index"
  ))

# * * * * * Stats

## Normality test
RDI_tidy %>%
  group_by(RDI_type, species) %>%
  shapiro_test(index)

## Wilcox test 
RDI_tidy %>%
  group_by(RDI_type) %>%
  wilcox_test(index ~ species) %>%
  add_significance() %>%
  mutate(label = paste0("p=", signif(p, 2))) -> stat_df

# * * * * * Plot
colours <- c("#8fd175", "#d175b8")

RDI_tidy %>%
  filter(RDI_type != "Polarisation Index") %>%
  ggplot(aes(x = species, y = index, colour = species)) +
  geom_violin() + 
  geom_quasirandom(bandwidth = 0.2, size = 1.75, alpha = 0.8) +
  geom_text(data = subset(stat_df, stat_df$RDI_type != "Polarisation Index"),
            aes(x = 1.5, y = 1.25, label = label),
            inherit.aes = FALSE,
            size = 3) + 
  labs(title = "SARS-CoV-2 subcellular RNA dispersion metrics",
       x = "",
       y = "Index value",
       colour = "") +
  coord_cartesian(ylim = c(0, 1.3)) + 
  scale_colour_manual(values = colours) + 
  facet_wrap(~RDI_type, scales = "free") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7, face = "bold"),
        axis.line = element_blank(),
        legend.background = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        axis.text = element_text(size = 7),
        strip.text = element_text(size = 7, face = "bold")) -> RNA_dispersion_index

RNA_dispersion_index

Subcellular RNA dispersion of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic RNA (gRNA) and subgenomic RNAs (sgRNAs).

Vero E6 cells were infected with SARS-CoV-2 (Victoria [VIC] strain, multiplicity of infection [MOI] 1) and hybridised with +ORF1a and +ORFN single-molecule fluorescence in situ hybridisation (smFISH) probes at 6 hr post infection (hpi). RNA dispersion and peripheral distribution indices were calculated using the RNA distribution index (RDI) calculator on smFISH channels (see Materials and methods). Schematic diagrams of subcellular RNA distribution of select index values are shown next to the corresponding plots. Mann–Whitney U test. ****p<0.0001 (n = 32 cells).

Preferential co-localisation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) nucleocapsid protein with genomic RNA (gRNA).

Density histogram of SARS-CoV-2 nucleocapsid (N) protein voxel intensities co-localised to gRNA or subgenomic RNA (sgRNA) spots that relate to Figure 2C. Schematic diagram of co-staining for N protein and viral RNA is shown. In addition to the ‘observed’ gRNA and sgRNA spots, we performed a ‘random’ simulation to place the same density of RNA spots within infected cells and calculated the voxel intensities that correspond to chance co-localisation. The vertical dotted line corresponds to the threshold of co-detection defined as 2 standard deviation value from the random distribution analysis (n = 7 field of views).

Denaturation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) double-stranded RNA (dsRNA) for negative strand detection.

(A) Comparison of dsRNA denaturation efficiency assessed by the reduction of anti-dsRNA (J2) immunofluorescence. Vero E6 cells infected with SARS-CoV-2 (Victoria [VIC], multiplicity of infection [MOI] = 1) were fixed at 8 hr post infection (hpi) and treated with DMSO, formamide, or NaOH prior to immunostaining (see Materials and methods). DMSO and formamide treatment was performed at 80°C. Representative low-magnification single-slice confocal images are shown. Formamide treatment with heat resulted in cell detachment from coverslips. Scale bar = 200 µm. (B) Sensitivity of +ORF1a single-molecule fluorescence in situ hybridisation (smFISH) and J2 immunofluorescence signal to RNase digestion. Fixed infected cells (24 hpi) were treated with RNaseT1 and/or RNaseIII to digest single-stranded RNA and/or dsRNA, respectively, hybridised with +ORF1a probe and stained with J2. Representative full z-projected confocal images are shown, which are single-molecule contrast matched. Scale bar = 20 µm. (C) RNaseT1 digestion of denatured dsRNA. Fixed infected cells (8 hpi) were treated as follows: (i) DMSO/heat only (left); (ii) RNaseT1 then DMSO/heat (centre); or (iii) in the reverse order of DMSO/heat and then RNaseT1 (right). Treated cells were hybridised with +ORF1a and -ORF1b probes (see Figure 2D for schematics) and stained with J2. RNaseT1 digestion followed by DMSO treatment shows that viral dsRNAs are resistant to RNaseT1 activity, but DMSO treatment preceding RNaseT1 suggests that the denatured dsRNA can be targeted by RNaseT1. Full z-projected confocal images are shown. Scale bar = 10 µm.

Negative sense gRNA and sgRNAs are the templates for the synthesis of positive sense RNAs and are expected to localise to viral replication factories. However, their detection by RT-qPCR or sequencing is hampered by cDNA library protocols that employ oligo(dT) selection and by primer binding to dsRNA structures 65Ramanan et al.201673Sethna et al.1991. To detect negative sense viral RNAs, we denatured dsRNA complexes through either formamide, DMSO, or sodium hydroxide treatment 76Singer et al.202190Wilcox et al.2019. The combination of DMSO with heat treatment resulted in a loss of anti-dsRNA J2 signal, while maintaining cell integrity, suggesting a disruption of dsRNA hybrids (Figure 2—figure supplement 3A). We designed an smFISH probe set specific for the ORF1b antisense sequence that targets the negative sense gRNA (-gRNA) and resulted in intense diffraction-limited spots in DMSO and heat-treated cells (Figure 2D). The -gRNA spots were detected at a significantly lower level than their +gRNA counterparts, with substantial overlap observed between the two strands at multi-RNA spots, consistent with these foci representing active sites of viral replication. To determine if these multi-RNA foci contain dsRNA, the permeabilised infected cells were treated with RNaseT1 or RNaseIII, which are nucleases specific for single-stranded RNA (ssRNA) and dsRNA, respectively (Figure 2—figure supplement 3B). RNaseT1 digestion diminished the +ORF1a probe signal, while RNaseIII treatment abolished the anti-dsRNA J2 signal. A cocktail of RNaseT1 and RNaseIII ablated both +ORF1a probe binding and anti-dsRNA J2 signals, demonstrating that the +ORF1a probe set hybridises to both single and duplex RNA under our experimental conditions. Furthermore, treating cells with DMSO prior to RNaseT1 fully ablated the smFISH signal (Figure 2—figure supplement 3C), demonstrating that denaturation makes dsRNA accessible for RNaseT1 degradation. In summary, our data show that probe binding to negative strand gRNA requires chemical denaturation, suggesting that this replication intermediate is rich in dsRNA structures.

Anti-dsRNA antibodies underestimate SARS-CoV-2 replication

The establishment of replication factories is a critical phase of the virus life cycle. Previous reports have identified these viral factories using the J2 dsRNA antibody 8Burgess and Mohr201519Cortese et al.202080Targett-Adams et al.200889Weber et al.2006. However, this approach depends on high levels of viral dsRNA as cells naturally express endogenous low levels of dsRNA (22Dhir et al.2018; 45Kimura et al.2018; Figure 2E). To evaluate the ability of J2 antibody to quantify SARS-CoV-2 replication sites, we co-stained infected cells at 2 and 6 hpi with both J2 and +ORF1a smFISH probes. No viral-specific J2 signal was detected at 2 hpi, and only 10% of infected cells stained positive at 6 hpi, in agreement with previous observations (19Cortese et al.2020; 27Eymieux et al.2021; Figure 2E). In contrast, more than 85% of the cells showed diffraction-limited smFISH signals at both timepoints (Figure 2F). Furthermore, the average J2 signal detected in the SARS-CoV-2-infected cells at both timepoints was comparable to uninfected cells (Figure 2F). These data clearly show that the J2 antibody, although broadly used, underestimates the frequency of SARS-CoV-2 infection. In contrast, smFISH detected gRNA as early as 2 hpi, with a significant increase in copy number by 6 hpi, highlighting its utility to detect and quantify viral replication factories.

SARS-CoV-2 replication at single-molecule resolution

The efficiency and sensitivity of smFISH to detect single molecules of SARS-CoV-2 RNA allowed us to investigate the dynamics of viral replication in Vero E6 cells during the first 10 hr of infection (Figure 3A). At 2 hpi, the +ORF1a probe set detected predominantly single molecules of +gRNA with a median value of ~30 molecules per cell (Figure 3B and C). Interestingly, at 2 hpi RDV treatment did not affect the number of gRNA copies per cell, suggesting that these RNAs derive from incoming viral particles (Figure 3C). In contrast, the increase in gRNA copies per cell at 4 and 6 hpi was inhibited by RDV, indicating active viral replication. The infected cell population showed varying gRNA levels that we classified into three groups; (i) ‘partially resistant’ cells with <102 gRNA copies that showed no increase in gRNA burden between 2 and 8 hpi (60% of the population); (ii) ‘permissive’ cells with ~102–105 copies per cell showing a modest increase over time (~30%); and (iii) ‘super-permissive’ cells with >105 copies per cell showing a sharp increase in gRNA copies (~10%). Given the high gRNA density in super-permissive cells, RNA counts were estimated by correlating the integrated fluorescence intensity of reference single molecules to the total fluorescence of the 3D cell volume (see Materials and methods), which follows a linear relationship (Figure 3—figure supplement 1). Analysing the total cellular gRNA content showed that ‘super-permissive’ cells are the dominant source of gRNA across the culture (Figure 3D). This suggests that bulk RNA analyses such as RT-qPCR are biased towards this high gRNA burden group. Importantly, we found that cellular heterogeneity persists beyond the initial hours of infection. Even at 24 hpi, 40% of the cells did not reach the super-permissive state, and they formed a distinct population with approximately 10-fold less gRNA (Figure 3C and E). Similar heterogeneous cell populations were observed between 24 and 48 hpi, although the overall levels of gRNA started to decline after 32 hpi, reflecting cytotoxic effects and virus egress (Figure 3—figure supplement 2A–C). Therefore, these results highlight a wide variation in cell susceptibility to SARS-CoV-2 replication, which persisted throughout the infection. Notably, the high level of gRNA content in super-permissive cells (~107 counts/cell) was similar throughout the time course, suggesting the existence of an upper limit of gRNA copies in Vero E6 cells (Figure 3C).

Profiling severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replication kinetics at single-molecule resolution.

(A) Experimental design to profile SARS-CoV-2 replication kinetics using single-molecule fluorescence in situ hybridisation (smFISH). Vero E6 cells were seeded on cover-glass and 24 hr later inoculated with SARS-CoV-2 (Victoria [VIC] strain, multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion and cells fixed at the timepoints shown for hybridisation with +ORF1a and +ORFN probes. In the remdesivir (RDV) condition, the drug was added to cells at 10 µM during virus inoculation and maintained for the infection period. (B) Maximum z-projected confocal images of infected cells. Intensity calibration bars are shown for the +ORF1a and +ORFN channels. Scale bar = 10 µm. (D) Relative contribution of viral gRNA within the infected cell population. The infected cells were classified into three groups based on gRNA counts: (i) ‘partially resistant’ – gRNA <100; ‘permissive’ – 100 < gRNA < 105; ‘super-permissive’ – gRNA >105. The total gRNA within the infected wells was obtained by summing gRNA counts in population, and the figure shows the relative fraction from each classification. Representative max-projected images of cells in each category are shown (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2).

# * * * * *  Prepare dataframes

# Full dataframe
RNA_df <- read_csv("./Data/Figure3/Kinetics_RNA_quant_df5.csv")

# Ratio dataframe
RNA_df_ratio <- RNA_df %>%
  pivot_wider(id_cols = c(Label, time, condition, repeat_exp),
              names_from = channel,
              values_from = total_vRNAs) %>%
  mutate(sgRNA = ORFN - gRNA) %>%
  mutate(ratio = sgRNA/gRNA) %>%
  filter(sgRNA >= 0) %>% 
  filter(ratio != Inf) 

RNA_df_ratio %>%
  filter(time == 10) %>%
  filter(condition == "Untreated") %>%
  mutate(category = case_when(
    gRNA <= 100 ~ "Low",
    gRNA > 100 & gRNA < 100000 ~ "Mid",
    gRNA >= 100000 ~ "High")) %>%
  group_by(category) %>%
  summarise(n = n())

# Some function for plotting
median_se <- function(x) {
  if (!is.numeric(x)) {
    stop("x must be a numeric vector")
  }
  mean_x <- stats::median(x, na.rm = TRUE)
  sd_x <- WRS2::msmedse(x, sewarn = FALSE)
  data.frame("y" = mean_x,
             "ymin" = mean_x - sd_x,
             "ymax" = mean_x + sd_x)
}

# * * * * * Plotting
# gRNA kinetics 
RNA_df %>%
  filter(channel == "gRNA") %>%
  filter(total_vRNAs > 0) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = as.factor(time), y = total_vRNAs, colour = condition)) +
  geom_point(position = position_jitterdodge(jitter.width = 0.5, dodge.width = 0.75),
            size = 0.5, alpha = 0.25, stroke = 0.1) +
  geom_hline(yintercept = 100000, linetype = "dashed", size = 0.3, alpha = 0.25) + 
  geom_hline(yintercept = 100, linetype = "dashed", size = 0.3, alpha = 0.25) + 
  scale_y_log10(limits = c(10^0, 10^8),
                breaks = trans_breaks('log10', function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "gRNA kinetics",
       x = "Hours post infection",
       y = "Total RNA count per cell (Log)") +
  scale_colour_manual(values = c("#406e3c", "#8fd175")) + 
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = c(0.2, 0.9),
        legend.background = element_blank(),
        legend.key.size = unit(0.3, "cm"),
        legend.title = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  guides(color = guide_legend(override.aes = list(size = 1.5, alpha = 1))) -> gRNA_kinetics

gRNA_kinetics

# sgRNA kinetics 
RNA_df_ratio %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = as.factor(time), y = sgRNA, colour = condition)) +
  geom_point(position = position_jitterdodge(jitter.width = 0.5, dodge.width = 0.75),
            size = 0.5, alpha = 0.25, stroke = 0.1) +
  geom_hline(yintercept = 100000, linetype = "dashed", size = 0.3, alpha = 0.25) + 
  geom_hline(yintercept = 100, linetype = "dashed", size = 0.3, alpha = 0.25) + 
  scale_y_log10(limits = c(10^0, 10^8),
                breaks = trans_breaks('log10', function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "sgRNA kinetics",
       x = "Hours post infection",
       y = "Total RNA count per cell (Log)") +
  scale_colour_manual(values = c("#83396d", "#d175b8")) + 
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = c(0.2, 0.9),
        legend.background = element_blank(),
        legend.key.size = unit(0.3, "cm"),
        legend.title = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  guides(color = guide_legend(override.aes = list(size = 1.5, alpha = 1))) -> sgRNA_kinetics

sgRNA_kinetics

Bigfish quantification of +gRNA or +sgRNA counts per cell.

sgRNA counts were calculated by subtracting +ORF1a counts from +ORFN counts per cell. Horizontal dashed lines indicate 102 or 105 molecules of RNA. 24 hr post infection (hpi) samples and the cells harbouring >107 RNA counts were quantified by extrapolating single-molecule intensity. Quantified cells from all replicates are plotted (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2). Number of cells analysed (untreated/RDV): 2 hpi, 373/273; 4 hpi, 798/516; 6 hpi, 370/487; 8 hpi, 1442/1022; 10 hpi, 1175/1102; 24 hpi, 542/249.

#' @width 10
#' @height 6
# * * * * * Prepare dataframe 
# Get count sum
count_sum <- RNA_df_ratio %>% 
  mutate(category = case_when(
    gRNA <= 100 ~ "Low",
    gRNA > 100 & gRNA < 100000 ~ "Mid",
    gRNA >= 100000 ~ "High")) %>%
  group_by(time, condition, category) %>%
  summarise(ORFN_sum = sum(ORFN),
            gRNA_sum = sum(gRNA),
            sgRNA_sum = sum(sgRNA)) %>%
  ungroup() %>%
  group_by(time, condition) %>%
  summarise(ORFN_total = sum(ORFN_sum),
            gRNA_total = sum(gRNA_sum),
            sgRNA_total = sum(sgRNA_sum))

# Get dataframe for relative contiribution of viral RNA from each cell cateogry
rel_count <- RNA_df_ratio %>% 
  mutate(category = case_when(
    gRNA <= 100 ~ "Low",
    gRNA > 100 & gRNA < 100000 ~ "Mid",
    gRNA >= 100000 ~ "High")) %>%
  group_by(time, condition, category) %>%
  summarise(ORFN_sum = sum(ORFN),
            gRNA_sum = sum(gRNA),
            sgRNA_sum = sum(sgRNA),
            cell_count = n()) %>%
  ungroup() %>%
  left_join(count_sum, by = c("time", "condition")) %>%
  mutate(ORFN_p = ORFN_sum/ORFN_total * 100,
         gRNA_p = gRNA_sum/gRNA_total * 100,
         sgRNA_p = sgRNA_sum/sgRNA_total * 100)

# * * * * * Plotting
# gRNA relative contribution 
ordered_category <- c("High", "Mid", "Low")
rel_count %>%
  mutate(condition = fct_rev(condition)) %>%
  mutate(category = fct_relevel(category, ordered_category)) %>%
  mutate(category = fct_recode(category, 
                               "High: >1e5" = "High",
                               "Mid: 100~1e5 " = "Mid",
                               "Low: <100" = "Low")) %>%
  ggplot(aes(x = as.factor(time), y = gRNA_p, fill = category)) +
  geom_col(width = 0.8, alpha = 0.8) +
  scale_fill_manual(values = c("#2D708EFF", "#29AF7FFF", "#B8DE29FF")) +
  labs(title = "Relative contribution to total vRNA",
       x = "Hours post infection",
       y = "Percentage",
       fill = "Cellular gRNA burden") +
  facet_wrap(~ condition) +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "right",
        legend.key.size = unit(0.3 ,"cm"),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> rel_vRNA

rel_vRNA

Relative contribution of viral gRNA within the infected cell population.

The infected cells were classified into three groups based on gRNA counts: (i) ‘partially resistant’ – gRNA <100; ‘permissive’ – 100 < gRNA < 105; ‘super-permissive’ – gRNA >105. The total gRNA within the infected wells was obtained by summing gRNA counts in population, and the figure shows the relative fraction from each classification.

# Import data
RDV_psup <- read_csv("./Data/Figure3/RDV_p-superinfected.csv") %>%
  group_by(File, time, condition) %>%
  summarise(p_sup = sum(superinfected)/n()*100) %>% ungroup() %>%
  mutate(time = fct_rev(time),
         condition = fct_rev(condition))

# Statistics
RDV_psup %>% group_by(time, condition) %>% shapiro_test(p_sup) # Not normal
RDV_psup %>% group_by(time) %>% dunn_test(p_sup ~ condition, p.adjust.method = "holm")

# Annotation dataframe
RDV_psup_anno <- RDV_psup %>% group_by(time) %>% 
  dunn_test(p_sup ~ condition, p.adjust.method = "holm") %>% ungroup() %>%
  mutate(p.adj = formatC(p.adj, format = "e", digits = 1)) %>%
  mutate(label = paste0(p.adj.signif,"\np=",p.adj)) %>%
  mutate(time = fct_rev(time)) %>%
  mutate(y_pos = case_when(
    time == "8 hpi" ~ 30,
    time == "24 hpi" ~ 79
  ))

RDV_psup_meanlabel <- RDV_psup %>% group_by(time, condition) %>% 
  summarise(mean = mean(p_sup)) %>% ungroup() %>%
  mutate(mean = round(mean, 1)) %>%
  mutate(label = paste0(mean, "%")) %>%
  mutate(time = fct_rev(time))


# Plot
RDV_psup %>%
  ggplot(aes(x = time, y = p_sup, fill = condition)) +
  geom_bar(stat = "summary", width = 0.65, position = position_dodge(width = 0.75), alpha = 0.8) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4,
                 position = position_dodge(width = 0.75)) +
  scale_y_continuous(breaks = seq(0, 80, by = 20),
                     limits = c(0, 80)) + 
  scale_fill_manual(values = c("#26547c", "#75b8d1")) +
  geom_label(data = RDV_psup_meanlabel,
             aes(x = time, y = mean + 9, label = label, group = condition),
             size = 1.5, label.padding = unit(0.05, "lines"), label.size = 0.075,
             inherit.aes = FALSE,
             position = position_dodge(width = 0.75)) +
  labs(title = "% Super-permissive",
       x = "", 
       y = "% of cells") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = c(0.25, 0.9),
        legend.direction = "vertical",
        legend.background = element_blank(),
        legend.title = element_blank(),
        legend.key.size = unit(0.1 ,"cm"),
        legend.text = element_text(size = 5),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  geom_text(data = RDV_psup_anno, aes(x = time, y = y_pos, label = label), 
            size = 1.5, alpha = 0.7, inherit.aes = FALSE) -> p5

p5

Percentage of super-permissive cells in untreated and RDV-treated conditions at 8 and 24 hpi.

Labels represent average values. Data are represented as mean ± SD (n = 3, ~ 2000 cells were scanned from each replicate well). Student’s t-test. ***p<0.001; ****p<0.0001.

sgRNA_exp_cells <- read_csv("./Data/Figure3/sgRNA-expressing_cells_detection_normalised.csv")

sgRNA_exp_cells %>%
  filter(!(time == 2 & condition == "RDV" & repeat_exp == "R2")) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = as.factor(time), y = p_permissive, colour = condition)) + 
  geom_point(stat = "summary", size = 1.75) + 
  geom_line(aes(group = condition), stat = "summary") +
  geom_linerange(stat = "summary", size = 0.25) +
  scale_colour_manual(values = c("#83396d", "#d175b8")) +
  scale_y_continuous(limits = c(30, 110)) +
  labs(title = "Viral factory kinetics",
       x = "Hours post infection",
       y = "Number of factories per cell") +
  labs(title = "% sgRNA expressing cells",
       x = "Hours post infection",
       y = "Percentage") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = c(0.8, 0.3),
        legend.background = element_blank(),
        legend.key.size = unit(0.3, "cm"),
        legend.title = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p_sgRNA

p_sgRNA

Percentage of infected cells expressing sgRNA. sgRNA-expressing cells were identified by those having a (ORF-N – ORF1a) probe count more than 1.

Data are represented as mean ± SEM (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2)

## Ratio by time
RNA_df_ratio %>%
  filter(ratio != Inf) %>%
  mutate(condition = fct_rev(condition))  %>%
  ggplot(aes(x = as.factor(time), y = ratio, colour = condition)) +
  geom_point(position = position_jitterdodge(jitter.width = 0.5, dodge.width = 0.75),
            size = 0.5, alpha = 0.2, stroke = 0.1) +
  geom_point(aes(x = as.factor(time), y = ratio, group = condition),
                  stat = "summary", fun = median, position = position_dodge(width = 0.75), 
                  size = 1, colour = "gray50",
                  inherit.aes = FALSE, show.legend = FALSE) +
  geom_line(aes(group = condition), stat = "summary", 
            alpha = 0.75) +
  geom_hline(yintercept = 1, linetype = "dashed",
             size = 0.5, colour = "gray75") + 
  scale_y_continuous(limits = c(0, 30)) +
  labs(title = "sgRNA/gRNA ratio by hpi",
       x = "Hours post infection",
       y = "sgRNA/gRNA ratio") +
  scale_colour_manual(values = c("#26547c", "#75b8d1")) + 
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.title = element_blank(),
        legend.position = c(0.20, 0.85),
        legend.background = element_blank(),
        legend.direction = "vertical",
        legend.key.size = unit(0.3, "cm"),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  guides(color = guide_legend(override.aes = list(size = 2, alpha = 1))) -> ratio_time

ratio_time

Per cell ratio of sgRNA/gRNA counts across the time series.

Grey symbols represent cell-to-cell median values, whereas the line plot represents ratio calculated from population sum of gRNA and sgRNA. The number of cells analysed is the same as in (Figure 3C), with the exception of cells having equal ORF1a and ORF-N probe counts. Horizontal dashed line represents value of 1 (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2). Data are represented as median ± SEM. Horizontal dashed line represents value of 1 (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2).

## Ratio by category
ratio_anno_category <- RNA_df_ratio %>%
  filter(gRNA > 10) %>%
  mutate(category = case_when(
    gRNA <= 100 ~ "Low",
    gRNA > 100 & gRNA < 100000 ~ "Mid",
    gRNA >= 100000 ~ "High")) %>%
  group_by(category, time, condition) %>%
  summarise(ratio_bulk = mean(ratio)) %>% ungroup() %>%
  mutate(condition = fct_rev(condition))

RNA_df_ratio %>%
  filter(gRNA > 10) %>%
  mutate(category = case_when(
    gRNA <= 100 ~ "Low",
    gRNA > 100 & gRNA < 100000 ~ "Mid",
    gRNA >= 100000 ~ "High")) %>%
  mutate(condition = fct_rev(condition)) %>%
  mutate(category = fct_relevel(category, c("Low", "Mid", "High"))) %>%
  ggplot(aes(x = as.factor(category), y = ratio, colour = condition)) +
  geom_point(position = position_jitterdodge(jitter.width = 0.5, dodge.width = 0.75),
            size = 0.5, alpha = 0.2, stroke = 0.1) +
  geom_pointrange(data = ratio_anno_category,
                  aes(x = as.factor(category), y = ratio_bulk, colour = condition),
                  stat = "summary",
                  size = 0.4,
                  position = position_dodge(width = 0.75), 
                  inherit.aes = FALSE, show.legend = FALSE) +
  geom_hline(yintercept = 1, linetype = "dashed",
             size = 0.5, colour = "gray75") +
  labs(title = "sgRNA/gRNA ratio by gRNA burden",
       x = "gRNA burden classification",
       y = "sgRNA/gRNA ratio") +
  scale_colour_manual(values = c("#26547c", "#75b8d1")) + 
  coord_cartesian(ylim = c(0, 30)) +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.title = element_blank(),
        legend.position = c(0.8, 0.85),
        legend.background = element_blank(),
        legend.direction = "vertical",
        legend.key.size = unit(0.3, "cm"),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  guides(color = guide_legend(override.aes = list(size = 2, alpha = 1))) -> ratio_category

ratio_category

Per cell ratio of sgRNA/gRNA counts grouped by gRNA burden classification as defined in Figure 3D.

Data are represented as median ± SEM. Horizontal dashed line represents value of 1 (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2).

RNA_df %>%
  mutate(condition = fct_rev(condition)) %>%
  filter(time != 24) %>%
  filter(channel == "gRNA") %>%
  ggplot(aes(x = as.factor(time), y = repSites, colour = condition)) +
  geom_line(aes(group = condition), stat = "summary", alpha = 0.4) +
  geom_linerange(stat = "summary", size = 0.25) +
  geom_point(stat = "summary", size = 1.75) +
  scale_colour_manual(values = c("#26547c", "#75b8d1")) +
  labs(title = "Viral factory kinetics",
       x = "Hours post infection",
       y = "Number of factories per cell") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = c(0.2, 0.8),
        legend.background = element_blank(),
        legend.key.size = unit(0.3, "cm"),
        legend.title = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> vfactory_plot

vfactory_plot

The number of viral factories per cell increase over time as assessed by smFISH cluster detection.

Cells harbouring >107 copies of RNA, less than 10 molecules of RNA, cells with no viral factories, and cells from 24 hpi timepoints were excluded from this analysis. Data are represented as mean ± SEM. Number of cells analysed (untreated/RDV): 2 hpi, 494/240; 4 hpi, 758/494; 6 hpi, 315/417; 8 hpi, 933/877; 10 hpi, 726/885 (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2).

factory_df <- read_csv("./Data/Figure3/Kinetics_vFactory.csv")

factory_df %>%
  filter(channel == "gRNA") %>%
  mutate(time = fct_rev(as.factor(time))) %>%
  ggplot(aes(x = as.factor(time), y = repSite_content, colour = condition)) +
  geom_point(position = position_jitterdodge(jitter.width = 0.2, dodge.width = 0.75),
            size = 0.3, alpha = 0.5, stroke = 0.1) +
  geom_violin(position = position_dodge(width = 0.75),
              alpha = 0.4, size = 0.2) +
  # geom_boxplot(position = position_dodge(width = 0.75),
  #              width = 0.2) +
  scale_y_log10() +
  scale_colour_manual(values = c("#8fd175", "#406e3c")) +
  labs(title = "Number of gRNA per factory",
       x = "Hours post infection",
       y = "# Molecules per factory (log)") +
  coord_flip(ylim = c(4, 20000)) +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size =7),
        axis.line = element_blank(),
        legend.position = c(0.8, 0.8),
        legend.background = element_blank(),
        legend.key.size = unit(0.3, "cm"),
        legend.title = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  guides(colour = guide_legend(reverse=TRUE)) -> vf_content

vf_content

The kinetics of gRNA copies within viral factories.

Spatially extended viral factories were resolved by cluster decomposition to obtain single-molecule counts. The type and number of cells analysed are the same as in Figure 3I (2–8 hpi, n ≥ 3; 10 and 24 hpi, n = 2). gRNA, genomic RNA; sgRNA, subgenomic RNA.

# Import data
calibration_df <- read_csv("./Data/Figure3/intensity_calibration.csv")

# Plot
calibration_df %>%
  ggplot(aes(x = total_vRNAs_intensity, y = total_vRNAs_smFISH)) +
  geom_point(size = 0.4, alpha = 0.80, colour = "gray50", stroke = 0) +
  geom_smooth(method = "lm", colour = "#3f2d54", size = 0.3) +
  scale_x_log10(labels = trans_format("log10", math_format(10^.x))) +
  scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "smFISH intensity calibration",
       x = "RNA count: smFISH",
       y = "RNA count: Integrated intensity") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 8),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> intensity_calibration

intensity_calibration

Correlation between single-molecule fluorescence in situ hybridisation (smFISH) RNA counts and smFISH fluorescence intensity.

Correlation of smFISH RNA counts and cellular smFISH fluorescence intensity per cell. Non-‘super-permissive’ cells infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (multiplicity of infection [MOI], 1, 8–10 hr post infection [hpi]) were hybridised with +ORF1a probes and RNA counts quantified using Bigfish (absolute RNA count) or via estimation using integrated intensity. RNA count by Integrated intensity method was quantified using reference single-molecule intensity and relating it to the total 3D fluorescence of the cells. A linear correlation was fitted using geom_smooth() function of ‘ggplot2’ R package (n = 383 cells).

The dynamics and heterogeneity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA replication.

(A) Experimental design to profile SARS-CoV-2 replication kinetics in the late-stage infection. Vero E6 cells were seeded on cover-glass and 24 hr later inoculated with SARS-CoV-2 (Victoria [VIC] strain, multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion, and cells were fixed at timepoints shown for hybridisation with +ORF1a probe. In remdesivir (RDV) condition, the drug (10 µM) was added to cells at 24 hpi and maintained for the times shown. (B) Representative full z-projected confocal images of infected cells from the time series. Viral genomic RNA (gRNA) was visualised with +ORF1a probes. Images were contrasted to equivalent single-molecule intensity. Scale bar = 20 µm.

# Import data
decay_df_untreated <- read_csv("./Data/Figure3/CF06_RNA-quantification_df6.csv") %>%
  filter(condition == "Untreated")
decay_df_RDV <- read_csv("./Data/Figure3/CF06_RNA-quantification_df6.csv") %>%
  filter(condition == "RDV")

# Decay functions fitted from jupyter file
gRNA_RDV_fit <- function(x){
  5208835.4894508235 * exp(-0.09940804642469374 * (x+24)) -23790.892368007248
}

gRNA_INF_fit <- function(x){
  3500853.795573997 * exp(-0.08492950876764101 * (x+24)) -16784.180473896864
}

decay_line_gRNA_RDV <- tibble(time = seq(0, 35, 0.01)) %>%
  mutate(fit = gRNA_RDV_fit(time), condition = "RDV") %>%
  mutate(time = time + 24)

decay_line_gRNA_INF <- tibble(time = seq(0, 35, 0.01)) %>%
  mutate(fit = gRNA_INF_fit(time), condition = "Untreated") %>%
  mutate(time = time + 24)

decay_anno <- tibble(
  condition = c("Untreated", "RDV"),
  halflife = c(8.2, 7.0)) %>%
  mutate(x_pos = halflife + 4) %>%
  mutate(label = paste0("t1/2 = ", halflife, " hrs")) 

decay_anno_INF <- decay_anno %>% filter(condition == "Untreated")

decay_anno_RDV <- decay_anno %>% filter(condition == "RDV")

# Define median_se function
median_se <- function(x) {
  if (!is.numeric(x)) {
    stop("x must be a numeric vector")
  }
  mean_x <- stats::median(x, na.rm = TRUE)
  sd_x <- WRS2::msmedse(x, sewarn = FALSE)
  data.frame("y" = mean_x,
             "ymin" = mean_x - sd_x,
             "ymax" = mean_x + sd_x)
}

# Plot
decay_df_untreated %>%
  mutate(condition = fct_rev(condition)) %>%
  filter(Channel == "gRNA") %>%
  ggplot(aes(x = time, y = RNA_count)) +
  geom_point(position = position_jitter(width = 0.7),
             alpha = 0.1, stroke = 0.1,size = 0.6, colour = "#26547c") + 
  geom_pointrange(stat = "summary", fun.data = median_se,
                  size = 0.3, alpha = 1, colour = "#26547c") +
  geom_line(data = decay_line_gRNA_INF, aes(x = time, y = fit, group = condition),
            linetype = "dashed", colour = "#d1ab75", size = 0.5,
            inherit.aes = FALSE) +
  geom_vline(data = decay_anno_INF, aes(xintercept = x_pos+24, group = condition),
             linetype = "dotted", colour = "gray30", size = 0.4) +
  geom_text(data = decay_anno_INF, aes(x = x_pos + 26, y = 1500000, label = label, group = condition),
            size = 2, hjust = 0,
            inherit.aes = FALSE) +
  scale_x_continuous(breaks = c(24, 26, 28, 32, 48),
                     limits = c(22, 50)) +
  scale_y_continuous(labels = label_number(scale = 1/1e6, accuracy = 0.1)) +
  coord_cartesian(ylim = c(0, 5*10^6)) +
  labs(title = "gRNA count (Untreated)",
       x = "Hours post infection",
       y = expression(RNA~count~(x10^6)~linear~scale)) + 
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> decay_UT

decay_df_RDV %>%
  mutate(condition = fct_rev(condition)) %>%
  filter(Channel == "gRNA") %>%
  ggplot(aes(x = time, y = RNA_count)) +
  geom_point(position = position_jitter(width = 0.7),
             alpha = 0.2, stroke = 0.1, size = 0.6, colour = "#75b8d1") + 
  geom_pointrange(stat = "summary", fun.data = median_se,
                  size = 0.3, alpha = 1, colour = "#26547c") +
  geom_line(data = decay_line_gRNA_RDV, aes(x = time, y = fit, group = condition),
            linetype = "dashed", colour = "#d1ab75", size = 0.5,
            inherit.aes = FALSE) +
  geom_vline(data = decay_anno_RDV, aes(xintercept = x_pos+24, group = condition),
             linetype = "dotted", colour = "gray30", size = 0.4) +
  geom_text(data = decay_anno_RDV, aes(x = x_pos + 26, y = 1500000, label = label, group = condition),
            size = 2, hjust = 0,
            inherit.aes = FALSE) +
  scale_x_continuous(breaks = c(24, 26, 28, 32, 48),
                     limits = c(22, 50)) +
  scale_y_continuous(labels = label_number(scale = 1/1e6, accuracy = 0.1)) +
  coord_cartesian(ylim = c(0, 5*10^6)) +
  labs(title = "gRNA count (RDV treatment at 24 hpi)",
       x = "Hours post infection",
       y = expression(RNA~count~(x10^6)~linear~scale)) + 
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> decay_RDV
  
decay_UT + decay_RDV

Quantification of viral gRNA counts in untreated and RDV-treated cells.

Reference single RNA molecule intensity was acquired using Bigfish in signal-sparse region of the images. Viral gRNA counts were quantified by extrapolating single-molecule intensity to 3D integrated intensity per cell. Decay curve was fitted with the median RNA count values using the last three timepoints. Data are represented as median ± SEM (n = 2).

#' @width 13
#' @height 5.5
# * * * * * Import data 

decay_df_super <- read_csv("./Data/Figure3/CF06_RNA-quantification_df6.csv") %>%
  filter(Channel == "gRNA") %>%
  mutate(time = paste0(time, "h")) %>%
  mutate(cell_id = str_remove(cell_id, "[:digit:]$")) %>%
  mutate(cell_id = str_remove(cell_id, "[:digit:]$"))

# * * * * * Calculated super-permissive cell proportion

decay_df_super_plot <- decay_df_super %>%
  mutate(is_super = if_else(
    RNA_count > 100000,
    TRUE,
    FALSE
  )) %>%
  group_by(time, condition, cell_id) %>%
  summarise(n_cell = n(),
            n_super = sum(is_super),
            p_super = sum(is_super)/n() * 100) %>%
  ungroup() %>%
  mutate(condition = fct_rev(condition))

# * * * * * Stats

decay_df_super_plot %>%
  group_by(time) %>%
  t_test(p_super ~ condition) %>%
  add_significance() -> decay_df_super_plot_stat
  

# * * * * * Plot

decay_df_super_plot %>%
  ggplot(aes(x = time, y = p_super, fill = condition)) +
  geom_bar(stat = "summary", width = 0.6,
           position = position_dodge(width = 0.75)) +
  geom_linerange(size = 0.5, alpha = 1,
                 stat = "summary", fun.data = "mean_se", fun.args = list(mult = 1),
                 position = position_dodge(width = 0.75)) +
  geom_text(data = decay_df_super_plot_stat,
            aes(x = time, y = 85, label = p.signif),
            size = 2.5,
            inherit.aes = FALSE) +
  labs(title = "Percentage super-permissive cells",
       x = "Hours post infection",
       y = "% of cells (per FOV)") +
  scale_fill_manual(values = c("#26547c", "#75b8d1")) +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        axis.text.x = element_text(hjust = 0.5, vjust = 0.5),
        legend.position = "right",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> decay_super_plot

# * * * * * Import data

decay_cellcount <- read_csv("./Data/Figure3/CF06_viable_cell_count.csv")

# * * * * *  Cell count plot

mock_avg <- decay_cellcount %>%
  filter(condition == "Mock") %>%
  group_by(time) %>%
  summarise(avg_count = mean(n_cell)) %>% ungroup()

decay_cellcount_df <- decay_cellcount %>%
  left_join(mock_avg, by = "time") %>%
  mutate(norm_count = n_cell/avg_count)

## Statistics 
decay_cellcount_anno <- decay_cellcount_df %>% 
  filter(condition != "Mock") %>%
  group_by(time) %>%
  t_test(norm_count ~ condition) %>% add_significance()

## Plot
decay_cellcount_df %>%
  filter(condition != "Mock") %>%
  mutate(condition = fct_relevel(condition, c("Untreated", "RDV"))) %>%
  ggplot(aes(x = time, y = norm_count, fill = condition)) +
  geom_bar(stat = "summary", width = 0.6,
           position = position_dodge(width = 0.75)) +
  geom_linerange(aes(x = time, y = norm_count, group = condition), size = 0.5, alpha = 1,
                 stat = "summary", fun.data = "mean_se", fun.args = list(mult = 1),
                 position = position_dodge(width = 0.75),
                 inherit.aes = FALSE) +
  geom_text(data = decay_cellcount_anno,
            aes(x = time, y = 0.9, label = p.signif),
            size = 2.5,
            inherit.aes = FALSE) + 
  scale_fill_manual(values = c("#26547c", "#75b8d1")) +
  scale_y_continuous(limits = c(0, 1)) +
  labs(title = "Proportion of viable cells",
       x = "Hours post infection",
       y = "Normalised cell count (per FOV)") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        axis.text.x = element_text(hjust = 0.5, vjust = 0.5),
        legend.position = "right",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> decay_cellcount

decay_super_plot + decay_cellcount + plot_layout(guides = "collect")

The proportion of super-permissive and viable cells per field of view (FOV) across the time series.

Cells containing >105 gRNA counts were considered super-permissive. Viable cells were quantified by counting non-condensed nuclei in randomly sampled FOVs. Non-condensed nuclei counts were normalised to the average count values from uninfected ‘Mock’ condition. Student’s t-test. ***p<0.001; ****p<0.0001. Data are represented as mean ± SEM (n = 2).

# Import data
dose <- read_csv("./Data/Figure3/RDV-IC50-tidy.csv") %>%
  mutate(RDV_log = log10(RDV))

# Fit non-linear model ([Inhibitor] vs. normalised response model)
fit <- nls(SARS ~ bot + (top - bot)/(1 + 10^(RDV_log - log10(IC50))),
           data = dose,
           start = list(bot = 10, top = 100, IC50 = 0.001)) 

# Create annotation parameters 
Hill <- -1
IC50 <- tidy(fit) %>% filter(term == "IC50") %>% pull(estimate)
IC50_anno <- paste0("IC50 = ", round(IC50, 2), " \u03BCM")
IC90 <- ((100 - 90)/90)^(1/Hill) * IC50
IC90_anno <- paste0("IC90 = ", round(IC90, 2), " \u03BCM")

# Plot 
dose %>%
  ggplot(aes(x = RDV, y = SARS)) +
  geom_smooth(method = "nls",
              formula = y ~ bot + (top - bot)/(1 + 10^(x - log10(IC50))),
              method.args = list(start = c(bot = 0, top = 100, IC50 = 0.001)),
              se = FALSE,
              colour = "#758bd1", size = 0.5, alpha = 0.75) +
  geom_pointrange(stat = "summary", fun.data = "mean_se",
                  fun.args = list(mult = 1),
                  colour = "#3f2d54", size = 0.25) +
  scale_x_log10(limits = c(1e-3, 1e2),
                labels = trans_format("log10", math_format(10^.x)),
                breaks = c(10^-3, 10^-2, 10^-1, 10^0, 10^1, 10^2)) +
  labs(title = "RDV dose response curve",
       x = "[Remdesivir (Log)] (\u03BCM)",
       y = "SARS-CoV-2 RNA\n(% of Untreated)") + 
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 8),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  annotate("text", x = 10^-0.5, y = 180, 
           label = IC50_anno, colour = "#3f2d54", size = 2, hjust = 0) +
  annotate("text", x = 10^-0.5, y = 150, 
           label = IC90_anno, colour = "#3f2d54", size = 2, hjust = 0) -> RDV_IC50

RDV_IC50

Remdemsivir (RDV) dose-response of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA replication.

Dose-response curve of RDV versus SARS-CoV-2 RNA replication. Viral RNA was quantified using RT-qPCR in infected Vero E6 cells at 24 hr post infection (hpi), targeting the ORF-N region. The concentrations of drug required for 50% inhibition (inhibitory concentration 50; IC50) and 90% inhibition (IC90) were estimated by fitting a non-linear (weighted) least-squares model on the data using the nls() function in base R (n = 3).

Viral RNA stability is an important determinant for the virus’ ability to initiate and maintain a productive infection. In an effort to determine the stability of SARS-CoV-2 RNA, we added RDV simultaneously to SARS-CoV-2 infection and followed gRNA persistence in a time course. Notably, the average number of gRNA copies per cell was stable in RDV-treated cells (Figure 3C), suggesting that the incoming gRNA is long-lived. To assess if gRNA is also stable at later times post infection, we treated cells with RDV at 24 hpi and measured gRNA abundance at different times post treatment (Figure 3—figure supplement 2A–C). As expected, RDV treatment led to a reduced proportion of super-permissive cells and non-viable cells at 48 hpi, indicating an inhibition of viral replication and, consequently, viral-induced cell death (Figure 3—figure supplement 2D). However, considerable levels of gRNA persisted in these cells even after 24 hr of RDV treatment, suggesting that gRNA is also relatively stable at late times post infection. To estimate the half-life of gRNA in late infection, we fitted a decay curve and calculated the half-life of gRNA within a range of 6–8 hr (Figure 3—figure supplement 2C). This half-life might be underestimated as gRNA loss is not only due to decay but also to virus egress. Moreover, while we used an RDV dose that exceeded the 90% inhibitory concentration (IC90) (Figure 3—figure supplement 3), we cannot rule out that incomplete inhibition by RDV could affect our half-life estimates (Figure 3C).

Simultaneous analysis of +ORF1a and +ORFN revealed similar expression kinetics for sgRNA, with 11 copies/cell of sgRNA detected in 63% of infected cells at 2 hpi (Figure 3C and F). Since +sgRNA requires -sgRNA template for its production, our results imply that multiple rounds of transcription occur rapidly following virus internalisation that are RDV insensitive. By 6 hpi, most cells contain sgRNA (Figure 3F), with the super-permissive cells supporting high levels of sgRNA transcription. We examined the vRNA replication dynamics and found the ratio of sgRNA/gRNA ranged from 0.5 to 8 over time (Figure 3G), consistent with a recent report in diagnostic samples 1Alexandersen et al.2020. Notably, the sgRNA/gRNA ratio increased between 2 and 10 hpi, followed by a decline at 24 hpi, indicating a shift in preference to produce gRNA over sgRNA in later stages of infection. A similar trend was observed in RDV-treated cells, with a reduced sgRNA/gRNA peak at 8–10 hpi. We estimated the sgRNA/gRNA ratio for individual cells and found that sgRNA synthesis is favoured in the ‘partially resistant’ and ‘permissive’ cells, whereas the ‘super-permissive’ cells had a reduced ratio of sgRNA/gRNA (Figure 3H). In summary, these results indicate that gRNA synthesis is favoured in the late phase of infection, which may reflect the requirement of gRNA to assemble new viral particles.

Positive sense RNA viruses, including coronaviruses, utilise host membranes to generate viral factories, which are sites of active replication and/or virus assembly 92Wolff et al.2020. Our current knowledge on the genesis and dynamics of these factories in SARS-CoV-2 infection is limited. We exploited the spatial resolution of smFISH to study these structures, which we define as spatially extended foci containing multiple gRNA molecule clusters. These clusters are compatible in size with the double membrane vesicles (DMVs) employed by SARS-CoV-2 to replicate and assemble new virions, as previously identified by EM (see Materials and methods; 19Cortese et al.2020; 58Mendonça et al.2021). We refer to these gRNA clusters as ‘factories’. We observed 1–2 factories per cell at 2 hpi, which increased to ~30 factories/cell by 10 hpi (Figure 3I). In addition, the average number of gRNA molecules within these factories, although variable, increased over time (Figure 3J). RDV treatment reduced both the number of viral factories per cell and their RNA content. Together these data show the capability of smFISH to localise and quantify active sites of SARS-CoV-2 replication and to measure changes in gRNA and sgRNA at a single-cell level over the course of the infection.

Super-permissive cells are randomly distributed

Our earlier kinetic analysis of infected Vero E6 cells identified a minor population of ‘super-permissive’ cells containing high gRNA copies at 8 hpi. A random selection of ~300 cells allowed us to further characterise the infected cell population (Figure 4A and B). To extend these observations, we examined the vRNAs in two human lung epithelial cell lines, A549-ACE2 and Calu-3, that are widely used to study SARS-CoV-2 infection 17Chu et al.202037Hoffmann et al.2020. In agreement with our earlier observations with Vero E6, 3–5% of A549-ACE2 and Calu-3 cells showed a ‘super-permissive’ phenotype (Figure 4C and D). An important question is how these ‘super-permissive’ cells are distributed in the population as the pattern could highlight potential drivers for susceptibility 36Healy et al.2020. Infection can induce innate signalling that can lead to the expression and secretion of soluble factors such as interferons that induce an antiviral state in the local cellular environment 4Belkowski and Sen198772Schoggins and Rice2011. Regulation can be widespread through paracrine signalling or affect only proximal cells. We considered three scenarios where ‘super-permissive’ cells are randomly distributed, evenly separated or clustered together. We compared the average nearest-neighbour distance between ‘super-permissive’ cells and simulated points that were distributed either randomly, evenly, or in clusters (Figure 4—figure supplement 1). In summary, our results show conclusively that the ‘super-permissive’ infected Vero E6, A549-ACE2, and Calu-3 cells were randomly distributed (Figure 4E and F, Figure 4—figure supplement 1). We interpret these data as being consistent with an intrinsic property of the cell that defines susceptibility to virus infection. The data also argue against cell-to-cell signalling mechanisms that would either lead to clustering (if increasing susceptibility) or to an even distribution (if inhibiting) of infected cells.

Heterogeneous severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA replication.

(A) Representative ×60 magnified field of view (FOV) of SARS-CoV-2-infected Vero E6 cells at 8 hr post infection (hpi) (Victoria [VIC] strain, multiplicity of infection [MOI] = 1). Single-molecule fluorescence in situ hybridisation (smFISH) against ORF1a was used to visualise cellular heterogeneity in viral RNA counts. Magnified panels show (i) a ‘super-permissive’ cell and (ii) a cell with discrete viral RNA copies. Scale bar = 10 or 50 µm. (B) Discrete separation of genomic RNA (gRNA) count distribution among infected cells randomly sampled at 8 hpi, where each symbol represents a cell. Statistics for the percentage of infected cells and frequency of ‘super-permissive’ cells at 8 hpi. Quantification was performed per FOV, and the number labels represent average values. Cells with >105 gRNA copies were considered to be ‘super-permissive’ (gRNA quantification: n = 4, 148 uninfected and 316 infected cells; percentage infection: n = 3). (C, D) Heterogeneous SARS-CoV-2 replication in lung epithelial A549-ACE2 and Calu-3 cells. The percentage of infected and super-permissive cells was quantified as with Vero E6 cells above. Scale bar = 50 µm (A549-ACE2, n = 2; Calu-3, n = 3). (E, F) Spatial distribution analysis of super-permissive Vero E6 and A549-ACE2 cells at 8 hpi. Low-magnification smFISH overview of infected cells (top left). 2D mask of super-permissive cells (top right). An example of randomly simulated points within the DAPI mask (bottom left). Same number of random points as super-permissive cells were simulated 10 times per FOV. Histogram of nearest-neighbour distances calculated from super-permissive cells (Observed) and randomly simulated points (Random) (bottom right). Further modes of spatial analyses are presented in Figure 4—figure supplement 1 with infected Calu-3 cells. All confocal images are presented as maximum full z-projection. Data are represented as mean ± SD. Student’s t-test; p-values are shown on the presented visual (Vero E6, n = 3; A549-ACE2, n = 2).

# Quantification data
Vero_8hpi_count <- read_csv("./Data/Figure4/Heterogenity_example_count.csv") %>%
  mutate(time.adj = as.character(time.adj)) %>%
  mutate(hours_post_infection = case_when(condition == "INF" ~ paste0(time.adj, "hpi"),
                                          condition == "MOCK" ~ "Mock")) %>%
  mutate(total_count = total_count + 1) %>%
  mutate(hours_post_infection = fct_rev(hours_post_infection))

# Percentage infection data
Vero_8hpi_infection <- read_csv("./Data/Figure4/Heterogenity_example_p-infected.csv") %>%
  mutate(Time = case_when(Time == "MOCK" ~ "Mock",
                          TRUE ~ Time)) %>%
  mutate(Time = as_factor(Time)) %>%
  mutate(Time = fct_relevel(Time, c("Mock", "8hpi", "sup"))) %>%
  filter(!is.na(Time))

Vero_8hpi_infection_summary <- Vero_8hpi_infection %>%
  group_by(Time) %>%
  summarise(mean = mean(p.infected)) %>%
  ungroup() %>%
  mutate(mean = round(mean, 2)) %>%
  mutate(label = paste0(mean, "%"))



# * * * * * Plots
Vero_8hpi_count %>%
  # filter(condition == "INF") %>%
  ggplot(aes(x = hours_post_infection, y = total_count, colour = condition)) +
  geom_violin(trim = TRUE, adjust = 1.5, size = 0.25) +
  geom_quasirandom(width = 0.3, size = 0.1, alpha = 0.25) +
  scale_colour_manual(values = c("#d1ab75", "#c9d175")) +
  scale_y_log10(limits = c(1, 1e7),
                labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "gRNA count",
       x = "", y = "Total RNA count + 1 (log10)") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5)) -> p1

Vero_8hpi_infection %>%
  ggplot(aes(x = Time, y = p.infected, fill = Time)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#c9d175", "#d1ab75", "#d18975")) +
  geom_label(data = Vero_8hpi_infection_summary,
             aes(x = Time, y = mean + 7, label = label), 
             size = 1.5, label.padding = unit(0.15, "lines"), label.size = 0.1,
             inherit.aes = FALSE) + 
  labs(title = "Infection state",
       x = "", 
       y = "% of cells per field of view") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p2

p1 + p2

Discrete separation of genomic RNA (gRNA) count distribution among infected cells randomly sampled at 8 hpi, where each symbol represents a cell.

Statistics for the percentage of infected cells and frequency of ‘super-permissive’ cells at 8 hpi. Quantification was performed per FOV, and the number labels represent average values. Cells with >105 gRNA copies were considered to be ‘super-permissive’ (gRNA quantification: n = 4, 148 uninfected and 316 infected cells; percentage infection: n = 3).

#' @width 6.6
#' @height 10.2
# Percentage infection data
A549_8hpi_infection <- read_csv("./Data/Figure4/Heterogenity_A549_p-infected.csv") %>%
  mutate(Time = as_factor(Time)) %>%
  mutate(Time = fct_relevel(Time, c("Mock", "8hpi", "sup"))) %>%
  filter(!is.na(Time))

A549_8hpi_infection_summary <- A549_8hpi_infection %>%
  group_by(Time) %>%
  summarise(mean = mean(p_infected)) %>%
  ungroup() %>%
  mutate(mean = round(mean, 2)) %>%
  mutate(label = paste0(mean, "%"))

# * * * * * Plots 
A549_8hpi_infection %>%
  ggplot(aes(x = Time, y = p_infected, fill = Time)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#c9d175", "#d1ab75", "#d18975")) +
  geom_label(data = A549_8hpi_infection_summary,
             aes(x = Time, y = mean + 7, label = label), 
             size = 1.5, label.padding = unit(0.15, "lines"), label.size = 0.1,
             inherit.aes = FALSE) + 
  labs(title = "A549-ACE2 Infection",
       x = "", 
       y = "% of cells per field of view") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p3
p3

Heterogeneous SARS-CoV-2 replication in lung epithelial A549-ACE2 cells (n = 2).

#' @width 6.6
#' @height 10.2
# Percentage infection data
Calu_8hpi_infection <- read_csv("./Data/Figure4/Heterogenity_Calu_p-infected.csv") %>%
  mutate(Time = as_factor(Time)) %>%
  mutate(Time = fct_relevel(Time, c("Mock", "8hpi", "sup"))) %>%
  filter(!is.na(Time))

Calu_8hpi_infection_summary <- Calu_8hpi_infection %>%
  group_by(Time) %>%
  summarise(mean = mean(p_infected)) %>%
  ungroup() %>%
  mutate(mean = round(mean, 2)) %>%
  mutate(label = paste0(mean, "%"))

# * * * * * Plots 
Calu_8hpi_infection %>%
  ggplot(aes(x = Time, y = p_infected, fill = Time)) +
  geom_bar(stat = "summary", width = 0.75) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.4) +
  scale_y_continuous(breaks = seq(0, 100, by = 20),
                     limits = c(0, 100)) + 
  scale_fill_manual(values = c("#c9d175", "#d1ab75", "#d18975")) +
  geom_label(data = Calu_8hpi_infection_summary,
             aes(x = Time, y = mean + 7, label = label), 
             size = 1.5, label.padding = unit(0.15, "lines"), label.size = 0.1,
             inherit.aes = FALSE) + 
  labs(title = "Calu-3 Infection",
       x = "", 
       y = "% of cells per field of view") +
  theme_classic(base_size = 7) + 
  theme(plot.title = element_text(hjust = 0.5, size = 7),
        axis.line = element_blank(),
        legend.position = "none",
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> p4
p4

Heterogeneous SARS-CoV-2 replication in lung epithelial Calu-3 cells (n = 3).

# Import data
nndist_calu <- read_csv("./Data/Figure4/Spatial-randomness_Calu_nn_distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.33)

# Stats and parameters 
p_value <- nndist_calu %>% t_test(nn_distance ~ data_type) %>% add_significance() %>% pull(p)
p_value_label = paste0("p = ", p_value)

observed_mean <- nndist_calu %>% filter(data_type == "Observed") %>%
  pull(nn_distance) %>% mean() %>% round(1)
observed_mean_label = paste0("mean = ", observed_mean, " \u03BCm")

random_mean <- nndist_calu %>% filter(data_type == "Random") %>%
  pull(nn_distance) %>% mean() %>% round(1)
random_mean_label = paste0("mean = ", random_mean, " \u03BCm")


# Plot
ggplot() +
  geom_histogram(data = subset(nndist_calu, data_type == "Observed"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 colour = "white", bins = 40, alpha = 1, size = 0.1) +
  geom_histogram(data = subset(nndist_calu, data_type == "Random"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 bins = 80, alpha = 0.5) +
  scale_fill_manual(values = c("gray50", "#75b8d1")) + 
  scale_y_continuous(limits = c(0, 1),
                     breaks = seq(0, 1, by = 0.2)) +
  labs(x = "Nearest neighbour \n distance (\u03BCm)",
       y = "Normalised count") + 
  theme_classic(base_size = 7) +
  theme(legend.title = element_blank(),
        axis.line = element_blank(),
        legend.position = c(0.7, 0.85),
        legend.key.size = unit(0.1, "cm"),
        legend.text = element_text(hjust = 0),
        legend.background = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        axis.text = element_text(size = 5)) +
  annotate("text", x = 325, y = 0.75, 
           label = observed_mean_label, colour = "gray30", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.68, 
           label = random_mean_label, colour = "#75b8d1", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.61, 
           label = p_value_label, colour = "gray30", size = 2, hjust = 0) -> spatial_calu_plot

spatial_calu_plot

Histogram of nearest-neighbour distances calculated from super-permissive cells (Observed) and randomly simulated points (Random)

# Import data
nndist_A549 <- read_csv("./Data/Figure4/Spatial-randomness_A549_nn_distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.55)

# Stats and parameters 
p_value <- nndist_A549 %>% t_test(nn_distance ~ data_type) %>% add_significance() %>% pull(p)
p_value_label = paste0("p = ", p_value)

observed_mean <- nndist_A549 %>% filter(data_type == "Observed") %>%
  pull(nn_distance) %>% mean() %>% round(0)
observed_mean_label = paste0("mean = ", observed_mean, " \u03BCm")

random_mean <- nndist_A549 %>% filter(data_type == "Random") %>%
  pull(nn_distance) %>% mean() %>% round(0)
random_mean_label = paste0("mean = ", random_mean, " \u03BCm")


# Plot
ggplot() +
  geom_histogram(data = subset(nndist_A549, data_type == "Observed"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 colour = "white", bins = 40, alpha = 1, size = 0.1) +
  geom_histogram(data = subset(nndist_A549, data_type == "Random"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 bins = 80, alpha = 0.5) +
  scale_fill_manual(values = c("gray50", "#75b8d1")) + 
  scale_y_continuous(limits = c(0, 1),
                     breaks = seq(0, 1, by = 0.2)) +
  scale_x_continuous(limits = c(0, 1000)) + 
  labs(x = "Nearest neighbour \n distance (\u03BCm)",
       y = "Normalised count") + 
  theme_classic(base_size = 7) +
  theme(legend.title = element_blank(),
        axis.line = element_blank(),
        legend.position = c(0.7, 0.85),
        legend.key.size = unit(0.1, "cm"),
        legend.text = element_text(hjust = 0),
        legend.background = element_blank(),
        axis.text = element_text(size = 5),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5)) +
  annotate("text", x = 325, y = 0.75, 
           label = observed_mean_label, colour = "gray30", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.68, 
           label = random_mean_label, colour = "#75b8d1", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.61, 
           label = p_value_label, colour = "gray30", size = 2, hjust = 0) -> spatial_A549_plot

spatial_A549_plot

Histogram of nearest-neighbour distances calculated from super-permissive cells (Observed) and randomly simulated points (Random).

Spatial distribution of super-permissive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-infected cells.

Spatial distribution analysis of ‘super-permissive’ SARS-CoV-2 (Victoria [VIC])-infected Calu-3 cells. Cells were infected with SARS-CoV-2 (VIC strain) at a multiplicity of infection [MOI] of 1, fixed at 8 hr post infection (hpi) and hybridised with +ORF1a single-molecule fluorescence in situ hybridisation (smFISH) probe to visualise viral RNA. (A) Low-magnification image (full z-projection) of infected Calu-3 cells showing a population of minority ‘super-permissive’ cells visible with smFISH. Scale bar = 250 µm. (B) 2D mask generated from spatial coordinates of super-permissive cells. (C) Observed distribution of super-permissive cells (Observed) and the example coordinates of the three modes of spatial distributions: (i) evenly spaced, (ii) clustered, and (iii) random. The simulations were confined to DAPI-positive areas. (D) Density plot of nearest-neighbour distances obtained from the spatial distribution simulation. Each mode of distribution was iterated 10 times per image. (E) Beeswarm plot of nearest-neighbour distances obtained from the spatial distribution analysis. One-way ANOVA with post-hoc Tukey test (n = 3).

Differential replication kinetics of the B.1.1.7 and VIC strains

The recent emergence of SARS-CoV-2 VOCs, which display differential transmission, pathogenesis, and infectivity, has changed the course of the COVID-19 pandemic. Recent studies have focused on mutations in the Spike protein and whether these alter particle uptake into cells and resistance to vaccine or naturally acquired antibodies 18Collier et al.202123Dicken et al.202162Planas et al.2021. The B.1.1.7 variant is associated with higher transmission 20Davies et al.202131Galloway et al.202184Volz et al.2021 and has 17 coding changes mapping to both non-structural (ORF1a/b, ORF3a, ORF8) and structural (Spike and N) proteins. Mutations within the non-structural genes could affect virus replication, independent of Spike-mediated entry, thus we used smFISH to compare the replication kinetics of the B.1.1.7 and VIC strains (Figure 5A). We discovered that the number of gRNA molecules at 2 hpi was similar for both viruses, reflecting similar cell uptake of viral particles (Figure 5B–E). However, the quantities of intracellular gRNA and sgRNA were lower in B.1.1.7-infected cells compared to VIC at 6 and 8 hpi (Figure 5E). We also found that while the amount of gRNA per cell was reduced in the B.1.1.7 variant, there were an equal number of +ORF1a and +ORFN-positive cells (Figure 5D), suggesting that the reduced B.1.1.7 RNA burden is due to a differential replication efficiency rather than infection rate. The B.1.1.7 variant also showed a reduced number of replication factories per cell (Figure 5F), with each focus containing on average a lower number of gRNA molecules compared to the VIC strain (Figure 5G). RDV treatment ablated the differences between the viral strains, demonstrating that the observed phenotype is replication-dependent (Figure 5B,E-I). Nevertheless, the lower level of individual gRNA that we detected in RDV-treated cells persisted for at least 8 hpi in both the VIC and B.1.1.7 strains. We conclude that individual gRNA molecules of both the strains are highly stable in the cytoplasm of infected cells.

Delayed replication kinetics of B.1.1.7 variant.

(A) Experimental design to compare the replication kinetics of Victoria (VIC) and B.1.1.7 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains. Vero E6 cells were seeded on cover-glass and 24 hr later inoculated with VIC or B.1.1.7 strain (multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion, and cells were fixed at designated timepoints for hybridisation with +ORF1a and +ORFN probes. In remdesivir (RDV) condition, the drug was added to cells at 10 µM during virus inoculation and maintained for the infection period. (B) Maximum z-projected confocal images of Vero E6 cells infected with VIC or B.1.1.7 strains. Representative super-permissive cells from the time series are shown. Scale bar = 10 µm. (C) Comparing viral genomic RNA (gRNA) counts at 2 hr post infection (hpi) between VIC and B.1.1.7. Each symbol represents a cell. Different hue of colours represents readings taken from individual repeat experiments, and the labels represent average values (n = 3; VIC, 424 cells; B.1.1.7, 519 cells). Mann–Whitney U test. (D) Comparing percentage of infected cells between the two viral strains at 2 hpi. Infected cells were determined by +ORF1a single-molecule fluorescence in situ hybridisation (smFISH) fluorescence. Data are represented as mean ± SD (n = 3). Student’s t-test. (E) Bigfish quantification of gRNA and subgenomic RNA (sgRNA) smFISH counts per cell. Quantification was performed as in (Figure 3C). Due to bimodality of the data, statistical significance was determined using two-sample Kolmogorov-Smirnov test to compare cumulative distribution of+ ORF1 a counts between the two strains. (n = 3). (VIC, 2 : 6 : 8 hpi = 460 : 343 : 407 cells; B.1.1.7, 2 : 6 : 8 hpi = 396 : 487 : 429 cells). (F) Comparing the number of viral factories per cell between the two viral strains across the time series. Cells harbouring >107 copies of vRNA were excluded from analysis. Viral factories were identified using Bigfish cluster detection as with (Figure 3I). Data are represented as mean ± SEM (n = 3). Mann–Whitney U test. (G) Density ridge plot showing the number of gRNA copies within viral factories for VIC and B.1.1.7 variants. The density distribution represents the number of molecules per viral factories per cell. Vertical segment symbol represents a cell (n = 3). Mann–Whitney U test. (H) Per cell ratio of sgRNA/gRNA counts across the time series. Grey symbols represent cell-to-cell mean ± SE which are connected by line plots. Horizontal dashed line represents value of 1 (n = 3). Mann–Whitney U test. (I) Comparison of the percentage of super-permissive cells between the two strains assessed from low-magnification high-throughput smFISH assay (see Figure 5—figure supplement 1 for details). Data are represented as mean ± SD (n = 3). Student’s t-test. n.s., not significant; *p<0.05; **p<0.01; ***p<0.001; ***p<0.0001.

# * * * * * % infection
# Prepare data
RNA_df <- read_csv("./Data/Figure5/Data_summary_threshold_150.csv") %>%
  dplyr::select(c("File" = file_name, time, "Virus" = strain, "condition" = treatment, everything())) %>%
  mutate(Virus = case_when(
    Virus == "B117" ~ "B.1.1.7",
    Virus == "Vic" ~ "Victoria",
    Virus == "None" ~ "Mock")) %>%
  mutate(condition = case_when(
    condition == "INF" ~ "Untreated",
    TRUE ~ condition)) %>%
  mutate(time = case_when(
    time == 2 ~ "2 hpi",
    time == 6 ~ "6 hpi",
    time == 8 ~ "8 hpi",
    time == 24 ~ "24 hpi")) %>%
  mutate(repeat_exp = case_when(
    str_detect(File, "R1") ~ "R1",
    str_detect(File, "R2") ~ "R2",
    str_detect(File, "R3") ~ "R3"
  ))

p_infection <- RNA_df %>% 
  filter(Virus != "Mock" & condition == "Untreated" & time == "2 hpi") %>%
  mutate(File = str_replace(File, "cell_[:alnum:]", "")) %>%
  group_by(File, Virus, repeat_exp) %>%
  summarise(n_cell = n(),
            n_inf = sum(ch4_total_vRNAs > 4),
            p_inf = sum(ch4_total_vRNAs > 4)/n()*100) %>%
  ungroup() %>%
  mutate(Virus = fct_rev(Virus)) %>%
  group_by(Virus, repeat_exp) %>% summarise(mean_p_inf = mean(p_inf)) %>% 
  ungroup()

# Statistics
p_infection %>% group_by(Virus) %>% shapiro_test(mean_p_inf)
p_inf <- p_infection %>% t_test(mean_p_inf ~ Virus) %>% pull(p)
p_anno_inf <- paste0("p=", p_inf)

# * * * * * Viral entry
# Prepare data 
RNA_df_2hpi <- RNA_df %>% filter(time == "2 hpi") %>% filter(Virus != "Mock") %>%
  filter(ch4_total_vRNAs > 1) %>%
  mutate(Virus = fct_rev(Virus)) %>%
  mutate(colour_factor = paste0(Virus, "-", repeat_exp))

# Statistics 
RNA_df_2hpi %>% group_by(Virus, repeat_exp) %>% summarise(mean = mean(ch4_total_vRNAs)) %>%
  shapiro_test(mean)
RNA_df_2hpi %>% group_by(Virus, repeat_exp) %>% summarise(mean = mean(ch4_total_vRNAs)) %>%
  ungroup() %>% t_test(mean ~ Virus) %>% pull(p) -> p_entry

# Annotation
RNA_df_2hpi_anno <- RNA_df_2hpi %>% group_by(Virus) %>% 
  summarise(mean = mean(ch4_total_vRNAs)) %>% 
  ungroup() %>%
  mutate(mean = round(mean, 1))
p_anno_entry <- paste0("p=", p_entry)

# Viral entry plot
RNA_df_2hpi %>%
  ggplot(aes(x = Virus, y = ch4_total_vRNAs, colour = colour_factor)) +
  geom_beeswarm(cex = 0.8, alpha = 0.15, size = 0.75, stroke = 0,
                priority = "random") +
  geom_point(stat = "summary", position = position_dodge(width = 0.1), 
             size = 1.5, alpha = 1) +
  geom_label(data = RNA_df_2hpi_anno, 
             aes(x = Virus, y = 150, label = mean), 
             inherit.aes = FALSE,label.padding = unit(0.1, "lines"), size = 2) +
  scale_y_continuous(limits = c(0, 200)) +
  scale_colour_manual(
    values = c("#5672be", "#2f416f", "#0d152a", "#f9ab38", "#c39050", "#8b7660")) +
  labs(title = "Viral entry (2hpi)",
       x = "",
       y = "(+)ve gRNA total count") +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = "none",
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  annotate(geom = "text", label = p_anno_entry, size = 2, alpha = 0.75,
           x = 1.5, y = 190)

Comparing viral genomic RNA (gRNA) counts at 2 hr post infection (hpi) between VIC and B.1.1.7.

Each symbol represents a cell. Different hue of colours represents readings taken from individual repeat experiments, and the labels represent average values (n = 3; VIC, 424 cells; B.1.1.7, 519 cells). Mann–Whitney U test.

#' @width 6
#' @height 6
# % infection plot
p_infection %>%
  ggplot(aes(x = Virus, y = mean_p_inf, fill = Virus)) +
  geom_bar(stat = "summary",
           width = 0.5) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", fun.args = list(mult = 1),
                 size = 0.5) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "% infected cells",
       x = "",
       y = "% cells per image") + 
  theme_classic(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5),
        legend.title     = element_blank(),
        legend.key.size  = unit(0.2, "cm"),
        legend.position  = "none", 
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  annotate(geom = "text", label = p_anno_inf, size = 2, alpha = 0.75,
           x = 1.5, y = 100)

Comparing percentage of infected cells between the two viral strains at 2 hpi.

Infected cells were determined by +ORF1a single-molecule fluorescence in situ hybridisation (smFISH) fluorescence. Data are represented as mean ± SD (n = 3). Student’s t-test.

# * * * * * Data

RNA_df_total <- read_csv("./Data/Figure5/CF10_JL_20210629.csv") %>%
  mutate(gRNA_log = log10(total_vRNAs_gRNA))

RNA_df_total %>%
  group_by(time, strain) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  arrange(strain)

# * * * * * Statistics 

## 2 hpi
RNA_df_total %>% group_by(time, condition, strain, repeat_exp) %>%
  filter(time == "2hpi") %>%
  filter(total_vRNAs_gRNA > 0) %>%
  summarise(gRNA_mean = mean(gRNA_log, na.rm = TRUE)) %>% ungroup() %>%
  group_by(time, condition, strain) %>%
  shapiro_test(gRNA_mean) # normal

RNA_df_total %>% group_by(time, condition, strain, repeat_exp) %>%
  filter(time == "2hpi") %>%
  filter(total_vRNAs_gRNA > 0) %>%
  summarise(gRNA_mean = mean(gRNA_log, na.rm = TRUE)) %>% ungroup() %>%
  group_by(time, condition) %>%
  t_test(gRNA_mean ~ strain)

## 6-8 hpi: KS statistics test function
do_ks_test <- function(time, condition, test){
  vic <- RNA_df_total[RNA_df_total$time == time & RNA_df_total$condition == condition, ] %>% filter(strain == "Victoria") %>%
    pull(test)
  b17 <- RNA_df_total[RNA_df_total$time == time & RNA_df_total$condition == condition, ] %>% filter(strain == "B.1.1.7") %>%
    pull(test)
  ks.test(vic, b17, alternative = "two.sided") %>% tidy() %>% add_significance()
}

## Do tests
do_ks_test(time = "6hpi", condition = "Untreated", test = "total_vRNAs_gRNA")
do_ks_test(time = "8hpi", condition = "Untreated", test = "total_vRNAs_gRNA")
do_ks_test(time = "6hpi", condition = "RDV", test = "total_vRNAs_gRNA")
do_ks_test(time = "8hpi", condition = "RDV", test = "total_vRNAs_gRNA")

# * * * * * Plot
set.seed(13)
rows <- sample(nrow(RNA_df_total))
RNA_df_total_plot <- RNA_df_total[rows, ]

RNA_df_total_plot %>%
  mutate(condition = fct_rev(condition)) %>%
  mutate(strain = fct_rev(strain)) %>%
  ggplot(aes(x = total_vRNAs_sgRNA, y = total_vRNAs_gRNA, colour = strain, alpha = strain)) +
  geom_point(size = 1, stroke = 0) +
  geom_vline(xintercept = 100000, linetype = "dotted", alpha = 0.5, size = 0.5) +
  geom_hline(yintercept = 100000, linetype = "dotted", alpha = 0.5, size = 0.5) +
  scale_colour_manual(values = c("#f9ab38", "#5672be")) +
  scale_alpha_manual(values = c(0.8, 0.5)) +
  scale_x_log10(limits = c(1e0, 1e7),
                labels = trans_format("log10", math_format(10^.x))) +
  scale_y_log10(limits = c(1e0, 1e7),
                labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "Total viral RNA count per cell",
       x = "sgRNA count (Log)",
       y = "gRNA count (Log)") + 
  facet_grid(condition ~ time) + 
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5),
        legend.position  = c(0.07, 0.9),
        legend.background = element_blank(),
        legend.direction = "vertical",
        legend.title = element_blank(),
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5))

Bigfish quantification of gRNA and subgenomic RNA (sgRNA) smFISH counts per cell.

Quantification was performed as in (Figure 3C). Due to bimodality of the data, statistical significance was determined using two-sample Kolmogorov-Smirnov test to compare cumulative distribution of+ ORF1 a counts between the two strains. (n = 3). (VIC, 2 : 6 : 8 hpi = 460 : 343 : 407 cells; B.1.1.7, 2 : 6 : 8 hpi = 396 : 487 : 429 cells).

#' @width 12
#' @height 7
## Stat
RNA_df_total %>%
  group_by(condition, time, strain) %>%
  shapiro_test(repSites_gRNA)

RNA_df_total %>%
  group_by(condition, time) %>%
  wilcox_test(repSites_gRNA ~ strain) %>% add_significance()


## Plot
RNA_df_total %>%
  mutate(colour_arg = paste0(strain, " + " , condition)) %>%
  mutate(colour_arg = fct_rev(colour_arg)) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = time, y = repSites_gRNA, colour = colour_arg)) +
  geom_pointrange(stat = "summary", fun.data = "mean_se", fun.args = list(mult = 1),
                  size = 0.5) +
  geom_line(aes(group = strain), stat = "summary",
            size = 0.5, alpha = 0.5) +
  scale_colour_manual(values = c("#f9ab38", "#f9d275", "#5672be", "#b4dded")) + 
  labs(title = "Number of viral factories per cell",
       x = "",
       y = "Count") +
  theme_classic(base_size = 7) + 
  facet_wrap(~condition) + 
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = c(0.75, 0.8),
        legend.background = element_blank(),
        legend.key.size = unit(0.3 ,"cm"),  
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5))

Comparing the number of viral factories per cell between the two viral strains across the time series.

Cells harbouring >107 copies of vRNA were excluded from analysis. Viral factories were identified using Bigfish cluster detection as with (Figure 3I). Data are represented as mean ± SEM (n = 3). Mann–Whitney U test.

#' @width 16
#' @height 9
## Stat
RNA_df_total %>%
  group_by(time, condition) %>%
  wilcox_test(content ~ strain) %>% add_significance() %>%
  arrange(desc(condition), time)

## Plot
RNA_df_total %>%
  mutate(condition = fct_rev(condition)) %>%
  mutate(strain = fct_rev(strain)) %>%
  mutate(time = fct_rev(time)) %>%
  ggplot(aes(x = content, y = time, colour = strain, fill = strain)) +
  geom_density_ridges(aes(height = stat(ndensity)), 
                      alpha = 0.4, size = 0.3, scale = 1.3, bandwidth = 0.16,
                      jittered_points = TRUE, point_shape = "|", point_size = 2.5,
                      position = position_points_jitter(height = 0)) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  scale_colour_manual(values = c("#f9ab38", "#5672be")) +
  scale_x_log10(labels = trans_format("log10", math_format(10^.x))) +
  labs(title = "Viral factory content",
       x = "Number of gRNA in viral factories per cell (Log)",
       y = "Density (at hours post infection)") +
  facet_wrap(~ condition) +
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.background = element_blank(),
        legend.position = c(0.25, 0.07),
        legend.direction = "horizontal",
        legend.key.size = unit(0.3 ,"cm"),  
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5))

Density ridge plot showing the number of gRNA copies within viral factories for VIC and B.1.1.7 variants.

The density distribution represents the number of molecules per viral factories per cell. Vertical segment symbol represents a cell (n = 3). Mann–Whitney U test.

# ## Stat
# RNA_df_total %>%
#   filter(ratio != Inf) %>%
#   group_by(time, condition) %>%
#   wilcox_test(ratio ~ strain) %>% add_significance() %>%
#   arrange(condition, time)
# 
# ## Plot
# RNA_df_total %>%
#   mutate(strain = fct_rev(strain))  %>%
#   mutate(condition = fct_rev(condition)) %>%
#   ggplot(aes(x = as.factor(time), y = ratio, colour = strain)) +
#   geom_point(position = position_jitterdodge(jitter.width = 0.4, dodge.width = 0.75),
#             size = 0.65, alpha = 0.3, stroke = 0.1) +
#   geom_pointrange(aes(x = as.factor(time), y = ratio, group = strain),
#                   stat = "summary", position = position_dodge(width = 0.75), 
#                   size = 0.3, colour = "gray50",
#                   inherit.aes = FALSE, show.legend = FALSE) +
#   geom_line(aes(group = strain), stat = "summary", size = 0.6, 
#             alpha = 1) +
#   geom_hline(yintercept = 1, linetype = "dashed",
#              size = 0.5, colour = "gray75") + 
#   coord_cartesian(ylim = c(0, 30)) + 
#   labs(title = "sgRNA/gRNA ratio (by hpi)",
#        x = "Hours post infection",
#        y = "sgRNA/gRNA ratio") +
#   scale_colour_manual(values = c("#f9ab38", "#5672be")) + 
#   facet_wrap(~ condition) +
#   theme_classic(base_size = 7) +
#   theme(plot.title = element_text(hjust = 0.5, size = 7),
#         axis.line = element_blank(),
#         legend.title = element_blank(),
#         legend.position = c(0.8, 0.8),
#         legend.background = element_blank(),
#         legend.direction = "horizontal",
#         legend.key.size = unit(0.3, "cm"),
#         panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
#         strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
#   guides(color = guide_legend(override.aes = list(size = 2, alpha = 1)))

Per cell ratio of sgRNA/gRNA counts across the time series.

Grey symbols represent cell-to-cell mean ± SE which are connected by line plots. Horizontal dashed line represents value of 1 (n = 3). Mann–Whitney U test.

#' @width 12
#' @height 7
# Import data
FACS_df <- read_csv("./Data/Figure5/CoV-FISH-10_Vero-overivew-FACS.csv") %>%
  mutate(time = case_when(
    time == "02h" ~ "2 hpi",
    time == "06h" ~ "6 hpi",
    time == "08h" ~ "8 hpi",
    time == "24h" ~ "24 hpi"
  ))

# Create dataframe
FACS_bar <- FACS_df %>%
  filter(time == "6 hpi" | time == "8 hpi") %>%
  group_by(File, time, Virus, condition) %>%
  summarise(p.superinfected = sum(superinfected)/n() * 100) %>%
  ungroup()

# Statistics
FACS_bar %>% group_by(time, Virus, condition) %>% shapiro_test(p.superinfected) # Failed normality

FACS_bar %>% group_by(time, condition) %>% 
  wilcox_test(p.superinfected ~ Virus, p.adjust.method = "bonferroni") %>% add_significance()

# Plot
FACS_bar %>% 
  mutate(Virus = fct_rev(Virus)) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = time, y = p.superinfected, fill = Virus)) +
  geom_bar(stat = "summary", fun.data = "mean_se",
           width = 0.65, position = position_dodge(0.8), alpha = 0.8) +
  geom_linerange(stat = "summary", fun.data = "mean_se", 
                 fun.args = list(mult = 1), size = 0.25,
                 position = position_dodge(0.8)) +
  scale_y_continuous(limits = c(0, 15)) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "% Super-permissive",
       x = "",
       y = "% cells per field of view") +
  facet_wrap(~ condition) +
  theme_classic(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5),
        legend.title     = element_blank(),
        legend.position  = c(0.75, 0.85),
        legend.direction = "horizontal",
        axis.line        = element_blank(),
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.key.size  = unit(0.2, "cm"))

Comparison of the percentage of super-permissive cells between the two strains assessed from low-magnification high-throughput smFISH assay (see Figure 5—figure supplement 1 for details).

Data are represented as mean ± SD (n = 3). Student’s t-test. n.s., not significant; *p<0.05; **p<0.01; ***p<0.001; ***p<0.0001.

Delayed replication kinetics of B.1.1.7 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variant.

Vero E6 cells were seeded on cover-glass and 24 hr later inoculated with Victoria (VIC) or B.1.1.7 strain (multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion, and cells were fixed at designated timepoints. In remdesivir (RDV) condition, the drug was added to cells at 10 µM during virus inoculation and maintained for the infection period. (A) Per cell ratio of subgenomic RNA (sgRNA)/genomic RNA (gRNA) counts grouped by gRNA burden classification as in Figure 3D. Single-molecule fluorescence in situ hybridisation (smFISH) quantification was performed as with Figure 5H. Grey symbols represent mean ± SEM. The horizontal dashed line represents ratio of 1 (3). Mann–Whitney U test. n.s., not significant; *p<0.05; **p<0.01; ***p<0.001; ***p<0.0001. (B) Low-magnification (×20) z-projected (6 µm) images of +ORF1a and +ORFN smFISH in infected Vero E6 cells. Scale bar = 250 µm.

# * * * * * Import data
FACS_df <- read_csv("./Data/Figure5/CoV-FISH-10_Vero-overivew-FACS.csv") %>%
  mutate(time = case_when(
    time == "02h" ~ "2 hpi",
    time == "06h" ~ "6 hpi",
    time == "08h" ~ "8 hpi",
    time == "24h" ~ "24 hpi"
  ))

# * * * * * FACS plot
# Make annotation df
FACS_df_summary <- FACS_df %>% 
  group_by(time, Virus, condition) %>%
  summarise(Ch3.pool        = mean(Ch3.mean),
            Ch4.pool        = mean(Ch4.mean),
            p.superinfected = sum(superinfected)/n() * 100,
            cell.count      = n()) %>% 
  ungroup()

Vero_anno <- FACS_df_summary %>%
  mutate(y.pos = case_when(
    Virus == "Victoria" ~ 800,
    Virus == "B1.1.7" ~ 630,
    Virus == "Mock" ~ 500)) %>%
  mutate(p.superinfected = as.character(round(p.superinfected, 2))) %>%
  mutate(p.superinfected = paste0(p.superinfected, "%")) %>%
  mutate(Virus = fct_relevel(Virus, c("Victoria", "B1.1.7", "Mock"))) %>%
  mutate(condition = fct_rev(condition))

# Plot
FACS_df %>%
  filter(time != "24 hpi") %>% 
  mutate(Virus = fct_relevel(Virus, c("Victoria", "B1.1.7", "Mock"))) %>%
  arrange(Virus) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = Ch3.mean, y = Ch4.mean, colour = Virus)) +
  geom_point(shape = 21, size = 0.6, alpha = 0.25, stroke = 0.25) +
  geom_vline(xintercept = 140, size = 0.1, alpha = 0.5) +
  labs(title    = "smFISH intensity per cell",
       x        = "ORF-N smFISH intensity",
       y        = "gRNA smFISH intensity") + 
  scale_x_log10(limits = c(1e2, 1e4),
                labels = trans_format("log10", math_format(10^.x))) +
  scale_y_log10(limits = c(1e2, 1e3),
                labels = trans_format("log10", math_format(10^.x)),
                breaks = c(10^2, 10^2.5, 10^3)
                ) +
  scale_colour_manual(values = c("#f9ab38", "#5672be", "#d18975")) +
  facet_grid(condition ~ time) +
  theme_classic(base_size = 7) +
  guides(colour = guide_legend(override.aes = list(alpha = 1))) + 
  theme(plot.title = element_text(hjust = 0.5, size = 8),
        legend.position  = "bottom",
        legend.title = element_blank(),
        axis.line        = element_blank(), 
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) +
  geom_text(data = (Vero_anno %>% 
                      filter(time != "24 hpi")),
            aes(x = 200, y = y.pos, label = p.superinfected), 
            hjust = 0, show.legend = FALSE, size = 1.75) +
  guides(colour = guide_legend(override.aes = list(size = 2, alpha = 0.75, stroke = 0.6)))

Scatter plot showing high-throughput smFISH intensity quantification of +ORF1a and +ORFN probes in VIC and B.1.1.7-infected cells.

Each symbol represents a cell. Fluorescence density was measured from stitched ×20 overview images, covering approximately ~60% of the culture well area. At this magnification, smFISH fluorescence is only detectable in ‘super-permissive’ cells (>105 vRNA). The percentage of super-permissive cells was calculated based on a gate which was set with +ORFN signal using uninfected (Mock) condition signal as a threshold (vertical line) (n = 3).

# Import data
FACS_df <- read_csv("./Data/Figure5/CoV-FISH-10_Vero-overivew-FACS.csv") %>%
  mutate(time = case_when(
    time == "02h" ~ "2 hpi",
    time == "06h" ~ "6 hpi",
    time == "08h" ~ "8 hpi",
    time == "24h" ~ "24 hpi"
  ))

# Create dataframe
FACS_bar <- FACS_df %>%
  filter(time == "6 hpi" | time == "8 hpi") %>%
  group_by(File, time, Virus, condition) %>%
  summarise(p.superinfected = sum(superinfected)/n() * 100) %>%
  ungroup()

# Statistics
FACS_bar %>% group_by(time, Virus, condition) %>% shapiro_test(p.superinfected) # Failed normality

FACS_bar %>% group_by(time, condition) %>% 
  wilcox_test(p.superinfected ~ Virus, p.adjust.method = "bonferroni") %>% add_significance()

# Plot
FACS_bar %>% 
  mutate(Virus = fct_rev(Virus)) %>%
  mutate(condition = fct_rev(condition)) %>%
  ggplot(aes(x = time, y = p.superinfected, fill = Virus)) +
  # geom_point(position = position_jitterdodge(jitter.width = 0.1, dodge.width = 0.5)) +
  geom_bar(stat = "summary", fun.data = "mean_se",
           width = 0.65, position = position_dodge(0.8), alpha = 0.8) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.25,
                 position = position_dodge(0.8)) +
  scale_y_continuous(limits = c(0, 15)) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "% Highly permissive",
       x = "",
       y = "% cells per field of view") +
  facet_wrap(~ condition, nrow = 2) +
  theme_classic(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5, size = 8),
        legend.title     = element_blank(),
        legend.position  = "bottom",
        axis.line        = element_blank(),
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.key.size  = unit(0.2, "cm"))

Spatial distribution of Victoria (VIC) or B.1.1.7 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) super-permissive cells.

Spatial distribution analysis of super-permissive A549-ACE2 cells infected with VIC or B.1.1.7 strains at 8 hr post infection (hpi) and 24 hpi. Low-magnification overview of infected cells hybridised with +ORF1a probes is shown. To compare the ‘observed’ distribution of super-permissive cells and ‘random’ distributions, the same number of random points as super-permissive cells was simulated 10 times per field of view (FOV). Nearest-neighbour distances calculated from super-permissive cells (Observed) and simulated points (Random) are presented as histogram plots. Nearest-neighbour values (Rn) were calculated for each condition to assess the distribution of cell clusters (see Materials and methods). Rn value below 1 indicates a degree of clustering, and Rn value >1 indicates uniform placements. Student’s t-test. p-Values are denoted on the figure (n = 2 entire chamber wells).

# * * * * * Import data

## 24hpi
nndist_VIC_24h <- read_csv("./Data/Figure5/A549_Vic_24h_spatial_nn-distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.45)

nndist_B117_24h <- read_csv("./Data/Figure5/A549_B117_24h_spatial_nn-distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.45)

## 8hpi
nndist_VIC_8h <- read_csv("./Data/Figure4/Spatial-randomness_A549_nn_distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.45)

nndist_B117_8h <- read_csv("./Data/Figure5/A549_B117_8h_spatial_nn-distances.csv") %>%
  dplyr::select(c("Observed" = nn_distances_data, "Random" = nn_distances_sim)) %>%
  pivot_longer(everything(), names_to = "data_type", values_to = "nn_distance") %>%
  filter(!is.na(nn_distance)) %>%
  mutate(nn_distance = nn_distance * 0.65)

# * * * * * Define functions for stats/averaging/plotting

## Stats/Averaging
get_pvalue <- function(x){
  pvalue <- x %>% t_test(nn_distance ~ data_type) %>% add_significance() %>% pull(p)
  paste0("p = ", pvalue) %>% return()
}

get_mean_label <- function(x, y){
  observed_mean <- x %>% filter(data_type == y) %>%
  pull(nn_distance) %>% mean() %>% round(0)
  paste0("mean = ", observed_mean, " \u03BCm") %>% return()
}

get_rn <- function(df){
  variable_name <- deparse(substitute(df))
  total_area <- scan_area %>% filter(data == variable_name) %>% pull(area)
  n_images <- scan_area %>% filter(data == variable_name) %>% pull(n_images)
  nn_distance_vector <- df %>% filter(data_type == "Observed") %>% pull(nn_distance)
  Rn <- 2*(nn_distance_vector %>% mean()) / (sqrt(total_area*n_images*2/(length(nn_distance_vector))))
  Rn <- round(Rn, digits = 3)
  paste0("Rn = ", Rn) %>% return()
}

scan_area <- tibble(data = c("nndist_VIC_8h", "nndist_B117_8h", "nndist_VIC_24h", "nndist_B117_24h"),
                    area = c(1540000, 1210000, 15330000, 39430000),
                    n_images = c(8, 8, 2, 2))

# get_rn(nndist_VIC_8h)
# get_rn(nndist_B117_8h)
# get_rn(nndist_VIC_24h)
# get_rn(nndist_B117_24h)


## Plotting function
plot_nndistance <- function(df, ObsBin, RandBin){
  variable_name <- deparse(substitute(df))
  virus <- ifelse(str_detect(variable_name, "VIC"), "VIC", "B.1.1.7")
  hpi <- ifelse(str_detect(variable_name, "24h"), "24hpi", "8hpi")
  title <- paste0(virus, " ", hpi)
  colour <- ifelse(virus == "VIC", "#f9ab38", "#5672be")
  
  ggplot() +
  geom_histogram(data = subset(df, data_type == "Observed"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 colour = "white", bins = ObsBin, alpha = 1, size = 0.1) +
  geom_histogram(data = subset(df, data_type == "Random"),
                 aes(x = nn_distance, y = stat(ndensity), fill = data_type),
                 bins = RandBin, alpha = 0.4) +
  scale_fill_manual(values = c("gray50", colour)) + 
  scale_y_continuous(limits = c(0, 1),
                     breaks = seq(0, 1, by = 0.2)) +
  labs(title = title,
       x = "Nearest neighbour \n distance (\u03BCm)",
       y = "Normalised count") + 
  coord_cartesian(xlim = c(0, 800)) + 
  theme_classic(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5, size = 7, face = "bold"),
        legend.title = element_blank(),
        axis.line = element_blank(),
        legend.position = c(0.7, 0.85),
        legend.key.size = unit(0.1, "cm"),
        legend.text = element_text(hjust = 0),
        legend.background = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA, size = 0.5),
        axis.text = element_text(size = 5)) +
  annotate("text", x = 325, y = 0.90, 
           label = get_mean_label(df, "Observed"), colour = "gray30", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.83, 
           label = get_mean_label(df, "Random"), colour = colour, size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.76, 
           label = get_pvalue(df), colour = "gray30", size = 2, hjust = 0) +
  annotate("text", x = 325, y = 0.69, 
           label = get_rn(df), colour = "gray30", size = 2, hjust = 0)
}

# * * * * * Plot

(plot_nndistance(nndist_VIC_8h, 50, 100) + plot_nndistance(nndist_VIC_24h, 100, 500)) / 
  (plot_nndistance(nndist_B117_8h, 20, 100) + plot_nndistance(nndist_B117_24h, 85, 100)) + 
  plot_layout(guides = "collect")

Consistent with the delay in replication, we observed a shallower growth of the sgRNA/gRNA ratio in B.1.1.7-infected cells between 2 and 8 hpi compared to the VIC strain (Figure 5H). These differences between the strains were apparent in all three classifications of cells from our earlier gRNA burden criteria. We noted that B.1.1.7-infected ‘partially resistant’ and ‘permissive’ cells show lower sgRNA/gRNA ratio while ‘super-permissive’ cells displayed 1.5-fold higher ratio compared to VIC (Figure 5—figure supplement 1A). The frequency of super-permissive cells was lower for B.1.1.7 at 6 and 8 hpi (Figure 5I, Figure 5—figure supplement 1B and C). In agreement with our results with VIC (Figure 4E and F), the distribution of super-permissive cells with B.1.1.7 was random at 8 hpi; however, this changed to a non-random pattern at 24 hpi. In contrast, the distribution of VIC super-permissive cells remained random at all timepoints (Figure 5—figure supplement 2). We interpret these results as demonstrating differences in the infection kinetics of the variants, with B.1.1.7 displaying a potentially higher capacity to spread locally between adjacent cells than VIC.

To test whether our findings using B.1.1.7 are applicable to other cell types, we assessed the replication of both variants in A549-ACE2 cells that were recently reported to be immunocompetent 53Li et al.2021. Both VIC and B.1.1.7 infections resulted in comparable numbers of infected cells and similar numbers of gRNA molecules per cell at 2 hpi, demonstrating a similar degree of viral particle entry into cells (Figure 6—figure supplement 1A). However, infection with the B.1.1.7 variant led to a reduced gRNA and sgRNA burden at 8 and 24 hpi (Figure 6A and B, Figure 6—figure supplement 1B and C). Moreover, fewer ‘super-permissive’ cells were detected at these timepoints (Figure 6C). To evaluate whether the slower replication kinetics of B.1.1.7 was attributable to a reduction in the secretion of new particles, we measured the level of infectious virus (Figure 6—figure supplement 1D). We found a modest but significant reduction in the infectious titre of B.1.1.7 compared to VIC at 8 and 24 hpi, consistent with the reduced cellular RNA burden of B.1.1.7. Considering these results together, we conclude that the replication and secretion rates of B.1.1.7 are slower than VIC in contrast to its more rapid spread in the human population.

Transcriptomic landscape of B.1.1.7 and Victoria (VIC) severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains.

(A) Experimental design to compare replication kinetics and transcriptomic landscapes of VIC and B.1.1.7 strains. A549-ACE2 cells were seeded and 24 hr later inoculated with VIC or B.1.1.7 strain (multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion, and cells were fixed at designated timepoints for single-molecule fluorescence in situ hybridisation (smFISH) or harvested for RNA-seq library preparation. (B) Maximum z-projected confocal images of A549-ACE2 cells infected with VIC or B.1.1.7. Representative super-permissive cells from the time series are shown. Numbers at the bottom-left corner indicate dynamic contrast range used to display the image. Scale bar = 10 µm. (D) Read coverage along SARS-CoV-2 genome (positive strand) for the two variants in the three timepoints. Counts are normalised to total read count to show the increased proportion of reads from the virus in addition to the accumulation of subgenomic RNA and averaged across replicates. (n = 3). (E) Percentage of reads mapping to SARS-CoV-2 genome of total mapped reads, shown separately for the two strains. Each symbol represents an experimental replicate (n = 3). (F) Violin plots showing fold-changes in the host transcriptome and viral RNA genome comparing B.1.1.7 and VIC strains at the three timepoints. Fold-changes for SARS-CoV-2-positive and -negative strands are indicated as separate points and coloured according to the statistical significance of the change (red – higher in VIC, blue – higher in B.1.1.7, grey – no change). p-adjusted <0.01, log2 fold-change cut-off = 0 (n = 3). (G) Percent of reads mapping to SARS-CoV-2-negative (antisense) strand relative to all SARS-CoV-2 reads, shown separately for the two strains. Each symbol represents an experimental replicate (n = 3). (I) Expression of S, N, ORF9b, and N* viral subgenomic RNAs in each strain and different timepoints. Expression of each subgenomic RNA is determined from split reads indicative of transcriptional skipping landing within 100 nt upstream of annotated ORF start site, or until upstream ORF start codon if nearer. Percentage of all skip events is shown (n = 3).

# Prepare dataframe
FACS_hperm_summary <- read_csv("./Data/Figure6/A549-ACE2_smFISH_overview_highly_permissive.csv") %>%
  mutate(time = fct_rev(time)) %>%
  mutate(Virus = case_when(Virus == "B1.1.7" ~ "B.1.1.7",
                           TRUE ~ Virus))

# Statistics
FACS_hperm_summary %>% shapiro_test(p.superinfected) # Normal 
FACS_hperm_summary %>% group_by(time) %>% t_test(p.superinfected ~ Virus)

FACS_hperm_anno <- FACS_hperm_summary %>% 
  group_by(time) %>% t_test(p.superinfected ~ Virus) %>% add_significance() %>% ungroup() %>%
  mutate(label = paste0(p.signif, "\np=", p)) %>%
  mutate(y_pos = case_when(
    time == "8 hpi" ~ 8,
    time == "24 hpi" ~ 24))

# Plot 
FACS_hperm_summary %>% 
  mutate(Virus = fct_rev(Virus)) %>%
  ggplot() +
  geom_bar(aes(x = time, y = p.superinfected, fill = Virus),
           stat = "summary", fun.data = "mean_se",
           width = 0.65, position = position_dodge(0.8), alpha = 0.8) +
  geom_linerange(aes(x = time, y = p.superinfected, fill = Virus),
                 stat = "summary", fun.data = "mean_sdl", 
                 fun.args = list(mult = 1), size = 0.25,
                 position = position_dodge(0.8)) +
  geom_text(data = FACS_hperm_anno,
            aes(x = time, y = y_pos, label = label), 
            size = 2, alpha = 0.75) +
  scale_y_continuous(limits = c(0, 25)) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "% Super-permissive cells",
       x = "",
       y = "% cells per field of view") +
  theme_minimal(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5, size = 10),
        legend.title     = element_blank(),
        legend.position  = "bottom",
        legend.key.size  = unit(0.4, "cm"))

Comparison of the percentage of super-permissive cells between the two strains.

Super-permissive cells were identified from low-magnification high-throughput smFISH assay (see Figure 6—figure supplement 1B-C). Data are represented as mean ± SD (8 hr post infection [hpi] n = 2; 24 hpi n = 3). *p<0.5: **p<0.01.

ratio <- read_tsv("./Data/Figure6/subgenomic-to-genomic-ratio.tsv") %>%
  rename("time" = hpi) %>%
  mutate(time = case_when(
    time == "2h" ~ "2hpi",
    time == "8h" ~ "8hpi",
    time == "24h" ~ "24hpi"
  )) %>%
  mutate(strain = case_when(
    strain == "B117" ~ "B.1.1.7",
    strain == "victoria" ~ "Victoria"
  ))

## Statistics
ratio %>%
  group_by(time) %>%
  shapiro_test(sub_to_genomic_ratio)

ratio %>%
  group_by(time) %>%
  t_test(sub_to_genomic_ratio ~ strain) %>%
  add_significance() %>%
  mutate(label = paste0(p.signif, "\n", "p=", signif(p, 2))) -> ratio_anno

ratio %>%
  filter(strain == "B.1.1.7") %>%
  tukey_hsd(sub_to_genomic_ratio ~ time)

## Plot
ratio %>%
  mutate(time = fct_relevel(time, c("2hpi", "8hpi", "24hpi"))) %>%
  mutate(strain = fct_rev(strain)) %>%
  ggplot(aes(x = time, y = sub_to_genomic_ratio, colour = strain)) +
  geom_point(position = position_jitterdodge()) +
  geom_text(data = ratio_anno, 
            aes(x = time, y = 10, label = label), 
            size = 1.5, inherit.aes = FALSE) +
  scale_colour_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "sgRNA/gRNA ratio",
       x = "",
       y = "Ratio") +
  theme_classic(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5),
        legend.title     = element_blank(),
        legend.position  = "bottom",
        axis.line        = element_blank(),
        panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        strip.background = element_rect(colour = "black", fill = NA, size = 0.5)) -> sgRNA_ratio

ratio %>%
  mutate(time = fct_relevel(time, c("2hpi", "8hpi", "24hpi"))) %>%
  mutate(strain = fct_rev(strain)) %>%
  ggplot(aes(x = time, y = sub_to_genomic_ratio, colour = strain)) +
  geom_quasirandom(width = 0.2, size = 1.5, alpha = 0.4) +
  geom_line(aes(group = strain),
            stat = "summary", alpha = 0.4) +
  geom_text(data = ratio_anno, 
            aes(x = time, y = 10, label = label), 
            size = 2, inherit.aes = FALSE) +
  scale_colour_manual(values = c("#00BFC4", "#F8766D")) + 
  labs(title = "sgRNA/gRNA ratio",
       x = "",
       y = "Ratio") +
  theme_minimal(base_size = 7) +
  theme(
        legend.title     = element_blank(),
        legend.position  = "none",
        # axis.line        = element_blank(),
        # panel.border     = element_rect(colour = "black", fill = NA, size = 0.5),
        # strip.background = element_rect(colour = "black", fill = NA, size = 0.5),
        legend.key.size  = unit(0.2, "cm"))

Estimated ratio of SARS-CoV-2 subgenomic to genomic RNA for the two virus variants at the three timepoints.

Student’s t-test. n.s., not significant; *p<0.05; ****p<0.001 (n = 3).

# * * * * * % infection
# Prepare data
p_infection <- read_csv("./Data/Figure6/A549-ACE2_variant_pinfection.csv") %>%
  mutate(Virus = case_when(Virus == "B1.1.7" ~ "B.1.1.7",
                           TRUE ~ Virus)) %>%
  mutate(Virus = fct_rev(Virus))
  

# Statistics
p_infection %>% group_by(Virus) %>% shapiro_test(p_infection) # Not normal
p_infection %>% wilcox_test(p_infection ~ Virus) %>% pull(p) -> pinf_pval
pinf_pval_anno <- paste0("p=", pinf_pval)

# * * * * * Viral entry
# Prepare data 
entry <- read_csv("./Data/Figure6/A549-ACE2_smFISH_2h.csv") %>%
  mutate(Virus = case_when(Virus == "B1.1.7" ~ "B.1.1.7",
                           TRUE ~ Virus)) %>%
  mutate(Virus = fct_rev(Virus))

# Statistics 
entry %>% group_by(Virus) %>% shapiro_test(total_vRNAs) # Not normal
entry %>% wilcox_test(total_vRNAs ~ Virus) %>% pull(p) -> entry_pval
entry_pval_anno <- paste0("p=", entry_pval)

# Annotation
entry_anno <- entry %>% group_by(Virus) %>% 
  summarise(mean = mean(total_vRNAs)) %>% 
  ungroup() %>%
  mutate(mean = round(mean, 1))

p_infection %>%
  ggplot(aes(x = Virus, y = p_infection, fill = Virus)) +
  geom_bar(stat = "summary",
           width = 0.6, alpha = 0.8) +
  geom_linerange(stat = "summary", fun.data = "mean_sdl", fun.args = list(mult = 1),
                 size = 0.5) +
  scale_fill_manual(values = c("#f9ab38", "#5672be")) + 
  labs(title = "% infection",
       x = "",
       y = "% of cells (2 hpi)") + 
  theme_minimal(base_size = 7) +
  theme(plot.title       = element_text(hjust = 0.5),
        legend.title     = element_blank(),
        legend.key.size  = unit(0.2, "cm"),
        legend.position  = "none") +
  annotate(geom = "text", label = pinf_pval_anno, size = 2, alpha = 0.75,
           x = 1.5, y = 100) -> p1_2hpi

entry %>%
  ggplot(aes(x = Virus, y = total_vRNAs, colour = Virus)) +
  geom_beeswarm(cex = 1.5, alpha = 0.15, size = 1, stroke = 0,
                priority = "random") +
  geom_pointrange(stat = "summary", fun.data = "mean_sdl", fun.args = list(mult = 0.5),
                  size = 0.3, alpha = 1) +
  geom_label(data = entry_anno, 
             aes(x = Virus, y = 125, label = mean), 
             inherit.aes = FALSE,label.padding = unit(0.1, "lines"), size = 2) +
  #scale_y_continuous(limits = c(0, 200)) +
  scale_colour_manual(values = c("#f9ab38", "#5672be")) +
  labs(title = "Viral entry",
       x = "",
       y = "gRNA total count (2 hpi)") +
  theme_minimal(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5),
        legend.position = "none",
        legend.title = element_blank(),
        legend.key.size = unit(0.3, "cm")) +
  annotate(geom = "text", label = entry_pval_anno, size = 2, alpha = 0.75,
           x = 1.5, y = 160) -> p2_2hpi

# Patchwork
p1_2hpi + p2_2hpi

Comparing percentage of infected A549-ACE2 cells between VIC and B.1.1.7 at 2 hr post infection (hpi) (left).

Infected cells were determined by +ORF1a smFISH spot detection. Data are represented as mean ± SD. Comparison of viral genomic RNA (gRNA) counts at 2 hpi between the two strains (right). Each dot represents a cell. Data are represented as mean ± SEM, and the labels represent average values.

Low-magnification z-projected (9 µm) images of SARS-CoV-2-infected A549-ACE2 cells.

Cells were hybridised with +ORF1a and +ORFN probes to visualise super-permissive cells. Scale bar = 500 µm.

#' @width 12
#' @height 7
# * * * * * Import data
FACS_df <- read_csv("./Data/Figure6/A549-ACE2_smFISH_overview_intensity_full.csv") %>%
  filter(condition != "RDV") %>%
  mutate(time = case_when(
    time == "02h" ~ "2 hpi",
    time == "08h" ~ "8 hpi",
    time == "24h" ~ "24 hpi"
  )) %>%
  mutate(Virus = case_when(Virus == "B1.1.7" ~ "B.1.1.7",
                           TRUE ~ Virus)) %>%
  mutate(Virus = fct_relevel(Virus, c("Victoria", "B.1.1.7", "Mock"))) %>%
  mutate(time = fct_relevel(time, c("2 hpi", "8 hpi", "24 hpi"))) %>%
  arrange(Virus)

# * * * * * FACS plot
# Make annotation df
FACS_df_summary <- FACS_df %>% 
  group_by(time, Virus, condition) %>%
  summarise(Ch3.pool        = mean(Ch3.mean),
            Ch4.pool        = mean(Ch4.mean),
            p.superinfected = sum(superinfected)/n() * 100,
            cell.count      = n()) %>% 
  ungroup() %>%
  mutate(time = fct_relevel(time, c("2 hpi", "8 hpi", "24 hpi")))

A549_anno <- FACS_df_summary %>%
  mutate(y.pos = case_when(
    Virus == "Victoria" ~ 8000,
    Virus == "B.1.1.7" ~ 5000,
    Virus == "Mock" ~ 3000)) %>%
  mutate(p.superinfected = as.character(round(p.superinfected, 2))) %>%
  mutate(p.superinfected = paste0(p.superinfected, "%")) %>%
  mutate(Virus = fct_relevel(Virus, c("Victoria", "B.1.1.7", "Mock"))) %>%
  mutate(condition = fct_rev(condition))

# Plot
FACS_df %>%
  ggplot(aes(x = Ch3.mean, y = Ch4.mean, colour = Virus, alpha = Virus)) +
  geom_point(shape = 21, size = 0.6, stroke = 0.2) +
  geom_vline(xintercept = 280, size = 0.5, alpha = 0.5, linetype = "dashed", colour = "gray50") +
  labs(title    = "smFISH intensity per cell",
       x        = "ORF-N smFISH intensity (log)",
       y        = "ORF1a smFISH intensity (log)") + 
  scale_alpha_manual(values = c(0.4, 0.4, 0.05)) +
  scale_x_log10(limits = c(1e2, 1e4),
                labels = trans_format("log10", math_format(10^.x))) +
  scale_y_log10(limits = c(1e2, 1e4),
                labels = trans_format("log10", math_format(10^.x))) +
  scale_colour_manual(values = c("#f9ab38", "#5672be", "#d18975")) +
  facet_wrap(~ time) +
  theme_minimal(base_size = 7) +
  guides(colour = guide_legend(override.aes = list(alpha = 1))) + 
  theme(plot.title = element_text(hjust = 0.5, size = 8),
        legend.position  = "bottom",
        legend.title = element_blank()) +
  geom_text(data = (A549_anno),
            aes(x = 350, y = y.pos, label = p.superinfected), 
            hjust = 0, show.legend = FALSE, size = 1.75, alpha = 1) +
  guides(colour = guide_legend(override.aes = list(size = 2, alpha = 0.75, stroke = 0.6)))

Scatter plot showing high-throughput smFISH intensity quantification of +ORF1a and +ORFN probes in infected cells.

Each dot is a cell. Fluorescence signal density was measured from low-magnification overview image of entire culture wells. A gate was set with +ORFN signal using uninfected (Mock) condition (dotted line) (2–4 hpi, n = 2; 24 hpi, n = 3).

## Set factor levels 
time_order <- c("2hpi", "8hpi", "24hpi")
virus_order <- c("VIC", "B.1.1.7")

## Read data and add psuedocount
ifp <- read_csv("./Data/Figure6/A549-ACE2_infectious_particle_release.csv") %>%
  mutate(PFU_per_ml = if_else(time == "2hpi", 10, PFU_per_ml)) %>%
  mutate(time = fct_relevel(time, time_order)) %>%
  mutate(virus = fct_relevel(virus, virus_order))

# * * * * * Statistics

## t-test
ifp %>%
  mutate(PFU_per_ml_log = log10(PFU_per_ml)) %>%
  filter(time != "2hpi") %>%
  group_by(time) %>%
  t_test(PFU_per_ml_log ~ virus) %>%
  add_significance() %>%
  dplyr::select(time, p, p.signif) %>%
  add_row(time = "2hpi", p = 1, p.signif = "ns") %>%
  mutate(p_label = paste0(p.signif, "\np=", p)) %>%
  mutate(time = fct_relevel(time, time_order)) %>%
  arrange(time) %>%
  add_column(y_pos = c(50, 8000, 60000)) -> ifp_stat
  
# * * * * * Plot

## Plot
ifp %>%
  ggplot(aes(x = time, y = PFU_per_ml, fill = virus)) + 
  geom_bar(stat = "summary", width = 0.6, position = position_dodge(width = 0.8), alpha = 0.8) +
  geom_linerange(stat = "summary", position = position_dodge(width = 0.8)) + 
  geom_text(data = ifp_stat,
            aes(x = time, y = y_pos, label = p_label), 
            inherit.aes = FALSE, cex = 2) +
  geom_hline(yintercept = 10, linetype = "dotted", size = 0.5, colour = "gray20") +
  labs(title = "Infectious Virus Release",
       x = "Time post infection (hpi)",
       y = "PFU/ml (log)",
       fill = "") + 
  scale_fill_manual(values = c("#f9ab38", "#5672be")) +
  scale_y_log10(labels = trans_format("log10", math_format(10^.x))) +
  theme_minimal(base_size = 7) +
  theme(plot.title = element_text(hjust = 0.5),
        legend.position  = "right",
        strip.text = element_text(size = 7 ,face = "bold"),
        legend.key.size = unit(0.1, 'cm')) +
  annotate(geom = "text", x = 0.65, y = 15, label = "LLOD", cex = 2)

Comparison of VIC and B.1.1.7 infection and secretion of infectious virus.

Extracellular media from infected A549-ACE2 cells was collected at the indicated timepoints and infectious virus quantified using plaque forming assay. Data are represented as mean ± SEM. No apparent plaques were found at 2 hpi, and pseudocounts of 10 PFU/ml were added for plotting purposes (LLOD, lower limit of detection). Student’s t-test (n = 3). *p<0.05.

Transcriptomic landscapes of B.1.1.7 and Victoria (VIC) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains.

A549-ACE2 cells were seeded and 24 hr later inoculated with VIC or B.1.1.7 strain (multiplicity of infection [MOI] = 1) for 2 hr. Non-internalised viruses were removed by trypsin digestion, and the cells were harvested for RNA-seq library preparation. (A) First two components of a principal component analysis (PCA) performed on the 500 host genes showing the highest variance in RNA-seq. The infection timepoints (coloured) and control (grey) samples group into separate clusters, but the samples of the two strains remain close to one another (n = 3). (B) Violin plots showing fold-changes in the host transcriptome and viral RNA genome comparing consecutive timepoints separately for each of the two strains. Fold-changes for SARS-CoV-2-positive and -negative strands are indicated as separate points and coloured according to the statistical significance of the change (red – upregulated relative to earlier timepoint, blue – downregulated relative to earlier timepoint, grey – no change). p-adjusted <0.01, log2 fold-change cut-off = 0 (n = 3). (C) Expression of viral subgenomic RNAs (sgRNAs) in each strain and different timepoints. Expression of each sgRNA is determined from split reads indicative of transcriptional skipping landing within 100 nt upstream of annotated ORF start site, or until upstream ORF start codon if nearer. Percentage of all skip events is shown (n = 3). (D) Comparison of transcriptional skip site usage between the two virus strains at 24 hr post infection (hpi). Assignment to viral genes is as in panel (C).Average across replicates is shown (n = 3).

To evaluate our observation on B.1.1.7 replication kinetics with an independent method, we sequenced ribo-depleted total RNA libraries of A549-ACE2 cells infected with B.1.1.7 or VIC for 2, 8, and 24 hr (Figure 6A, Figure 6—figure supplement 2A). As expected, the number of reads mapping to SARS-CoV-2 genome increased over time, reflecting active replication and transcription (Figure 6D). Reads mapping to the 3′ end of the genome increased relative to the 5′ end, reflecting the synthesis of sgRNAs. In agreement with our smFISH analysis, we detected similar levels of vRNA at 2 hpi within B.1.1.7 or VIC-infected cells, consistent with similar internalisation rates in A549-ACE2 cells (Figure 6E). However, the abundance of vRNAs in B.1.1.7-infected cells at 8 and 24 hpi was notably lower than with VIC-infected cells (Figure 6E). Furthermore, the level of B.1.1.7 RNA was almost unaltered between 2 and 8 hpi, and then increased dramatically at 24 hpi (Figure 6E, Figure 6—figure supplement 2B), contrasting with VIC-infected cells, which showed a continuous increase in vRNA over time. Together, these RNA sequencing data confirm that the B.1.1.7 variant exhibits delayed replication kinetics complementing our smFISH results.

Transcriptomic changes in B.1.1.7 and VIC-infected cells

To further explore the differences in gene expression between the B.1.1.7 and VIC strains, we assessed the abundance of the different vRNAs in infected A549-ACE2 cells. Negative sense viral RNAs represent a small fraction of the vRNA present in the cell, as assayed by smFISH (Figure 6G). These negative sense transcripts are detectable as early as 2 hpi, adding further support to our earlier conclusion that primary viral replication events can occur rapidly post-infection, particularly in ‘super-permissive’ cells (Figures 3C and 6G). The ratio between negative and positive sense vRNAs increased throughout the infection for the VIC strain, but for B.1.1.7 we observed a modest reduction in the ratio at 24 hpi (Figure 6G). To assess the expression of sgRNAs, we quantified the reads mapping to the split junctions derived from RNA-dependent RNA polymerase discontinuous replication (Figure 6—figure supplement 2C; 44Kim et al.2020; 85V’kovski et al.2021). In agreement with smFISH data, sgRNAs were detected in low quantities at 2 hpi (Figure 6D and G). For VIC, the sgRNA/gRNA ratio peaks at 8 hpi, followed by a significant drop at 24 hpi (Figure 6G). For B.1.1.7, we observed a significantly lower sgRNA/gRNA ratio at 8 hpi when compared to VIC (Figure 5H). However, the sgRNA/gRNA ratio of B.1.1.7 remained stable between 8 and 24 hpi, surpassing VIC (Figure 6H). These results suggest that both VIC and B.1.1.7 have a different kinetics of gRNA and sgRNA expression, complementing our earlier observations with smFISH (Figure 3C and G).

Next, we assessed the relative abundance of each individual sgRNA. We found that S-sgRNA was the dominant species at 2 hpi, while the sgRNA encoding N (N-sgRNA) become prevalent at 8–24 hpi (Figure 6—figure supplement 2D). We interpret this early S-to-N sgRNA switch upon infection as indicating a transition to the assembly of viral particles requiring large numbers of N molecules. At 2 hpi, B.1.1.7 produced more S-sgRNA but less N-sgRNA than VIC, which is consistent with a delayed B.1.1.7 replication kinetic and S-to-N transition. Furthermore, we found upregulation of sgRNAs encoding ORF9b (~0.13%) and N* (~1%) in B.1.1.7-infected cells (Figure 6I), in agreement with recent studies reporting altered sgRNA landscapes for B.1.1.7 60Parker et al.202182Thorne et al.2021. Upregulation of these transcripts is likely to result from advantageous mutations that create novel transcriptional regulatory sequences (TRS-B) in B.1.1.7 60Parker et al.202187Wang et al.2021. When we scanned for the TRS motifs in other VOCs and variants of interests (VOIs), we found mutations in TRS-B near N* were also found in P.1 (Gamma) and P.2 (Zeta) variants while mutations in TRS-B near ORF9b were unique to B.1.1.7 (see Supplementary file 1). However, multiple sequence alignment of VOCs and VOIs revealed that mutations accumulate frequently at or near the TRS motif sequences, suggesting that SARS-CoV-2 utilises these regulatory motif and surrounding sequences as evolutionary hotspots to modulate sgRNA expression and viral fitness. These transcriptomic results reveal that B.1.1.7 does not only exhibit delayed replication kinetics, but also produces a differential pool of sgRNAs likely due to mutations within the TRS. Altogether, the novel combination of smFISH and ‘in-bulk’ RNA sequencing that we have described provides a powerful and holistic way to characterise the replication dynamics of SARS-CoV-2. Our pipeline can now be expanded to other VOC, viruses, functional analyses, and characterisation of antivirals.

Discussion

Our spatial quantitation of SARS-CoV-2 replication dynamics at the single-molecule and single-cell level provides important new insights into the early rate-limiting steps of infection. Typically, analyses of viral replication are carried out using ‘in-bulk’ approaches such as RT-qPCR and conventional RNA-seq. While very informative, these approaches lack spatial information and do not allow single-cell analyses. Although single-cell RNA-seq analyses can overcome some of these issues 29Fiege et al.202166Ravindra et al.2021, their low coverage and lack of information regarding the spatial location of cells remain a significant limitation. In this study, we show that smFISH is a sensitive approach that allows the absolute quantification of SARS-CoV-2 RNAs at single-molecule resolution. Our experiments show the detection of individual gRNA molecules within the first 2 hr of infection, which most likely reflect incoming viral particles. However, we also observed small numbers of foci comprising several gRNAs sensitive to RDV treatment, demonstrating early replication events. We believe that these foci represent ‘replication factories’ as they co-stain with FISH probes specific for negative sense viral RNA and sgRNA. These data provide the first evidence that SARS-CoV-2 replication occurs within the first 2 hr of infection and increases over time. This contrasts to our observations with the J2 anti-dsRNA antibody where viral-dependent signals were apparent at 6 hpi 19Cortese et al.202027Eymieux et al.2021. We noted that co-staining SARS-CoV-2-infected cells with J2 antibody and +ORF1a with an smFISH probe set showed a partial overlap, suggesting that infection may induce changes in cellular dsRNA. These findings highlight the utility of smFISH to uncover new aspects of SARS-CoV-2 replication that are worthy of further study.

We found that SARS-CoV-2 gRNA persisted in the presence of RDV, suggesting a long half-life that may reflect the high secondary structure of the RNA genome that could render it refractory to the action of nucleases 39Huston et al.202175Simmonds et al.2021. smFISH revealed complex dynamics of gRNA and sgRNA expression that resulted in a rapid expansion of sgRNA (peaking at 8 hpi), followed by a shift towards the production of gRNA (24 hpi), results that were confirmed by RNA-seq. Since a viral particle is composed of thousands of proteins and a single RNA molecule, we interpret the high synthesis of sgRNAs as aiming to fulfil the high demand for structural proteins in the viral particles. Once the structural proteins are available in sufficient quantities, the late shift towards gRNA synthesis may ensure the presence of sufficient gRNA to generate the viral progeny.

Our study shows that cells vary in their susceptibility to SARS-CoV-2 infection, where most cells had low vRNA levels (<102 copies/cell), but a minor population (4–10% depending on the cell line) had much higher vRNA burden (>105 copies/cell) at 10 hpi. In contrast, the number of intracellular vRNA copies at 2 hpi was similar across the culture, suggesting that this phenotype is not explained by differences in virus entry. These ‘super-permissive’ cells account for the majority of vRNA within the culture and mask the dominant cell population. Similar results were obtained with Vero E6, Calu-3, and A549-ACE2, suggesting that this is a common feature of SARS-CoV-2 infection. As both Calu-3 and A549-ACE2 have intact innate sensing pathways 12Cao et al.202153Li et al.2021 unlike Vero cells 21Desmyter et al.1968, this variable susceptibility is unlikely to reflect differential immune cell signalling and is consistent with their random distribution within the culture. The reason for the differential infection fitness may rely on the intrinsic properties of each cell, including the stage of the cell cycle, the expression of individual antiviral sensors or the metabolic state. Recent single-cell RNA sequencing studies of SARS-CoV-2-infected bronchial cultures identified ciliated cells as the primary target. However, only a minority of these cells contained vRNA that may either reflect low sequencing depth or cell-to-cell variation in susceptibility 29Fiege et al.202166Ravindra et al.2021, and indeed, our smFISH results on experimentally infected hamster lungs showed variable RNA levels across the tissue (Figure 1E). The human respiratory tract encompasses the nasal passage, large and small airways and bronchioles, and our knowledge on how specific cell types and SARS-CoV-2 RNA burden relate is still limited. Applying smFISH in concert with cell-type-specific labels to clinical biopsies and experimentally infected animal samples 71Salguero et al.2021 will allow us to address this important question.

Given the current status of the pandemic, there has been a global effort to understand the biology of emergent VOC with high transmission rates and possible resistance to neutralising antibodies. Most studies have focused on mutations mapping to the Spike glycoprotein as they can alter virus attachment, entry, and sensitivity to vaccine-induced or naturally acquired neutralising antibodies. However, many of the mutations map to other viral proteins, including components of the RNA-dependent RNA polymerase complex that could impact RNA replication, and non-coding regulatory regions as the TRSs, which can affect sgRNA expression. Our smFISH analysis revealed that the B.1.1.7 variant shows slower replication kinetics compared to the VIC strain, resulting in lower gRNA and sgRNA copies per cell, fewer viral replication factories and a reduced frequency of ‘super-permissive’ cells. This delay in B.1.1.7 replication was observed in Vero and A549-ACE2 cells and was confirmed by RNA-seq as an orthogonal method.

Emerging VOCs, such as B.1.1.7, have been reported to have a fitness advantage in terms of their ability to transmit compared to the VIC isolate 11Caly et al.202020Davies et al.202143Kidd et al.202184Volz et al.2021. However, the mechanisms underlying increased transmission are not well understood. Interestingly, a recent study reported that B.1.1.7 leads to higher levels of intracellular vRNA and N protein than VIC at 24 and 48 hpi using ‘in-bulk’ RT-qPCR and immunofluorescence, respectively 82Thorne et al.2021. We observed that while B.1.1.7 still produces lower level of vRNA than VIC at 24 hpi (Figure 6E), it exhibits a clear recovery compared to 8 hpi. It is thus plausible that both variants yield similar total amounts of viral RNA and proteins but within a different time frame. The potential differences in replication dynamics between the two variants are also reflected in distinct sgRNA/gRNA ratios throughout the infection (Figure 6H). That said, ‘in-bulk’ RT-qPCR analysis does not provide absolute quantification and individual cell assessment and, therefore, should likely be biased towards the super-susceptible cells that account for most of the RNA burden. Thorne and colleagues also reported an elevated expression of the sgRNA encoding the innate agonist ORF9b 82Thorne et al.2021, which is also supported by our results. We noticed that the increase of ORF9b sgRNA expression may be due to mutations in non-coding regulatory sequences involved in discontinuous replication (TRS), and that such mutations are common across VOCs possibly mediating differential sgRNA expression. Enhanced ORF9b expression, together with the lower intracellular vRNA levels present in B.1.1.7-infected cells, may grant this variant with an advantage to evade the antiviral response. This advantage, combined with mutations in the Spike that are proposed to improve cell entry, could provide the B.1.1.7 with a replicative advantage over the early lineage VIC strain enabling its rapid dissemination across the human population 11Caly et al.202020Davies et al.202143Kidd et al.202184Volz et al.2021. A recent longitudinal study of nasopharyngeal swabs showed that the B.1.1.7 variant was associated with longer infection times and yet showed similar peak viral loads to non-B.1.1.7 variants 46Kissler et al.2021. The authors conclude that this extended duration of virus shedding may contribute to increased transmissibility and is consistent with our data showing reduced replication of B.1.1.7 at the single-cell level. An independent study reported a significantly longer duration of SARS-CoV-2 RNA in nasopharyngeal swabs from persons infected with B.1.1.7 (16 days) compared to those infected with other lineages (14 days) 10Calistri et al.2021. Replication fitness will be defined by the relationship of the virus with its host cell, and aggressive replication is expected to trigger cellular antiviral sensors. In contrast, lower replication may allow the virus to replicate and persist for longer periods before host antiviral sensors are triggered. Such differences, and their impact on host antiviral responses, are likely to be of key importance for our understanding of the success of viral variants to spread through the population.

Materials and methods

Reagent type (species) or resource Designation Source or reference Identifiers Additional information
Gene (SARS-CoV-2) SARS-CoV-2 RefSeq reference genome NCBI NC_045512.2
Gene (Homo sapiens) human genome Ensembl GRCh38.99
Strain, strain background (SARS-CoV-2) Victoria 01/20 (BVIC01) 11Caly et al.2020 Provided by PHE Porton Down after supply from the Doherty Centre Melbourne, Australia
Strain, strain background (SARS-CoV-2) B.1.1.7 (20I/501Y.V1.HMPP1) 81Tegally et al.2020 Provided by PHE Porton Down
Strain, strain background (SARS-CoV-2) HCoV-229E Provided by Professor Andrew Davidson and Professor Peter Simmonds
Cell line (African green monkey) Vero E6 Kind gift from Professor William James
Cell line (H. sapiens) A549-ACE2 Kind gift from Professor Ralf Bartenschlager
Cell line (H. sapiens) Huh-7.5 Kind gift from Professor Peter Simmonds
Cell line (H. sapiens) Calu-3 Kind gift from Professor Nicole Zitzmann
Biological sample (Gold Syrian hamster) hamster lung tissue This study Infected tissue provided by the Biological Investigations Group, Public Health England, Porton Down
Antibody J2 primary antibody (mouse, monoclonal) Scicons Cat# 10010200;RRID:AB_2651015 1:500 IF
Antibody Anti-SARS-CoV-2 N antibody (human, monoclonal) 38Huang et al.2020 Ey2A clone 1:2000 IF
Commercial assay or kit SARS primer probe IDT Cat# 100006770
Commercial assay or kit B2M primer probe Applied Biosystems Cat# 4325797
Commercial assay or kit QIAGEN RNeasy kit QIAGEN Cat# 74004
Commercial assay or kit Illumina Total RNA Prep with Ribo-Zero Plus library kit Illumina Cat# 20040525
Commercial assay or kit Beckman Coulter RNAClean XP beads Beckman Coulter Cat# A63987
Commercial assay or kit Takyon Dry Probe MasterMix Eurogentec Cat# UFD-NPMT-C0101
Chemical compound, drug ATTO633 NHS ester Atto-Tec Cat# AD633
Chemical compound, drug ATTO565 NHS ester Atto-Tec Cat# AD565
Chemical compound, drug Cy3 NHS ester Lumiprobe Cat# 11320
Chemical compound, drug ATTO488 NHS ester Atto-Tec Cat# AD488
Chemical compound, drug Phalloidin-Alexa Fluor 488 conjugate Thermo Fisher Cat# A12379
Chemical compound, drug CellMask Green Thermo Fisher Cat# C37608
Chemical compound, drug RNaseT1 Thermo Fisher Cat# EN0541
Chemical compound, drug RNaseIII NEB Cat# M0245S
Software, algorithm Stellaris Probe Designer ver 4.2 Biosearch technologies
Software, algorithm ‘bowtie2’ 49Langmead and Salzberg2012 v2.4.4
Software, algorithm OMERO server 2Allan et al.2012
Software, algorithm cellSens Dimension Olympus
Software, algorithm ImageJ National Institute of Health
Software, algorithm FISH-quant 59Mueller et al.2013 v3
Software, algorithm RNA Distribution Index (RDI) calculator 79Stueland et al.2019
Software, algorithm STAR aligner 25Dobin et al.2013 v2.7.3a
Software, algorithm Cellpose 78Stringer et al.2021 v0.6.1
Software, algorithm Bigfish 40Imbert et al.2021 v0.4.0
Software, algorithm R package DESeq2 55Love et al.2014 v1.28.1
Software, algorithm Affinity Designer Serif

Cell culture

Vero E6, A549-ACE2 (kind gift from the Bartenschlager lab) 47Klein et al.2020, and Huh-7.5 cells were maintained in standard DMEM, Calu-3 cells in Advanced DMEM both supplemented with 10% foetal bovine serum, 2 mM L-glutamine, 100 U/ml penicillin, and 10 μg/ml streptomycin and non-essential amino acids. All cell lines tested free of mycoplasma were maintained at 37°C and 5% CO2 in a standard culture incubator.

Virus propagation and infection of cell culture models

SARS-CoV-2 strains: VIC 01/20 (BVIC01) 11Caly et al.2020 (provided by PHE Porton Down after supply from the Doherty Centre Melbourne, Australia) and B.1.1.7 81Tegally et al.2020 (20I/501Y.V1.HMPP1) (provided by PHE Porton Down). Viral strains were propagated in Vero E6 cells as described 91Wing et al.2021. Briefly, naïve Vero E6 cells were infected with SARS-CoV-2 at an MOI of 0.003 and incubated for 48–72 hr until visible cytopathic effect was observed. At this point, cultures were harvested, clarified by centrifugation to remove residual cell debris and stored at –80°C. To determine the viral titre, fresh Vero E6 cells were inoculated with serial dilutions of SARS-CoV-2 viral stocks for 2 hr followed by addition of a semi-solid overlay consisting of 1.5% carboxymethyl cellulose (Sigma). Cells were incubated for 72 hr and visible plaques enumerated by fixing cells using amido black stain to calculate PFU/ml. Similarly, HCoV-229E (Andrew Davidson lab [Bristol] and Peter Simmmonds lab [Oxford]) virus was propagated in Vero E6 cells and TCID50 was performed in Huh-7.5 cells.

For smFISH experiments with the SARS-CoV-2 stains, cells were infected at an MOI of 1 for 2 hr followed by extensive washing in PBS. Residual cell surface-associated virus was removed by trypsin treatment of the cell monolayer for 2 min followed by neutralisation of the trypsin using serum-containing media. Infected cells were then maintained for defined periods up to 24 hr. For the HCoV-229E, cells were infected at an MOI of 1 and were maintained for 24 and 48 hr.

Hamster infection and tissues preparation

Golden Syrian hamsters (Mesocricetus auratus) (males and females) aged 7 weeks old, weighing 96–116 g, were obtained from Envigo, London, UK. Hamsters were housed in individual cages with access to food and water ad libitum. Hamsters were briefly anaesthetised with 5% isoflurane (Zoetis, Leatherhead, UK) and 4 l/m O2 and inoculated by the intranasal route with 5 × 104 PFU/animal of SARS-CoV-2 BVIC01 delivered in 100 μl per nostril (200 μl in total). Hamsters were monitored post-infection for weight, clinical signs and temperature (via implanted temperature chip). On day 4, the hamsters were euthanised by overdose (sodium pentobarbitone [Dolelethal, Vetquinol UK Ltd]) via the intraperitoneal route. At necropsy, lung samples were fixed in 10% buffered formalin at room temperature and embedded in paraffin wax. 4 µm tissue sections were cut.

RT-qPCR

Infected cells were harvested in RLT buffer and RNA extracted using the QIAGEN RNeasy kit. SARS-CoV-2 RNA was quantified using a one-step reverse transcriptase qPCR (RT-qPCR) kit (Takyon) in a multiplexed reaction containing primer probes directed against the SARS-CoV-2 N gene (FAM) and ß–2-microglobulin (VIC) as an internal control. All qPCR reactions were carried out using a Roche 96 Light cycler (Roche) (SARS primer probe IDT CAT:100006770, B2M primer probe Applied Biosystems 4325797).

Single-molecule fluorescence in situ hybridisation (smFISH)

smFISH was carried out as previously reported 83Titlow et al.201893Yang et al.2017 with minor modifications. Briefly, cells were grown on #1.5 round-glass coverslips in 24-well plate or in µ-slides 8-well glass bottom (IBIDI) and fixed in 4% paraformaldehyde (Thermo Fisher) for 30 min at room temperature. Coverslips were cleaned in 80% ethanol with lint-free tissue and kept in 100% ethanol to maintain sterility and cleanliness. Cells were permeabilised in PBS/0.1% Triton X-100 for 10 min at room temperature followed by washes in PBS and 2× SSC. Cells were pre-hybridised in pre-warmed (37°C) wash solution (2× SSC, 10% formamide) twice for 20 min each at 37°C. Hybridisation was carried out in hybridisation solution (2× SSC, 10% formamide, 10% dextran sulphate) containing 500 nM smFISH probes overnight at 37°C. For infection timepoints beyond 24 hr, smFISH probes were added at 1 µM. After the overnight hybridisation, cells were washed for 20 min in pre-warmed wash solution at 37°C followed by counterstain with DAPI (1 µg/ml), phalloidin-Alexa Fluor 488 conjugate (264 nM) and/or CellMask Green (1:1,000,000) diluted in wash solution. Cells were then washed once with wash solution for 20 min at 37°C and twice with 2× SSC for 10 min each at room temperature. Cells were mounted using Vectashield, IBIDI mounting media or 2× SSC.

For RNase digestion experiments, RNaseT1 (Thermo Fisher, EN0541, 100 U/ml) or RNaseIII (M0245S, NEB, 20 U/ml) was used to degrade ssRNA and dsRNA, respectively. Permeabilised cells were treated with RNases in PBS supplemented with 5 mM MgCl2 and incubated at 37°C for 1 hr and washed three times with PBS.

In the experiment to detect viral negative strands, dsRNA was denatured using DMSO, formamide, or NaOH 76Singer et al.202190Wilcox et al.2019. After the permeabilisation step, cells were rinsed in distilled water and were treated with 50mM NaOH for 30s at room temperature, 70% formamide at 70°C for 1hr, or 90% DMSO at 70°C for 1hr. Following the treatments, cells were quickly cooled on ice, washed in ice-cold PBS and subjected to standard smFISH protocol. The smFISH experiments in Figures 3 and !number(5) were performed with DMSO and heat denaturation.

For smFISH on FFPE hamster lungs, the tissue sections were pre-treated as described in 3Annaratone et al.2017 and the probes were hybridised based on the protocol described in 70Rouhanifard et al.2018 with minor modifications. Briefly, tissues were fixed in 10% neutral buffered formalin and sectioned to 5µm slices. Tissue sections were deparaffinised in xylene (2 × 10min), washed in 100% ethanol (2 × 5min) and post-fixed in methanol-acetic acid (3:1v/v) for 5min. Tissues were re-hydrated in an ethanol gradient for 3min each (100%, 85%, 70%, nuclease-free water), heated at 80°C for 1hr in antigen retrieval solution (10mM sodium citrate, pH 6 supplemented with 1:50 RVC), permeabilised in 70% ethanol overnight at 4°C. Then, sections were incubated in 100% ethanol for 5min, air-dried for 5min and tissue-cleared with 8% SDS made up in 2× SSC. Afterwards, standard smFISH procedures were followed.

smFISH probe design and specificity analysis

Candidate smFISH probe sequences were acquired using Stellaris Probe Designer version 4.2 (https://www.biosearchtech.com/stellaris-designer) with the following parameters: organism, human; masking level, 5; oligo length, 20 nt; minimum spacing length, 3 nt. Appropriate region of the SARS-CoV-2 Wuhan-Hu-1 (NC_045512.2) reference sequence was used as target sequence. We BLAST screened candidate probe sequences against custom human transcriptome and intron database to score number of off-target basepair matches, then 35–48 sequences with the least match scores were chosen per probe set. Oligonucleotides were singly labelled with ATTO633, ATTO565, Cy3, or ATTO488 at 3′ ends according to a published protocol 33Gaspar et al.2017 and were concentration normalised to 25µM. All probe sets used in this study had degree of labelling>0.94.

We developed a bespoke pipeline to analysed the sequence specificity of oligonucleotide probe sequences against ORF-1a and ORF-N by alignment against SARS-CoV-1 (NC_004718), SARS-CoV-2 (NC_045512), MERS-CoV (NC_019843), HCoV-229E (NC_002645), HCoV-NL63 (NC_005831), HCoV-OC43 (NC_006213), HCoV-HKU1 (NC_006577), human (GCF_000001405.39), and African green monkey (GCF_015252025.1) RefSeq genome or transcriptome assembly using ‘bowtie2’ (2.4.4) 49Langmead and Salzberg2012. Following bowtie2 arguments were used to find minimum edit distance of oligonucleotide sequences to target genome/transcriptome: --end-to-end --no-unal --align-seed-mm 0, --align-seed-length 5, --align-seed-interval 1–1.15, --effort-extend 15, --effort-repeat 2. Melting temperatures were obtained using ‘rmelting’ (1.8.0) R package at 300mM Na concentration (2× SSC). smFISH probe sequences used in this study are available in Supplementary file 2.

Immunofluorescence

After permeabilisation, cells were blocked in blocking solution (50% LI-COR Odyssey blocking solution, pretreated with RNASecure for 30 min and supplemented with 2 mM ribonucleoside vanadyl complex and 0.1% Tween-20) for 30 min at room temperature. Then, cells were incubated with J2 primary antibody (Scicons 10010200) at 2 µg/ml or human anti-N primary antibody (Ey2A clone 1:2000) 38Huang et al.2020 for 2 hr at room temperature. Cells were washed three times in PBS/0.1% Tween-20 (PBSTw) for 10 min each at room temperature and incubated with fluorescent secondary antibodies (1:500) diluted in blocking solution for 1 hr at room temperature. After further three washes in PBSTw, cells were mounted using Vectashield or IBIDI mounting media. For combined smFISH and immunofluorescence, antibody staining was carried out sequentially after the smFISH protocol.

Microscopy and image handling

Cells were imaged on an Olympus SpinSR10 spinning disk confocal system equipped with Prime BSI and Prime 95B sCMOS cameras. Objectives used were ×20 dry (0.8 NA, UPLXAPO20X), ×60 silicone oil (1.3 NA, UPLSAPO60XS2), ×60 oil (1.5 NA, UPLAPOHR60X), or ×100 oil (1.45 NA, UPLXAPO100XO). Image voxel sizes were 0.55 × 0.55 × 2 µm (x:y:z) with the ×20 objective and 0.11 × 0.11 × 0.2 µm (x:y:z) with the ×60 and ×100 objectives. Automatic and manual image acquisition and image stitching were performed with Olympus cellSens Dimension software. Images were uploaded and stored in the University of Oxford OMERO server 2Allan et al.2012, and OMERO.figure (3.2.0) was used to generate presented image visualisations.

Image analysis

Cell segmentation and counting

Cell segmentation was performed either manually in ImageJ (National Institute of Health) or automatically with Cellpose (0.6.1) 78Stringer et al.2021 using 2D maximum intensity projected images of phalloidin or CellMask stains. Cellpose parameters for ×60 and ×100 magnification images were model_type = cyto, diameter = 375, flow_threshold = 0.9, cellprob_threshold=-3. For 20× stitched images, CellMask channel was deconvolved with constrained iterative module using cellSens (five iterations, default spinning disk PSF, Olympus), then the following Cellpose parameters were used: model_type = cyto, diameter = 55, flow_threshold = 0, cellprob_threshold=-6. Total number of cells per image was counted using a custom ImageJ macro script or from the Cellpose segmentation output on DAPI channel images (model_type = nuclei, diameter = 20, default threshold). Infected cells were counted using ImageJ ‘3D object counter’ or manually.

Quantification of smFISH images

Single-molecule-level quantification of smFISH images was performed either with FISH-quant 59Mueller et al.2013 or Bigfish 40Imbert et al.2021. For FISH-quant, ImageJ region of interest (ROI) files were converted to FQ outline file using a custom Python script. Then, smFISH channels were Laplacian of Gaussian filtered (sigma = 7, 3 px) and pre-detected using local maximum mode with ‘allow smaller z region for analysis’ option enabled. Pre-detected diffraction limited spots were fitted with 3D Gaussian and thresholded in batch mode based on filtered intensity, amplitude, and σz. Thresholds were defined by uninfected ‘Mock’ condition samples. The filtering also removed non-specific autofluorescence and rare dust particles because these contaminants usually show lower fluorescence intensity and are highly variable in shapes.

Large smFISH datasets were processed with a custom Python pipeline using Bigfish, skimage, and numpy libraries (available in the GitHub repository). Tif files were converted to a numpy array, and individual cells were segmented from the image using the Cellpose library as described above. Images where cells were labelled with the CellMask stain were pre-processed with a median filter, radius = 50. Background signal in the smFISH channel was subtracted with the skimage.white_tophat algorithm (radius = 5, individual z frames were processed in 2D due to memory constraints, results were indistinguishable from 3D-processed images). Threshold setting for smFISH spot detection was set specifically for each set of images collected in each session.

Cells with high viral RNA (>105-6 RNA counts) were quantified by integrating smFISH channel intensities within entire cellular volumes and comparing to the reference integrated intensity of single molecules derived from cells with lower infection density. Reference single-molecule images were obtained using ‘Average spots’ in TxSite mode of FISH-quant or ‘build_reference_spot()’ function in Bigfish.

Viral factories were defined as gRNA smFISH signals with spatially extended foci that exceed the point-spread function of the microscope and intensity of the reference single molecules. In FISH-quant, the foci were quantified using the TxSite quantification mode (xy:z = 500:1200 nm crop per factory) with normal-sampled averaged single-molecule image (xy:z = 15:12 px) from the batch mode output. Then, ‘Integrated intensity in 3D’ method was used to compare integrated intensity of the viral factory to that of averaged single-molecule RNA. In Bigfish, the factories were resolved using ‘decompose_cluster()’ function to find a reference single-molecule spot in a less signal-dense region of the image, which was used to simulate fitting of reference single-molecule spots into viral factories until the local signal intensities are matched. The candidate factories were filtered based on the previously reported radii of DMVs measured by electron microscopy (150 nm pre-8 hpi and 200 nm post-8 hpi) 19Cortese et al.2020. In addition, we applied a threshold of 3–7 RNA molecules per factory as a technical cut-off to prevent overestimation or over-cluster of viral factories at later infection timepoints.

Dual-colour smFISH spot detection analysis

The same viral RNA target was detected using two smFISH probes labelled with alternating (ODD and EVEN) red and far-red fluorochromes. Resulting images were processed in FISH-quant to obtain 3D coordinates of each spots. Percentage co-localisation analysis was performed with a custom script using an R package ‘FNN’ (1.1.3). Briefly, we calculated 3D distance of nearest neighbour for each spot in the red channel to the closest detected spot in the other channel and repeated the analysis starting from the far-red channel. We then used a value of 300 nm to define co-localised spots corresponding to the same viral RNA molecule. The presented visuals report percentage co-localisations calculated from the red channel to the far-red channel and vice versa. The analysis was performed per field of view.

Quantification of fluorescence intensity and signal co-localisation

Immunofluorescence images were background subtracted using rolling ball subtraction method (radius = 150 px) in ImageJ. Anti-dsRNA (J2) stain was quantified by integrating fluorescence signal across the z-stacks of cellular ROI divided by the cell volume to obtain signal density. Signal density was normalised to the average signal density of uninfected ‘Mock’ condition cells. Fluorescence intensity profiles were obtained using ImageJ ‘plot profile’ tool across 3 µm region on 1 µm maximum intensity projected images. To assess co-localisation of N protein with SARS-CoV-2 RNA, ellipsoid mask centred around centroid xyz coordinates of smFISH spots was generated with the size of the point-spread function (xy radius = 65 nm, z radius = 150 nm) using ImageJ 3D suite. Integrated density of N protein channel (background subtracted, radius = 5 px) fluorescence within the ellipsoid mask was measured and compared to the equivalent signal in the uninfected condition or randomly distributed ellipsoids.

Calculation of RNA spatial dispersion index

Subcellular spatial distribution metrics of SARS-CoV-2 RNA species were quantified using the RNA Distribution Index (RDI) calculator 79Stueland et al.2019. Nuclei and cell boundaries were pre-segmented in ImageJ using the ‘Auto Threshold’ function on DAPI (‘method = Huang’) or CellMask green (‘method = MaxEntropy’) channels. Resulting images were maximum intensity projected and passed into the RDI calculator MATLAB script. Standard RDI calculator graphical user interface was used without background intensity removals.

Simulation of super-permissive cell distribution

Simulations were performed to determine if the appearance of SARS-CoV-2 super-permissive cells follows a random distribution. The general strategy was to test the complete spatial randomness hypothesis by comparing the average nearest-neighbour distance of superinfected cells to an equal number of randomly selected coordinates 69Ripley1979. 2D spatial coordinates of superinfected cells were obtained from the 3D object counter (ImageJ) as described above. Cell nuclei were segmented with the DAPI channel, and placement of random coordinates was confined to pixels that fell within the DAPI segmentation mask. Nearest-neighbour distances were calculated using the KDtree algorithm 57Maneewongvatana and Mount1999 implemented in Python (scipy.spatial.KDTree). Pseudo-random distributions were simulated by randomly placing the first coordinate, then constraining the placement of subsequent coordinates to within a defined number of pixels. Rn nearest-neighbour statics 61Pinder and Witherick1972 were calculated according to the following equation, where D(obs) is the average nearest-neighbour distance (µm), a is the total imaged area (µm), and n is the number of super-permissive cells. Rn value of 1 suggests a random distribution, whereas Rn < 1 indicates clusteredness and Rn > 1 indicates regular distributions.

RNA-sequencing library preparation

RNA from infected cells were extracted as described above. Sequencing libraries were prepared using the Illumina Total RNA Prep with Ribo-Zero Plus library kit (Cat# 20040525) according to the manufacturer’s guidelines. Briefly, 100 ng of total RNA was first depleted of the abundant ribosomal RNA present in the samples by rRNA-targeted DNA probe capture followed by enzymatic digestion. Samples were then purified by Beckman Coulter RNAClean XP beads (Cat# A63987). Obtained rRNA-depleted RNA was fragmented, reverse transcribed, converted to dsDNA, end repaired, and A-tailed. The A-tailed DNA fragments were ligated to anchors allowing for PCR amplification with Illumina unique dual indexing primers (Cat# 20040553). Libraries were pooled in equimolar concentrations and sequenced on Illumina NextSeq 500 and NextSeq 550 sequencers using high-output cartridges (Cat# 20024907), generating single 150-nt-long reads.

RNA-sequencing analysis

Genomes

We downloaded the human genome primary assembly and annotation from Ensembl (GRCh38.99) and the SARS-CoV-2 RefSeq reference genome from NCBI (NC_045512.2). We combined the human and viral genome and annotation files into one composite genome and annotation file for downstream analyses.

Alignment and gene counts

We performed a splice-site-aware mapping of the sequencing reads to the combined human and SARS-CoV-2 genome and annotation using STAR aligner (2.7.3a) 25Dobin et al.2013. We also used STAR to assign uniquely mapping reads in strand-specific fashion to the Ensembl human gene annotation and the two SARS-CoV-2 strains.

Principal component analysis

To assess if SARS2 infection is the main driver of differences in the RNA-seq samples, we performed a principal component analysis (PCA). First, we performed library size correction and variance stabilisation with regularised-logarithm transformation implemented in DESeq2 (1.28.1) 55Love et al.2014. This corrects for the fact that in RNA-seq data variance grows with the mean, and therefore, without suitable correction, only the most highly expressed genes drive the clustering. We then used the 500 genes showing the highest variance to perform PCA using the prcomp function implemented in the base R package stats (4.0.2).

Differential expression analysis

We performed differential expression analysis using the R package DESeq2 (1.28.1) 55Love et al.2014. DESeq2 estimates variance-mean dependence in count data from high-throughput sequencing data and tests for differential expression based on a model using the negative binomial distribution. Full output of DESeq2 analysis is available in Supplementary file 3.

SARS-CoV-2 subgenomic RNA expression

To assess relative levels of viral subgenomic and genomic RNA expression, we tallied the alignments (using GenomicRanges and GenomicAlignments R packages; 51Lawrence et al.2013) mapping to the region unique to the genomic RNA and the shared region and normalised for their respective lengths. Unique contribution of sgRNA region was then estimated by subtracting the contribution of the genomic RNA from the shared region. In order to assess expression of individual SARS-CoV-2 subgenomic RNAs, we extracted split (junction) reads mapping to the viral genome with the GenomicAlignments R package (1.24.0) 51Lawrence et al.2013. The subgenomic transcripts fully overlap the full genomic RNA and partially with each other. While the molecular process generating these subgenomic RNAs is distinct from RNA splicing, from the point of view of short read mapping they are equivalent. We determined the relative expression level of each sgRNA generated by transcriptional skipping by calculating the number of reads supporting skipping into a region upstream of each annotated viral ORF. To avoid spurious mappings, we filtered for skip sites that were present in all three replicates and constituted at least 0.1% of all skipped viral reads.

Statistics, data wrangling, and visualisation

Statistical analyses were performed in R (3.6.3) and RStudio (1.4) environment using an R package ‘rstatix’ (0.7.0). p-Values were adjusted using the Bonferroni method for multiple comparisons. The ‘tidyverse suite’ (1.3.0) was used in R, and ‘Numpy’ and ‘Pandas’ Python packages were used in Jupyter notebook for data wrangling. The following R packages were used to create the presented visualisation: ‘ggplot2’ (3.3.2), ‘ggbeeswarm’ (0.6.0), ‘hrbrthemes’ (0.8.0), ‘scales’ (1.1.1), and ‘patchwork’ (1.1.1). Further visual annotations were made in the Affinity Designer (Serif).

References

    SARS-CoV-2 genomic and subgenomic RNAs in diagnostic samples are not an indicator of active replication11Nature Communications
    OMERO: flexible, model-driven data management for experimental biology9Nature Methods245253
    Quantification of HER2 and estrogen receptor heterogeneity in breast cancer by single-molecule RNA fluorescence in situ hybridization8Oncotarget1868018698
    Inhibition of vesicular stomatitis viral mRNA synthesis by interferons61Journal of Virology653660
    Detection of SARS-CoV-2 in formalin-fixed paraffin-embedded tissue sections using commercially available reagents100Laboratory Investigation14851489
    Single-cell heterogeneity and cell-cycle-related viral gene bursts in the human leukaemia virus HTLV-12Wellcome Open Research
    Translation and Replication Dynamics of Single RNA Viruses183Cell19301945
    Cellular 5’-3’ mRNA exonuclease Xrn1 controls double-stranded RNA accumulation and anti-viral responses17Cell Host & Microbe332344
    Rapid Decay of Host Basal MRNAs during SARS-CoV-2 Infection Perturbs Host Antiviral MRNA Biogenesis and ExportbioRxiv
    Infection sustained by lineage B.1.1.7 of SARS-CoV-2 is characterised by longer persistence and higher viral RNA loads in nasopharyngeal swabs105International Journal of Infectious Diseases753755
    Isolation and rapid sharing of the 2019 novel coronavirus (SARS-CoV-2) from the first patient diagnosed with COVID-19 in Australia212The Medical Journal of Australia459462
    Comprehensive Comparison of RNA-Seq Data of SARS-CoV-2, SARS-CoV and MERS-CoV Infections: Alternative Entry Routes and Innate Immune Responses12Frontiers in Immunology
    Phosphoregulation of Phase Separation by the SARS-CoV-2 N Protein Suggests a Biophysical Basis for its Dual Functions80Molecular Cell10921103
    Detection of SARS-CoV-2 by RNAscope in situ hybridization and immunohistochemistry techniques165Archives of Virology23732377
    Mutations Strengthened SARS-CoV-2 Infectivity432Journal of Molecular Biology52125226
    Single-Molecule Sensitivity RNA FISH Analysis of Influenza Virus Genome Trafficking1836Methods in Molecular Biology195211
    Comparative tropism, replication kinetics, and cell damage profiling of SARS-CoV-2 and SARS-CoV with implications for clinical manifestations, transmissibility, and laboratory studies of COVID-19: an observational study1The Lancet. Microbe
    Sensitivity of SARS-CoV-2 B.1.1.7 to mRNA vaccine-elicited antibodies593Nature136141
    Integrative Imaging Reveals SARS-CoV-2-Induced Reshaping of Subcellular Morphologies28Cell Host & Microbe853866
    Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England372Science
    Defectiveness of interferon production and of rubella virus interference in a line of African green monkey kidney cells (Vero)2Journal of Virology955961
    Mitochondrial double-stranded RNA triggers antiviral signalling in humans560Nature238242
    Characterisation of B.1.1.7 and Pangolin Coronavirus Spike Provides Insights on the Evolutionary Trajectory of SARS-CoV-2bioRxiv
    Structural basis of RNA recognition by the SARS-CoV-2 nucleocapsid phosphoprotein16PLOS Pathogens
    STAR: ultrafast universal RNA-seq aligner29Bioinformatics1521
    Development of a Hamster Natural Transmission Model of SARS-CoV-2 Infection13Viruses
    Ultrastructural modifications induced by SARS-CoV-2 in Vero cells: a kinetic analysis of viral factory formation, viral particle morphogenesis and virion release78Cellular and Molecular Life Sciences35653576
    Visualization of single RNA transcripts in situ280Science585590
    Single cell resolution of SARS-CoV-2 tropism, antiviral responses, and susceptibility to therapies in primary human airway epithelium17PLOS Pathogens
    A chance at survival: gene expression noise and phenotypic diversification strategies71Molecular Microbiology13331340
    Emergence of SARS-CoV-2 B.1.1.7 Lineage - United States70MMWR. Morbidity and Mortality Weekly Report9599
    System-wide Profiling of RNA-Binding Proteins Uncovers Key Regulators of Virus Infection74Molecular Cell196211
    Enzymatic production of single-molecule FISH and RNA capture probes23RNA15821591
    SARS-CoV-2 detection in formalin-fixed paraffin-embedded tissue specimens from surgical resection of tongue squamous cell carcinoma73Journal of Clinical Pathology754757
    Disruption of the Golgi Apparatus and Contribution of the Endoplasmic Reticulum to the SARS-CoV-2 Replication Complex13Viruses
    Rolling Signal-Based Ripley’s K: A New Algorithm to Identify Spatial Patterns in Histological SpecimensbioRxiv
    SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor181Cell271280
    Breadth and Function of Antibody Response to Acute SARS-CoV-2 Infection in HumansbioRxiv
    Comprehensive in vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms81Molecular Cell584598
    https://github.com/fish-quant/big-fish
    Genomic RNA Elements Drive Phase Separation of the SARS-CoV-2 Nucleocapsid80Molecular Cell10781091
    DNA nanoscaffold-based SARS-CoV-2 detection for COVID-19 diagnosis167Biosensors & Bioelectronics
    S-Variant SARS-CoV-2 Lineage B1.1.7 Is Associated With Significantly Higher Viral Load in Samples Tested by TaqPath Polymerase Chain Reaction223The Journal of Infectious Diseases16661670
    The Architecture of SARS-CoV-2 Transcriptome181Cell914921
    The Essential Role of Double-Stranded RNA-Dependent Antiviral Signaling in the Degradation of Nonself Single-Stranded RNA in Nonimmune Cells201Journal of Immunology10441052
    Densely Sampled Viral Trajectories Suggest Longer Duration of Acute Infection with B.1.1.7 Variant Relative to Non-B.1.1.7 SARS-CoV-2medRxiv
    SARS-CoV-2 structure and replication characterized by in situ cryo-electron tomography11Nature Communications
    Expression of SARS-CoV-2 Entry Factors in the Pancreas of Normal Organ Donors and Individuals with COVID-1932Cell Metabolism10411051
    Fast gapped-read alignment with Bowtie 29Nature Methods357359
    Morphometry of SARS-CoV and SARS-CoV-2 particles in ultrathin plastic sections of infected Vero cell cultures11Scientific Reports
    Software for computing and annotating genomic ranges9PLOS Computational Biology
    Development of immunohistochemistry and in situ hybridisation for the detection of SARS-CoV and SARS-CoV-2 in formalin-fixed paraffin-embedded specimens10Scientific Reports
    SARS-CoV-2 induces double-stranded RNA-mediated innate immune responses in respiratory epithelial-derived cells and cardiomyocytes118PNAS
    Molecular detection of SARS-CoV-2 in formalin-fixed, paraffin-embedded specimens5JCI Insight
    Moderated estimation of fold change and dispersion for RNA-seq data with DESeq215Genome Biology
    SARS-CoV-2 within-host diversity and transmission372Science
    Data Structures, Near Neighbor Searches, and Methodology105123
    Correlative multi-scale cryo-imaging unveils SARS-CoV-2 assembly and egress12Nature Communications
    FISH-quant: automatic counting of transcripts in 3D FISH images10Nature Methods277278
    Altered Subgenomic RNA Expression in SARS-CoV-2 B.1.1.7 InfectionsbioRxiv
    The Principles, Practice and Pitfalls of Nearest-neighbour Analysis57Geography277288
    Sensitivity of infectious SARS-CoV-2 B.1.1.7 and B.1.351 variants to neutralizing antibodies27Nature Medicine917924
    Imaging individual mRNA molecules using multiple singly labeled probes5Nature Methods877879
    Single-molecule approaches to stochastic gene expression38Annual Review of Biophysics255270
    Viral genome imaging of hepatitis C virus to probe heterogeneous viral infection and responses to antiviral therapies494Virology236247
    Single-cell longitudinal analysis of SARS-CoV-2 infection in human airway epithelium identifies target cells, alterations in gene expression, and cell state changes19PLOS Biology
    The effect of spike mutations on SARS-CoV-2 neutralization34Cell Reports
    Sensitive Visualization of SARS-CoV-2 RNA with CoronaFISHbioRxiv
    Tests of “Randomness” for Spatial Point Patterns41Journal of the Royal Statistical Society368374
    ClampFISH detects individual nucleic acid molecules using click chemistry-based amplification37Nature Biotechnology8489
    Comparison of rhesus and cynomolgus macaques as an infection model for COVID-1912Nature Communications
    Interferon-stimulated genes and their antiviral effector functions1Current Opinion in Virology519525
    Minus-strand copies of replicating coronavirus mRNAs contain antileaders65Journal of Virology320325
    Spatiotemporal analysis of hepatitis C virus infection11PLOS Pathogens
    Understanding the outcomes of COVID-19 - does the current model of an acute respiratory infection really fit?102The Journal of General Virology
    Quantitative measurements of early alphaviral replication dynamics in single cells reveals the basis for superinfection exclusion12Cell Systems210219
    Continuous and Discontinuous RNA Synthesis in Coronaviruses2Annual Review of Virology265288
    Cellpose: a generalist algorithm for cellular segmentation18Nature Methods100106
    RDI Calculator: An Analysis Tool to Assess RNA Distributions in Cells9Scientific Reports
    Visualization of double-stranded RNA in cells supporting hepatitis C virus RNA replication82Journal of Virology21822195
    Emergence and Rapid Spread of a New Severe Acute Respiratory Syndrome-Related Coronavirus 2 (SARS-CoV-2) Lineage with Multiple Spike Mutations in South AfricamedRxiv
    Evolution of Enhanced Innate Immune Evasion by the SARS-CoV-2 B.1.1.7 UK VariantbioRxiv
    Super-Resolution Single Molecule FISH at the Drosophila Neuromuscular Junction1649Methods in Molecular Biology163175
    Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England593Nature266269
    Coronavirus biology and replication: implications for SARS-CoV-219Nature Reviews. Microbiology155170
    Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus94Journal of Virology
    The SARS-CoV-2 subgenome landscape and its novel regulatory features81Molecular Cell21352147
    Emergence and rapid transmission of SARS-CoV-2 B.1.1.7 in the United States184Cell25872594
    Double-stranded RNA is produced by positive-strand RNA viruses and DNA viruses but not in detectable amounts by negative-strand RNA viruses80Journal of Virology50595064
    Next-generation sequencing of dsRNA is greatly improved by treatment with the inexpensive denaturing reagent DMSO5Microbial Genomics
    Hypoxic and pharmacological activation of HIF inhibits SARS-CoV-2 infection of lung epithelial cells35Cell Reports
    Double-Membrane Vesicles as Platforms for Viral Replication28Trends in Microbiology10221033
    Single molecule fluorescence in situ hybridisation for quantitating post-transcriptional regulation in Drosophila brains126Methods166176
    GCG inhibits SARS-CoV-2 replication by disrupting the liquid phase condensation of its nucleocapsid protein12Nature Communications