PAVS enriches for specific viral species from a heterogeneous sample
PCR-Activated Virus Sorting allows specific viruses in a mixed population to be detected and recovered by sorting. This is accomplished by encapsulating the viruses in double emulsion droplets using microfluidic technology and performing PCR in each droplet to probe for sequences of interest. Because the viruses are encapsulated at 0.1 per droplet, most droplets are empty or contain a single virion, in accordance with Poisson statistics (Fig. 1). If the target virus is present in a droplet, PCR amplification occurs, generating a fluorescent signal that can be detected and recovered by FACS. Due to the rapid rate at which microfluidics can encapsulate individual virions in droplets (>1 KHz), millions of single virus particles can be sorted in a few hours.
Specific detection and quantification of viral genomes
To identify a virus in a droplet, PAVS uses a PCR assay interrogating for sequences that exist within the target species. In this way, the PCR primers are analogous to antibodies when sorting cells with FACS, providing a detectable fluorescence signal only when the target species is present. However, whereas generating high affinity antibodies against a target virus can be challenging, especially if it is uncultivable, designing specific PCR primers is straightforward. This makes PAVS general, allowing it to recover any virus of interest to which PCR primers can be designed. To illustrate this, we perform digital TaqMan PCR on samples containing T4, ФX174, and lambda virus, using probes specific for only bacteriophage T4 (Additional file 1: Table S1). The droplets are visualized using epifluorescence imaging after thermal cycling. As expected we observe TaqMan positive droplets in the T4 sample, demonstrating successful amplification when this virus is present (Fig. 2a). By contrast, TaqMan fluorescence is absent in the ФX174 and lambda negative controls (Fig. 2a), confirming that the reaction is specific. This shows that our TaqMan PCR assay can be used to differentiate between single virus particles of these species.
In addition to enabling the detection of specific viruses in a sample, PAVS can count individual virus particles. To demonstrate this, we analyze a dilution series of T4 bacteriophage, reading out the results with fluorescence microscopy and image analysis. We find that, as expected, the fraction of positive droplets is directly proportional to T4 concentration (Fig. 2b). This is due to the viruses being loaded at limiting dilution, such that most droplets are empty but a small fraction contain virus particles. Under such conditions, the viruses are encapsulated individually and the number of droplets containing a virus is approximately equal to the number of viruses in the sample, in accordance with random Poisson encapsulation. As with qPCR, the minimum number of viruses necessary for PAVS depends on the specificity of the TaqMan assay. A strength of TaqMan assays is that they are highly specific, allowing confident detection of rare virus species. For example, in initial tests of this approach, we found that the rate of non-specific amplification in a droplet is 1 in ~100,000, so that viruses less rare than this can be confidently detected. Since we can routinely sort >2 million droplets in a single FACS run, this allows us to detect as few as ~20 virus particles in a sample.
Multiplexed digital PCR can detect full-length virus genomes
A unique and valuable property of PAVS is that it can differentiate between viruses that contain just one target sequence and others that contain multiple. This is possible because TaqMan PCR can be multiplexed using probes targeting different sequences labeled with fluorescent dyes of different color. Hence, viruses containing one sequence will be positive only at one color, whereas those with two sequences will be positive for two colors. These populations can then be separated by gating the fluorescence measurements to recover single- or double-positive droplets. To demonstrate the ability to multiplex the reaction, we synthesize primers and Cy5 TaqMan probes targeting a genomic region near the 5’ end of the lambda genome, and others targeting regions at increasing distances away from the 5’ end. The primer and probe sequences are listed in Additional file 1: Table S1, and a graphical representation of the probe locations with the Cy5 TaqMan probe in red and the FAM TaqMan probes in green is provided in Fig. 3a. Lambda DNA is combined with the PCR reagents and the sample is emulsified using the microfluidic device. After thermal cycling, the droplets are imaged using fluorescence microscopy (Fig. 3b) and analyzed to measure their intensity on the Cy5 and FAM channels (Fig. 3c). The droplets are characterized as positive for both targets (Cy5+FAM+), positive for one target (FAM + Cy5-, FAM−Cy5+), or negative for both (FAM−Cy5−). Each multiplexed PCR is performed in triplicate, containing 5000–8000 droplets.
We observe less multiplexing when probe pairs are far apart, indicating that the probability that two target sequences exist within a given genome decreases for sequences that are more separated (Fig. 3d, blue curve); this implies that the genomes might be partially fragmented. To investigate this further, we perform a negative control in which we digest the genome with a restriction endonuclease cleaving at position 10,086 base pairs (bp), which is between the first and second FAM probes. If fragmentation is the source of lowered multiplexing signal, then the fraction of double-positives should fall precipitously beyond the cleavage point; indeed, this is what we observe, as shown by the red curve in Fig. 3d. As an additional negative control, we digest the lambda genome using a non-specific endonuclease (Fragmentase) producing ~500 bp products, and observe that double-positives are rare for all probe pairs (green curve). This demonstrates that PAVS can characterize the length distributions of viral genomes in a solution and, more generally, the presence of multiple genetic loci in a target virus; this should be useful for studying correlations between loci in single viruses that are on the same linear molecule or on entirely different molecules, such as in segmented virus genomes. PAVS can also be used to characterize the integrity of viral genomes.
PAVS allows target virus to be sorted out of a mixed population
The PAVS workflow consists of two steps, a first in which target viruses are detected using single virus PCR in droplets, and a second in which the droplets are sorted to recover the target viruses. To demonstrate this, we construct a mixed sample of two bacteriophages, T4 and фX174, at a ratio of 1:999 respectively. This 0.1% T4 spike-in is encapsulated at limiting dilution in droplets with PCR primers specific for T4 phage, thermally cycled, and stained with SYBR Green. If a particular droplet contains T4, the nucleic acids targeted by the PCR primers are amplified and the SYBR stain produces a fluorescent signal that fills the droplet. The fluorescence signal is detected with FACS and the positive droplets sorted into a 1.5 mL microcentrifuge tube.
To validate that the PAVS workflow enriches for T4 over ФX174, we quantify virus concentrations in the sorted and unsorted pools using qPCR. The sorted droplets are ruptured and the viral genomes amplified by ddMDA. Equal concentrations of T4 and ФX174 DNA from the unsorted and sorted emulsions are subjected to qPCR (Fig. 4). The primers used to detect T4 target a different locus than the ones for PAVS sorting (Additional file 1: Table S1). The qPCR curve for T4 shifts to lower cycles post-sorting, demonstrating that T4 has been enriched. By contrast, the curve shifts to higher cycle numbers for ФX174, indicating that this virus has been de-enriched by sorting, as expected. To quantify the degree of enrichment, we compute an enrichment factor e defined as,
$$ e=\kern0.5em \frac{\left(n\kern0.5em +\kern0.5em 1\right)\left(\frac{1}{2^{\varDelta {C}_t^{\mathrm{T}4}}}\right)}{\left(\frac{1}{2^{\varDelta {C}_t^{\mathrm{T}4}}}\right)\kern0.5em + n\left(\frac{1}{2^{\varDelta {C}_t^{\upphi \mathrm{X}174}}}\right)}, $$
where n is the ratio of the viral species with respect to one another and ΔC
t
T4 and ΔC
t
фX174 are the differences of cross-threshold values for T4 and фX174, respectively. For this experiment, n = 999, ΔC
t
T4 is 2.16, and ΔC
t
фX174 is 5.45, yielding e = 9.69, indicating that the final sample is enriched by about tenfold for T4 from an initial concentration of 0.1%. The degree of enrichment is tunable over a large range, as the rarer the target is in the droplets before sorting, the more it is enriched thereafter. Conversely, if the target is abundant, then many droplets will be positive and only a minor enrichment is possible. To increase enrichment, the sample is thus diluted prior to partitioning in droplets, which reduces the rate of co-encapsulation of different viruses and false-positive recovery of off-target species. The enrichment possible is also limited by the false-positive rate of droplet detection, which sets an upper bound to how much the sample can be diluted. The false-positive rate for our TaqMan assay is ~1/100,000 droplets, setting a theoretical upper enrichment limit of ~100,000×; however, the maximum enrichment achieved in practice can also be limited by other considerations, such as a the specificity of assays or the number of positive viruses that must be recovered for downstream characterization.
PAVS recovery of single virions from a mixed sample
Common FACS instruments can pool all positive droplets into one well or dispense controlled numbers into different wells, including down to single droplets. This is commonly used to isolate cells for single cell analysis (Fig. 5a). When combined with PAVS, this allows a heterogeneous mixture of viruses to be sorted, to isolate specific virions in the sample, which can then be subjected to additional analyses, such as qPCR. To illustrate this, we sort a sample of lambda virus with PAVS and dispense the positive droplets into wells in controlled numbers (Additional file 1: Table S1, Lambda FWD 2, Lambda REV 2, and Lambda probe 2). We load 1, 10, or 50 positive droplets into each well and analyze the recovered material with qPCR for primers targeting a different portion of the lambda genome than was amplified in the PAVS detection (Additional file 1: Table S1). The C
t
values decrease as the number of viruses dispensed increases, indicating that the viruses are present at higher numbers (Fig. 5b). When fewer than 50 viruses are sorted, it is difficult to reliably detect them in the sorted wells; wells with 10 viruses show amplification at C
t
values of 33, while single viruses do not amplify above the negative controls.
To confirm that the sorting is specific, we generate qPCR curves for wells containing 50 positive droplets and wells containing 50 unsorted droplets. The qPCR curve shifts left by an average of 4.24 C
t
values, demonstrating that the Lambda virus is more abundant in the sorted sample (Fig. 5c). While our results show that single viruses provide too little DNA for detection with standard qPCR, other post-sorting amplification methods could be implemented to improve sensitivity, such as higher efficiency PCR reagents, nested PCR [31], or non-specific ddMDA followed by qPCR [28]. Because DNA can fragment under flow through narrow channels, the ability to perform multiplexed TaqMan assays in the droplets can be used to identify and dispense only intact viral genomes into the wells.