Potential for La Crosse virus segment reassortment in nature

The evolutionary success of La Crosse virus (LACV, family Bunyaviridae) is due to its ability to adapt to changing conditions through intramolecular genetic changes and segment reassortment. Vertical transmission of LACV in mosquitoes increases the potential for segment reassortment. Studies were conducted to determine if segment reassortment was occurring in naturally infected Aedes triseriatus from Wisconsin and Minnesota in 2000, 2004, 2006 and 2007. Mosquito eggs were collected from various sites in Wisconsin and Minnesota. They were reared in the laboratory and adults were tested for LACV antigen by immunofluorescence assay. RNA was isolated from the abdomen of infected mosquitoes and portions of the small (S), medium (M) and large (L) viral genome segments were amplified by RT-PCR and sequenced. Overall, the viral sequences from 40 infected mosquitoes and 5 virus isolates were analyzed. Phylogenetic and linkage disequilibrium analyses revealed that approximately 25% of infected mosquitoes and viruses contained reassorted genome segments, suggesting that LACV segment reassortment is frequent in nature.


Background
In the 1970s, La Crosse virus (LACV family Bunyaviridae, genus Orthobunyavirus) emerged as a significant human pathogen in the upper Midwestern United States, and it is now the most common cause of pediatric arboviral encephalitis in the U.S [1]. LACV is maintained primarily in cycles between Aedes triseriatus and small mammals (usually chipmunks and tree squirrels). Aedes triseriatus develop a life-long infection, and infected females can transovarially transmit (TOT) the virus to their progeny [2,3]. TOT is perhaps the most important mechanism for maintenance and amplification of LACV in nature [4,5].
LACV has a tripartite, negative-sense RNA genome with the three segments designated large (L), medium (M), and small (S). The L segment encodes the RNA-dependent RNA polymerase [6], the M segment encodes a precursor polypeptide that is post-translationally cleaved to generate the G1 and G2 glycoproteins and the nonstructural protein NSm [7][8][9][10], and the S segment encodes the nucleocapsid protein and the small nonstructural protein NSs in overlapping reading frames [8].
LACV exhibits considerable evolutionary potential in nature. There are distinct geographic genotypes of the virus in different areas of the United States [11][12][13][14], and there is evidence that disease severity may be conditioned by certain LACV genotypes [13,15]. The evolutionary success of the LACV and other viruses in the family Bunyaviridae is attributed in part to their ability to adapt to varying conditions through genetic drift (intramolecular genetic changes) and genetic shift (segment reassortment).
Genetic drift occurs during genome replication and can result in viral diversity and altered fitness [16]. RNA virus replication yields multiple genetic variants, or quasispecies, which occur due to poor fidelity of the RNA polymerases and the lack of proofreading enzymes. The errorprone polymerase can provide an array of mutations, which allows constant adaptation to and selection by changes in the vector and vertebrate host.
Laboratory studies have demonstrated the occurrence of genetic shift (segment reassortment) in mosquitoes that have become dually infected by ingesting viruses of two different LACV genotypes, either simultaneously or within two days of each other [17]. LACV reassortant viruses can be isolated from up to 25% of dually infected Ae. triseriatus and the newly generated viruses can be transmitted. The potential for segment reassortment increases when a transovarially-infected mosquito takes a blood meal from a viremic host [18]. These mosquitoes can be orally superinfected, and can transmit the new reassortant viruses. The new reassortants might exhibit new characteristics such as altered host and vector ranges, new tropisms or virulence, and thus may be epidemiologically significant [5]. Segment reassortment is apparently restricted to closely related bunyaviruses, typically in the same serogroup [19][20][21][22].
Evidence has also been presented for reassortment between LACV genotypes in nature. For example, the genomes of 23 isolates of LACV were analyzed by oligonucleotide fingerprinting and categorized in terms of the degree of their RNA sequence relatedness [14]. One genotype (denoted type A) was isolated from mosquitoes from Wisconsin, Minnesota, Indiana, and Ohio and a second genotype (denoted type B) was isolated from mosquitoes from Minnesota, Wisconsin, and Illinois. A reassortant LACV isolated in Rochester, Minnesota contained the S segment of the B genotype, and the M and L segments of the A genotype.
Genome segment reassortment has also been demonstrated among other Orthobunyaviruses and in other Bunyaviridae genera. Ngari virus is a newly emerged reassortant virus associated with severe disease epidemics in Africa [23]. Sequence analysis of the three genomic RNA segments revealed that the S and L segments were derived from Bunyamwera virus, but the M segment was derived Although genome reassortment appears to occur frequently in the Bunyaviridae family, the epidemiologic consequences of these evolutionary events are poorly understood. In this study molecular epidemiological techniques were used to investigate the evolutionary and reassortment potential of LACV in field-infected mosquitoes from the upper Midwest of the United States.

LACV infected mosquitoes and isolates analyzed
A total of 6,791 mosquitoes collected as eggs at 151 study sites in Wisconsin, Minnesota, and Iowa ( Figure 1) were reared and tested for LACV antigen by immunofluorescence assay (IFA). Of these, 309 (4.6%) were positive. Viral RNA was amplified by RT-PCR from one to three mosquitoes from the selected sites listed in Table 1. Four  LACV isolates from 1960, 1978, 2006 and 2007 were also examined in this study. The viruses from 2006 and 2007 were isolated from mosquitoes collected in the field. L, M, and S viral RNA (see Amplicon Cloning and Sequencing) was also amplified from the two virus isolates as well as directly from the two infected mosquitoes. The L, M, and S sequences from the viruses and the RNA amplified directly from the mosquitoes were identical (data not shown).

Rates and patterns of molecular evolution
The numbers of sequences analyzed and the number of segregating sites in each segment are shown in Table 2. The greatest nucleotide diversity (π) was seen in the M segment, twice that in the S segment and thrice that in the L segment. The distributions of these polymorphisms are shown in Figure 2. What is most noteworthy is that all three segments had more replacement than synonymous substitutions. In the L segment the diversity among replacement substitutions (π a ) was actually 3.24 times larger than the diversity among synonymous substitutions (π s ). The location and amino acid replacements are listed in Table 3. These trends suggest that some form of positive selection is operating on amino acid substitutions in all three segments.
The program Tipdate [28] estimated the molecular evolutionary rate (substitutions/site), the absolute molecular evolution rate (substitutions/site/year) of each segment and the age of the dataset (the time in years since the sequences evolved from a common ancestral sequence)( Table 4). The absolute evolution rate was most rapid in the S segment, 480 times greater than the rate in the L segment and 4.8 times greater than the rate in the M segment. Both the M and S segments appear to be of similar ages, while the L segment appears to predate both bỹ 400,000 years.

Haplotype determination
The haplotype grouping system was determined through a conservative phylogenetic analysis. The system identified three S haplotypes based on seven polymorphic sites, five of which were nonsynonymous mutations. The three haplotypes identified in the M segment were based on twelve polymorphic sites, seven of which were nonsynonymous. For the L segment, two haplotypes were identified based on thirteen polymorphic sites, twelve of which were nonsynonymous substitutions ( Figure 3).

Phylogenetic analysis
Maximum parsimony phylogenetic trees were established using amplified sequences from each of the three segments. Comparison of the clades on the three maximum parsimony trees provides evidence for the potential for transmission of reassortant viruses by the infected Ae. triseriatus (Figures 4, 5, 6). If there were no reassortants, the three genome segments from each infected mosquito would have appeared in the same clade. A number of mosquitoes contained viral genome segments that clustered into different clades in each of the trees. For example, the S segment from the sample MCBB/La Crosse/ 2004 was in haplotype #2 (red), the M segment in haplotype #2 (predominantly red) and the L segment in haplotype #1 (mixture of red and blue). Another example is the LACV RNA from the mosquito collected in NFCS/ Winona/2004. The S segment was in haplotype #3 (purple), the M segment in haplotype #2 (predominantly red), and the L segment in haplotype #1 (mixture of red and blue). These suggest that segment reassortment had occurred. The distribution of the sequences in the phylogenetic trees for all three segments would be identical if reassortment had not occurred; however, the phylogenetic trees are highly variable when the S, M and L segment tree topologies are compared.

Linkage disequilibrium analysis
A linkage disequilibrium analysis was performed within and among the S, M, and L segments. Figure 7 is a heat diagram in which low disequilibrium coefficients are represented by light yellow squares and high disequilibrium coefficients are represented by red squares. The matrix is read according to the nucleotide position of segregating sites displayed along the diagonal. For example in Figure  7, the lowest square connects sites S22 (segregating site 22 from the S segment) and S86 and it is red. This corresponds to an r 2 of 1.00 and these sites are in complete linkage disequilibrium. In contrast, squares linking site S359 with all other sites are light yellow indicating that all sites in S are in equilibrium with S359. The triangles along the diagonal in Figure 7 contain many red squares indicating that many sites within a segment are in disequilibrium. Thus our coverage of each of the segments appears adequate.
The squares in Figure 7 indicate patterns of disequilibrium among segments. In contrast to the large amounts of disequilibrium found within each of the segments, there is very little disequilibrium among segments. Between  disequilibrium; these are S232 and L427. All other possible interaction between S and L are in equilibrium indicating reassortment between these segments. Between M and L there are again 256 possible interactions but only two of these are in disequilibrium: M179 with L312 and L314. All other possible interaction between M and L are in equilibrium indicating reassortment between these segments.
An independent heterogeneity χ 2 analysis (Table 5) was performed to test this pattern. There were 3 S clades, 3 M clades and 2 L clades; thus there were 18 possible segment combinations corresponding to each row in Table 5. The observed column is the number of times that a segment combination occurred in the 45 samples. Eight of the combinations were in disequilibrium but 10 were in equilibrium (in bold) supporting an inference of frequent reassortment. In total, eleven of the 45 (24.4%) samples were in linkage equilibrium

LACV segment reassortment in nature
Both phylogenetic and linkage disequilibrium analyses revealed that LACV RNA genome segments had undergone reassortment in 24% of mosquitoes and isolates analyzed. This is remarkable and illustrates the exceptional evolutionary potential and genetic diversity of Bunyaviridae viruses in nature. One possible reason for this could be the ability of Ae. triseriatus to become dually infected. When mosquitoes ingest two different LACV isolates simultaneously or sequentially within four hours, 100% become dually infected [17]. Even at 48 hours postinitial bloodmeal, 27% of mosquitoes that ingest a sec-ond virus become dually infected before a barrier to superinfection develops. In addition, when transovariallyinfected mosquitoes ingested a bloodmeal containing a heterologous LACV, 19% became dually infected [18]. These experiments suggest that dual infection can occur frequently through both oral and transovarial infection and therefore increase the possibility of segment reassortment in vectors. The newly evolved viruses are also efficiently transmitted [17]. These experiments were performed in a controlled laboratory setting, but they demonstrate the potential for segment reassortment to occur frequently in nature.
Although the analyses demonstrate the potential for reassortment, most of the sequences used were from RNA amplified directly from the infected mosquitoes and not from virus isolates. The reassortment frequency detected in this study could have resulted from analysis of RNA quasispecies sequences in the mosquito. However the L, M, and S sequences obtained from the virus isolates as well as those directly amplified from the infected mosquitoes in 2006 and 2007 were identical. This suggests that 1) the genome sequence obtained by direct amplification of the viral RNA from the mosquito is the dominant viral sequence in the mosquito as well as in infectious virus and 2) that the estimation of reassortment frequency was not confounded by potential RNA quasispecies in the mosquitoes. Estimating the frequency of reassortment of LACV in nature would be improved by analysis of plaquepurified viruses isolated from the mosquitoes, preferably from their saliva or ovaries, which are the epidemiologically significant organs of transmission. *Fifty mosquitoes were tested for LACV antigen from most sites. There were 11 sites with less than 50 adult mosquitoes.

Conclusion
There are important public health implications of reassortment in LACV-infected Ae. triseriatus in the field. LACV reassortants could be more virulent and could have altered vector species and vertebrate host ranges. New viruses could create new arbovirus cycles with potentially significant epidemiological consequences [5]. For example, the geographic distribution of LACV is currently determined by the distribution of Ae. triseriatus and chipmunks and tree squirrels. If a new virus established a transmission cycle that involved a mosquito species that fed more aggressively on humans, increased human infections could occur. If a new reassortant virus was more virulent or exhibited different tissue tropisms, infections could become clinically significant in both adults and children. For example, a new reassortant virus could replicate more efficiently in humans, resulting in greater viremia titers and more efficient infection of the central nervous system. Determination of the evolutionary potential of LACV through genetic shift may permit prediction of the epidemiologic consequences of these events.
These studies illustrate the significant evolutionary and epidemic potential of viruses in the family Bunyaviridae. Viruses in this family have contributed inordinately to the list of newly emerged viruses [29], and they will likely continue to do so in the future.

Egg collection
Aedes triseriatus eggs were collected from five oviposition traps in each of 151 sites in Minnesota (n = 37), Wisconsin (n = 108) and Iowa (n = 6). Sites were established in areas where LACV encephalitis cases occurred or areas that contained clusters of people judged by the La Crosse County Nucleotide diversity (π) of the LACV S, M and L segment sequences amplified from field-infected mosquitoes Figure 2 Nucleotide diversity (π) of the LACV S, M and L segment sequences amplified from field-infected mosquitoes.

Immunofluorescence assay (IFA)
To determine if mosquitoes were infected, mosquito heads were severed, squashed onto acid-washed microscope slides, and fixed in acetone. Heads were assayed for LACV antigen by direct IFA using LACV-specific polyclonal antiserum [30].

LACV-positive mosquitoes
Viral RNA from 40 mosquitoes was analyzed, including 34 field collected mosquitoes from 2004 and six field collected mosquitoes from 2000.

LACV strains
Previously isolated LACV strains were also used in the analysis. The 1960 LACV isolate was isolated originally from the brain of a child who died from LACV encephalitis in La Crosse, WI and it was passed five times in suckling mouse brains (SMB). A 1978 LACV (78V-8853) was isolated from an Ae. triseriatus mosquito from Rochester, MN and passed once in Vero cells and twice in SMB. LACV was isolated from mosquitoes collected in the field in WI and MN in 2006 and 2007.
Cell monolayers of Vero cells were grown in six-well plates at 37°C in an atmosphere of 5% CO 2 . Supernatant from the centrifuged mosquito homogenate (0.2 ml) was added to one well in a six-well plate, incubated at 37°C for one hour. Following the incubation, 5 ml of medium were added to each well.

Plaque purification
The virus isolates from 2006 and 2007 were plaque purified using monolayers of Vero cells in six-well plates [31]. Virus isolates were serially diluted 10 -1 to 10 -6 and 200 μl of each virus dilution was added to individual wells and incubated at 37°C for 1 hour. The virus inoculum was removed and 5 ml of overlay was added to the well. After six days of incubation at 37°C in 5% CO 2 , 200 μl of the Table 3: Nonsynonymous mutations found in sequences of LACV RNA that was RT-PCR amplified from field collected mosquitoes detection solution, methylthiazolyldiphenyl-tetrazolium bromide (MTT) (5 mg/ml in PBS), was added to each well. The plates were incubated overnight and visible plaques were picked and placed in 1 ml of MEM with 0.2% FBS for 1 hr at 37°C. An aliquot of the medium from the wells was added to Vero cells, and the presence of virus confirmed by detection of cytopathic effect.

RNA purification from mosquitoes
The posterior half of each mosquito abdomen was individually homogenized in 500 μl of Trizol (Invitrogen, Carlsbad, CA), using a pellet pestle (Fisher Scientific, Pittsburg, PA), and then total RNA was extracted according to manufacturer's instructions.

RNA purification from virus isolates
The medium and cells from wells with plaque purified virus were removed and placed in a 15 ml conical tube and centrifuged at 3000 rpm for 10 minutes. The supernatant was removed and the cell pellet was resuspended in 500 μl of Trizol (Invitrogen, Carlsbad, CA). Total RNA was extracted according to manufacturer's instructions.
RNA from the 1960 and 1978 LACV isolates was prepared by infection of C6/36 cell cultures at a multiplicity of infection of 0.01. Three days post-infection, cells were scraped into the medium, centrifuged and cell pellets were resuspended in 500 μl of Trizol for RNA extraction. Phylogenetic analyses yielded three haplotypes for the S segment, three haplotypes for the M segment, and two haplotypes for the L segment. The genome position is provided above the genetic sequence.  containing ampicillin (50 μg/ml) and kanamycin (50 μg/ ml). Colonies were screened for inserts by PCR amplification using the original primers and positive products were purified using QIAquick spin columns (Qiagen, Valencia, CA). Three to five cDNA clones per segment from each mosquito were sequenced in both directions using the ABI PRISM dye terminator cycle sequencing kit (Applied Biosystems, Foster City, CA) and the ABI 310 DNA auto-M segment (nucleotides 1637-1994) phylogenetic tree

Haplotype determination
Genetic haplotypes were established for each of the three segments through maximum parsimony analysis, sequence identity matrix, neighbor joining distance L segment (nucleotides 179-625) phylogenetic tree matrix, and ratio of synonymous to nonsynonymous substitutions.

Statistical analyses 1. DNA polymorphism and nucleotide substitution rates
For each segment, the computer program DnaSP 4.5 [33] estimated π the average number of nucleotide differences among all pairwise comparisons of sequences [34], equation 10.5]. π was also estimated separately for synonymous (π s ) and replacement substitutions (π s ). The rate of molecular evolution (substitutions/site/year) was estimated using the program TipDate [28]. Tipdate analyzes sequences of RNA viruses that have been obtained at different dates to provide a maximum likelihood estimate of the absolute rate of molecular evolution. The program assumes a molecular clock to estimate the date of the most common ancestor.

Linkage Disequilibrium Analysis
Linkage disequilibrium is a measure of the degree to which substitutions in a segment occur independently of one another. Substitutions that occur together in a segment at a rate predicted by their independent frequencies are in linkage equilibrium. Substitutions that occur more or less often than expected by random chance are considered to be in linkage disequilibrium. Linkage disequilibrium also tests whether sampling a portion of a genome segment is representative of the whole segment. Linkage equilibrium is detected when different parts of a segment are evolving independently and sequencing a portion of the segment may not provide a representative sample of the whole.
A linkage disequilibrium analysis was also performed to determine if entire segments assort randomly, thereby suggesting segment reassortment. Segment are in disequilibrium when some combinations of segments occur together in a mosquito more or less often than would be predicted by their independent frequencies. The first step is to determine the number of times segment, S i , Mj, and A heat map of linkage disequilibrium within and among the LACV S, M, and L segments   T ijk = the number of times haplotype i, j, and k occur in a mosquito.
E ijk is the number of times haplotype i, j, and k are expected to occur in a mosquito.
where p i is the frequency of S i in the mosquito population and N is the number of mosquitoes. Linkage disequilibrium (D ijk ) was then estimated.
A Hill and Robertson correlation coefficient R ijk was determined [35].
R ijk = D ij /((p i (1-p i ))(p j (1-p j ))(p k (1-p k )) 1/2 (4) The squared correlation coefficient (R ijk 2 ) was used as a metric of disequilibrium because it ranges from zero (linkage equilibrium) to one (linkage disequilibrium). Linkage disequilibrium patterns among all polymorphic sites were plotted on a heat map using the LDheatmap program in R [36]. A chi-square statistic (χ 2 Link ) and the corresponding level of significance were calculated for each combination of haplotypes to test the hypothesis that the individual haplotype combinations are in linkage equilibrium.

Maximum Parsimony analysis
Maximum parsimony phylogenetic analysis was performed using the Phylogenetic Analysis Using Parsimony (PAUP) 4.0b10 package [37]. The phylogenetic trees indicate the branches that appeared in the majority of the 100 bootstrap pseudo replications and the frequency with which these appear among replications. A maximum parsimony phylogenetic tree was created for each of the three genome segments.