Sequence diversity on four ORFs of citrus tristeza virus correlates with pathogenicity

The molecular characterization of isolates of citrus tristeza virus (CTV) from eight locations in Mexico was undertaken by analyzing five regions located at the opposite ends of the virus genome. Two regions have been previously used to study CTV variability (coat protein and p23), while the other three correspond to other genomic segments (p349-B, p349-C and p13). Our comparative nucleotide analyses included CTV sequences from different geographical origins already deposited in the GenBank databases. The largest nucleotide differences were located in two fragments located at the 5' end of the genome (p349-B and p349-C). Phylogenetic analyses on those five regions showed that the degree of nucleotide divergence among strains tended to correlate with their pathogenicity. Two main groups were defined: mild, with almost no noticeable effects on the indicator plants and severe, with drastic symptoms. Mild isolates clustered together in every analyzed ORF sharing a genetic distance below 0.022, in contrast with the severe isolates, which showed a more disperse distribution and a genetic distance of 0.276. Analyses of the p349-B and p349-C regions evidenced two lineages within the severe group: severe common subgroup (most of severe isolates) and severe divergent subgroup (T36-like isolates). This study represents the first attempt to analyze the genetic variability of CTV in Mexico by constructing phylogenetic trees based on new genomic regions that use group-specific nucleotide and amino acid sequences. These results may be useful to implement specific assays for strain discrimination. Moreover, it would be an excellent reference for the CTV situation in México to face the recent arrival of brown citrus aphid.


Background
Citrus tristeza virus (CTV) is a destructive pathogen that causes the most important disease of citrus. CTV is phloem-limited and naturally transmitted by several species of aphids in a semipersistent mode and by the use of infected propagative budwood [1].
CTV infects almost all citrus species, including hybrids and relatives, causing different range of symptoms depending on the host and virus strain. The most common symptoms are: quick decline (QD) or death of most citrus species propagated on the sour orange (Citrus aurantium L.) rootstock; stem pitting (SP) of the scion that reduces the yield and fruit quality of some citrus varieties regardless of the rootstock; and seedling yellows which is seen on sour orange, lemon (Citrus lemon (L.) Burn. f.) and grapefruit (Citrus paradisi Macf.). Strains that not elicit symptoms on commercial citrus hosts have been named as mild isolates [2].
Citriculture represents a very important activity in Mexico where 517,000 has. are grown in several agroecological regions yielding 7.1 million tons of fruit [3]. CTV was first detected in the Mexican State of Tamaulipas in 1983. Extensive survey in the major citrus growing regions indicated (up to 2006) that CTV incidence was low (< 1%) in other 17 States of Mexico. Because of the random pattern of the infected trees, CTV infection was probably propagated in Mexico by the use of infected budwood, rather than by aphid vectors [4]. In general, the trees remain asymptomatic, without declining suggesting a possible prevalence of mild strains. However, Mexican citrus industry is now facing a real and immediate threat due to: i) the recent introduction to Mexico of the most efficient vector of CTV, commonly called brown citrus aphid (BCA), Toxoptera citricida and ii) the predominance of sour orange as rootstock (the most susceptible to CTV) in almost the totality of the citrus groves. Despite the importance of CTV, there is a lack of knowledge about the genetic diversity of CTV in Mexico. Therefore, the objective of this study was to determine the molecular variability of CTV isolates collected in eight of the major citrus growing areas in Mexico, before the spreading of the BCA.

Biological characterization of the CTV isolates
Symptoms of CTV isolates varied in their severity and ranged from a severe SP observed in the isolates Mx-Tam and Mx-BC (Mexican lime on C. macrophylla) to those as Mx-Yuc and Mx-QR, which showed a mild vein clearing in Mexican lime. Incipient SP and vein clearing were observed for Mx-Mich, Mx-NL, Mx-Col, Mx-Ver and Mx-QR. When sour orange was used as rootstock, a gradual declining occurred for the Mx-Tam isolate associated to pinholing or honeycombing below the bud union.

Phylogeny and genetic variability in CTV isolates
The nucleotide sequences of five regions of the CTV genome were determined. They are situated at the oppo-site ends covering around the 20% of the CTV genome. These DNA fragments include two fragments of the p349 ORF located at the 5'end of the genome: p349-B (nucleotides 1460 to 2350) and p349-C (nucleotides 6870 to 7900) and the ORFs CP, p23 and p13 (located at the 3' end).
Alignments of these genomic regions indicated that variations at nucleotide level were simple nucleotide changes. Overall similarities for the five regions among CTV Mexican isolates ranged from 71.2 to 100% at the nucleotide level; however, the sequences of these strains clustered into two groups that were highly homologous (92.1-100% identical nucleotides and 92.9-100% identical amino acid) such as Mx-Mich, Mx-Ver, Mx-QR, Mx-NL, Mx-Yuc and Mx-Col and less homologous sequences (71.2-96.7% identical nucleotides and 71.6-96.4% identical amino acid) such as Mx-Tam and Mx-BC (data not shown).
The neighbour-joined method of Clustal × program was used to generate phylogenetic trees for each ORF including the following full length CTV sequences: T36 [5] and T30 from Florida [6], VT from Israel [7], T385 from Spain [8], SY568 from California [9] and NUagA from Japan [10]. In order to compare our five amplified regions as individual ORFs, sequences from regions p349-B and p349-C were combined and named as p349-B/C. The phylogenetic trees were topologically similar for all the ORFs. A very consistent and defined cluster was observed with a bootstrapping over 990, which included the highly homologous Mexican isolates (Mex-QR, Mex-Mich, Mex-NL, Mex-Col, Mex-Ver and Mex-Yuc) as well as isolates T30 and T385 (figure 1). Despite the distant geographical origin of the isolates, they all induced mild symptoms on indicator plants and only mild stem pitting on the Mexican lime.
On the other hand, CTV Isolates Mex-Tam, Mex-BC, NUagA, VT, SY568 and T36, which induced severe symptoms, showed a more disperse phylogenetic distribution (figure 1). The comparative analysis revealed that both severe Mexican isolates Mx-Tam and Mx-BC, tended to be clustered away from the group formed by NUagA, VT and SY568, although not enough to be considered a separate cluster. Similar clustering was obtained using the predicted amino acid sequences (data not shown).
The calculated sequence diversity for the mild cluster was almost the same for the five analyzed regions ranging from 0.002 to 0.022. On the contrary, the severe cluster showed an increased diversity especially toward the 5'end of the genome (CP < p23 < p13 < p349-B < p349-C) (table  1).  Interestingly, two observations could be emphasized for the p349 gene. First, the genomic regions p349-B and p349-C showed the greatest diversity (0.143 and 0.169, respectively), twice as much as the one for CP, p13 and p23. Second, the T36 isolate remained separated in the phylogenetic tree for p349-B/C region. When the genetic distances were calculated for the severe group excluding the T36 isolate, it showed little variation for CP, p13 and p23, but it diminished two-fold p349-B (0.233 to 0.143) and for p349-C (0.276 to 0.120) (data not shown). The marked increase in the genetic diversity given by the T36 isolate suggests the presence of two lineages within the severe group.
To further analyze this possibility, a 280-nucleotide fragment from p349-B region (nucleotides 3333 to 4000) was compared for 44 CTV isolates which nucleotide sequences and biological behaviors had been previously reported [11]. Topology of the phylogenetic trees constructed from sequences alignments (for nucleotide as well as for amino acid residues) was similar to those observed for the p349-B/C. In this case, the T36 isolate considerably diverged forming an additional cluster with other two severe CTV isolates (T734 and T346) (data not shown). Another phylogenetic tree confirmed the divergence of the selected CTV isolates into three groups according to the severity level: one contained the 30 isolates (mild group), a second contained the previously reported five severe isolates and other ten more (severe common group), and a third one contained severe isolates which were located in the same branch along with T36 isolate (severe divergent group). Within-and between-groups nucleotide diversity analyses indicated that all mild isolates had a very low average intra-group genetic diversity (0.013) while the severe groups showed values up to seven times higher than the mild isolates. The inter-group distances were always higher than within groups. The highest variability (0.633 and 0.584) was observed among the severe divergent group and the other two groups (mild and severe common) respectively, thus confirming that the severe divergent group represents a different CTV genotype (table 2).
Additionally, clones obtained from five Mexican isolates belonged to more than one group: mild and severe common groups. This was the case of severe isolates Mx-BC for p349-B and Mx-Tam for p349-C, and mild isolates Mx-Col, Mx-Mich and Mx-NL, illustrating the presence of different sequence variants in their gRNA populations. However, several rounds of amplification, cloning and sequencing were repeated for these isolates and only the consensus nucleotide sequences of the major variants were considered in the analyses.
Similarly, the CP and p23 sequences from 24 and 28 CTV isolates from the GenBank databases were compared, respectively. As a result, the spatial distributions in the phylogenetic trees for CP and p23 were very similar to the observed with only 14 isolates shown in the figure 1. The mild isolates were grouped in a single branch with a high intra-group homology greater than 98% while for the severe isolates this rank was smaller, between 80-85% (data not shown).

Nucleotide and amino acid substitutions of the CTV variants
A further visual examination of the sequences alignments for all the isolates revealed group-specific features that can be used to classify isolates. Group-specific nucleotides could be identified for each ORF and many of them were located on regions p349-C (37.3%), p349-B (32.0%), followed by p23 (20.0%), CP (13.3%) and p13 (10.7%) (table 3). The majority of the sequence polymorphisms corresponded to single changes, and only three putative stretches of nucleotides were observed for the regions p349-C (positions 310-311 and 355-357) and p23 (positions 86-87).
Sequence alignments revealed that 85% of differential nucleotides could be classified in severe isolates (including T36 isolate) and the mild isolates. On the other hand, the 25% of the remainder polymorphisms distinguished the isolate T36 from the others, whether severe common or mild isolates. Most of these T36 typical changes were observed throughout the p349-B and p349-C genomic regions and only two of these types of changes were   (table 3). Finally, there were three nucleotide insertions at positions 153, 369 and 792 in the sequence of T36 for p349-C region (data not shown). Many of these nucleotide substitutions yielding amino acid changes were conserved within each group. From the 34 group-specific amino acid substitutions that were identified, 14 were located in the p349-B (table 4).

Discussion
The genetic diversity of different regions on the genomic RNA of CTV isolates from Mexico was analyzed for the first time. The phylogenetic relationships of CTV have been analized utilizing genomic regions such as CP, p20, p23, p27, p349 (genomic regions A and F), 5'and 3' UTR.
To better understand the diversity among different CTV isolates from Mexico, genes CP, p23, p13, p349-B and p349-C were sequenced. This approach allowed the comparison of the Mexican isolates with those previously reported and also look for additional regions that may be useful to further characterize CTV strains. The regions analyzed here are located at opposite sides of the CTV genome.
To achieve homogeneity in the sequence alignments, the Mexican isolates were compared with six geographical and biologically distinct isolates and whose complete genome sequences are known: a severe quick decline isolate T36 from Florida-USA (Karasev et al., 1995), grapefruit stem pitting and a decline isolate VT from Israel Our results showed that clustering of isolates highly correlates with their symptom severity. The overall branching pattern of the phylogenetic trees based on all regions were highly similar for all the mild isolates, including six of our Mexican isolates, forming a single cluster. The genetic distance analyses based on these segments indicated that the mild isolates were almost identical with a very low genetic diversity intra-group (lower than 0.022) independently from any analyzed ORFs. Such high similarity may be sur-prising since their considerable differences in geographical origin and time of sampling. This high similarity was observed for the first time dissecting the complete consensus sequences of the mild isolates T30 from Florida and T385 from Spain, which were geographically isolated for at least 24 years [12]. However, a different situation was observed for the severe isolates, where the genetic distance had a large range of variation (from to 0.066 to 0.276) and a unique branch could not be identified for them. In this way, the severe isolates were located more dispersed in the trees for all ORFs as opposed to the mild isolates.
The diversity among the severe isolates obtained was not random and the ORF p349, located in the 5'region of the viral genome was twice more variable than those located in the 3' end. The analyses of other two areas located into the ORF p349, region A (nucleotide 2021 to 2548) and region F (nucleotide 3561 to 3998), for 30 CTV isolates from Spain and California, showed appreciable differences because the former showed the greatest diversity (0.136). The values of genetic distance determined by us for the other two regions of p349:p349-B (nucleotides 1460 to 2350) and p349-C (nucleotides 6870 to 7900) were 0.143 and 0.169 respectively and coincide with those reported for the region A [11].
These observations were confirmed by comparing a sequence that covers the final 300 nucleotides of the p349-B region from 44 additional CTV isolates. This nucleotide stretch of DNA was found to be highly conserved in isolates from the same group (mean of 0.076), but highly variable among isolates of different groups (mean of 0.451), which supported the differentiation of CTV isolates for this region into three major genotypes here designed as: mild, severe common (all severe isolates excluding T36 group) and severe divergent isolates (T36like isolates).
Sequences obtained from five Mexican isolates belonged to more than one group. It is possible that when a mixture of mild and severe clones are present in the host at different ratios, disease development can be restricted by a large excess of mild viral genome, even though the diseasecausing variant remains at low levels in the viral populations. Previous reports have extensively demonstrated that most CTV isolates are composed by a population of genetically related variants (haplotypes) that could have origi- a Nucleotide (nuc.) differences in five CTV genomic region among mild isolates (M), severe common isolates (SC) and T36 isolate. Nucleotide positions which resulted in amino acid residue changes are underlined. Only differences between T36 and SC are indicated for T36. S: C or G; M: A or C; B: C or G or T; W: A or T; K: G or T; R: A or G; V: A or C or G; Y: C or T. nated from mixed infection of two or more CTV isolates with diverged sequence variants [13]. Also, the presence of mixed infection should be expected in woody plants that may live many years and are exposed to multiple aphid inoculations [14].
We observed that the nucleotide and amino acid variation were higher for 5' termini than for 3' termini indicating that the former target appears to be better suited to molecular clarification of relationship among CTV strains. In turn, the conserved 3' terminal nucleotide sequences would be insufficient to establish possible sub-groups inside of severe isolates to which the virus belongs.
Our results are in agreement with several reports that suggest a considerable degree of sequence variation in the CTV isolates mostly toward 5' termini of the genome. The sequencing of the 5' untranslated genomic region from eight CTV isolates from Spain and Japan, allowed their classification into three groups, I (severe-like T36), II (severe-like VT) and III (mild) represented by sequences of T36, VT and T317-8 (similar to T385), respectively, with a very low intergroup nucleotide identity ranged between 44 and 66%. This grouping was confirmed by sequencing an additional 15 virus isolates from several countries since all sequences could be assigned to one of the three types previously established [15].
Another report compared the predominant sequence variants of p23 gene from 18 CTV isolates of different geographic origin and pathogenicity . As a result of the phylogenetic analyses, the CTV isolates were separated in three groups: first including the mild isolates, second including the most of severe isolates and third, an atypical group which showed variable pathogenicity. This last group was loosely related and included asymptomatic as well as seedling yellows isolates, maybe as result of recombination events . In that report the intra-group genetic diversity for mild and severe isolates were similar of what we found, confirming that p23 is more heterogeneous that some others ORFs located toward 3' termini as p13 and CP. When we compared the eight Mexican isolates with those 18 reported for p23 [13], the divergence among the CTV isolates was supported for only two clearly distinct groups: mild and severe.
Similar mild genotypes have been found in several countries and led to speculate that this group is well adapted to its host and could have evolved from the same original sequence. As hypothesized, the divergences of the CTV strains could initiated in four possible different progenitors species in South East Asia prior to citrus cultivation [12]. According with this rationale, they spread to different citrus growing areas via infected budwood within the last 200 years. Then, mild strains could be distributed by growers due to the absence of symptoms on the commercial varieties and to the lack of certification programs.
In contrast, severe isolates induce stem-pitting, which affect the plant physiology and therefore reduce vigor and yield and cause eventually the death of the tree [2]. So, the visibly affected trees had been continuously eliminated or not propagated. Probably, like other viruses, the flow and evolution could have been interrupted several times creating successive bottlenecks leading to variability.

Conclusion
This study represents the first attempt to analyze the genetic variability of CTV strains in Mexico. The analysis of eight CTV isolates, originally obtained from Mexican citrus fields, confirms that the CTV has been present in many citrus-producing of Mexico as a mixture of variants with different biological and genetic properties for long time. The divergence found between the isolates Mx-Tam and Mx-BC and also among these two severe isolates and the other six mild isolates plus the relative distant geographical separation of isolates, may suggest that it is unlikely that one strain evolved from the other after arriving to Mexico. It is likely, that the introduction of CTV to Mexico was under independent events an in several occasions to different citrus areas occurred via the use of infected propagative budwood sources.
Taking into account in the recent arrival of the aphid T. citricida to Mexico, the most important vector of CTV, and the predominance of sour orange as a rootstock in almost all the country, this fact should be of great concern for the future of the Mexican citrus industry.
The group-specific genome features identified here would provide a valuable tool for the generation of diagnostic tools for an early discrimination of virus strains based in at least five viral genome regions. Thus, the success of the citrus management strategies could depend in part, of the elucidation of the diversity and evolutionary relationships of the CTV isolates present in the different citrus areas of Mexico.

Virus isolates
Samples from CTV-infected plants were collected by technicians of the Dirección General de Sanidad Vegetal (DGSV), in eight citrus-growing areas of Mexico between 1983 and 2000. CTV isolates were graft-propagated and maintained in insect-proof greenhouses at DGSV headquarters at Mexico City. These isolates were biologically characterized on graft-inoculated indicator plants using different scion/rootstock combinations. Scions included Mexican lime, grapefruit, and sweet orange, whereas Citrus macrophylla and sour orange were used as rootstocks.
Geographic origin and symptoms induced by these CTV isolates are summarized in table 5.

RNA isolation, reverse transcription and polymerase chain reaction (RT-PCR)
Total RNA was extracted using fresh bark tissue from healthy/CTV-infected plants in the presence of Trizol (Invitrogen, Carlsbad, CA) following the manufacturer's protocol. The final RNA preparation was dissolved in 30 μl of RNase-free distilled water.
For the reverse transcription (RT), 300 ng of total RNA and the reverse primer were incubated at 70°C for 5 min and chilled immediately on ice for 3 min. The mixture was added to provide a final concentration of 50 mM Tris-HCl, pH 8.3, 75 mM KCl, 10 mM DTT, 2.5 mM MgCl 2 , 500 μM of each dNTPs, 25 pmol of antisense primer, 20 U RNasin and 200 U MLV-RT (Promega, Madison, WI). Five different sets of primers were designed in our lab to direct the amplification of several genomic regions (table  6). These regions were selected as conserved regions after aligning the published CTV sequences from different geographical regions [8,12].

Cloning and nucleotide sequencing
The reaction mixture was fractioned by agarose gel electrophoresis and the PCR products were purified using the GeneClean glass bead method (Bio101, Vista, CA) and ligated into pGEM-T Easy Vector System II (Promega, Madison, WI). The ligation mixture was used to transform Escherichia coli DH5α cells. Plasmids containing the expected sizes were chosen for sequencing. At least, three clones of each isolates were sequenced in both directions to obtain a consensus sequences by the Dye Cycle Sequencing kit (Applied Biosystem, Foster City, CA) in an ABI PRISM 377 sequencer

Sequence alignment and phylogenetic analyses
Multiple sequences of different CTV isolates were aligned using the software MegAlign 4.0 of the Lasergene package (DNASTAR Inc. Madison). The unrooted and neighbourjoined phylogenetic trees were prepared using Clustal × [16] and drawn with TreeView 1.5 (Page, 1996). The robustness of individual internal phylogenetic tree nodes was estimated using 1,000 bootstrap reiterations. Clustal W algorithm was used for the estimation of genetic distance by the Kimura two-parameter method [17]. A genetic distance was also calculated using MEGA 2.1 with the output used to calculate both individual pairwise diversity and mean pairwise diversity using the Kimura-2parameter approach.
The nucleotide sequences of the Mexican isolates obtained in this study were deposited in the GenBank database under the accession numbers: AF342890 to AF342895, AY649491 to AY649492, and AF652892 to AF652923.
JCOS collected viral samples and organized information and material data. RRB and JPMS conceived of the study, and participated in its design and coordination. All authors read and approved the final manuscript.