The nucleotide sequence and genome organization of Plasmopara halstedii virus

Background Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Methods Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. Results The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. Conclusions The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.


Background
Only very few species within the Oomycetes are known to host virus-like elements such as virus-like particles (VLPs), double-stranded RNA (dsRNA) or singlestranded RNA (ssRNA) (for details see [1]). So far, only the virions of Sclerophthora macrospora and Plasmopara halstedii have been studied more in detail.
Sclerophthora macrospora virus A (SmV A) virus B (SmV B) are the only virions of Oomycetes of which the genome has yet been fully characterized [2,3]. They were isolated from Japanese isolates of S. macrospora, the downy mildew pathogen of Oryza sativa and other species within the Poaceae family. Both, SmV A and SmV B were isometric and used ssRNA to encode their viral genomes [4,5].SmV B featured one coat protein (CP) of 41 kDa and one ssRNA segment of 5533 nucleotides (nt) encoding two large open-reading frames (ORF). Two CP (43 kDa and 39 kDa) and three segments of ssRNA were found to set up SmV A. RNA1 consisted of 2928 nt and two ORF (ORF1a and ORF1b). ORF1a contained the motifs of the RdRp. The latter showed some similarity in the amino acid sequence to the RdRp of Nodaviridae. RNA2 consisted of 1981 nt and a single ORF (ORF2) which encoded the CP. The CP of SmV A showed similarities to CP of viruses within the Tombusviridae family. RNA3 consisted of 977 nt but no ORF suggesting it as a satellite RNA [3][4][5].
P. halstedii is a worldwide distributed pathogen with a broad spectrum of pathotypes (physiological races) [6], causing sunflower downy mildew infections. Almost 20 years ago, isometric virions were found in a single North American pathotype of P. halstedii [7]. The CP was determined to consist of a 37.5 kDa polypeptide and RNA was determined to encode the viral genome [8,9]. A more recent screening of P. halstedii isolates from different countries showed the occurrence of morphologically and biochemically indistinguishable virions in all samples independent of their host's origin or pathotype or fungicide tolerance [1].
The virions were isometric and measured approximately 37 nm in diameter. One polypeptide of ca. 36 kDa and two segments of ssRNA (3.0 and 1.6 kb) were detected. Comparison of a partial nucleotide sequence confirmed the uniformity of the virions found in P. halstedii isolates. In addition, the deduced amino acid of this RNA fragment indicated similarities to a part of the CP of SmV A [1]. However, final analysis of the relationship of the Plasmopara halstedii virus (PhV) to other viruses was constricted due to lack of full genomic data. Here we report on the complete nucleotide sequence and genome organization of PhV and its relationship to other viruses. Additionally we report on the extremely low genetic variation of PhV within several P. halstedii isolates of different origin and pathogenicity.

Methods
Origin and culturing of the used fungal isolates P. halstedii pathotypes were cultured, characterized, harvested and intermediately stored as described earlier [1]. Isolates of P. halstedii used in this study are listed in Table 1

Virus extraction and purification
Virus extraction was based on former experiments [10] and was empirically adjusted to precipitate adequate amounts of PhV.
At least 180 g of frozen sunflower tissue (primary foliage leaves, cotyledones and stems) infected with different isolates of P. halstedii was used to purify virions. Per 1 g infected sunflower tissue, 2 ml 0.05 M sodium phosphate buffer, pH 7.0 were added. Additionally, 0.3% (w/v) of disodium sulfite was added and dissolved by stirring before the sunflower tissue was finely homogenized with an electrical blender. The homogenate was squeezed through four layers of cheesecloth and then centrifuged (5000 g, 4°C, 15 minutes). To the supernatant, 0.5 M sodium chloride was added and dissolved by stirring at room temperature. PEG 6000 was gradually added to a final ratio of 12% (w/v) and the suspension was stirred for an hour before the precipitation was conducted at 4°C for 60 hours. The suspension was centrifuged (7000 g, 4°C, 30 min) and the precipitate was re-suspended in 1 / 100 of the original volume of 0.05 M sodium phosphate buffer, pH 7.0 with additional 0.3% (w/v) disodium sulfite. To remove contaminants like ribosomes [11], 20% (v/v) chloroform was added and mixed with the re-suspension. A low-speed centrifugation step (1000 g, 4°C, 5 min) was performed to separate virus-containing aqueous from the organic phase. The chloroform extraction step was repeated once. Success of the virus extraction and purification procedure was controlled by negative staining for transmission electron microscopy as described earlier [1].
The virus suspension was divided into aliquots and stored at -70°C. Under these conditions, virus suspensions were suitable for RNA extraction, purification and reverse transcription PCR (RT-PCR) experiments for more than three years.

Protein extraction and mass spectrometry
Extraction of CP of PhV was carried out as described earlier [1]. Samples for nano-LC-ESI-MS analysis were further purified on 10% polyacryl amide gels [12] on a Mini-Protean System (Bio-Rad Laboratories, München, Germany) and stained with coomassie brilliant-blue (Roti-Blue, Carl Roth, Karlsruhe, Germany).
Gel bands of the CP were in-gel-digested using trypsin (Roche, Penzberg, Germany) [13]. After tryptic digestion, the supernatant was recovered and the gel pieces were extracted with 50% acetonitrile (ACN)/50% 0.1% formic acid (FA) (v/v). The pooled supernatants were then dried in a vacuum centrifuge and stored intermediately at -20°C.
Nano-LC-ESI-MS/MS experiments were performed on an ACQUITY nano-UPLC system (Waters, Milford, USA) directly coupled to a LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Fisher, Bremen, Germany). Tryptic digests of the PhV CP were concentrated and desalted on a precolumn (2 cm × 180 μm, Symmetry C18, 5 μm particle size; Waters, Milford, CT) and separated on a 20 cm × 75 μm BEH 130 C18 reversed phase column (1.7 μm particle size; Waters, Milford, CT) using a linear gradient of 1 to 50% ACN in 0.1% FA within 1 h. The LTQ-Orbitrap was operated under the control of XCalibur 2.0.7 software. Survey spectra (m/z = 250-1800) at a resolution of 60.000 at m/z = 400 were detected in the Orbitrap using lock-mass ions from ambient air for internal calibration [14]. Data-dependent tandem mass spectra were generated for the five most abundant peptide precursors in the linear ion trap. Mascot 2.2 software (Matrix Science, London, UK) was used for protein identification. Spectra were searched against the NCBI protein sequence database downloaded as FASTA-formatted sequences from ftp:// ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz and supplemented with the deduced ORF2 amino acid sequence of the PhV CP. Search parameters specified trypsin or "no enzyme" as cleaving enzyme allowing two missed cleavages, a 3 ppm mass tolerance for peptide precursors and 0.6 Da tolerance for fragment ions.

Characterization of the N-terminus of the CP
Mass spectra of purified PhV CP, in source decay (ISD)and MS/MS-spectra were acquired on a AutoflexIII MALDI-TOF-TOF mass spectrometer (Bruker Daltonics, Bremen, Germany). The instrument was operated in the positive ion mode and externally calibrated using protein mass or peptide calibration standards (Bruker Daltonics, Bremen, Germany), respectively. PhV CP samples were desalted and concentrated on C4 ZipTips (Millipore, Schwalbach, Germany) following the manufacturer's protocols. Proteins were eluted directly onto a stainless steel target using 1 μl of a 2,5-Dihydroxybenzoic acid matrix solution (30 mg mL -1 in 50% ACN/50% 0.1% TFA, v/v). Molecular masses of intact CP was obtained in the linear mode using an accelerating voltage of 20 kV with 1000 laser shots per sample to ensure good S/N ratio.
ISD-spectra and MS/MS-spectra (T3-Sequencing = terminus-specific pseudo-MS3 TOF-TOF analysis) of PhV CP were acquired in the reflector mode using an accelerating voltage of 20 kV. In order to achieve a good S/N ratio, 3000 and 2000-3000 laser shots were recorded for reflector-ISD and MS/MS spectra, respectively. Data were analyzed using Flex Analysis 3.0 and Bio-Tools 3.0 software (Bruker Daltonics, Bremen, Germany) taking into account the deduced amino acid sequence from ORF2 of PhV.

Nucleic acid extraction
Viral ssRNA was extracted from purified virions, P. halstedii sporangia or herbarium specimens of P. halstedii infected sunflower using the Aurum Total RNA Mini Kit (Bio-Rad Laboratories, Hercules, CA) according to the manufacturer's protocols. RNA sample purity was checked spectroscopically (A 260 /A 280 ratio).
Viral dsRNA was extracted and purified from P. halstedii infected sunflower tissue [15].
cDNA synthesis, primers, PCR and sequencing cDNA synthesis was performed using either the First Strand cDNA Synthesis Kit or RevertAid First Strand cDNA Synthesis Kit (both kits: Fermentas, Glen Burnie, MD). A first fragment of RNA2 was obtained as stated earlier [1].
PhV specific primers used for sequence comparison were summarized in Table 2.
The PCR standard protocol was 94°C for 2 min; then 35 cycles of 94°C for 45 s, 54°C for 45 s, 72°C for 1 min 15 s; and a finale elongation step at 72°C for 10 min. Occasionally, a gradient thermal cycler set at different annealing temperatures was used to optimize the yield of PCR products.
PCR products were analyzed on 1% agarose gels (1× TBE buffer, ethidium bromide staining) and then purified (Qiaquick PCR Purification Kit; Qiagen, Valencia, CA). The purified PCR products were either directly sequenced with virus-specific primers or cloned in E.
coli (StrataClone PCR U/A Cloning Kit, Stratagene, La Jolla, CA), isolated (GeneJet Plasmid Miniprep Kit; Fermentas, Glen Burnie, MD) and sequenced with standard primers. All sequences were at least once verified with another primer.
Using a gradient thermal cylcer, a primer set for RNA2 (2-r7: 5'-AAG CGC GGC GTT TGT -3' and 2-f9: 5'-CAA AGC GTC TCC CAT TGG -3') resulted in two additional DNA fragments (0.8 kb and 0.7 kb, respectively) at an annealing temperature of 48°C. These two additional amplicons were directly sequenced. Their deduced amino acid sequences showed similarity to the deduced amino acid sequence of the RdRp of SmV A. Based upon this similarity, the sequence data of the putative PhV RNA1 were aligned and virus-specific primers were designed to close the gap between these two partial nucleotide sequences.

3' Rapid amplification of cDNA ends (3' RACE)
The 3' ends of RNA1 and RNA2 were determined when viral RNA was transcribed into cDNA with an adaptor (5'-ATG ACT CGA GTC GAC ATC GAT TTT TTT TTT TTT TTT TTT TTT TTT TTV N -3') which included a poly(T) tract (adaptor and adaptor-specific primer modified after Sambrook and Russell, 2001). After its synthesis, cDNA was amplified in PCR with virus-specific forward primers and an adaptor-specific reverse primer (5'-ATG ACT CGA GTC GAC ATC GA -3'). PCR products were cloned in E. coli and sequenced with virus-specific forward primers. Control experiments with E. coli poly(A) polymerase (NewEngland Biolabs, Ipswich, MA) indicated the lack of internal annealing sites for poly(T), thus confirming that both viral RNA segments featured a poly(A) tract at their 3' ends.

5' Rapid amplification of cDNA ends (5' RACE)
The 5' ends were determined using the SMART RACE cDNA Amplification Kit (Clontech Laboratories, Carlsbad, CA).
A second 5' RACE technique based on purified viral dsRNA was carried out [16] in order to ascertain the 5' ends of RNA1 and RNA2. The results of the second 5' RACE experiments confirmed the 5' sequence of the RNA1 segment. For the RNA2 segment, the 5' end determined with the SMART RACE-kit was extended by additional 44 nt.

Comparative sequence analysis
Viral RNA of different P. halstedii isolates was transcribed into cDNA using random hexamer primers. PCR was then conducted using primer sets for RNA1 and RNA2, respectively. After submitting the PCR product to agarose gel electrophoresis, a single DNA fragment resulted for each primer pair. These fragments were purified and directly sequenced using one primer of the corresponding primer pair for sequencing. Both strands were sequenced at least twice.

Sequence analysis
Sequence data were subjected to BLAST analysis and aligned using the software BioEdit (version 7.0.5.3; Hall, 1999). The GenBank accession numbers for the different PhV isolates are given in Table 1. The GenBank accession numbers for SmV A are AB083060 (RNA1), AB083061 (RNA2) and AB083062 (RNA3).

Genome organization
Based on the recent comparison of a partial nucleotide sequence of PhV in various samples of P. halstedii [1], the viral isolate Ph8-99 was selected for complete sequencing of the two ssRNA segments and genome analysis.

Coat protein (CP) characterization
A molecular mass of the CP of ca. 37.5 kDa [8] to 36 kDa [1] was estimated previously by SDS-PAGE analysis. This was in discrepancy with the molecular mass of 40 kDa calculated for the 375 amino acids deduced from the nucleotide sequence of ORF2. Therefore, the CP band was cut off from an SDS-PAGE-gel, digested with trypsin and analyzed by mass spectrometry. Nano-LC-ESI-MS/MS analysis showed sequence coverage of 62% according to the deduced amino acid sequence from ORF2. The first 23 amino acids of the N-terminal sequence of CP were not covered by tryptic peptides and the peptide most proximal to the N-terminus of CP comprising the amino acids 24-35 (DYTVQSNSIVQR), had a non-tryptic cleavage site at its N-terminus (data not shown). This suggested that the N-terminus of the CP might be proteolytically processed in vivo leading to the lower molecular mass as observed in SDS-PAGE experiments. In order to verify this, the N-terminal sequence of the CP was analyzed by Top-Down MALDI-TOF analysis [17]. Two major in source decay (ISD) fragments of 1479.74 Da and 1765.93 Da were observed by Top-Down MALDI-TOF analysis ( Figure  1A). Fragmentation of the smaller ISD fragment revealed the peptide sequence DYTVQSNSIVQR ( Figure  1B), whereas the larger ISD fragment covered the sequence DYTVQSNSIVQRSLR (data not shown). Therefore, the Top-Down MALDI-TOF sequence analysis determined the start of the N-terminal sequence of CP at amino acid 24 and confirmed the result from the nano-LC-ESI-MS/MS experiment.
These results suggest that the CP of PhV is processed in vivo at its N-Terminus by a yet unknown protease to a final size of 352 amino acids and a mass of 38.0 kDa which is in consistency with the results obtained by SDS-PAGE analysis. Similar processing of the N-terminus was assumed for CP2 of SmV A [3].

Viral nucleotide sequence comparison among different isolates of P. Halstedii
Variation in the nucleotide sequences of PhV from P. halstedii isolates of different pathogenicity and origin was assessed using virus-specific primers for large fragments of both RNA segments.
In terms of RNA1, the two French isolates (Ph8-99 and Ph1-00) were identical. The samples from Germany (Ph9-98), Hungary (Ph19-01), and the USA (Ph4-93) each differed from isolate Ph8-99 in single nucleotides at different positions. Only the sequence variation in the German sample led to an amino acid exchange in the deduced amino acid sequence.
In terms of RNA2, Ph8-99 differed in a single nucleotide from all other samples causing an amino acid exchange from alanine to valine. The Hungarian isolate Ph19-01 differed from isolate Ph8-99 in one nucleotide which led to an amino acid exchange from isoleucine to methionine. The French isolate Ph1-00 differed from Ph8-99 in two nucleotides causing a single amino acid exchange from glutamine to serine. The German isolate Ph5-05 showed one nucleotide insertion in the 3'-UTR and additionally one nucleotide exchange. The US isolate Ph4-93 showed only the nucleotide insertion like the German isolate Ph5-05 but not the additional nucleotide exchange. Since the insertion took place in the 3'-UTR, the frameshift was without consequences for the structure or function of the CP.
It appeared that alterations in these highly conserved sequences of RNA1 and RNA2 are detrimental for the self-assembly process during virus propagation.
Additionally five infected herbarium specimens of sunflower were tested to fathom the applicability of PhV sequence population studies on a broader set of samples. The herbarium specimens were stored in the herbarium of the University of Hohenheim for up to nine years. Small fragments between 410 and 650 nt of RNA2 were amplified and sequenced. Again, the sequence variation of the virus in these herbarium specimens was insignificant (data not shown).
The results of this study suggested the presence of a single virus in the tested P. halstedii isolates.

Comparison of PhV with other viruses
A preliminary study revealed that PhV and SmV A shared several morphological, biochemical and molecular characteristics. Both viruses were isometric and showed a granular surface. PhV measured ca. 37 nm in diameter, whereas SmV A was slightly larger (ca. 40 nm in diameter). Both viral genomes are encoded in ssRNA. In SmV A, three RNA segments were detected whereas PhV only contained two segments [1,[3][4][5]. The genome organization was now studied and compared to other viruses.
SmV A RNA1 with 2928 nt coded for the RdRp on ORF1a and another protein of unknown function in ORF1b. In PhV, RNA1 with its 2793 nt (excluding the 3' poly(A) tract) coded analogously for the RdRp. ORF1b as it was found in SmV A was not determined in PhV RNA1 ( Table 3).
The deduced amino acid sequence of PhV ORF1 (RdRp) showed similarity of ca. 47% to the deduced amino acid sequence of SmV A ORF1a (RdRp). Moreover, an amino acid sequence similarity of ca. 40% was observed between PhV RdRp and RdRp of viruses within RNA2 of SmV A with a length of 1981 nt coded for the two CP. In PhV, RNA2 of 1526 nt (excluding the 3' poly(A) tract) was found to code for the single CP (Table 3). Between ORF2 of PhV and SmV A ORF2, amino acid sequence similarity of ca. 56% was observed. Within the first ten amino acids at the N-terminal sequences, PhV CP (DYTVQSNSIV) and CP2 of SmV A (DYKVSQNSLV) featured six identical amino acid residues. Two other amino acid QS (PhV) and SQ (SmV A), respectively, were interchanged. Amino acid sequence similarity of ca. 37% was observed between the CP of PhV and the CP of viruses within the Tombusviridae (e.g. Pelargonium leaf curl virus, Tomato bushy stunt virus).
In PhV, RNA1 and RNA2 both had poly(A) tracts at their 3' termini whereas in SmV A poly(A) tracts were lacking at the 3'-termini of all three viral RNA (Table 3).

Conclusions
The complete nucleotide sequence of PhV was established. PhV showed similarities to SmV A and viruses within the Tombusviridae family as well as Nodaviridae family.
The sequence data of several viral isolates suggested that there was a single virus type in different P. halstedii isolates. Viral sequence variation did not account for different pathotypes of the oomycete P. halstedii.