Identification and characterization of a maize-associated mastrevirus in China by deep sequencing small RNA populations
© Chen et al. 2015
Received: 4 July 2015
Accepted: 16 September 2015
Published: 5 October 2015
Maize streak Reunion virus (MSRV) is a member of the Mastrevirus genus in the family Geminiviridae. Of the diverse and increasing number of mastrevirus species found so far, only Wheat dwarf virus and Sweetpotato symptomless virus 1 have been discovered in China. Recently, a novel, unbiased approach based on deep sequencing of small interfering RNAs followed by de novo assembly of siRNA, has greatly offered opportunities for plant virus identification.
Samples collected from maize leaves was deep sequencing for virus identification. Subsequently, the assay of PCR, rolling circle amplification and Southern blot were used to confirm the presence of a mastrevirus.
Maize streak Reunion virus Yunnan isolate (MSRV-[China:Yunnan 06:2014], abbreviated to MSRV-YN) was identified from maize collected from Yunnan Province, China, by small RNA deep sequencing. The complete genome of this virus was ascertained as 2,880 nucleotides long by conventional sequencing. A phylogenetic analysis showed it shared 96.3 % nucleotide sequence identity with the isolate of Maize streak Reunion virus from La Reunion Island. To our knowledge, this is the first identification of MSRV in China. Analyses of the viral derived small interfering RNAs (vsiRNAs) profile showed that the most abundant MSRV-YN vsiRNAs were 21, 22 and 24 nt long and biased for A and G at their 5’ terminal residue. There was a slightly higher representation of MSRV-YN siRNAs derived from the virion-sense strand genome than the complementary-sense strand genome. Moreover, MSRV-YN vsiRNAs were not uniformly distributed along the genome, and hotspots were detected in the movement protein and coat protein-coding region.
A mastrevirus MSRV-YN collected in Yunnan Province, China, was identified by small RNA deep sequencing. This vsiRNAs profile derived from MSRV-YN was characterized, which might contribute to get an insight into the host RNA silencing defense induced by MSRV-YN, and provide guidelines on designing antiviral strategies using RNAi against MSRV-YN.
Viruses in the family Geminiviridae are taxonomically classified into seven genera (Begomovirus, Curtoviurs, Mastrevirus, Topocuvirus, Becurtovirus, Turncurtovirus, Eragrovirus) based on insect vector, host range and genome organization . According to their genome components, geminiviruses can be characterized into two groups: bipartite geminiviruses with two similar-sized single-stranded (ss) DNA genomes and monopartite geminiviruses with one ssDNA genome. The genus Mastrevirus contains species with a monopartite genome of approximately 2.7 kb encapsidated in geminate virions, which encode four proteins separated by two intergenic regions (large intergenic region [LIR] and small intergenic region [SIR]). The two proteins encoded on the virion-sense strand are the movement protein (MP), functioning in cell-to-cell movement, and the coat protein (CP), encapsidating the virion-sense ssDNA and acting as the nuclear shuttle protein (NSP) for viral DNA. The complementary-sense strand encodes the replication-associated protein Rep and Rep A . The Rep is expressed through a transcription splicing mechanism of the transcripts for C1 and C2 , while Rep A is expressed via a transcript spanning the C1 ORF.
Members of Mastrevirus are mainly transmitted by leafhoppers and can infect a wide variety of monocotyledonous and dicotyledonous plants. Currently, there are 29 recognized members belonging to the genus Mastrevirus based on ICTV classification . Maize streak Reunion virus (MSRV) is an emerging new member of genus Mastrevirus that has been reported only in La Reunion Island and Nigeria and shares less than 57 % genome-wide identity with all other known mastreviruses [4, 5]. Apart from Maize streak virus (MSV), MSRV is the only mastrevirus species detected in maize. Of the diverse and increasing number of Mastrevirus species found so far, only Wheat dwarf virus (WDV) and Sweetpotato symptomless virus 1 (SPSMV-1) have been discovered in China [6, 7].
Recently, a novel, unbiased approach for plant virus identification has been developed by deep sequencing and assembly of virus-derived small interfering (si) RNAs . Upon virus infection, the host can initiate an efficient defense to ward off invading nucleic acids by cleaving viral double-stranded RNA or imperfectly folded viral self-complementary single-stranded RNA sequences using Dicer-like proteins (DCLs), which generates different classes of small RNAs (sRNAs). These viral siRNAs are overlapping in sequence and can be assembled into long contiguous fragments (contigs) mapping to the invading viral genome , which provides the theoretical basis for identifying viruses using small RNA deep sequencing. Unlike traditional genetic methods for virus diagnostics mainly relying on serological or molecular characterization, the biggest advantage of this approach is that it requires no a priori knowledge of the pathogen.
In this paper, we report the first identification using high-throughput sequencing of small RNAs populations of a DNA virus infecting maize in China. Further, the genome organization and a phylogenetic analysis suggest that the virus is a member of Mastrevirus, Geminiviridae, most closely related to MSRV isolated from La Reunion Island.
Results and discussion
De Novo assembly of small RNAs and the maize streak reunion virus YN isolate identified from maize
Deep sequencing data and assembly of small RNAs
Reads after removing adaptor (clean reads)
18-28 nt reads
Total contigs after assembling small RNA
Contigs matching Maize streak Reunion virus
Reads matching Maize streak Reunion virus-YN genome (0 mismatch)
Primer pair F/R1 was designed based on the assembled viral contigs, and a nucleotide fragment of 720 bp was obtained. A BLASTn search with this fragment in the NCBI database showed that it shares high identity of 98 % to the isolate of MSRV (GenBank: JQ624880), with a query cover of 100 % (data not shown), further confirming the actual presence of the potential MSRV isolate. Thus, we tentatively named the isolate MSRV-[China:Yunnan 06:2014], which was abbreviated to MSRV-YN.
In a PCR using primer pair F/R1 to investigate the incidence of MSRV-YN among the 22 maize samples collected from Yuanmou county, Yunnan province, amplicons of the expected size, about 700 bp, were generated in ten of the 22 tested samples, revealing an infection rate for isolate MSRV-YN of 45.5 % (data not shown). To our knowledge, this is the first report of a isolate of MSRV in China.
Full-length cloning, RCA and Southern blot detection of MSRV-YN
Back-to-back primers F/R designed based on the assembled contigs were used to amplify the full-length genomic sequences of MSRV-YN from a crude extract of total DNA from the maize sample, yielding a fragment of about 3.0 kb. This fragment was then cloned and sequenced by conventional Sanger sequencing and assembled by SeqMan (Lasergene package, Version 7.1.0), generating a complete sequence 2880 bp long (GenBank: KT717933). A BLASTn search using the completed genome of MSRV-YN in NCBI revealed significant similarity (98 %) to the isolate of MSRV from La Reunion Island (GenBank: JQ624880).
Genomic organization and sequence analysis of MSRV-YN
The identities (%) between MSRV-YN and other mastreviruses based on nucleotide and amino acid alignment
Genome size (nt)
Nucleotide identity (%)
Amino acid similarity (%)
Wheat dwarf virus
Oat dwarf virus
Eragrostis minor streak virus
Chickpea yellows mastrevirus
Chickpea chlorosis virus
Tobacco yellow dwarf virus
Paspalum dilatatum striate mosaic virus
Paspalum striate mosaic virus
Bromus catharticus striate mosaic virus
Wheat dwarf India virus
Maize streak Reunion virus
Characterization of viral derived siRNAs from MSRV-YN
Previous studies have indicated that the recruiting of sRNAs into specific AGO complexes mainly depends on the identity of the first 5’ -nucleotide of sRNAs [16, 17]. To deduce the potential interactions between MSRV-YN-derived siRNAs and distinct AGO complexes, the 5’ -nucleotide specificity in vsiRNAs of 20–25 nt was analyzed. The results revealed the prevalence of adenosine (A) and guanidine (G) compared with cytosine (C) and uridine (U) at the 5’ -terminal position of vsiRNAs (Fig. 4b). In A. thaliana, AGO2 and AGO4 predominantly favored small RNAs starting with the 5’ -terminal A, AGO1 preferentially recruited sRNAs initiating with a 5’ -terminal U, while AGO5 predominantly bound small RNAs that with a 5’ -terminal C [16, 17]. It is still unknown which of the remaining AGOs (if any) could preferentially bind sRNAs with a 5’ -terminal G. Our results suggested that the production of MSRV-YN-derived sRNAs preferentially involves AGO2 and AGO4, whereas the association with AGO1 was relatively lower. Notably, the 5’ -terminal residue bias of geminivirus-derived siRNAs presented very diverse results; 5’ -terminal residue biases for U and A were observed in CabLCV and TYLCCNV, whereas the 5’ -terminal residue C was preferred in TYLCSV [14, 15, 18]. In our case, a biased accumulation of 5’ -terminal G was observed (Fig. 4b), which is not very common in plants. In many cases reported, G was the least preferred base [14, 18, 19]. Although the biological meaning of these associations is not yet very clear, this finding might suggest the involvement of diverse AGOs in different plant species for vsiRNAs sorting.
When the vsiRNAs strand polarity was evaluated using the 20–25 nt vsiRNAs, a slightly higher representation of MSRV-YN siRNAs derived from the virion-sense strand (positive vsiRNAs) was unexpectedly observed, accounting for about 63.46 % of the total vsiRNAs, irrespective of length (Fig. 4c), suggesting that the different DCLs do not show strand biases. Strand preference can arise from cleavage of dsRNAs formed during replication for RNA viruses, bidirectional transcription for a circular viral DNA, or highly structured single-stranded viral RNAs by DCLs [15, 20, 21]. We speculated that more frequent cleavage of transcripts of virion-sense strands might be one reason for the asymmetric distribution along the MSRV-YN genome.
Currently, mastreviruses are receiving more and more attention since they have been associated with many serious crop diseases. Of the 29 currently recognized mastrevirus species, 23 include viruses that can infect monocotyledonous hosts and 6 include viruses that infect dicotyledonous hosts . To date, other than MSV, MSRV is the only mastrevirus species that has ever been sampled from maize having maize streak disease symptoms. Interestingly, MSRV was also detected from wild grasses such as Setaria barbata and Rottboellia sp. in Nigeria, suggesting expanded host and geographical ranges for this virus . This first report of MSRV isolates in China reveals that this virus is likely to possess a far greater diversity and distribution than has been appreciated. Because 10 of 22 samples from Yunnan Province, China, were infected with MSRV-YN, for an infection rate of 45.5 %, further work on epidemics of MSRV-YN in China is needed.
Recent advances in deep sequencing have greatly accelerated the accuracy and rate of virus discovery. Our study, the first to use deep sequencing to characterize siRNAs derived from MSRV-YN, should provide guidelines on designing antiviral strategies using RNAi against MSRV-YN.
Materials and methods
Samples collection and total RNA extraction
In May 2014, 22 maize (Zea mays) leaf samples were collected from five maize fields in Yuanmou County, Yunnan Province, China, the distance between which ranged from 100 m to 2000 m more than. These samples displaying a range of disease symptoms such as yellowing and chlorotic streaks along leaf veins (data not shown) were stored at −80 °C. Total RNA was extracted from 0.1 g of symptomatic maize leaf according to the manufacturer’s instructions (Invitrogen, Carlsbad, USA). The purified RNA preparations were evaluated using a spectrophotometer (Nanodrop, Thermo Fisher Scientific, Waltham, MA, USA) and agarose gel electrophoresis. The better quality RNA described as YM was shipped with dry ice to the Beijing Genomics Institute (BGI) for small RNA library construction and sequencing.
Small RNA deep sequencing and data processing
Sequencing was performed according to the manufacturers’ protocol by using the high-throughput Illumina HiSeq-2000 sequencing technology. Data processing was conducted using a custom bioinformatics pipeline. Raw data were analyzed using an in-house Perl script, and 18–28 nt reads consisting of trimmed sRNA sequences were collected for subsequent analysis. The Velvet program  was used for genome assembly and a parameter of 17 nucleotides was set as the minimal overlapping length (k-mer) required for joining two sRNAs into a contig . The assembled contigs were then aligned with the nonredundant (nr) nucleotide sequences from the National Center for Biotechnology Information (NCBI) database using BLASTn (nucleotide blast) with standard parameters .
Total DNA extraction and cloning of full-length viral genome
A CTAB-based extraction method  was used to obtain total DNA from 0.1 g maize leaves sample. The full-length viral genome was acquired by PCR amplification using a pair of adjacent primers F/R (Additional file 1 Table S1), which were designed based on the assembled contigs. The resulting PCR products were then cloned into vectors using ZeroBack Fast Ligation Kit (TIANGEN BIOTECH CO.) and sequenced by conventional Sanger dideoxy sequencing. The DNAStar Lasergene package (Version 7.1.0) was used for full-length nucleotide sequences assembly of the viral genome.
Detection of MSRV-YN by rolling circle amplification (RCA) and Southern blot
The rolling circle amplification of circular DNA was conducted using TempliPh 500 Amplification Kit (GE Healthcare) according to the manufacturer’s protocol. From 10 to 100 ng of total deoxyribonucleic acids was dissolved in 5 μl sample buffer, denatured at 95 °C for 3 min and then immediately cooled down on ice for 10 min. Subsequently, 5 μl reaction buffer and 0.1 μl enzyme mix were added, and the reaction was run at 30 °C for 18 h. The reaction was then stopped at 65 °C for 5 min. The RCA products were digested with the single restriction enzyme BamHI, and fragments were then separated on 1 % agarose gels, excised and recovered using a Gel Extraction Kit (Axygen Scientific Inc). For Southern blot analysis, 20 μg of total DNA preparations and 20 ng RCA recovered products were used and hybridized with a virus-specific probe according to standard Southern blot protocols. The probe was synthesized using the primer pair F1/R1 (Additional file 1: Table S1) and digoxigenin-labelled (Roche, DIG-High Prime DNA Labeling and Detection Starter Kit) for chromogenic substrate detection with alkaline phosphatase (AP) after hybridization according to the manufacturer’s instructions.
Nucleotide sequence analysis and phylogenetic analysis
Open reading frames (ORFs) were predicted using the ORF finder function of the Snap Gene software . Nucleotide and amino acid identities were computed using the MegAlign software (Version 7.1.0). Phylogenetic trees were built with the neighbor-joining (NJ) algorithm within the Molecular Evolutionary Genetics Analysis program package (MEGA 6) . Multiple alignments of nucleotide sequences were conducted using Clustal W . Amino acid sequences were aligned using the MUSCLE algorithm . The reliability of each branch was evaluated with a bootstrap of 1,000 replicates. Other parameters were set to the default value. Selected sequences used in the phylogenetic analysis are listed in supplementary information (Additional file 1: Table S3).
Viral derived siRNA analysis
Small RNA mapping was conducted by Bowtie 1.0 with zero mismatch, and results were exported to Microsoft Excel for further analysis. The program MISIS  was used to analyze maps of small RNAs derived from viruses and genomic loci generating multiple small RNAs. Structural analysis of viral vsiRNA hot spots in viral genome was carried out using RNAfold software . Secondary structures of RNAs were predicted using the thermodynamic prediction of minimal free energy (MFE) .
This research was supported by the General Administration of Quality Supervision, Inspection and Quarantine of China (grant 201310068) and the Ministry of Education of China (313052).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Varsani A, Navas-Castillo J, Moriones E, Hernández-Zepeda C, Hernández-Zepeda C, Idris A, et al. Establishment of three new genera in the family Geminiviridae: Becurtovirus, Eragrovirus and Turncurtovirus. Arch Virol. 2014;159:2193–203.View ArticlePubMedGoogle Scholar
- King AM, Adams MJ, Carstens EB, Lefkowitz EJ (Eds.). Virus taxonomy: ninth report of the International Committee on Taxonomy of Viruses (Vol. 9). New York: Elsevier; 2012;352–355.Google Scholar
- Wright EA, Heckel T, Groenendijk J, Davies JW, Boulton MI. Splicing features in maize streak virus virion- and complementary-sense gene expression. Plant J. 1997;12:1285–97.View ArticlePubMedGoogle Scholar
- Pande D, Kraberger S, Lefeuvre P, Lett JM, Shepherd DN, Varsani A, et al. A novel maize-infecting mastrevirus from La Reunion Island. Arch Virol. 2012;157:1617–21.View ArticlePubMedGoogle Scholar
- Oluwafemi S, Kraberger S, Shepherd DN, Martin DP, Varsani A. A high degree of African streak virus diversity within Nigerian maize fields includes a new mastrevirus from Axonopus compressus. Arch Virol. 2014;159:2765–70.View ArticlePubMedGoogle Scholar
- Wang YJ, Zhang DS, Zhang ZC, Wang S, Qiao Q, Qin Y, et al. First Report on Sweetpotato Symptomless Virus 1 (genus Mastrevirus family Geminiviridae) in sweetpotato in China. Plant Dis. 2015;99:1042.View ArticleGoogle Scholar
- Xie J, Wang X, Liu Y, Zhou G. First report of the occurrence of Wheat dwarf virus in wheat in China. Plant Dis. 2007;91:111.View ArticleGoogle Scholar
- Kreuze JF, Perez A, Untiveros M, Quispe D, Fuentes S, Barker I, et al. Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: a generic method for diagnosis, discovery and sequencing of viruses. Virology. 2009;388:1–7.View ArticlePubMedGoogle Scholar
- Wu Q, Luo Y, Lu R, Lau N, Lai EC, Li WX, et al. Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs. Proc Natl Acad Sci. 2010;107:1606–11.PubMed CentralView ArticlePubMedGoogle Scholar
- Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.PubMed CentralView ArticlePubMedGoogle Scholar
- Kumar J, Singh SP, Kumar J, Tuli R. A novel mastrevirus infecting wheat in India. Arch Virol. 2012;157:2031–4.View ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–80.PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang X, Wang Y, Guo W, Xie Y, Xie Q, Fan L, et al. Characterization of small interfering RNAs derived from the geminivirus/betasatellite complex using deep sequencing. PLoS One. 2011;6:e16928.PubMed CentralView ArticlePubMedGoogle Scholar
- Aregger M, Borah BK, Seguin J, Rajeswaran R, Gubaeva EG, Zvereva AS, et al. Primary and secondary siRNAs in geminivirus-induced gene silencing. PLoS Pathog. 2012;8:e1002941.PubMed CentralView ArticlePubMedGoogle Scholar
- Mi S, Cai T, Hu Y, Chen Y, Hodges E, Ni F, et al. Sorting of small RNAs into Arabidopsis argonaute complexes is directed by the 5’ terminal nucleotide. Cell. 2008;133:116–27.PubMed CentralView ArticlePubMedGoogle Scholar
- Takeda A, Iwasaki S, Watanabe T, Utsumi M, Watanabe Y. The mechanism selecting the guide strand from small RNA duplexes is different among argonaute proteins. Plant Cell Physiol. 2008;49:493–500.View ArticlePubMedGoogle Scholar
- Miozzi L, Pantaleo V, Burgyán J, Accotto GP, Noris E. Analysis of small RNAs derived from tomato yellow leaf curl Sardinia virus reveals a cross reaction between the major viral hotspot and the plant host genome. Virus Res. 2013;178:287–96.View ArticlePubMedGoogle Scholar
- Xu Y, Huang L, Fu S, Wu J, Zhou X. Population diversity of rice stripe virus-derived siRNAs in three different hosts and RNAi-based antiviral immunity in Laodelphgax striatellus. PLoS One. 2012;7:e46238.PubMed CentralView ArticlePubMedGoogle Scholar
- Csorba T, Pantaleo V, Burgyán J. RNA silencing: an antiviral mechanism. Adv Virus Res. 2009;75:35–230.View ArticlePubMedGoogle Scholar
- Wang XB, Wu Q, Ito T, Cillo F, Li WX, Chen X, et al. RNAi-mediated viral immunity requires amplification of virus-derived siRNAs in Arabidopsis thaliana. Proc Natl Acad Sci. 2010;107:484–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Donaire L, Wang Y, Gonzalez-Ibeas D, Mayer KF, Aranda MA, Llave C. Deep-sequencing of plant viral small RNAs reveals effective and widespread targeting of viral genomes. Virology. 2009;392:203–14.View ArticlePubMedGoogle Scholar
- Donaire L, Barajas D, Martínez-García B, Martínez-Priego L, Pagán I, Llave C. Structural and genetic requirements for the biogenesis of tobacco rattle virus-derived small interfering RNAs. J Virol. 2008;82:5167–77.PubMed CentralView ArticlePubMedGoogle Scholar
- Molnár A, Csorba T, Lakatos L, Várallyay É, Lacomme C, Burgyán J. Plant virus-derived small interfering RNAs originate predominantly from highly structured single-stranded viral RNAs. J Virol. 2005;79:7812–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Ho T, Wang H, Pallett D, Dalmay T. Evidence for targeting common siRNA hotspots and GC preference by plant Dicer-like proteins. FEBS Lett. 2007;581:3267–72.View ArticlePubMedGoogle Scholar
- Du QS, Duan CG, Zhang ZH, Fang YY, Fang RX, Xie Q, et al. DCL4 targets Cucumber mosaic virus satellite RNA at novel secondary structures. J Virol. 2007;81:9142–51.PubMed CentralView ArticlePubMedGoogle Scholar
- Gruber AR, Lorenz R, Bernhart SH, Neuböck R, Hofacker IL. The vienna RNA websuite. Nucleic Acids Res. 2008;36:W70–4.PubMed CentralView ArticlePubMedGoogle Scholar
- Muhire B, Martin DP, Brown JK, Navas-Castillo J, Moriones E, Zerbini FM, et al. A genome-wide pairwise-identity-based proposal for the classification of viruses in the genus Mastrevirus (family Geminiviridae). Arch Virol. 2013;158:1411–24.View ArticlePubMedGoogle Scholar
- Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Murray MG, Thompson WF. Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 1980;8:4321–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Seguin J, Otten P, Baerlocher L, Farinelli L, Pooggin MM. MISIS: A bioinformatics tool to view and analyze maps of small RNAs derived from viruses and genomic loci generating multiple small RNAs. J Virol Methods. 2014;195:120–2.View ArticlePubMedGoogle Scholar
- Simmonds P, Karakasiliotis I, Bailey D, Chaudhry Y, Evans DJ, Goodfellow IG. Bioinformatic and functional analysis of RNA secondary structure elements among different genera of human and animal caliciviruses. Nucleic Acids Res. 2008;36:2530–46.PubMed CentralView ArticlePubMedGoogle Scholar