The complete genome sequence, occurrence and host range of Tomato mottle mosaic virus Chinese isolate

Background Tomato mottle mosaic virus (ToMMV) is a recently identified species in the genus Tobamovirus and was first reported from a greenhouse tomato sample collected in Mexico in 2013. In August 2013, ToMMV was detected on peppers (Capsicum spp.) in China. However, little is known about the molecular and biological characteristics of ToMMV. Methods Reverse transcription-polymerase chain reaction (RT-PCR) and rapid identification of cDNA ends (RACE) were carried out to obtain the complete genomic sequences of ToMMV. Sap transmission was used to test the host range and pathogenicity of ToMMV. Results The full-length genomes of two ToMMV isolates infecting peppers in Yunnan Province and Tibet Autonomous Region of China were determined and analyzed. The complete genomic sequences of both ToMMV isolates consisted of 6399 nucleotides and contained four open reading frames (ORFs) encoding 126, 183, 30 and 18 kDa proteins from the 5’ to 3’ end, respectively. Overall similarities of the ToMMV genome sequence to those of the other tobamoviruses available in GenBank ranged from 49.6% to 84.3%. Phylogenetic analyses of the sequences of full-genome nucleotide and the amino acids of its four proteins confirmed that ToMMV was most closely related to Tomato mosaic virus (ToMV). According to the genetic structure, host of origin and phylogenetic relationships, the available 32 tobamoviruses could be divided into at least eight subgroups based on the host plant family they infect: Solanaceae-, Brassicaceae-, Cactaceae-, Apocynaceae-, Cucurbitaceae-, Malvaceae-, Leguminosae-, and Passifloraceae-infecting subgroups. The detection of ToMMV on some solanaceous, cucurbitaceous, brassicaceous and leguminous plants in Yunnan Province and other few parts of China revealed ToMMV only occurred on peppers so far. However, the host range test results showed ToMMV could infect most of the tested solanaceous and cruciferous plants, and had a high affinity for the solanaceous plants. Conclusions The complete nucleotide sequences of two Chinese ToMMV isolates from naturally infected peppers were verified. The tobamoviruses were divided into at least eight subgroups, with ToMMV belonging to the subgroup that infected plants in the Solanaceae. In China, ToMMV only occurred on peppers in the fields till now. ToMMV could infect the plants in family Solanaceae and Cucurbitaceae by sap transmission.

Tobamovirus is the largest of six genera in the family Virgaviridae, it consists of 25 species and 6 tentative species, with TMV as the type species [11,12]. According to the current online taxonomy released by ICTV (International Committee on Taxonomy of Viruses), however, there are now 35 members in the genus Tobamovirus (http:// www.ictvonline.org/virusTaxonomy.asp). The known tobamovirus genomes contain four open reading frames (ORFs) encoding four proteins. The two larger polypeptides of 124-132 kDa and 181-189 kDa are involved in virus replication, the 124-132 kDa protein is terminated by an amber (UAG) stop codon and the 181-189 kDa protein is produced by a readthrough of this stop codon [13]. Two other ORFs encode the 28-31 kDa movement protein (MP) and the 17-18 kDa coat protein (CP). Tobamoviruses were previously divided into three subgroups based on the genomic structure and host range [13]. Min and co-workers [14] proposed that tobamoviruses could be divided into at least five subgroups according to the amino acid composition and primary structure of their CPs, and the hosts from which the viruses were originally isolated. Song et al. [15] suggested tobamoviruses should be divided into six subgroups based on the phylogenetic analysis of the four tobamovirus proteins with the existence of passifloraceaeinfecting subgroup in the genus Tobamovirus.
There has been no detailed molecular information or biological characterization of ToMMV Chinese isolates until now, and no report about ToMMV infecting other plants except pepper and tomato. The genome sequence and genetic diversity analysis of ToMMV will help to characterize this emerging virus and develop appropriate detection methods. A better knowledge of the host range test of ToMMV will provide a theoretical basis for monitoring, prediction and effective prevention and control of the virus disease caused by ToMMV. This paper presented the complete genome sequences of two ToMMV isolates infecting peppers in China. The host range of ToMMV and the phylogeny of 32 available tobamoviruses were also analyzed.

Host range test
To test the potential hosts of ToMMV, sap from ToMMVpositive pepper plants was ground in 0.1 M PBS buffer (pH 7.2) and mechanically inoculated onto Nicotiana tabacum var. Samsun at the 4-to 5-leaf stage. The inoculated tobacco plant showed systemic mosaic symptom but no local lesion was found. The inoculated plants was then detected with the specific primers of ToMMV, TMV, ToMV, PMMoV, TMGMV, CMV, TSWV, ToCV, BBWV2, and the degenerate primers of tobamovirus potyvirus, polerovirus and begomovirus by RT-PCR, and only ToMMV but no other virus was detected from the inoculated plants. So

Amplification of the complete virus genome
The complete genome sequence of each ToMMV isolate was divided into six fragments and amplified by RT-PCR using designed primers. Each fragment was 1000-to 1500-bp long except the third segment, which was about 883 bp in length. There were overlapping areas between the adjacent two segments ( Table 1). The six pairs of primers were designed according to the complete sequence of ToMMV (KF477193) accessed in GenBank. Reverse transcription amplification using Reverse Transcriptase M-MLV (RNase H − ) (TaKaRa Biotech, Dalian, China), and PCR amplification using TaKaRa Ex Taq TM (TaKaRa Biotech, Dalian, China) were conducted following the manufacturer's instructions.

Cloning and sequencing of viral genomic fragments
All amplified products were purified with a Universal DNA Purification Kit (TIANGEN Biotech, Beijing, China) and subsequently cloned into a pMD 19-T or pBackZero-T Vector (TaKaRa Biotech, Dalian, China) based on the manufacturers' instructions. At least three independent clones from each ligation were sequenced on both strands (BGI Tech. Solutions, Shenzhen, China).

Sequence assembly and analysis
The sequence data were assembled and analyzed with DNAstar 6.0 (DNAStar Inc, Madison,USA). The complete nucleotide sequences of ToMMV-YYMLJ and ToMMV-TiLhaLJ, as well as the deduced amino acid sequences of

Genome organization of ToMMV
The complete genome sequences of the two ToMMV isolates (YYMLJ, Acc. No. KR824950; TiLhaLJ, Acc. No. KR824951) from peppers in China were the same length and had the same genomic structure as three other ToMMV tomato isolates: MX5 (KF477193) from Mexico [6], 10-100 (KP202857) [7] and NY-13 (KT810183) from the United States. The genomes of YYMLJ, TiLhaLJ and 10-100 consisted of 6399 nucleotides (nt) with an additional "A" at nt 6212, compared with MX5 and NY-13, and encoded four ORFs. The 5' untranslated region (UTR) was 75nts long with a so-called "Ω fragment" that had no G residue except the m 7 G cap at the 5' ultimate nucleotide [18,19]. The 3' UTR was 201nts long with a conserved region 6,271 TCCCTCCACTTAAATCGAAG GGTT 6,294 ending with the sequence CCCA typical of other tobamoviruses [20]. ORF1 started at nt 76 and encoded a putative protein of 126 kDa, and the 183 kDa readthrough protein started from a leaky UAG stop codon at nt 3426 and terminated at nt 4926 with a 54 kDa polypeptide in the readthrough region. The third ORF encoded a 30 kDa MP from nt 4910 to nt 5716, and ORF4 extended between nt 5719-6198 with an intergenic region of 2nts between the MP and CP ORFs, resulting in a 18 kDa CP (Fig. 1).

Whole genome alignment and analysis of ToMMV compared with other tobamoviruses
The full-length nucleotide sequences of available ToMMV isolates were compared with each other and with the 31 other tobamoviruses. The genomes of ToMMV-YYMLJ and ToMMV-TiLhaLJ shared identities of 99.9% at the nucleotide sequence level, so the genomic sequence of ToMMV-YYMLJ was therefore used in subsequent sequence analyses. Gibbs et al. [21] reported a genus-specific nucleotide motif for tobamoviruses, the "4404-50 motif ", twenty nine sites of this 47nts sequence are invariant in all tobamovirus sequences. Eighteen of the sites in the 4404-50 motif varied between species, but at 12 or more of these sites all isolates of each species usually had the same nucleotide [21]. Multiple sequence alignments revealed that the ToMMV genome also contains the distinctive 4404-50 motif of 4411 GGTGATGTTACAACTTTCATAGGAA ATACTGTTATTATAGCCGCGTG 4457 (underlined bases are different with other tobamoviruses). All of the 18 variable sites are invariant among the 5 ToMMV isolates, while the conserved nucleotides in the ToMMV sequences differ from those in ToMV at 9 of the sites, which clearly distinguishes ToMMV from ToMV as follows: Red-shaded sites were the same in all tobamoviruses, yellow-shaded sites were the same in ToMMV and ToMV, and the unshaded sites varied among different isolates of ToMMV and ToMV. No more than 9 of the 18 conserved sites of ToMMV were shared with other tobamoviruses. The results of these comparisons strongly confirmed ToMMV as a distinct member in the genus Tobamovirus.

Phylogenetic relationship of ToMMV with other tobamoviruses
The phylogenetic tree analyses based on the complete genome and the 126 kDa, 54 kDa, MP and CP amino acid sequences both showed ToMMV grouped with ToMV, ToBRFV, TMV and other tobamoviruses infecting solanaceous plants, and ToMMV was most closely related to ToMV (Figs. 2 and 3). The phylogenetic analyses and the host from which the viruses were originally isolated showed the currently available 32 tobamoviruses could be divided into at least eight subgroups according to their host-plant families: Solanaceae-, Brassicaceae-, Cactaceae-, Apocynaceae-, Cucurbitaceae-, Malvaceae-, Leguminosae-, and Passifloraceae-infecting subgroups. This grouping was based on the complete nucleotide sequences (Fig. 2)   126 kDa replicase (Fig. 3a), 54 kDa polymerase (Fig. 3b), MP (Fig. 3c) and CP (Fig. 3d).
Occurrence and the host range of ToMMV  PMMoV, TMGMV, CMV, TSWV, ToCV, BBWV2, and the degenerate primers of tobamovirus, potyvirus, polerovirus and begomovirus, and only ToMMV was detected. The virus caused systemic symptoms including mosaic (Fig. 4a, b, c, d, g, h, i), blistering (Fig. 4a, f, g, h, j) and chlorosis (Fig. 4c, e) on the majority of infected species with occasionally severe foliar distortion (Fig. 4a, d, f, h, j), leaf narrowing (Fig. 4d, g, h) and necrosis (Fig. 4b, f ) in the infected solanaceous plants, while the virus caused mottle symptom in the infected B. pekinensis (Fig. 4k), B. campestris (Fig. 4l) Min et al. [14] proposed dividing the tobamoviruses into at least five subgroups according to their amino acid composition and primary structure of their CPs, these virus were originally isolated from plants in the Solanaceae, Brassicaceae, Cucurbitaceae, Cactaceae and Malvaceae. Song et al. [15] suggested the existence of a sixth subgroup in the genus Tobamovirus isolated from the Passifloraceae, based on the phylogenetic analysis of the four tobamovirus proteins. Here, according to the phylogenetic analyses and the hosts from which the viruses were originally isolated, the presently available 32 tobamoviruses could be divided into at least eight subgroups, Solanaceae-, Brassicaceae-, Cactaceae-, Apocynaceae-, Cucurbitaceae-, Malvaceae-, Leguminosae-, and Passifloraceaeinfecting subgroups based on both the complete nucleotide sequences and 126 kDa, 54 kDa, MP and CP amino acid sequences. In addition, those tobamoviruses infecting plants in the Solanaceae were most closely related to the tobamoviruses infecting plants in the Brassicaceae, which was consistent with the results of host range test.
Ribgrass mosaic virus (ReMV) which infecting the Scrophulariaceae was always clustered with the solanaceaeinfecting subgroup, while Streptocarpus flower break virus (SFBV) which infecting the Gesneriaceae was grouped into the cruciferae-infecting subgroup no matter at the nt or the aa sequence levels (Figs. 2 & 3). Whereas Odontoglossum ringspot virus (ORSV) which infecting the Orchidaceae was divided into the solanaceae-infecting subgroup based on the nt and aa sequences of its MP and CP, but clustered with the brassica-infecting subgroup based on the aa sequences of its 126 kDa replicase and 54 kDa polymerase (Figs. 2 & 3). ToMMV grouped with ToMV, ToBRFV, TMV and other tobamoviruses infecting solanaceous plants, and ToMMV was most closely related to ToMV (Figs. 2 & 3). Both our phylogenetic tree analyses of the complete genome and the 126 kDa, 54 kDa, MP and CP amino acid sequences strongly supported that ToMMV belongs to the subgroup of the tobamoviruses that infects plants in the Solanaceae.
More surveys are needed to determine the incidence and distribution of ToMMV in the field. Research to identify genetic mechanisms of pathogenesis and hostplant defense will assist in the development of crop resistance to ToMMV.