Molecular characterization of a Chinese variant of the Flury-LEP strain

The entire genome of rabies virus vaccine strain Flury-LEP-C, a Chinese variant of the rabies virus vaccine strain Flury-LEP, was sequenced. The overall length of the genome of Flury-LEP-C strain was 11 924 nucleotides (nt), comprising a leader sequence of 58 nt, nucleoprotein (N) gene of 1353 nt, phosphoprotein (P) gene of 894 nt, matrix protein (M) gene of 609 nt, glycoprotein (G) gene of 1575 nt, RNA-dependent RNA polymerase (RdRp, L) gene of 6384 nt, and a trailer region of 70 nt. There was TGAAAAAAA (TGA7) consensus sequence in the end of each gene in Flury-LEP-C genome, except G gene which had a GAGAAAAAAA sequence in the end of the non-coding G-L region. There were AACAYYYCT consensus start signal close to the TGA7. Flury-LEP-C has 310 nucleotides more than HEP-Flury in G-L intergenic region. The analysis showed that the residue at 333 of the mature G protein was Arg, which was reported to be related to pathogenicity. Compared with FluryLEP, there were 19 different amino acids (AAs) in five proteins of Flury-LEP-C, including 15 AAs which were identical with corresponding residues of Hep-Flury, and 4 AAs which were neither identical with the residues of FluryLEP nor with the residues of Hep-Flury. The results showed the topology of the phylogenetic trees generated by two protein sequences were similar. It was demonstrated that HN10, BD06, FJ009, FJ008, D02, D01, F04, F02 have a close relationship to CTN-1 and CTN181, and MRV was closely related to Flury-LEP, HEP-Flury and Flury-LEP-C.


Findings
The rabies virus belongs to the Rhabdoviridae family and the Lyssavirus genus. The genome of the rabies virus is a non-segmented, anti-sense, single-stranded RNA which is about 12, 000 nucleotides (nt) long. Viral RNA encodes five major proteins: nucleoprotein (N-protein), phosphoprotein (P protein), matrix protein (M-protein), glycoprotein (G-protein) and RNA-dependent RNA-polymerase (L-protein) [1].
It was reported there were still high rabies cases happened in China, especially in rural China, about 5537 fatalities per year in 80's, and about 3300 fatalities in 2007 [2][3][4][5]. During recent years, most of the research on the control of rabies has concentrated on the development of oral vaccine, including attenuated vaccine and live vectored vaccines. However, these virus strains are still pathogenic for laboratory and wild rodents or wildlife species, and several rabies cases caused by such vaccines have been reported [6,7]. It was reported some rabies virus in China was closely related to several vaccine strains [8]. The main goal of the present study was to obtain the entire genome sequence of vaccine strain Flury-LEP-C, a Chinese variant of the rabies virus vaccine strain Flury-LEP, including the 3'-and 5'-terminal noncoding regions of the genome. The genome sequence has been compared to the sequences of other vaccine strains used in China and street strains in China available from GenBank. The data obtained from vaccine strain and street strain can lead to a better understanding and more effective strategies to control the spread of rabies.
Here, we obtained the full length genome of Flury-LEP-C strain by RT-PCR or RACE similar to the method described by Marston et al. [9]. Using a total of 12 primers (as shown in Table 1), the entire genome of Flury-LEP-C strain was amplified as 5 separate overlapping PCR products. The result showed that the full genome of rabies virus strain Flury-LEP-C consists of 11924 nt. The full length sequence was submitted to GenBank (Gen-Bank accession numbers FJ577895).
In the full genome sequence of Flury-LEP-C, the leader sequence was 58 nt in length, while trailer sequence was * Correspondence: renlz@jlu.edu.cn 70 nt. All RVs (as shown Table 2) in this study were absolutely conserved over the 12 bases of the genomic 3'-terminus ( Fig. 1) and 5'-terminus (Fig. 2). The sequences of 3' leader and 5' trailer termini showed exactly complementary for the terminal 11 nt of all RVs, except that MRV and DRV showed different 3'-terminus and 5'-terminus end.
Between the transcription stop and start signals, there was an intergenic sequence (IGS), which was not transcribed into mRNA. The N/P IGS was CT. The P/M IGS was CAGGC, and M/G IGS was CTATT. The IGS between the non-coding G-L region and L gene was 21 nt.  The G-L intergenic region is a non-coding region. It was reported that this region was highly susceptible to random mutations, unrestricted by structure and function requirements or by immunological pressure [10]. Comparison result in this study showed that the G-L intergenic region of Flury-LEP-C has 310 nucleotides more than that of HEP-Flury (Fig. 3), which demonstrate that the non-coding G-L region was more prone to mutate. The observation indicates that the region may be used as an insertion site for a marker gene to construct a marker vaccine. However, studies should be undertaken to confirm this hypothesis. Rabies virus encodes five structural proteins in the order of N-P-M-G-L. The length of five genes of Flury-LEP-C strain were 1353 nt, 894 nt, 609 nt, 1575 nt, 6384 nt, respectively. There was TGAAAAAAA (TGA 7 ) consensus sequence in the end of each gene in Flury-LEP-C genome, except that G gene had a GAGAAAAAAA  sequence in the end of the non-coding G-L region. There were AACAYYYCT consensus start signal close to the TGA 7 . The main difference between Flury LEP and Flury-LEP-C was that the latter has 12 nt more than the former in L gene (Table 3). Further studies are necessary to elucidate the role of these mutations in Flury-LEP-C.
The entire amino acid sequence of Flury-LEP-C was aligned with 17 entire genome sequences (as shown in table 2) obtained from the GenBank. Analysis of deduced amino acid sequences from open reading frames (ORFs) of N, P, M, G, and L genes revealed 98.81%, 93.94%, 96.75%, 95.12%, 97.69%. Szanto reported that P gene was the most variable gene [11], similar result was obtained in Flury-LEP-C.
The G gene does indeed encode a product of 524 amino acids but this includes a 19 amino acid N-terminal signal peptide that is cleaved to generate the mature product of 505 amino acids. It was reported that the G protein plays an important role in viral pathogenicity and protective immunity, especially residue Arg333 [1,[12][13][14][15][16][17]. Jackson et al. reported that less neurovirulent strain, which contains an attenuating substitution of Arg333 in the rabies virus glycoprotein, was a stronger inducer of neuronal apoptosis and there was an inverse relationship between patho-genicity and apoptosis [18]. In this study, the analysis showed that the residue at 333 of the mature G protein was Arg. P protein is a structural component of the RNP. And P protein is also crucially involved in numerous events during the virus life cycle, including proper formation of viral RNPs and virus particles and viral RNA synthesis [14]. The P protein has been shown to interact with LC8 (cytoplasmic dynein light chain) at residues 138-172 [19,20], specifically the motif K/RXTQT at residues 145-149 [20]. Mebatsion found that the deletions introduced into the LC8 binding site abolished the P-LC8 interaction, blocked LC8 incorporation into virions, and reduced the efficiency of peripheral spread of the virus, but LC8 is dispensable for the spread of a pathogenic RV from a peripheral site to the CNS [19]. We found that the minimal binding motif for LC8 at residues 145-149 of P protein was KSTQT in all rabies sequences in this study, except that SHBRV-18 has a KATQT motif.
Compared with FluryLEP, there were 19 different amino acids (AAs) in five proteins of Flury-LEP-C, including 15 AAs which were identical with corresponding residues of Hep-Flury, and 4 AAs which were neither identical with the residues of FluryLEP nor with the resi- Studies are undertaking to find difference in phenotypic characteristics between the Flury-LEP-C and its parental strain FluryLEP.
In this study, two kinds of proteins were used to construct the phylogeny tree. First, nucleotide sequences of five viral genes of each strain were translated into protein sequences and joined to one sequence in the original order, based on which a phylogenetic tree was generate (Fig. 4). Second, P protein, due to its multifunctional nature including its ability to interact with host-cell proteins [21], were also used to construct a phylogeny tree (Fig. 5). The results showed the topology of the phylogenetic trees generated by these two methods were similar. It was demonstrated that HN10, BD06, FJ009, FJ008, D02, D01, F04, F02 have a close relationship to CTN-1 and CTN181, which means the homology between the CTN stains and the Chinese street strains was much higher than that of any other vaccine strain. And MRV was closely related to Flury-LEP, HEP-Flury and Flury-LEP-C, but DRV formed an outlying clade. The CTN (or its derivates, including CTN-1 and CTN181), PV and PM strains are the human rabies virus vaccine strains, and FluryLEP, HEP-Flury, ERA and CTN-1 are the veterinary rabies virus vaccine strains currently used in China. It was hypothesized the CTN strain should be most suitable for use in China as a vaccine strain [10,22], and the result in our study also supported the hypothesis.