The characteristics of the synonymous codon usage in hepatitis B virus and the effects of host on the virus in codon usage pattern
Virology Journalvolume 8, Article number: 544 (2011)
Hepatitis B virus (HBV) infection is one of the main human health problem and causes a large-scale of patients chronic infection worldwide.. As the replication of HBV depends on its host cell system, codon usage pattern for the viral gene might be susceptible to two main selections, namely mutation pressure and translation selection. In this case, a deeper investigation between HBV evolution and host adaptive response might assist control this disease.
Relative synonymous codon usage (RSCU) values for the whole HBV coding sequence were studied by Principal component analysis (PCA). The characteristics of the synonymous codon usage patterns, nucleotide contents and the comparison between ENC values of the whole HBV coding sequence indicated that the interaction between virus mutation pressure and host translation selection exists in the processes of HBV evolution. The synonymous codon usage pattern of HBV is a mixture of coincidence and antagonism to that of host cell. But the difference of genetic characteristic of HBV failed to be observed to its different epidemic areas or subtypes, suggesting that geographic factor is limited to influence the evolution of this virus, while genetic characteristic based on HBV genotypes could be divided into three groups, namely (i) genotyps A and E, (ii) genotype B, (iii) genotypes C, D and G.
Codon usage patterns from PCA for identification of evolutionary trends in HBV provide an alternative approach to understand the evolution of HBV. Further more, a combined selection of mutation pressure with translation selection on codon usage might shed a light on understanding the evolutionary trends of HBV genotypes.
Hepatitis B virus (HBV) disease is one of the main global health problems that two billion people are infected and 350 million people undergo chronic infection as well . HBV belongs to the protyotype member of the family Hepadnaviridae, and has a compact and circular DNA genome of about 3.2 kb in length, with four overlapping open reading frames including large S region (PreS/S), PreC/C, × and P [2, 3]. Moreover, the overlapping regions on the genome are helpful to study the evolution of the virus with its point mutations, because the incidence of recombination is rare and any point mutation could effect the genetic characteristics of two overlapped genes . The evolution of HBV should be interactional and constrained by the overlap of genes . In some cases, the evolution of one overlapping-gene protein may evolve more rapidly as a consequce of negative selection to the other,. And the overlapping genes might be subject to different selections . Furthermore, independent adaptive selection for both overlapping genes has been reported . One of the main features of HBV are its genetic heterogeneity . There are four main subtypes, namely ayw, adw, adr and ayr . According to phylogenetic analysis of the complete HBV genomic sequence, 9 genotype of HBV from genotype A to I have been determined and divided into approximately twenty-five subgenotypes [10–14]. HBV genotypes show distinct geographical distributions at the level of nucleotide different more than 8% each other [11, 15, 16]. It is noticed that nucleotide composition comprising of HBV coding sequence with various genetic diversities is selective rather than random, because the natural selection from host is responsible for selection of various strains shaped by mutation. In previous reports, translation selection and compositional constraints under the mutational pressure are thought to be the major factors accounting for codon usage variation among genomes in microorganisms [17–24]. In some RNA viruses, compared with natural selection, mutation pressure plays a more important role in synonymous codon usage pattern [25, 26]. Although it is known that compositional constraints and translation selection are the more generally accepted mechanisms accounting for codon usage bias [27–30], other selection forces have also been proposed such as fine-tuning translation kinetics selection as well as escape of cellular antiviral responses [23, 31–34]. Thus, the codon usage pattern may be important in disclosing the molecular mechanism and evolutionary process of HBV to avoid host cell response. To our knowledge, it is the first systemic study to analysis the synonymous codon usage pattern and evolutional dynamics of HBV as well as the relationship between codon usage pattern of HBV and its host.
Synonymous coodn usage in HBV
The C% and U% were higher than A% and G%, and C3% and U3% were higher than A3% and G3% in HBV (Table 1).
The overall nucleotide composition never affects the nucleotide contents in the third site of codon in HBV coding sequence, suggesting that composition constraints may be one of the factors in affecting the codon usage pattern of HBV. For the synonymous codon usage pattern of HBV, the over-represented synonymous codons are rare in HBV coding sequence, only including UCU for Ser, in addition, the under-represented ones contain AUA for Ile, CCC for Pro, ACC for Thr, GCC for Ala, CGU and CGG for Arg (Table 2).
The codon usage bias of HBV suggests that some synonymous codons are not chosen equally and randomly.
Genetic relationship based on synonymous codon usage in HBV
The PCA detected the first principal component (f 1 ') which can account for 23.65% of the total synonymous codon usage variation, and the second principal component (f 2 ') for 19.47% of the total variation. Based on the geographical factor in influencing HBV evolution potentially, there is an obviously geographical distribution. For example, the overall codon usage pattern of HBV isolated from Philippines and South Korea is far from those of China and Indonesia, and the HBV isolated from Germany and Iran has a similar genetic diversity with that isolated from South Africa (Figure 1).
Based on the subtypes of HBV, the plots for the subtype adw were generally divided into two groups, while the other three subtypes seem to have a similar genetic characteristic (Figure 2).
It is worth noting that the plots for different HBV genotypes were generally separated from each other. Moreover, the genotypes A and B have an obviously different genetic characteristic with the rest, while genotypes C, D and G appear to have a relationship of evolution (Figure 3).
These results indicated that the geographic distribution might be a limited factor to effect the codon usage of the whole HBV coding sequence, and the subtypes did not reflect the characteristic of HBV evolution to some degree. In this case, the codon usage variation might be one of factors to drive HBV evolution.
The effect of mutation pressure on codon usage of HBV
To analyze if the evolution of HBV is shaped by mutation pressure from virus itself or by translation selection from host, G+C content at the first and second codon positions (GC12%) was compared with that at synonymous third codon positions (GC3%) (Figure 4).
A highly significant correlation was observed (r = 0.432, P < 0.01), implying that mutation pressure from base composition of HBV is a main factor in shaping genetic diversity of this virus, since the effects are present at all codon positions. In addition, the ENC values were calculated for each strain and the plot was made by ENC value against GC3% (Figure 5).
The Figureure 5 represented that the plots of HBV aggregated below the expected curve, suggesting other selections take part in the process of HBV evolution.
Comparative analysis of the RSCU values between HBV and human cell
There is a resemblance of synonymous codons usage pattern between this virus and human cell, for example, the similar synonymous codon usage pattern includes all synonymous codons for Phe, Ile, Val, Ser, Ala, Tyr, His, Lys, Asp, Cys and Gly (Table 1). This may be explained that the codon usage of HBV adapting to its host under translation selection could result in the multiplication of progeny virus. This phenomenon possibly implies that the resemblance of codon usage is favorable for HBV replication in human cells. But if compared with the under-represented codons in human cells, CCG for Pro, ACG for Thr, CAA for Gln and CUA for Leu in HBV are highly used (Table 1). The result suggested that these codons could influence the translational rate of the context flanking them, resulting in the viral product correct fold.
The ENC values calculated for HBV indicated that although a significantly lower bias of codon usage exists in HBV, the codon usage is not mainly affected by mutation pressure. As for some viruses, previous study reported that the major factor in shaping codon usage patterns appears to be mutation pressure rather than natural selection [19, 21, 24, 35]. However, the comparison of the synonymous codon usage between HBV and human cells suggested that the interaction of mutation pressure with translation selection exists in the process of HBV evolution, although ENC values for the whole HBV coding sequence to represent mutation pressure is one of the factors in influencing codon usage pattern. This characteristic of HBV confers adaptive advantages which result in a highly efficient dissemination of the virus through different ways of transmission.
The pattern of codon usage is a genetic characteristic of various organisms in Previous study [19, 20, 27, 31, 32, 35, 36]. Because C%, U%, U3% and C3% play roles in the formation of the different optimal codons with any nucleotide-ended, the codon usage pattern of HBV is likely influenced by composition constraints. The codon usage pattern of PV is mostly coincident with that of its host, while the codon usage pattern of HBV is antagonistic to that of its host [37, 38]. The codon usage pattern of HBV is a mixture of the two types of codon usage. The coincident portion of codon usage pattern for HBV enables the corresponding amino acids to be translated rapidly, the other antagonistic portion of codon usage pattern likely enable viral proteins to be folded properly, although the translation efficiency of the corresponding amino acids is decreased. Latent genes in Epstein-Barr virus deoptimize codon usage in order to evade competition for host protein translation  and attenuation of PV activity was performed by rare codon pairs inducing poor translation for sequences of viral proteins . These results suggested that disfavored codons coding for amino acids may not be a deleterious factor for viruses to adapt to its host cells.
According to the data of codon usage pattern of HBV isolated from different countries, the geographic factor fails to influence the formation of codon usage pattern of HBV. After all, with development of international communication and highly efficient dissemination of HBV through various approaches of transmission, the affection of geographic factor seems to be weak on the limitation of HBV distribution in different countries. It is interesting that the main four subtypes of HBV have no significant difference in genetic characteristic shaped by different human races. This result might suggested that translation selection from human is not a single factor to shape the overall codon usage pattern of this virus and mutation pressure from HBV itself is a main force to drive HBV evolution. Genotyping of HBV is of high interest because there is increasing evidence that HBV genotypes may be associated with HBeAg sero-conversion rates, mutation occurring in the procure and core promoter region, severity of liver disease and treatment response [15, 16, 39, 40]. There is a significant difference of the overall codon usage pattern of HBV between genotypes A, B, E and C, D, G. HBV genotypes and subgenotypes have been associated with differences in clinical and virological characteristics, showing that they may play a role in the virus-host relationship . It has been shown that genotypes C and D are associated with more serious liver injuries and with a higher incidence of HCC than genotypes A and B [42–44]. In addition, genotype C and D have a much lower rate in response to interferon therapy than those infected with A or B genotypes [40, 45]. Moreover, subtle differences in frequency and type of lamivudine resistant variants occur in genotype A and D infectious . An evolutionary approach to HBV infection, based on the principles of natural selection, may offer explanation for how modes of transmission may favor some genotypes and subgenotypes over others and influence HBV virulence.
The genetic diversity and codon usage patterns we proposed here are helpful to understand the processes of HBV evolution, especially the roles played by translation selection from host and mutation pressure from virus. Additionally, such information might benefit to understand the roles of geographic and subtype factors in influencing the process of HBV evolution.
Materials and methods
The 58 complete RNA sequences of HBV were downloaded from the National Center for Biotechnology Information (NCBI) http://www.ncbi.nlm.nih.gov/Genbank/ and detailed information about the viruses were listed in Table 3
Each general nucleotide composition (U%, A%, C% and G%) and each nucleotide composition in the third site of codon (U3%, A3%, C3% and G3%) in HBV coding sequence were calculated by biosoftware DNAStar 7.0 for windows.
The calculation of the relative synonymous codon usage (RSCU)
The relative synonymous codon usage (RSCU) values for the whole 58 coding sequence of HBV were calculated as previously described . RSCU values do not depend on the factors of amino acid composition and the size of the coding sequence, because the two factors can be eliminated in the process of calculation. When RSCU value is equal to 1.0, it means that this codon is chosen equally and randomly. The RSCU value for a synonymous codon more than 1.0 or less than 1.0 indicates the more frequency or less frequency, respectively. The synonymous codons with RSCU more than 1.6 were thought to be over-represented, while the synonymous codons with RSCU less than 0.6 were regarded as under-represented .
Analysis of codon usage bias
The 'effective number of codons' (ENC), the useful estimator of absolute codon usage bias, was a measure quantifying the codon usage bias of the whole coding sequence of HBV. The ENC value ranges from 20 (when only one synonymous codon is chosen by the corresponding amino acid) to 61 (when all synonymous codons are used equally) . In this study, this measure was used to evaluate the degree of codon usage bias of coding sequences for HBV.
Principal component analysis
Principal component analysis (PCA), which was a commonly used multivariate statistical method , was carried out to analyze the major trend in codon usage pattern among different strains of HBV. PCA involves a mathematical procedure that transforms some correlated variable (RSCU values) into a smaller number of uncorrelated variables called principal components. Each strain was represented as a 59 dimensional vector, and each dimension corresponded to the RSCU value of each sense codon, which only included several synonymous codons for a particular amino acid, excluding the codon of AUG, UGG and three stop codons.
The relationship between each general nucleotide composition (U%, A%, C% and G%) and each nucleotide composition in the third site of codon (U3%, A3%, C3% and G3%) in HBV coding sequence and the relationship between U3%, A3%, C3%, G3% and the coodn usage pattern of HBV were evaluated by the Pearson's rank.
All statistical processes were carried out by statistical software SPSS11.5 for windows.
Experimental Center of Medicine, Lanzhou General Hospital, Lanzhou Military Area Command; Key lab of Stem cells and Gene Drugs of Gansu Province, Lanzhou 730000, China
Kim SM, Lee KS, Park CJ, Lee JY, Kim KH, Park JY, Lee JH, Kim HY, Yoo JY, Jang MK: Prevalence of occult HBV infection among subjects with normal serum ALT levels in Korea. J Infect 2007, 54: 185-191. 10.1016/j.jinf.2006.02.002
Westover KM, Hughes AL: Evolution of cytotoxic T-lymphocyte epitopes in hepatitis B virus. Infect Genet Evol 2007, 7: 254-262. 10.1016/j.meegid.2006.10.004
Zhang D, Chen J, Deng L, Mao Q, Zheng J, Wu J, Zeng C, Li Y: Evolutionary selection associated with the multi-function of overlapping genes in the hepatitis B virus. Infect Genet Evol 2009, 10: 84-88.
Mizokami M, Orito E, Ohba K, Ikeo K, Lau JY, Gojobori T: Constrained evolution with respect to gene overlap of hepatitis B virus. J Mol Evol 1997,44(Suppl 1):S83-90.
Jordan IK, Sutter BAt, McClure MA: Molecular evolution of the Paramyxoviridae and Rhabdoviridae multiple-protein-encoding P gene. Mol Biol Evol 2000, 17: 75-86.
Pavesi A: Origin and evolution of overlapping genes in the family Microviridae. J Gen Virol 2006, 87: 1013-1017. 10.1099/vir.0.81375-0
Zaaijer HL, van Hemert FJ, Koppelman MH, Lukashov VV: Independent evolution of overlapping polymerase and surface protein genes of hepatitis B virus. J Gen Virol 2007, 88: 2137-2143. 10.1099/vir.0.82906-0
Stanojevic B, Osiowy C, Schaefer S, Bojovic K, Blagojevic J, Nesic M, Yamashita S, Stamenkovic G: Molecular characterization and phylogenetic analysis of full-genome HBV subgenotype D3 sequences from Serbia. Infect Genet Evol 2011, 11: 1475-1480. 10.1016/j.meegid.2011.05.004
Okamoto H, Imai M, Shimozaki M, Hoshi Y, Iizuka H, Gotanda T, Tsuda F, Miyakawa Y, Mayumi M: Nucleotide sequence of a cloned hepatitis B virus genome, subtype ayr: comparison with genomes of the other three subtypes. J Gen Virol 1986,67(Pt 11):2305-2314.
Bartholomeusz A, Schaefer S: Hepatitis B virus genotypes: comparison of genotyping methods. Rev Med Virol 2004, 14: 3-16. 10.1002/rmv.400
Norder H, Courouce AM, Coursaget P, Echevarria JM, Lee SD, Mushahwar IK, Robertson BH, Locarnini S, Magnius LO: Genetic diversity of hepatitis B virus strains derived worldwide: genotypes, subgenotypes, and HBsAg subtypes. Intervirology 2004, 47: 289-309. 10.1159/000080872
Schaefer S, Magnius L, Norder H: Under construction: classification of hepatitis B virus genotypes and subgenotypes. Intervirology 2009, 52: 323-325. 10.1159/000242353
Pourkarim MR, Amini-Bavil-Olyaee S, Lemey P, Maes P, Van Ranst M: Are hepatitis B virus «subgenotypes» defined accurately? J Clin Virol 2010, 47: 356-360. 10.1016/j.jcv.2010.01.015
Pourkarim MR, Lemey P, Amini-Bavil-Olyaee S, Maes P, Van Ranst M: Novel hepatitis B virus subgenotype A6 in African-Belgian patients. J Clin Virol 2009, 47: 93-96.
Schaefer S: Hepatitis B virus genotypes in Europe. Hepatol Res 2007, 37: S20-26. 10.1111/j.1872-034X.2007.00099.x
Schaefer S: Hepatitis B virus taxonomy and hepatitis B virus genotypes. World J Gastroenterol 2007, 13: 14-21.
Karlin S, Mrazek J: What drives codon choices in human genes? J Mol Biol 1996, 262: 459-472. 10.1006/jmbi.1996.0528
Lesnik T, Solomovici J, Deana A, Ehrlich R, Reiss C: Ribosome traffic in E. coli and regulation of gene expression. J Theor Biol 2000, 202: 175-185. 10.1006/jtbi.1999.1047
Liu YS, Zhou JH, Chen HT, Ma LN, Ding YZ, Wang M, Zhang J: Analysis of synonymous codon usage in porcine reproductive and respiratory syndrome virus. Infect Genet Evol 2010, 10: 797-803. 10.1016/j.meegid.2010.04.010
Liu YS, Zhou JH, Chen HT, Ma LN, Pejsak Z, Ding YZ, Zhang J: The characteristics of the synonymous codon usage in enterovirus 71 virus and the effects of host on the virus in codon usage pattern. Infect Genet Evol 2011, 11: 1168-1173. 10.1016/j.meegid.2011.02.018
Zhou T, Gu W, Ma J, Sun X, Lu Z: Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses. Biosystems 2005, 81: 77-86. 10.1016/j.biosystems.2005.03.002
Zhou T, Sun X, Lu Z: Synonymous codon usage in environmental chlamydia UWE25 reflects an evolutional divergence from pathogenic chlamydiae. Gene 2006, 368: 117-125.
Zhou JH, Zhang J, Chen HT, Ma LN, Ding YZ, Pejsak Z, Liu YS: The codon usage model of the context flanking each cleavage site in the polyprotein of foot-and-mouth disease virus. Infect Genet Evol 2011, 11: 1815-1819. 10.1016/j.meegid.2011.07.014
Zhou JH, Zhang J, Chen HT, Ma LN, Liu YS: Analysis of synonymous codon usage in foot-and-mouth disease virus. Vet Res Commun 2010, 34: 393-404. 10.1007/s11259-010-9359-4
Jenkins GM, Holmes EC: The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Res 2003, 92: 1-7. 10.1016/S0168-1702(02)00309-X
Levin DB, Whittome B: Codon usage in nucleopolyhedroviruses. J Gen Virol 2000, 81: 2313-2325.
Coleman JR, Papamichail D, Skiena S, Futcher B, Wimmer E, Mueller S: Virus attenuation by genome-scale changes in codon pair bias. Science 2008, 320: 1784-1787. 10.1126/science.1155761
Karlin S, Blaisdell BE, Schachtel GA: Contrasts in codon usage of latent versus productive genes of Epstein-Barr virus: data and hypotheses. J Virol 1990, 64: 4264-4273.
Zhi N, Wan Z, Liu X, Wong S, Kim DJ, Young NS, Kajigaya S: Codon optimization of human parvovirus B19 capsid genes greatly increases their expression in nonpermissive cells. J Virol 2010, 84: 13059-13062. 10.1128/JVI.00912-10
Zhou J, Liu WJ, Peng SW, Sun XY, Frazer I: Papillomavirus capsid protein expression level depends on the match between codon usage and tRNA availability. J Virol 1999, 73: 4972-4982.
Aragones L, Bosch A, Pinto RM: Hepatitis A virus mutant spectra under the selective pressure of monoclonal antibodies: codon usage constraints limit capsid variability. J Virol 2008, 82: 1688-1700. 10.1128/JVI.01842-07
Aragones L, Guix S, Ribes E, Bosch A, Pinto RM: Fine-tuning translation kinetics selection as the driving force of codon usage bias in the hepatitis A virus capsid. PLoS Pathog 2010, 6: e1000797. 10.1371/journal.ppat.1000797
Karlin S, Doerfler W, Cardon LR: Why is CpG suppressed in the genomes of virtually all small eukaryotic viruses but not in those of large eukaryotic viruses? J Virol 1994, 68: 2889-2897.
Sugiyama T, Gursel M, Takeshita F, Coban C, Conover J, Kaisho T, Akira S, Klinman DM, Ishii KJ: CpG RNA: identification of novel single-stranded RNA that stimulates human CD14+CD11c+ monocytes. J Immunol 2005, 174: 2273-2279.
Zhao S, Zhang Q, Liu X, Wang X, Zhang H, Wu Y, Jiang F: Analysis of synonymous codon usage in 11 human bocavirus isolates. Biosystems 2008, 92: 207-214. 10.1016/j.biosystems.2008.01.006
Das S, Paul S, Dutta C: Synonymous codon usage in adenoviruses: influence of mutation, selection and protein hydropathy. Virus Res 2006, 117: 227-236. 10.1016/j.virusres.2005.10.007
Mueller S, Papamichail D, Coleman JR, Skiena S, Wimmer E: Reduction of the rate of poliovirus protein synthesis through large-scale codon deoptimization causes attenuation of viral virulence by lowering specific infectivity. J Virol 2006, 80: 9687-9696. 10.1128/JVI.00738-06
Sanchez G, Bosch A, Pinto RM: Genome variability and capsid structural constraints of hepatitis a virus. J Virol 2003, 77: 452-459. 10.1128/JVI.77.1.452-459.2003
Deterding K, Constantinescu I, Nedelcu FD, Gervain J, Nemecek V, Srtunecky O, Vince A, Grgurevic I, Bielawski KP, Zalewska M, et al.: Prevalence of HBV genotypes in Central and Eastern Europe. J Med Virol 2008, 80: 1707-1711. 10.1002/jmv.21294
Wiegand J, Hasenclever D, Tillmann HL: Should treatment of hepatitis B depend on hepatitis B virus genotypes? A hypothesis generated from an explorative analysis of published evidence. Antivir Ther 2008, 13: 211-220.
Araujo NM, Waizbort R, Kay A: Hepatitis B virus infection from an evolutionary point of view: How viral, host, and environmental factors shape genotypes and subgenotypes. Infect Genet Evol 2011, 11: 1199-1207. 10.1016/j.meegid.2011.04.017
Kramvis A, Kew MC: Relationship of genotypes of hepatitis B virus to mutations, disease progression and response to antiviral therapy. J Viral Hepat 2005, 12: 456-464. 10.1111/j.1365-2893.2005.00624.x
McMahon BJ: The influence of hepatitis B virus genotype and subgenotype on the natural history of chronic hepatitis B. Hepatol Int 2009, 3: 334-342. 10.1007/s12072-008-9112-z
You J, Sriplung H, Chongsuvivatwong V, Geater A, Zhuang L, Huang JH, Chen HY, Yu L, Tang BZ: Profile, spectrum and significance of hepatitis B virus genotypes in chronic HBV-infected patients in Yunnan, China. Hepatobiliary Pancreat Dis Int 2008, 7: 271-279.
Erhardt A, Blondin D, Hauck K, Sagir A, Kohnle T, Heintges T, Haussinger D: Response to interferon alfa is hepatitis B virus genotype dependent: genotype A is more sensitive to interferon than genotype D. Gut 2005, 54: 1009-1013. 10.1136/gut.2004.060327
Sharp PM, Li WH: An evolutionary perspective on synonymous codon usage in unicellular organisms. J Mol Evol 1986, 24: 28-38. 10.1007/BF02099948
Wong EH, Smith DK, Rabadan R, Peiris M, Poon LL: Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus. BMC Evol Biol 2010, 10: 253. 10.1186/1471-2148-10-253
Wright F: The 'effective number of codons' used in a gene. Gene 1990, 87: 23-29. 10.1016/0378-1119(90)90491-9
This work was supported by gramts from the National Natural Science Foundation of China (No. 81060015) and Provincial Natural Science Foundation of China(1107RJ2A114).
The authors declare that they have no competing interests.
RMM and HL carried out the molecular genetic studies, participated in the sequence alignment and drafted the manuscript., MLW and FXZ participated in the sequence alignment. SDZ, GL and YW participated in the design of the study and performed the statistical analysis. XQH conceived of the study, and participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.