- Open Access
Analysis of synonymous codon usage in Hepatitis A virus
- Yiqiang Zhang†1, 2,
- Yongsheng Liu†1,
- Wenqian Liu1,
- Jianhua Zhou1,
- Haotai Chen1,
- Yin Wang2,
- Lina Ma1,
- Yaozhong Ding1 and
- Jie Zhang1Email author
© Zhang et al; licensee BioMed Central Ltd. 2011
- Received: 10 December 2010
- Accepted: 16 April 2011
- Published: 16 April 2011
Hepatitis A virus is the causative agent of type A viral hepatitis, which causes occasional acute hepatitis. Nevertheless, little information about synonymous codon usage pattern of HAV genome in the process of its evolution is available. In this study, the key genetic determinants of codon usage in HAV were examined.
The overall extent of codon usage bias in HAV is high in Picornaviridae. And the patterns of synonymous codon usage are quite different in HAV genomes from different location. The base composition is closely correlated with codon usage bias. Furthermore, the most important determinant that results in such a high codon bias in HAV is mutation pressure rather than natural selection.
HAV presents a higher codon usage bias than other members of Picornaviridae. Compositional constraint is a significant element that influences the variation of synonymous codon usage in HAV genome. Besides, mutation pressure is supposed to be the major factor shaping the hyperendemic codon usage pattern of HAV.
- Codon Usage
- Synonymous Codon
- Codon Usage Bias
- Synonymous Codon Usage
- Relative Synonymous Codon Usage
Hepatitis A virus (HAV), the causative agent of type A viral hepatitis, is an ancient human virus that was first identified in the stools of infected people in 1973 . HAV is a non-enveloped, single-stranded positive-sence RNA virus which belongs to order Picornavirales, family Picornaviridae, the genus Hepatovirus in virus taxonomy [2–4]. The genome of HAV is approximately 7500 nucleotide in length and contains a large open-reading frame (ORF) encoding a polyprotein in which the major capsid proteins represent the amino-terminal third, with the remainder of the polyprotein comprising a series of nonstructural proteins required for HAV RNA replication: 2B, 2C, 3A, 3B, 3Cpro and 3Dpol. Based on the studies of genetics, HAV was proposed to divide into six different genotypes . However, there is only one known serological group of human HAV [6, 7]. Although HAV causes occasional, dramatic disease outbreaks of acute hepatitis with fatal outcomes in otherwise healthy adults as well as isolated severe cases of hepatitis, it has never been associated with chronic liver disease .
As we all know, the genetic code chooses 64 codons to represent 20 standard amino acids and stop signals. These alternative codons for the same amino acid are termed as synonymous codons. Synonymous mutations tend to occur in the third base position, but the cases can be interchanged without altering the primary sequence of the polypeptide product. Some reports indicate that synonymous codons are not chosen equally both within and between genomes [9–13]. In general, codon usage variation may be the product of natural selection and/or mutation pressure for accurate and efficient translation in various organisms [14–21]. It is well known that codon usage variation is considered as an indicator of the forces shaping genome evolution. In addition, compared with natural selection, mutation pressure plays an important role in synonymous codon usage pattern in some RNA viruses [18, 22, 23].
Nevertheless, little information about codon usage pattern of HAV genome including the relative synonymous codon usage (RSCU) and codon usage bias (CUB) in the process of its evolution is available. In this study, the key genetic determinants of codon usage index in HAV were examined.
Synonymous codon usage in HAV
Identified nucleotide contents in complete coding region (length >250 bps) in hepatitis A virus (21 isolates) genome
Synonymous codon usage of the whole coding sequence in hepatitis A virus
Correspondence analysis (COA)
Compositional properties of HAV genomes
Summary of correlation analysis between the A, U, C, G contents and A3, U3, C3, G3 contents in all selected samples.
r = 0.965**
r = -0.160NS
r = -0.328NS
r = -0.555**
r = -0.679**
r = 0.357NS
r = 0.691**
r = -0.853**
r = -0.164NS
r = -0.825**
r = -0.622**
r = -0.191NS
r = 0.926**
r = -0.140NS
r = 0.662**
r = -0.532*
r = -0.181NS
r = -0.139NS
r = 0.945**
r = 0.580**
r = -0.844**
r = -0.270NS
r = 0.687**
r = 0.462*
r = 0.907**
Analysis of correlation between the first two principle axes and nucleotide contents in samples.
f 1 '(26.98%)
f 2 '(19.50%)
r = -0.714**
r = -0.573**
r = 0.302NS
r = 0.151NS
r = 0.274NS
r = 0.332NS
r = 0.178NS
r = 0.433*
r = 0.361NS
r = 0.589**
Mutational bias is another main factor leading to codon usage variation
Overtime, there have been more and more features that are unique to HAV within the family Picornaviridae, including its tissue tropism, its virion morphogenesis, its genetic distance from other members of this family, the important details of the processing of the viral polyprotein and the interactions of the virus with host cells . After we analyzed synonymous codon usage in HAV (Table 2), we found that comparing with other viruses of Picornaviridae, such as Coxsackievirus A9 (ENC = 55.6), Enterovirus 71 (ENC = 56.6), Poliovirus type 3 (ENC = 54.2), Rhinovirus type 89 (ENC = 45.9)  and Food-and-Mouth Disease virus (mean ENC = 51.53) , the ENC values for HAV are a little low (mean ENC = 39.34). Although the ENC values for Coxsackievirus, Enterovirus, Poliovirus and Rhinovirus are not the mean value, it is also suggesting that the overall extent of codon usage bias in HAV genomes is rather high in Picornaviridae. In fact, Sánchez et al. have previously reported that HAV presents a higher codon usage bias than other members of the family, which conveys in the adaptation to use abundant and rare codons . As a result, HAV codon usage has evolved to be complementary to that of human cells, never adopting codons those abundant for the host cell, even in some instances using these abundant codons as rare codons .
Since the variation and evolution of virus generally appear in the changes of virus genome composition, compositional constraint was assumed to be closely correlated with the synonymous codon usage pattern [18, 19, 27–30]. Nucleotide U content was the highest, and the ratio of U3% was much higher than the other base composition on the third codon position (Table 3), which interpreted why most of the preferentially used codons are U-ended codons (Table 2). Despite the ratio of U3% was the highest, the major compositional constraint, which shaping the synonymous codon usage pattern of HAV genome, was from the percent of nucleotide A on the third codon position (Table 4). Moreover, two principle axes (ƒ'1 and ƒ'2) are not correlated with the other base compositions except nucleotide A (Table 4). This discovery was different from many reports which suggest that C+G compositional constraints were the major factor influencing codon usage bias in virus genome [18, 29, 30]. Therefore, we supposed that the compositional constraint was from not only C+G contents but also A and/or U contents. In addition, we found that A3% has a remarkable correlation with (C+G)% (Table 3). Hence, we could infer that A3% could influence the synonymous codon usage pattern through coordinating the contents of (C+G)%. Moreover, each composition was closely correlated with one of the other compositions, and each composition has a striking negative correlation with the other compositions. The (C3+G3)% was correlated with all the base compositions especially U and C contents. All these data suggest that there were kinds of complex and fantastic interrelations existing among these base compositions to regulate the codon usage bias. In brief, compositional constraint can indeed determine the variation of synonymous codon usage in virus genome.
Mutational pressure and natural selection are generally thought to be the main factors that account for codon usage variation between genes in different organisms [14–21]. We wished to determine which should be responsible for the extreme codon usage bias in HAV. In the present study, the mutational pressure was determined to be the more important factor for the codon usage bias in HAV, which is shown in Figure 2, indicating that the codon usage in HAV genome is influenced by the C+G content which is usually assumed to be the result of mutational pressure. Actually, it is previously reported that mutation pressure rather than natural selection is the most important determinant of the codon bias in human RNA viruses . Since mutation rates in RNA viruses are much higher than those in DNA viruses , it is understandable that mutational pressure is the major factor of shaping codon usage pattern in the 21 HAV strains included in our study. Despite this, HAV does not appear to undergo the rapid accumulation of genetic changes seen in many RNA viruses. Because HAV exploits a very low translation rate and a very low replication rate to promote and ensure its survival [26, 32], it shows a quite low mutation rate than other members of the family Picornaviridae[24, 33].
Since HAV mutation rate is much lower than other members of the family Picornaviridae, how does it form such a higher codon usage bias than other members of the family? Furthermore, how does it form kinds of trends in codon usage variation among different stains (Shown in Figure 1) in the condition of the similar nucleotide contents (Table 2)? This could be ascribed to the distinct endemicity of HAV, which is speculated from the result of COA. Early comparative studies of the nucleotide sequences of different human HAV strain suggested that sequence correlation could be correlated with the geographical origin of viruses [34, 35]. It is well known that quasispecies dynamics is characterized by continuous generation of variant viral genomes, competition among them, and selection of the fittest mutant distributions in any given environment. As other RNA viruses, HAV exists in vivo as distributions of closely related variant referred to as quasispecies [25, 32]. HAV strains maintained their low rate of accumulating mutations over a long period of time so that it developed specific ecological niches . Because of surviving in different geographical area, different human race and different rounds of replication, the extreme codon usage bias of HAV was established over a long time. Moreover, in the context of a very low mutation rate, the extreme codon usage bias of HAV was conserved so that a distinct endemicity was generated.
HAV presents a higher codon usage bias than other members of Picornaviridae. The most important determinant of the high codon bias in HAV is mutation pressure which is also the main element shaping the hyperendemic codon usage pattern of HAV despite the mutation rate of HAV is quite low. Besides, compositional constraint is another factor influencing the synonymous codon usage in HAV. Although basic knowledge of codon usage patterns of HAV and the factors regulating the synonymous codon usage are demonstrated in our present study, more comprehensive analysis is necessary for revealing the deeper characteristic of synonymous codon usage in HAV genome.
Information of hepatitis A virus genomes used in this study
Derived from HM-175
Vaccine strain IA
Derived from H2
Derived from H2
Measures of relative synonymous codon usage
Where gij is the observed number of the i th codon for j th amino acid which has ni type of synonymous codons. When the codon with RSCU values close to 1.0, it means that this codon is chosen equally and randomly.
The ENC was calculated to quantify the codon usage bias of an ORF , which is the best estimator of absolute synonymous codon usage bias . The larger extent of codon preference in a gene, the smaller the ENC value is. And the index GC3s was used to calculate the fraction of the nucleotides G+C at the synonymous third codon position (excluding Met, Trp, and the termination codons).
Multivariate statistical analysis can be used to explore the relationships between variables and samples. In this study, correspondence analysis was used to investigate the major trend in codon usage variation among genes. In this study, the complete coding region of each gene was represented as a 59 dimensional vector, and each dimension corresponds to the RSCU value of one sense codon (excluding Met, Trp, and the termination codons) .
Correlation analysis was used to identify the relationship between nucleotide composition and synonymous codon usage pattern . This analysis was implemented based on the Spearman's rank correlation analysis way.
All statistical processes were carried out by with statistical software SPSS 11.5 for windows.
This work was supported in parts by grants from National Key Technologies R&D Program (2009ZX08007-006B), International Science & Technology Cooperation Program of China (No.2010DFA32640) and Science and Technology Key Project of Gansu Province (No.0801NKDA034). This study was also supported by National Natural Science foundation of China (No.30700597 and No. 31072143).
- Feinstone SM, Kapikian AZ, Purcell RH: Hepatitis A: detection by immune electron microscopy of a virus-like antigen associated with a cute illness. Science 1973, 182: 1026-1028. 10.1126/science.182.4116.1026View ArticlePubMedGoogle Scholar
- Melnick JL: Classification of hepatitis A virus as enterovirus type 72 and of hepatitis B virus as hepadnavirus type I. Intervirology 1982, l8: 105-106. 10.1159/000149313View ArticleGoogle Scholar
- Gust ID, Coulepis AG, Feinstone SM, Locarnini SA, Moritsugu Y, Najera R, Siegl G: Taxonomic classification of hepatitis A virus. Intervirology 1983, 20: 1-7. 10.1159/000149367View ArticlePubMedGoogle Scholar
- Minor PD: Picornaviridae. In Classification and Nomenclature of Viruses: Fifth Report of the International Committee on Taxonomy of Viruses, Archives of Virology, Supplementum. Volume 2. Edited by: Francki RIB, Fauquet CM, Knudson DL, Brown F. Wien: Springer Verlag; 1991:320-326.Google Scholar
- Costa-Mattioli M, Ferre V, Casane D, Perez-Bercoff R, Coste-Burel M, Imbert-Marcille BM, Andre EC, Bressollette-Bodin C, Billaudel S, Cristina J: Evidence of recombination in natural populations of hepatitis A virus. Virology 2003, 311: 51-59. 10.1016/S0042-6822(03)00109-0View ArticlePubMedGoogle Scholar
- Rakela J, Fay OH, Stevenson K, Gordon I, Mosley JW: Similarities of two hepatitis A virus strains. WHO Bulletin 1976, 54: 561-564.Google Scholar
- Lemon SM, Binn LN: Antigenic relatedness of two strains of hepatitis A virus determined by cross-neutralization. Infection and Immunity 1983, 42: 418-420.PubMed CentralPubMedGoogle Scholar
- Martin A, Lemon SM: Hepatitis A virus: from discovery to vaccines. Hepatology 2006, 43: S164-S172. 10.1002/hep.21052View ArticlePubMedGoogle Scholar
- Grantham R, Gautier C, Gouy M, Mercier R, Pave A: Codon catalog usage and the genome hypothesis. Nucleic Acids Res 1980, 8: 49-62.Google Scholar
- Martin A, Bertranpetit J, Oliver JL: Variation in G+C content and codon choice: differences among synonymous codon groups in vertebrate genes. Nucleic Acids Res 1989, 17: 6181-6189. 10.1093/nar/17.15.6181View ArticleGoogle Scholar
- Lloyd AT, Sharp PM: Evolution of codon usage patterns: the extent and nature of divergence between Candida albicans and Saccharomyces cerevisiae . Nucleic Acids Res 1992, 20: 5289-5295. 10.1093/nar/20.20.5289PubMed CentralView ArticlePubMedGoogle Scholar
- Xie T, Ding D, Tao X, Dafu D: The relationship between synonymous codon usage and protein structure. FEBS Lett 1998, 434: 93-96. 10.1016/S0014-5793(98)00955-7View ArticlePubMedGoogle Scholar
- Dittmar KA, Goodenbour JM, Pan J: Tissue-specific differences in human transfer RNA expression. PLoS Genet 2006, 2: 2107-2115. 10.1371/journal.pgen.0020221View ArticleGoogle Scholar
- Sharp PM, Li WH: Codon usage in regulatory genes in Escherichia coli does not reflect selection for 'rare' codon. Nucleic Acids Res 1986, 14: 7737-7749. 10.1093/nar/14.19.7737PubMed CentralView ArticlePubMedGoogle Scholar
- Sharp PM, Tuohy TM, Mosurski KR: Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 1986, 14: 5125-5143. 10.1093/nar/14.13.5125PubMed CentralView ArticlePubMedGoogle Scholar
- Karlin S, Mrázek J: What drives codon choices in human genes? J Mol Biol 1996, 262: 459-472. 10.1006/jmbi.1996.0528View ArticlePubMedGoogle Scholar
- Lesnik T, Solomovici J, Deana A, Ehrlich R, Reiss C: Ribosome traffic in E.coli and regulation of gene expression. J Theor Biol 2000, 202: 175-185. 10.1006/jtbi.1999.1047View ArticlePubMedGoogle Scholar
- Gu WJ, Zhou T, Ma JM, Sun X, Lu ZH: Analysis of synonymous codon usage in SARS coronavirus and other viruses in the Nidovirales . Virus Res 2004, 101: 155-161. 10.1016/j.virusres.2004.01.006View ArticlePubMedGoogle Scholar
- Zhou T, Gu WJ, Ma JM, Sun X, Lu ZH: Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses. Biosystems 2005, 81: 77-86. 10.1016/j.biosystems.2005.03.002View ArticlePubMedGoogle Scholar
- Zhou T, Sun X, Lu ZH: Synonymous codon usage in environmental Chlamydia UWE25 reflects an evolution divergence from pathogenic chlamydiae. Gene 2006, 368: 117-125. 10.1016/j.gene.2005.10.035View ArticlePubMedGoogle Scholar
- Zhong JC, Li YM, Zhao S, Liu S, Zhang Z: Mutation pressure shapes codon usage in the GC-rich genome of foot-and-mouth disease virus. Virus Genes 2007, 35: 767-776. 10.1007/s11262-007-0159-zView ArticlePubMedGoogle Scholar
- Levin DB, Whittome B: Codon usage in nucleopolyhedroviruses. J Gen Virol 2000, 81: 2313-2325.View ArticlePubMedGoogle Scholar
- Jenkins GM, Holmes EC: The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Res 2003, 92: 1-7. 10.1016/S0168-1702(02)00309-XView ArticlePubMedGoogle Scholar
- Cristina J, Costa-Mattioli M: Genetic variability and molecular evolution of Hepatitis A virus. Virus Res 2007, 127: 151-157. 10.1016/j.virusres.2007.01.005View ArticlePubMedGoogle Scholar
- Sánchez G, Bosch A, Pintó RM: Genome variability and capsid structural constraints of hepatitis A virus. J Virol 2003, 77: 452-459.PubMed CentralView ArticlePubMedGoogle Scholar
- Pintó RM, Aragonès L, Costafreda MI, Ribes E, Bosch A: Codon usage and replicative strategies of hepatitis A virus. Virus Res 2007, 127: 158-163.View ArticlePubMedGoogle Scholar
- Bernardi G, Bernardi G: Codon Usage and Genome Composition. J Mol Evol 1985, 22: 363-365. 10.1007/BF02115693View ArticlePubMedGoogle Scholar
- Bernardi G, Bernardi G: Compositional Constraints and Genome Evolution. J Mol Evol 1986, 24: 1-11. 10.1007/BF02099946View ArticlePubMedGoogle Scholar
- Das S, Paul S, Dutta C: Synonymous codon usage in adenoviruses: Influence of mutation, selection and protein hydropathy. Virus Res 2006, 117: 227-236. 10.1016/j.virusres.2005.10.007View ArticlePubMedGoogle Scholar
- Zhao S, Zhang Q, Liu XL, Wang XM, Zhang HL, Wu Y, Jiang F: Analysis of synonymous codon usage in 11 Human Bocavirus isolates. BioSystems 2008, 92: 207-214. 10.1016/j.biosystems.2008.01.006View ArticlePubMedGoogle Scholar
- Drake JW, Holland JJ: Mutation rates among RNA viruses. Proc Natl Acad Sci USA 1999, 96: 13910-13913. 10.1073/pnas.96.24.13910PubMed CentralView ArticlePubMedGoogle Scholar
- Costa-Mattioli M, Domingo E, Cristina J: Analysis of sequential hepatitis A virus strains reveals coexistence of distinct viral subpopulations. J Gen Virol 2006, 87: 115-118. 10.1099/vir.0.81286-0View ArticlePubMedGoogle Scholar
- Robertson BH, Jansen RW, Khanna B, Totsuka A, Nainan OV, Siegl G, Widell A, Margolis HS, Isomura S, Ito K, Ishizu T, Moritsugu Y, Lemon SM: Genetic relatedness of hepatitis A virus strains recovered from different geographical regions. J Gen Virol 1992, 73: 1365-1377. 10.1099/0022-1317-73-6-1365View ArticlePubMedGoogle Scholar
- Jansen RW, Siegl G, Lemon SM: Molecular epidemiology of human hepatitis A virus defined by an antigen-capture/polymerase chain reaction method. Proc Natl Acad Sci USA 1990, 87: 2867-2871. 10.1073/pnas.87.8.2867PubMed CentralView ArticlePubMedGoogle Scholar
- Robertson BH, Khanna B, Nainan OV, Margolis HS: Epidemiologic patterns of wild-type hepatitis A virus determined by genetic variation. J Infect Dis 1991, 163: 286-292. 10.1093/infdis/163.2.286View ArticlePubMedGoogle Scholar
- Wright F: The 'effective number of codons' used in a gene. Gene 1990, 87: 23-29. 10.1016/0378-1119(90)90491-9View ArticlePubMedGoogle Scholar
- Comeron JM, Aguade M: An evaluation of measures of synonymous codon usage bias. J Mol Evol 1998, 47: 268-274. 10.1007/PL00006384View ArticlePubMedGoogle Scholar
- Mardia KV, Kent JT, Bibby JM: Multivariate analysis. NewYork, Academic press; 1979.Google Scholar
- Ewens WJ, Grant GR: Statistical Methods in Bioinformatics. NewYork, Springer; 2001.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.