Patterns and influencing factor of synonymous codon usage in porcine circovirus
© LIU et al; licensee BioMed Central Ltd. 2012
Received: 13 September 2011
Accepted: 15 March 2012
Published: 15 March 2012
Analysis of codon usage can reveal much about the molecular evolution of the viruses. Nevertheless, little information about synonymous codon usage pattern of porcine circovirus (PCV) genome in the process of its evolution is available. In this study, to give a new understanding on the evolutionary characteristics of PCV and the effects of natural selection from its host on the codon usage pattern of the virus, Patterns and the key determinants of codon usage in PCV were examined.
We carried out comprehensive analysis on codon usage pattern in the PCV genome, by calculating relative synonymous codon usage (RSCU), effective number of codons (ENC), dinucleotides and nucleic acid content of the PCV genome.
PCV genomes have relatively much lower content of GC and codon preference, this result shows that nucleotide constraints have a major impact on its synonymous codon usage. The results of the correspondence analysis indicate codon usage patterns of PCV of various genotypes, various subgenotypes changed greatly, and significant differences in codon usage patterns of Each virus of Circoviridae.There is much comparability between PCV and its host in their synonymous codon usage, suggesting that the natural selection pressure from the host factor also affect the codon usage patterns of PCV. In particular, PCV genotype II is in synonymous codon usage more similar to pig than to PCV genotype I, which may be one of the most important molecular mechanisms of PCV genotype II to cause disease. The calculations results of the relative abundance of dinucleotides indicate that the composition of dinucleotides also plays a key role in the variation found in synonymous codon usage in PCV. Furthermore, geographic factors, the general average hydrophobicity and the aromaticity may be related to the formation of codon usage patterns of PCV.
The results of these studies suggest that synonymous codon usage pattern of PCV genome are the result of interaction between mutation pressure and natural selection from its host. The information from this study may not only have theoretical value in understanding the characteristics of synonymous codon usage in PCV genomes, but also have significant value for the molecular evolution of PCV.
Genetic information is transmitted from mRNA to protein in a mode of triplet codon. Each amino acid matches with at least one codon, at most six codons. The codons encoding the same amino acid is called synonymous codon. During biosynthesis of protein, usage probability of those synonymous codons is different. Some species or some genes are usually prone to use one or several particular synonymous codons. These codons are called preferable codons, which is called as codon bias. Usage bias of codons from various species has been studied, and it is found that during protein biosynthesis synonymous codons encoding amino acid is not used randomly [1–3]. Many studies have indicated that obvious bias exists between different genes from different species or the same species [4–6]. Usage bias of codons is influenced mainly by mutation bias, translation selection, secondary protein structure, replication and selective transcription, hydrophobia and hydrophilia of protein, and external environment [7–13].
PCV belongs to genus of porcine circovirus, family of porcine circovirus. It has two genotype, PCV genotype I and PCV genotype II, and it is the smallest virus which has been discovered so far . Among the different genotype, PCV genotype II infection and its related diseases have become one big problem across the globe for pig feeding, which threatens greatly to normal development of the industry of pig feeding. The PCV genome is a single-stranded negative circular DNA, and very small; full length of the PCV genotype I is only 1,759 bp, and PCV genotype II, 1,767 bp or 1,768 bp. The genome contains 11 open reading frames (ORF), among which, ORF1 encodes replication-associated proteins (Rep and Rep'); ORF2, structural proteins (viral capsid proteins, Cap); ORF3, toxicity-associated proteins, which can cause apoptosis [15, 16]. By analyzing the whole sequence of PCV genome, it is found that ORF2 has smaller selective pressure than ORF1, and more mutation. Nucleotide sequences among various strains in the same genotype are very conservative, their homology is over 90%, while similarity between nucleotide sequences from various strains respectively from the two genotypes is less than 80% [17, 18]. However, so far, studies have not related to usage of PCV codons. Explanation of codon usage pattern of PCV has significance on PCV evolution, gene prediction, gene classification, design of high expressed genes and viral vectors, and understanding of interaction between PCV and its host cells. Therefore, in this study, we first performed comprehensive analysis on codon usage pattern of PCV genome and the related factors affecting on codon usage. This study will play a major role in explanation of evolution process of PCV genome and further studies.
The characteristics of synonymous codon usage in PCV
The relative synonymous codon usage frequency (RSCU) of PCV and swine
Nucleotide content of 28 PCV genomes
Nucleotide content of all PCV genomes
The correlation analysis between the A, U, C, G contents and A3, U3, C3, G3 contents in all ORF of PCVa
A 3 %
U 3 %
G 3 %
C 3 %
The correlation analysis between the first two axes in CA and the nucleotide contents of PCVa b
A 3 %
U 3 %
G 3 %
C 3 %
f' 1 '
f' 2 '
Genetic relationship based on synonymous codon usage
Relationships between codon usage pattern of PCV and that of the host
Relationship between dinucleotide biases and codon usage in PCV
Relative abundance of dinucleotides in PCV
Relative abundance of the 16 dinucleotides
Mean ± SDa
0.924 ± 0.053
0.870 ± 0.032
1.045 ± 0.029
0.905 ± 0.085
0.881 ± 0.043
1.059 ± 0.037
0.852 ± 0.063
0.826 ± 0.028
Mean ± SDa
0.887 ± 0.042
0.927 ± 0.044
0.890 ± 0.027
0.887 ± 0.033
1.105 ± 0.056
1.046 ± 0.097
1.051 ± 0.029
0.622 ± 0.029
Synonymous codon usage in different viruses of circoviridae is virus specific
Effect of other factors on codon usage
During protein biosynthesis synonymous codon encoding amino acids are not used randomly, and some species or some gene always prefers to use of one or several particular synonymous codons, which is called as codon usage bias. Precious studies reveal that different genes from different species or the same one have obvious codon usage bias [21, 22]. Codon usage bias is influenced mainly by mutation bias [23, 24], translation selection [25, 26], secondary protein structure [20, 27], replication and transcription selection , secondary mRNA structure , gene length , tRNA abundance , gene function and external environment . However, most of these studies focus on some higher organism and many microorganisms with large genome and more genes, and there are few studies on virus with small genome and few genes or comparison between virus and host. Relatively, there are more reports on codon usage in genomes from viruses with great harm to mankind, such as SARS, human immunodeficiency virus, influenza virus A and hepatitis virus. PCV is a primary pathogen of postweaning multisystemic wasting syndrome (PMWS), which has threatened the development of pig feeding industry seriously because in recent year's occurrence of this disease has increased so as to bring about great economic loss in the world industry of pig feeding. Further studies on codon usage pattern in PCV have great significance on mutation pattern and molecular evolution of PCV. However, reports on codon usage pattern in PCV are rare, and this study is the first report.
By comparison with reported DNA viruses such as Duck plague virus, Duck enteritis virus, Iridovirus, Herpesvirus [33–36], synonymous codon usage bias in the PCV genome is low at large (average ENC values is 56.80, and minimum is 55.). This suggests that low codon bias may result from increase in itself replication efficiency in PCV in order to adapt to replication system of its hosts.
In this study, relation between main indices (f'1 and f'2)for the correspondence analysis on PCV usage cofon usage and its nucleotide composition (See Table 2) indicates, mutation pressure has a significant role in PCV codon usage. Other factors which can influence on PCV codon usage are also analysed and the initial results show that mutation pressure is the main factor to influence on PCV codon usage variation.
There were reports that natural selection can influence on synonymous codon usage pattern in viruses and the same conclusions are also obtained from this study. Three evidences support this conclusion. The first evidence is that PCV genome is GC3% -poor (average value = 47.08, SD = 2.88), but most of preferentially used codons are G/T-ended codons. Meanwhile, average of A3% is higher than that of T3%, but among the codons which PCV prefers to using, there are only three preferable codons with the end base of A3% while six those with the end base of T3%. The second evidence is that the high similarities exist between PCV and its natural host. The third evidence is that CpG and the synonymous codon including it were inhibited. The three above evidences both state that natural selection is involved in formation of PCV synonymous codon usage pattern.
At present, according to pathogenicity, antigenicity and nucleotide sequence difference, PCV is divided into two genotypes, PCV genotype I and PCV genotype II, of which PCV genotype II includes various subtypes. From significance of PCV codon usage between different genotypes in Figure 1, we can see that PCV codon bias may have association with genotypes. In addition to this, the results in this study also reveal that geological factor may almost have relation with codon usage in PCV. In some reports, gene length has certain correlation with codon usage . Similarly, in some viruses, gene length has no effect on codon usage . With correlation analysis we surveyed codon usage bias and gene length in PCV, and it is found that in these viral genes, codon usage bias has no notable correlation with gene length (Spearman, r = 0.075, p > 0.1). The results indicate that PCV gene length has no effect on synonymous codon usage. Other factors, including GRAVY and aromaticity may also significantly influence codon usage of PCV
Taken together, the codon usage patterns of PCV possibly result from interactions between natural selection and mutation pressure. These results not only provide an insight into the variation of codon usage pattern among the genomes of PCV, but also may help in understanding the processes governing the evolution of PCV.
Materials and methods
PCV genome sequences included in this study
Swine genes used in this study
Viral sequence of Circoviridae used in this study
Beak and feather disease virus
Chicken anemia virus
Synonymous codon usage measures
In order to eliminate the influence of amino acid composition on codon usage and directly reflect the usage characteristics of codon, the study evaluates synonymous codon usage bias through statistical estimation on relative synonymous codon usage frequency (RSCU) . RSCU value refers to the ratio between the usage frequency of one codon in gene sample and expected frequency in the synonymous codon family. If the synonymous codon usage of one amino acid has no preferences, that is, codon usage frequency is close to expected frequency, the RSCU values of codons are equal to 1; if a codon RSCU value is greater than 1, the codon use frequency is higher than expected frequency, whereas it is less than expected value.
The definition on a single gene codon bias is mainly based on effective number of codons (ENC) . ENC values can reflect the preference degree of synonymous codon non-equilibrium use in codon family. The range of ENC values is from 20 (each amino acid only uses one codon) to 61 (all synonymous codons are equivalently used). ENC value is closer to 20, the degree of being used non-randomly is higher, and the bias is stronger. It is generally believed that the genes are provided with significant codon bias when ENC ≤ 35. The values of RSCU and ENC were obtained by codonW program.
A comparison of actual and expected dinucleotide frequencies of the 16 dinucleotides in coding region of PCV genomes was also undertaken using SPSS 17.0.
Correspondence analysis (CA)
Correspondence analysis is mainly used for detecting the changes of codon RSCU values in genes . It is an effective multivariate statistical method of studying the internal relation between the variables and samples, and it is successfully applied to the study of codon. In correspondence analysis, all genes in samples are distributed in a 59-dimensional (59 justice codons, in addition to the stop codon, Met, and Trp) vector space, each gene is described with 59 (f' 1, f' 2,..., f' 59 ) variables, the results can be applied for finding out the major factors affecting codon usage bias in genes [40, 41]. This was done using the CodonW program.
Correlation analysis of PCV was used to identify the relationship between nucleotide composition and synonymous codon usage pattern . All statistical processes were carried out by with statistical software SPSS17.0 for windows.
This study was supported in parts by grants from National Pig Industrial System (CARS-36-06B) and Research and Demonstration on evaluation technologies of clinical immune responses of serious animal diseases vaccine (201203039)
- Dittmar KA, Goodenbour JM, Pan J: Tissue-specific differences in human transfer RNA expression. PLoS Genet 2006, 2:2107–2115.View Article
- Lloyd AT, Sharp PM: Evolution of codon usage patterns: the extent and nature of divergence between Candida albicans and Saccharomyces cerevisiae. Nucleic Acids Res 1992, 20:5289–5295.PubMedView Article
- Xie T, Ding D, Tao X, Dafu D: The relationship between synonymous codon usage and protein structure. FEBS Lett 1998, 434:93–96.PubMedView Article
- Chiapello H, Lisacek F, Caboche M: Codon usage and gene function are related in sequences of Arabidopsisthaliana. Gene 1998,209(1–2):GC1-GC38.PubMedView Article
- Adams MJ, Antoniw JF: Codon usage bias amongst plant viruses. Arch Virol 2003,149(1):113–135.PubMedView Article
- Zhou H, Wang H, Huang LF: Heterogeneity in codon usages of sobemovirus genes. Arch Virol 2005,150(8):1591–1605.PubMedView Article
- Levin DB, Whittome B: Codon usage in nucleopolyhedroviruses. J Gen Virol 2000, 81:2313–2325.PubMed
- Gupta SK, Ghosh TC: Gene expressivity is the main factor in dictating the codon usage variation among the genes in Pseudomonas aeruginosa. Gene 2001,273(1):63–70.PubMedView Article
- D'Onofrio G, Ghosh TC, Bernardi G: The base composition of the genes is correlated with the secondary structures of the encoded Proteins. Gene 2002,300(1):179–187.PubMedView Article
- Gu W, Zhou T, Ma J, Sun X, Lu Z: The relationship between synonymous codon usage and Protein structure in Escherichia coli and Homo sapiens. Biosystems 2004,73(2):89–97.PubMedView Article
- Wang Meng, Zhang Jie, Zhou Jian-hua, Chen Hao-tai, Ma Li-na, Ding Yao-zhong, Liu Wen-qian, Liu Yong-sheng: Analysis of codon usage in bovine viral diarrhea virus. Arch Virol 2010,156(1):153–160.PubMedView Article
- Romero H, Zavala A, Musto H: Compositional pressure and translational selection determine codon usage in the extremely GC poor unicellular eukaryote Entamoeba histolytica. Gene 2000,242(1–2):307–311.PubMedView Article
- Van der Linden MG, de Farias ST: Correlation between codon usage and thermostability. Extremophiles 2006,10(5):479–481.PubMedView Article
- Mankertz A, Persson F, Mankertz J, Blaess G, Buhk HJ: Mapping and characterization of the origin of DNA replication of porcine circovirus. J Virol 1997,71(3):2562–2566.PubMed
- Karuppannan AK, Kwang J: ORF3 of porcine circovirus 2 enhances the in vitro and in vivo spread of the of the virus. Virology 2011, 410:248–256.PubMedView Article
- Liu J, Chen I, Du Q: The ORF3 protein of porcine circovirus type 2 is involved in viral pathogenesis in vivo. Virol 2006,80(10):5065–5073.View Article
- Andre L, Hamel , Lihua L, Lin , Gopi PS, Nayar : Nucleotide sequence of porcine circovirus associated with postweaning multisystemic wasting syndrome in pigs. J Virol 1998, 72:5262–5267.
- Meehan BM, McNeilly F, Todd D, Kennedy S, Jewhurst VA, Ellis JA, Hassard LE, Clark EG, Haines DM, Allan GM: Characterization of novel circovirus DNAs associated with wasting syndromes in pigs. J Gen Virol 1998, 79:2171–2179.PubMed
- Zhou T, Wanjun Gu, Ma J, Sun X, Zuhong Lu: Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses. Biosystems 2005, 81:77–86.PubMedView Article
- Chiusano Maria Luisa, Alvarez-Valin Fernando, Giulio Massimo Di, D'Onofrio Giuseppe, Ammirato Gaetano, Colonna Giovanni, Bernardi Giorgio: Second codon positions of genes and the secondary structures of proteins.Relationships and implications for the origin of the genetic code. Gene 2000, 261:63–69.PubMedView Article
- Sharp PM, Tuohy TM, Mosurski KR: Codon usage in yeast:cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 1986, 14:5125–5143.PubMedView Article
- Gu WJ, Zhou T, Ma JM, Sun X, Lu ZH: Analysis of synonymous codon usage in SARS Coronavirus and other viruses in the Nidovirales. Virus Res 2004,101(2):155–161.PubMedView Article
- Liu Wen-qian, Zhang Jie, Zhang Yi-qiang, Zhou Jian-hua, Chen Hao-tai, Ma Li-na, Ding Yao-zhong, Liu Yongsheng: Compare the differences of synonymous codon usage between the two species within cardiovirus. Virol J 2011, 8:325.PubMedView Article
- Gareth M, Jenkins , Pagel Mark, Ernest A, Gould , Zanotto DeA, Paolo M, Edward C, Holmes : Evolution of base composition and codon usage bias in the genus Flavivirus. J Mol Evol 2001, 52:383–390.
- Peixoto L, Zavala A, Romero H, Musto H: The strength of translational selection for codon usage varies in the three replicons of Sinorhizobium meliloti. Gene 2003,320(27):109–116.PubMedView Article
- Romero H, Zavala A, Musto H, Bernardi G: The influence of translational selection on codon usage in fishes from the family cyprinidae. Gene 2003,317(1–2):141–147.PubMedView Article
- Gupta SK, Majumdar S, Bhattacharya TK, Ghosh TC: Studies on the relationships between the synonymous codon usage and protein secondary structural units. Biochem Biophys Res Commun 2000, 269:692–696.PubMedView Article
- James O, McInerney : Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. PNAS 1998,95(18):10698–10703.View Article
- Zama M: Codon usage and secondary strueture of mRNA. Nucleic Acids Symp Ser 1990, 22:93–94.PubMed
- Stoletzki N, Eyre-Walker A: Synonymous codon usage in Escherichia coli: selection for translational accuracy. Mol Biol Evol 2007,24(2):374–381.PubMedView Article
- Rocha PC, Eduardo : Codon usage bias from tRNA's point of view: redundancy, specialization, and efficient decoding for translation optimization. Genome Res 2004,14(11):2279–2281.PubMedView Article
- Lynn DJ, GAC: Synonymous codon usage is subject to selection in thermophilic bacteria. Nucleic Acids Res 2002,30(19):4272–4277.PubMedView Article
- Minghui Fu: Codon usage bias in herpesvirus. Arch Virol 2010, 155:391–396.View Article
- Tsai C-T, Lin C-H, Chang C-Y: Analysis of codon usage bias and base compositional constraints in iridovirus genomes. Virus Res 2007, 126:196–206.PubMedView Article
- Cai M-S, An-Chun Cheng, Wang M-S, Li-Chan Zhao: Characterization of Synonymous Codon Usage Bias in the Duck Plague Virus UL35 Gene. Intervirology 2009, 52:266–278.PubMedView Article
- Jia R, Cheng A, Wang M: Analysis of synonymous codon usage in the UL24 gene of duck enteritis virus. Virus Genes 2009, 38:96–103.PubMedView Article
- Wright F: The effective number of codons used in a gene. Gene 1990, 87:23–29.PubMedView Article
- Tao P, Dai L, Luo M, Tien Fangqiang Tang Po, Pan Z: Analysis of synonymous codon usage in classical swine fever virus. Virus Genes 2009, 38:104–112.PubMedView Article
- Hao S, Zhang Q, Liu X, Wang X, Zhang H, Wu Y, Jiang F: Analysis of synonymous codon usage in 11 Human Bocavirus isolates. Biosystems 2008, 92:207–214.View Article
- Sau K, Gupta SK, Sau S, Mandal SC, Ghosh TC: Factors influencing synonymous codon and amino acid usage biases in Mimivirus. Biosystems 2006, 85:107–113.PubMedView Article
- Ewens WJ, Grant GR: Statistical Methods in Bioinformatics. New York: Springer; 2001.
- Drake JW, Holland JJ: Mutation rates among RNA viruses. Proc Natl Acad Sci USA 1999, 96:13910–13913.PubMedView Article