- Open Access
Revisiting the taxonomical classification of Porcine Circovirus type 2 (PCV2): still a real challenge
Virology Journalvolume 12, Article number: 131 (2015)
PCV2 has emerged as one of the most devastating viral infections of swine farming, causing a relevant economic impact due to direct losses and control strategies expenses. Epidemiological and experimental studies have evidenced that genetic diversity is potentially affecting the virulence of PVC2. The growing number of PCV2 complete genomes and partial sequences available at GenBank questioned the accepted PCV2 classification.
Nine hundred seventy five PCV2 complete genomes and 1,270 ORF2 sequences available from GenBank were subjected to recombination, PASC and phylogenetic analyses and results were used for comparison with previous classification scheme.
The outcome of these analyses favors the recognition of four genotypes on the basis of ORF2 sequences, namely PCV2a, PCV2b, PCV2c and PCV2d-mPCV2b. To deal with the difficulty of founding an unambiguous classification and accounting the impossibility to define a p-distance cut-off, a set of reference sequences that could be used in further phylogenetic studies for PCV2 genotyping was established. Being aware that extensive phylogenetic analyses are time-consuming and often impracticable during routine diagnostic activity, ORF2 nucleotide positions adequately conserved in the reference sequences were identified and reported to allow a quick genotype differentiation.
Globally, the present work provides an updated scenario of PCV2 genotypes distribution and, based on the limits of the previous classification criteria, proposes new rapid and effective schemes for differentiating the four defined PCV2 genotypes.
Members of the family Circoviridae, genus Circovirus, are icosahedral, non-enveloped viruses with single-stranded ambisense circular genomes. Two members of this genus have been reported to infect mammals; Porcine circovirus type 1 (PCV1) and Porcine circovirus type 2 (PCV2) . PCV1, initially designated as porcine circovirus (PCV), was first discovered in 1974 as a permanent contaminant of continuous cell culture PK15 and is considered non-pathogenic . At the beginning of 1990s a new syndrome, named postweaning multisistemic wasting syndrome (PMWS) and nowadays designated as PCV2-systemic disease (PCV2-SD) , was described by Clark and Harding [4, 5] and the etiological agent, recognized as a Circovirus different from PCV1, was first isolated in 1998 . Since the first reports, PCV2 infection was reported all over the world and PMWS showed epidemical proportion in Europe and South East Asia by the late 1990’s and in the Americas by 2004–05 . Progressively, several clinical manifestations, collectively named porcine circovirus diseases (PCVD), have been associated with PCV2 infection and are responsible of a relevant economic impact to the pig industry due to direct losses and control measures’ costs. PCV2 display a simple ambisense genome ranging from 1766 to 1768 nucleotides. Three major Open Reading Frames (ORF1, ORF2 and ORF3) have been recognized , encoding for proteins involved respectively in viral genome replication (Rep and Rep’), capsid protein (Cap) and possibly in pathogenesis (ORF3) [9–11].
At the intraspecific level, two major PCV2 groups were initially defined (i.e. PCV2a and PCV2b) [12, 13]. Applying PASC analyses [14, 15] and linearized phylogenetic trees  under the assumption of strict molecular clock, two nucleotide diversity cut-offs for ORF2 (3.5 %) and the complete genome (2.0 %) were proposed . Accordingly, PCV2 genotypes a and b were designated under the European project no 513928 of the Sixth Framework Programme (www.pcvd.eu) . A third genotype named PCV2c was reported from Denmark in the 1980s  and has recently been identified in the Brazilian Pantanal . Two additional genotypes were proposed after analyzing several sequences from the People’s Republic of China ; although subsequent analyses did not support the genotype status for those strains . Another study, restricted to Chinese sequences, also proposed a fourth group, which fitted with the existing definition of genotype and was named PCV2d . This late clade, apparently more virulent , was also detected in the USA (designated as mPCV2b) and in other countries [24, 25]. The mPCV2b-PCV2d genotype will be referred in this paper as PCV2d. A genotype shift from the older PCV2a to the new variant PCV2b has been reported during a time window that coincided with an increase in severe outbreaks of PCVD worldwide: Canada [26, 27], China , Denmark , Spain , Sweden , Switzerland  and USA . Interestingly, the oldest PCV2b strains were reported at the beginning of 90’s , contemporaneously to the first report of PCV2-SD. All these elements suggest that the viral genotype plays a major role in the appearance of clinical disease .
A growing number of PCV2 complete genomes and partial sequences are available at GenBank (more than 3,300 in July 2014), most of them updated after 2008 when the PCV2 genotype proposal was published. Since then, several new genotypes have been proposed, but also highly divergent sequences have been reported elsewhere (i.e.  .) Therefore, the aim of this paper was to revisit the intraspecific taxonomy of PCV2 and the genotype definition to check its current validity, to unify nomenclature and to avoid further misconceptions.
After removal of poor quality sequences and exclusion of predicted recombinant sequences detected by RDP3, 595 complete genomes and 954 ORF2 were maintained (Additional file 1). Recombination traces were reported in a substantial proportion of the PCV2 whole genome (37.7 %) and ORF2 (24.3 %) sequences downloaded from GenBank. Recombination was pervasive and affected several genome fragments, intra- and inter-genes, involving strains belonging to both closely and distantly related clades (i.e. different genotypes according to previous classification). The majority of the positions in the alignments was variable in the complete PCV2 genomes (52.8 %) and especially in the ORF2 (76.4 % nucleotide, 87.2 % amino acid).
Phylogenetic analysis and genotype definition
Phylogenetic trees reconstructed from PCV2 ORF2 using NJ (Fig. 1a) and ML (Fig. 1b) methods displayed very similar topologies and four main clades were identified. These four clades substantially corresponded to the previously defined PCV2a, PCV2b, PCV2c and PCV2d genotypes. Very few strains (n = 7) showed contradictory clustering between PCV2a and PCV2d clades in the ORF2 trees (Additional file 1). The same clustering in four major groups was obtained rooting the tree using PCV1 sequences as outgroup (data not shown). Similarly, phylogenetic trees reconstructed using complete genome showed a similar topology even if with a closer relationship between remaining PCV2d and PCV2a strains (named according to ORF2 classification).
PASC analyses based on the PCV2 complete genome and the ORF2 (Fig. 2) displayed a multimodal curve. In both cases the definition of a single cut-off value to define PCV2 genotypes appeared complicated. Applying the previously reported cut-off values to classify complete genomes (0.02) and ORF2 (0.035) to the PASC analyses, 27 and 18 genotypes should be defined in PCV2 considering whole genomes and ORF2, respectively. For complete genomes (Fig. 2a), a threshold between 0.034 and 0.042 would separate PCV2a and PCV2c from PCV2b/PCV2d, but a second threshold of 0.038 would be necessary to differentiate PCV2b and PCV2d. The picture is even more complicated for ORF2 (Fig. 2b). The first value pointed by the PASC analysis (around 0.068) would be meaningless for the genotype definition and a second value around 0.090 would only differentiate PCV2c from PCV2a/PCV2b/PCV2d. Alternatively, a theoretical cut-off around 0.078 would separate PCV2a, PCV2c and PCV2b/PCV2d, but this value was not robust according to the pairwise distribution. Consequently, a clear overlap between the p-distances calculated using sequences belonging to the same genotype and the sequences belonging to different genotypes was reported (Additional file 2). Even the sequences corresponding to the highly divergent PCV2c genotype displayed a p-distance with those belonging to the PCV2a and PCV2b comprised between 0.087 and 0.165, within the range of intra-genotype distances of the PCV2a (0–0.103) and PCV2b (0–0.12).
Rate of substitution estimation
The substitution rates and the 95 % CI for the previously defined PCV2a, PCV2b, PCV2c and PCV2d genotypes based on the ORF2 are shown in Fig. 3. The estimated rate for PCV2a (1.41°10 −3 subs · site −1 · year −1 ) is higher compared with PCV2b (7.8°10 −4 ) and PCV2d (7.7°10 −4 ). The same picture is reported when the whole PCV2 genome is used, with PCV2a sowing higher substitution rates than the others. Actually, when all genotypes were analyzed together, the marginal likelihood estimation indicated that a relaxed molecular clock fit much better in the model than a strict one (data not shown), pointing that the variation among genotypes and among lineages is affecting the goodness of the model.
In view of the confused results reported in the PASC analysis, an alternative methodology for genotyping PCV2 based on marker positions is proposed. Table 1 summarizes the ORF2 marker positions that are consistently (>95 %) different among PCV2 genotypes. Taking as a reference genotype PCV2b, 9 unique positions were identified to define genotype PCV2a, thirty-five for PCV2c and five for PCV2d (Table 1). In addition, a more robust ORF2 reference dataset was built selecting only those sequences unequivocally part of one genotype on the basis of the topology of both ML and NJ trees (Additional file 3).
A unified criterion for PCV2 genotyping is paramount to allow the comparison of molecular epidemiology data worldwide . Historically, the intraspecific classification of PCV2 has been controversial [22, 23, 34, 35]. In 2008, the EU consortium on Porcine Circovirus Diseases proposed a standardized nomenclature for PCV2 genotype definition based on pairwise sequence comparisons . The PASC analyses applied to PCV2 complete and capsid (ORF2) nucleotide sequences defined two distance thresholds at 0.020 and 0.035, respectively . Since then, a huge number of PCV2 sequences has been deposited in the GenBank, and several new genotypes were proposed (i.e. 7) though they were not always validated [21, 22, 35]. A significant proportion of those sequences have a recombinant origin and in some cases they have been circulating with increasing prevalence in several Asian countries and USA [36–38]. Bearing this in mind, the intraspecific taxonomy of PCV2 and the genotype definition have been revised to check their current validity in order to unify the nomenclature and avoid further misconceptions.
Based on the obtained results , one of the main PASC and linearized phylogenetic trees assumptions is unequivocally violated. These techniques assume equal rates of evolution among clades and, according to the BEAST estimations for the four main groups of the phylogenetic trees (Fig. 1), they are significantly different (Fig. 3). These evidences, coupled with the enormous amount of new sequences available, have important implications for PCV2 intraspecific classification. Mostly, the thresholds applied since 2008 to define PCV2 genotypes using PASC and linearized phylogenetic trees are currently not applicable to all PCV2 strains and therefore these methods should be changed. Since the accepted scheme is no longer valid, an alternative method to genotype PCV2 strains in an unambiguous way is proposed. The suggested approach tries to account for different theoretical and practical issues. The first challenge when dealing with PCV2 classification is related to the presence of several recombinant strains that, as a matter of fact, belongs to more than one genotype. Considering the high recombination frequency reported in this study and by other authors [38–41] and given the tendency to display a higher frequency of breakpoints between or at the periphery of the genes , complete genome sequences pose indubitably a greater challenge for PCV2 classification. In fact, despite the attempt to exclude all recombinants sequences from the generated databases using a combination of different approaches with RDP software, current methods are still perfectible and their results are somewhat dependent on database features, initial settings and subjective refinement of the results. Additionally, other evolutionary phenomena may lead to the identification of putative recombination, including lineage-specific rate variation, convergent evolution and natural selection. So, it cannot be excluded that some recombinant sequences were not identified or misclassified, affecting the results. In practice it is infeasible to perform an extensive recombination analysis on routinely basis and results would probably be quite different among different analysis and operators. Moreover, the higher percentage of identity of ORF1 gene provides lower phylogenetic signal, limiting its applicability to phylogenetic inference and recombination detection. Finally, complete genome sequencing is laborious and expensive, so many laboratories currently base their analysis on ORF2 sequences. Taking into account all these factors, a classification approach based on the ORF2 gene was preferred. Aiming at offering an unambiguous classification scheme, several reference sequences, whose classification was clear and concordant for the two phylogenetic reconstruction methods, were selected (Additional files 1 and 3). This allows to define four genotypes using both phylogenetic reconstruction and genotype-specific marker positions. The substantial agreement between the more rapid NJ method and the more accurate ML tree reconstruction method represent a remarkable advantage. In addition to substantiate the robustness of this classification, it suggests the possibility to use a quicker approach still obtaining reliable results. Being aware that extensive phylogenetic analysis are time consuming and often impracticable during routine diagnostic activity, conserved nucleotide marker positions in the ORF2 are proposed to perform a quick genotype differentiation. These markers, consistently (>95 %) present in each of the four accepted genotypes ORF2, are depicted in Table 1 and can be used as a reference to assign a certain sequence to one of the newly proposed genotypes. Noteworthy, classification offered in the present study substantially agrees with that proposed by Olvera et al. , with the remarkable difference that strains previously classified into PCV2b Clade 1C are now part of the PCV2d genotype.
The present study confirms and validates the variability of viral sequences and the high intra- and inter-genotype recombination frequency between PCV2 strains, and highlights the difficulty to pinpoint an unequivocal genotype definition of this virus. Considering that the method based on genetic distance seems to be no longer valid and has generated some misclassification through time, it is suggested to use an approach based on the reference sequences and/or identification of marker positions.
Materials and methods
Nine hundred seventy five PCV2 complete genomes and 1,270 ORF2 sequences available from GenBank (www.ncbi.nlm.nih.gov) were downloaded in January 2014. Complete genome sequences were aligned at nucleotide level. Considering the coding nature of ORF2 sequences, multiple sequence alignment was carried out at amino acid level and was then used to generate the corresponding nucleotide sequences. The MUSCLE algorithm  implemented in MEGA6  was employed in both cases. Alignments were visually inspected and edited to remove poorly aligned sequences or those with modification in the reading frame or premature stop codon, highly suggestive of sequencing errors. The alignments robustness was evaluated using Guidance  assuming MAFFT  as multiple alignment algorithms.
Recombination analysis was performed on both complete genome and ORF2 alignments using RDP3 . The RDP, GENECONV, MaxChi and 3Seq methods were selected as primary scan, while all the methods implemented in RDP3 were used for recombination detection refinement. Settings for each method were adjusted considering the database features following the recommendations of the RDP3 manual. A recombination event was accepted if detected by more than two methods with a significance p-value of p < 0.01 with Bonferroni’s correction. All sequences identified as recombinant where excluded from further analysis.
Phylogenetic trees for both complete genome and ORF2 were reconstructed using the maximum likelihood (ML) method implemented in PhyML . Substitution model was selected according to Bayesian Information Criterion (BIC), calculated using Jmodeltest 2.1.2 . A combination of Nearest Neighbor Interchange (NNI) and Sub-tree Pruning and Regrafting (SPR) was used as tree rearrangement strategy. The phylogenetic tree reliability was evaluated using the Shimodaira–Hasegawa [SH]-aLRT  likelihood-based measures of branch supports implemented in PhyML. Phylogenetic trees were also reconstructed with the Neighbor Joining (NJ) method and MEGA6 using the substitution model with the better BIC score given by the same MEGA software. The confidence of the internal branches was evaluated performing 1000 bootstrap pseudo-replicates of the aligned dataset.
Pairwise p-distances among sequences for every dataset were calculated with MEGA6. Distances were ordered and a histogram of pairwised differences was constructed to perform a PASC analysis using Microsoft Excel 2010.
Rates of substitution
Estimations for the rate of substitution were calculated for the nucleotide sequences of the PCV2 genomes and the ORF2 using a Bayesian Markov chain Monte Carlo (MCMC) approach implemented in BEAST v.1.8.0 package . Three independent runs of MCMC per dataset were performed under a strict and a relaxed molecular clock model, using the General Time Reversible model of sequence evolution and the remaining default parameters in the prior’s panel. To account for different population dynamics through time a Bayesian Skygrid  was chosen as tree prior. The MCMC run was 5 × 10 7 steps long and the posterior probability distribution of the chains was sampled every 1000 steps. Convergence was assessed by visually inspecting the runs’ trace plot and on the basis of an effective sampling size greater than 200 after a 10 % burn-in using Tracer software, version 1.6 . The estimations are the mean values obtained for the three runs, combined using LogCombiner v1.8.0 (part of the BEAST 1.8. package). The mean substitution rate and the 95 % CI were calculated, and the best-fitting models were selected by a Bayes factor using marginal likelihoods estimator tools implemented in Tracer and the Stepping Stone and Path Sampling approaches  .
Delwart E, Li L. Rapidly expanding genetic diversity and host range of the Circoviridae viral family and other Rep encoding small circular ssDNA genomes. Virus Res. 2012;164(1–2):114–21.
Tischer I, Gelderblom H, Vettermann W, Koch MA. A very small porcine virus with circular single-stranded DNA. Nature. 1982;295(5844):64–6.
Segales J. Porcine circovirus type 2 (PCV2) infections: clinical signs, pathology and laboratory diagnosis. Virus Res. 2012;164(1–2):10–9.
Clark EG. Post-weaning multisystemic wasting syndrome. Proc West Can Assoc Swine Pract. 1996;21:19–20.
Harding JC. Postweaning multisystemic wasting syndrome: preliminary epidemiology and clinical findings. Proc West Can Assoc Swine Pract. 1996;21:2.
Ellis J, Hassard L, Clark E, Harding J, Allan G, Willson P, et al. Isolation of circovirus from lesions of pigs with postweaning multisystemic wasting syndrome. Can Vet J. 1998;39(1):44–51.
Segales J, Kekarainen T, Cortey M. The natural history of porcine circovirus type 2: from an inoffensive virus to a devastating swine disease? Vet Microbiol. 2013;165(1–2):13–20.
Guo LJ, Lu YH, Wei YW, Huang LP, Liu CM. Porcine circovirus type 2 (PCV2): genetic variation and newly emerging genotypes in China. Virol J. 2010;7:273.
Liu J, Zhu Y, Chen I, Lau J, He F, Lau A, et al. The ORF3 protein of porcine circovirus type 2 interacts with porcine ubiquitin E3 ligase Pirh2 and facilitates p53 expression in viral infection. J Virol. 2007;81(17):9560–7.
Trible BR, Rowland RR. Genetic variation of porcine circovirus type 2 (PCV2) and its relevance to vaccination, pathogenesis and diagnosis. Virus Res. 2012;164(1–2):68–77.
Karuppannan AK, Kwang J. ORF3 of porcine circovirus 2 enhances the in vitro and in vivo spread of the of the virus. Virology. 2011;410(1):248–56.
Gagnon CA. Emergence of a new type of porcine circovirus in swine (PCV): A type 1 and type 2 PCV recombinant. Vet Microbiol. 2010;144(1–2):18–23.
Olvera A, Cortey M, Segales J. Molecular evolution of porcine circovirus type 2 genomes: phylogeny and clonality. Virology. 2007;357(2):175–85.
Van Regenmortel M, Bishop D, Fauquet C, Mayo M, Maniloff J. Virus Classification by Pairwise Sequence Comparison (PASC). In: van Regenmortel MHV, Brian WJ M, editors. Desk Encyclopedia of General Virology. San Diego: Academic Press; 2010. p. 95–100.
Bao Y, Kapustin Y, Tatusova T. Virus classification by pairwise sequence comparison (PASC). Encyclopedia of virology. 2008;5:342–8.
Takezaki N, Rzhetsky A, Nei M. Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol. 1995;12(5):823–33.
Grau-Roma L, Crisci E, Sibila M, López-Soria S, Nofrarias M, Cortey M, et al. A proposal on porcine circovirus type 2 (PCV2) genotype definition and their relation with postweaning multisystemic wasting syndrome (PMWS) occurrence. Vet Microbiol. 2008;128(1–2):23–35.
Segales J, Olvera A, Grau-Roma L, Charreyre C, Nauwynck H, Larsen L, et al. PCV-2 genotype definition and nomenclature. Vet Rec. 2008;162(26):867–8.
Dupont K, Nielsen EO, Baekbo P, Larsen LE. Genomic analysis of PCV2 isolates from Danish archives and a current PMWS case–control study supports a shift in genotypes with time. Vet Microbiol. 2008;128(1–2):56–64.
Franzo G, Cortey M, De Castro AM, Alessandra Marnie Martins G, Piovezan U, Szabo MPJ, et al. Genetic characterisation of porcine circovirus type 2 (PCV2) strains from feral pigs in the Brazilian Pantanal: an opportunity to reconstruct the history of PCV2 evolution. Vet Microbiol. 2015;178(1–2):158–62.
Wang F, Guo X, Ge X, Wang Z, Chen Y, Cha Z, et al. Genetic variation analysis of Chinese strains of porcine circovirus type 2. Virus Res. 2009;145(1):151–6.
Cortey M, Olvera A, Grau-Roma L, Segalés J. Further comments on porcine circovirus type 2 (PCV2) genotype definition and nomenclature. Vet Microbiol. 2011;149(3–4):522–3.
Guo L, Fu Y, Wang Y, Lu Y, Wei Y, Tang Q, et al. A porcine circovirus type 2 (PCV2) mutant with 234 amino acids in capsid protein showed more virulence in vivo, compared with classical PCV2a/b strain. PLoS One. 2012;7(7):e41463.
Xiao CT, Halbur PG, Opriessnig T. Complete genome sequence of a novel porcine circovirus type 2b variant present in cases of vaccine failures in the United States. J Virol. 2012;86(22):12469.
Franzo G, Tucciarone CM, Dotto G, Gigli A, Ceglie L, Drigo M. International trades, local spread and viral evolution: The case of porcine circovirus type 2 (PCV2) strains heterogeneity in Italy. Infect Genet Evol. 2015;32:409–15.
Carman S, Cai HY, DeLay J, Youssef SA, McEwen BJ, Gagnon CA, et al. The emergence of a new strain of porcine circovirus-2 in Ontario and Quebec swine and its association with severe porcine circovirus associated disease--2004-2006. Can J Vet Res. 2008;72(3):259–68.
Gagnon CA, Tremblay D, Tijssen P, Venne MH, Houde A, Elahi SM. The emergence of porcine circovirus 2b genotype (PCV-2b) in swine in Canada. Can Vet J. 2007;48(8):811–9.
Cortey M, Pileri E, Sibila M, Pujols J, Balasch M, Plana J, et al. Genotypic shift of porcine circovirus type 2 from PCV-2a to PCV-2b in Spain from 1985 to 2008. Vet J. 2011;187(3):363–8.
Timmusk S, Wallgren P, Brunborg IM, Wikström FH, Allan G, Meehan B, et al. Phylogenetic analysis of porcine circovirus type 2 (PCV2) pre-and post-epizootic postweaning multisystemic wasting syndrome (PMWS). Virus Genes. 2008;36(3):509–20.
Wiederkehr DD, Sydler T, Buergi E, Haessig M, Zimmermann D, Pospischil A, et al. A new emerging genotype subgroup within PCV-2b dominates the PMWS epizooty in Switzerland. Vet Microbiol. 2009;136(1):27–35.
Cheung AK, Lager KM, Kohutyuk OI, Vincent AL, Henry SC, Baker RB, et al. Detection of two porcine circovirus type 2 genotypic groups in United States swine herds. Arch Virol. 2007;152(5):1035–44.
Firth C, Charleston MA, Duffy S, Shapiro B, Holmes EC. Insights into the evolutionary history of an emerging livestock pathogen: porcine circovirus 2. J Virol. 2009;83(24):12813–21.
Novosel D, Tuboly T, Csagola A, Lorincz M, Cubric-Curik V, Jungic A, et al. Origin of porcine circovirus type 2 (PCV2) from swine affected by PCV2-associated diseases in Croatia. Vet Rec. 2014;174(17):431.
Grau-Roma L, Fraile L, Segales J. Recent advances in the epidemiology, diagnosis and control of diseases caused by porcine circovirus type 2. Vet J. 2011;187(1):23–32.
Zhao G, Cheng W, Zhang P, Han Y, Chen D. Novel genotypes of type 2 porcine circovirus (PCV2) in PMWS pigs in China between 2008 and 2009. J Anim Vet Adv. 2010;9:3083–91.
Opriessnig T, Gerber PF, Xiao C, Mogler M, Halbur PG. A commercial vaccine based on PCV2a and an experimental vaccine based on a variant mPCV2b are both effective in protecting pigs against challenge with a 2013 US variant mPCV2b strain. Vaccine. 2014;32(2):230–7.
Cai L, Ni J, Xia Y, Zi Z, Ning K, Qiu P, et al. Identification of an emerging recombinant cluster in porcine circovirus type 2. Virus Res. 2012;165(1):95–102.
Wei C, Zhang M, Chen Y, Xie J, Huang Z, Zhu W, et al. Genetic evolution and phylogenetic analysis of porcine circovirus type 2 infections in southern China from 2011 to 2012. Infect Genet Evol. 2013;17:87–92.
Huang Y, Shao M, Xu X, Zhang X, Du Q, Zhao X, et al. Evidence for different patterns of natural inter-genotype recombination between two PCV2 parental strains in the field. Virus Res. 2013;175(1):78–86.
Mu C, Yang Q, Zhang Y, Zhou Y, Zhang J, Martin DP, et al. Genetic variation and phylogenetic analysis of porcine circovirus type 2 infections in central China. Virus Genes. 2012;45(3):463–73.
Ma CM, Hon CC, Lam TY, Li VY, Wong CK, de Oliveira T, et al. Evidence for recombination in natural populations of porcine circovirus type 2 in Hong Kong and mainland China. J Gen Virol. 2007;88(Pt 6):1733–7.
Lefeuvre P, Lett JM, Varsani A, Martin DP. Widely conserved recombination patterns among single-stranded DNA viruses. J Virol. 2009;83(6):2697–707.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–9.
Penn O, Privman E, Ashkenazy H, Landan G, Graur D, Pupko T. GUIDANCE: A web server for assessing alignment confidence scores. Nucleic Acids Res. 2010;38:W23–8.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
Martin DP, Lemey P, Lott M, Moulton V, Posada D, Lefeuvre P. RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics. 2010;26(19):2462–3.
Guindon S, Dufayard J, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst Biol. 2010;59(3):307–21.
Darriba D, Taboada GL, Doallo R, Posada D. JModelTest 2: More models, new heuristics and parallel computing. Nat Methods. 2012;9(8):772.
Anisimova M, Gil M, Dufayard J, Dessimoz C, Gascuel O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst Biol. 2011;60(5):685–99.
Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29(8):1969–73.
Gill MS, Lemey P, Faria NR, Rambaut A, Shapiro B, Suchard MA. Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci. Mol Biol Evol. 2013;30(3):713–24.
Rambaut A, Suchard MA, Xie D & Drummond AJ (2014) Tracer v1.6, Available from http://beast.bio.ed.ac.uk/Tracer
Baele GF, Lemey PF, Bedford TF, Rambaut A, Suchard MA FAU, FAU SM, et al. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol Biol Evol. 2012;29(9):2157–67.
Alessandra M. M. G. de Castro received scholarship from FAPESP (grant 2007/57115-3).
None of the authors of this paper has a financial or personal relationship with other people or organizations that could inappropriately influence or bias the content of the paper. The authors declare that they have no competing interests.
GF, MC, AO and PB performed the analysis; AC and DN provided part of the sequences used in the present work; GF, MC, JS and MD wrote the manuscript; All authors have revised the manuscript and cooperate for its improvement. All authors read and approved the final manuscript.
Giovanni Franzo Martí Cortey Joaquim Segalés and Michele Drigo contributed equally to this work.
Spreadsheet containing the datasets used in the present study. For ORF2, the clustering according to NJ and ML phylogenetic trees is included. The list of Acc.Num of reference strains is also reported. (XLS 313 kb)
Intra and inter genotype pairwise p-distances obtained from complete ORF2 Database (a) and on reference sequences (b). (PDF 2920 kb)
ORF2 reference dataset built selecting only those sequences unequivocally part of one genotype on the basis of the topology of both ML and NJ trees. (PDF 921 kb)