The genomic DNA sequence of a novel enteric uncultured microphage, ΦCA82 from a turkey gastrointestinal system was determined utilizing metagenomics techniques. The entire circular, single-stranded nucleotide sequence of the genome was 5,514 nucleotides. The ΦCA82 genome is quite different from other microviruses as indicated by comparisons of nucleotide similarity, predicted protein similarity, and functional classifications. Only three genes showed significant similarity to microviral proteins as determined by local alignments using BLAST analysis. ORF1 encoded a predicted phage F capsid protein that was phylogenetically most similar to the Microviridae ΦMH2K member's major coat protein. The ΦCA82 genome also encoded a predicted minor capsid protein (ORF2) and putative replication initiation protein (ORF3) most similar to the microviral bacteriophage SpV4. The distant evolutionary relationship of ΦCA82 suggests that the divergence of this novel turkey microvirus from other microviruses may reflect unique evolutionary pressures encountered within the turkey gastrointestinal system.
Metagenomics analyses have lead to the discovery of a variety of microbial nucleotide sequences from environmental samples . These techniques have also allowed for the discovery of uncultured viral nucleotide sequences that are commonly from bacteriophages [2–4] that has also resulted in the discovery of useful enzymes for molecular biology . There has been a resurgent interest in bacteriophage biology and their use or use of phage gene products as antibacterial agents [6–8]. Bacteriophages are thought to be the most abundant life form as a group  and the importance of phage to bacterial evolution [10, 11], the role of phage or prophage encoded virulence factors that contribute to bacterial infectious diseases [12–14] and their contribution to horizontal gene transfer  cannot be over stated. Additionally, the contribution to microbial ecology  and to agricultural production [17, 18] is also extremely important.
Enteric diseases are an important economic production problem for the poultry industry worldwide. One of the major economically important enteric diseases for the poultry industry are the poult enteritis complex (PEC) and poult enteritis mortality syndrome (PEMS) in turkeys and a runting-stunting syndrome (RSS) in broiler chickens . Consequently, studies have been ongoing to identify novel enteric viruses among poultry species at our laboratory. In a recent study, we utilized the Roche/454 Life Sciences GS-FLX platform to compile an RNA virus metagenome from turkey flocks experiencing enteric disease . This approach yielded numerous sequences homologous to viruses in the BLAST nr protein database, many of which have not been described in turkeys.
Additionally, we have successfully applied a random PCR-based method for detection of unknown microorganisms from enteric samples of turkeys that resulted in identification of genomic sequences and subsequent determination of the full-length genome from a previously uncultured parvovirus . During these ongoing investigations to further characterize the turkey gut microbiome and identify novel viral pathogens of poultry, bacteriophage genomic sequences have also been identified. Herein we report the complete genomic sequence of a putative novel member of the Microviridae obtained from turkey gastrointestinal DNA samples utilizing metagenomics approaches. The protein sequences of ΦCA82 were most similar to those of Chlamydia phages.
Materials and Methods
Assembly of ΦCA82, a novel member of the Microviridae family
Forty-two complete intestinal tracts (from duodenum/pancreas to cloaca, including cecal tonsils) from a turkey farm in California, U.S.A. with histories of enteric disease problems were received at the Southeast Poultry Research Laboratory (SEPRL). The intestines were processed and pooled into a single sample, as previously described . A sequence-independent polymerase chain reaction (PCR) protocol was employed to amplify particle-associated nucleic acid (PAN) present in turkey intestinal homogenates, and has been described elsewhere in detail . Using this approach, a total of 576 clones were identified and sequenced with the M13 forward and reverse primers on an AB-3730 automated DNA sequencer. The sequenced clones were used as query sequences to search the GenBank non-redundant nucleotide and protein databases using the blastn and blastx algorithms . In total, the majority of clones with inserts had no hit in the databases using tblastx . However, 46% of the cloned DNA had homology to cellular DNA, bacterial DNA, bacteriophage DNA, and several eukaryotic viral DNA genomes. Twelve DNA clones had sequence similarity to single-stranded DNA microphages, which have also been identified predominantly in microbialites . A contig, CA82 with an average of eightfold coverage and length of 1962 nt was assembled from eight of those clones. This contig had no significant nucleotide similarity to database sequences, but the deduced amino acid sequence revealed significant similarity to the members of the family Microviridae. This initial contig was used to design PCR primers in the opposite orientation of the circular ssDNA to assemble into a contiguous ΦCA82 genome. The PCR amplification resulted in a 3.4 kb product that closed the gap between the CA82 contig and the rest of the circular genome. The final sequence was confirmed by sub-cloning and primer walking with primers resulting ~1 kb fragments containing 250 bp overlapping sequences across the genome. The circular DNA genome was assembled from contigs exhibiting 100% nucleotide identity within the overlapping regions.
The ΦCA82 genome and ORFs were aligned with selected microvirus sequences using ClustalW . Putative ORFs within the ΦCA82 genome were predicted using the FGENESV Trained Pattern/Markov chain-based viral gene prediction method from the Softberry website . Searches for conserved domains within the ΦCA82 genome were performed with the Conserved Domain Database (CDD) Search Service v2.17 at the National Center for Biotechnology Information (NCBI) website .
Comparative genomics of the Microviridae
The sequence of phage ΦCA82 was compared to 14 other members of the Microviridae (Table 1) obtained from the integrated microbial genomes (IMG) system . To first determine nucleotide level similarities, tetra-nucleotide comparisons between genomes were performed with jspecies . Pairwise genome comparisons were based on regressions of normalized tetra-nucleotide frequency counts and the distributions of the R2 values from these comparisons were visualized in R . To compare genomes based on similarity of predicted gene sequences, the program CD-HIT  was used.
Microviridae sequences used for comparative genomic analyses
NCBI Taxon ID
Genome Size (bp)
Chlamydia phage 1
Chlamydia phage 2
Chlamydia phage 3
Chlamydia phage 4
Chlamydia pneumoniae phage CPAR39
Enterobacteria phage G4
Enterobacteria phage St-1
Enterobacteria phage alpha3
Enterobacteria phage phiX174
Guinea pig Chlamydia phage
Spiroplasma phage 4
Abbreviations are as shown in Figure 4.
Genomic functional comparisons were based on pfam categories for each predicted gene as classified by the IMG annotation pipeline . A data table of pfam categories and gene counts for each genome was used to construct a similarity matrix and dendrogram in R. To determine which predicted genes were unique to ΦCA82 and those which were shared with other Microviridae members, the Microviridae pangenome was constructed as the union of all predicted genes from the 14 Microviridae genomes and compared to predicted genes for ΦCA82 using both CD-HIT and our data analysis pipeline as described above and blastp run with default parameters except for an e-value cutoff of 0.01.
Nucleotide accession number
The nucleotide sequence of ΦCA82 genome was deposited in GenBank under accession number HQ264138.
Results and Discussion
The ΦCA82 genome
The entire circular, single-stranded nucleotide sequence for the uncultured microvirus ΦCA82 genome was determined to be 5,514 nucleotides. The complete genome sequence had a nucleotide composition of A (38.6%), C (19.6%), G (20.1%), and T (21.6%) with an overall G + C content of 39.7%, which is similar to the chlamydial phages (37-40%). The ΦCA82 genome was organized in a modular arrangement similar to microviruses [33, 34] and encoded predicted proteins homologous to those chlamydial bacteriophage types  and to the Bdellovibrio bacteriovorus ΦMH2K . The coding capacity of the genome is 91% as it encodes ten ORFs, greater than 99 nucleotides similarly to other chlamydial microvirus genomes . The genome size, number of ORFs and total coding % of nucleotides as depicted in Figure 1 is larger than most of the chlamydial phages and is closer in size to the ΦX174 genome [33, 34].
Capsid proteins of ΦCA82
The amino acid identities and homologies between ΦCA82 ORF gene products and predicted proteins from selected phages are presented in Table 2. A total of ten genes could be identified of which only three gene products could be assigned with a known function based upon BLAST analysis. The predicted major capsid protein VP1 encoded by ORF1 belongs to the family of single-stranded bacteriophages and is the major structural component of the virion that may contain as many as 60 copies of the protein [37, 38]. The closest sequence similarity of the 565 amino acid ΦCA82 VP1 protein was with the Spiroplasma phage 4 (SpV4) capsid protein  and the chlamydial phage VP1 proteins [35, 38, 40–42], as well as the Chlamydia prophage CPAR39  and Bdellovibrio phage ΦMH2K major capsid protein . A putative minor capsid protein of 234 amino acids was encoded by ORF2 that had similarity to the chlamydial bacteriophages [35, 40–43] and the Bdellovibrio phage ΦMH2K  that was originally postulated to be an attachment protein .
Putative ΦCA82 ORFs and amino acid (aa) homologies with members of Microviridae
No. of aa
Homologous protein (GenBank accession #)
% amino acid identity (homology)
major capsid protein
minor capsid protein
a Amino acid identity and homology was determined using ClustalW. Homology was calculated by the sum of identical, strongly similar and weakly similar residues and expressed as percentages of the total.
Recent studies using a comparative metagenomic analysis of viral communities associated with marine and freshwater microbialites indicated that identifiable sequences in these were dominated by single-stranded DNA microphages . Partial sequence analysis of the VP1 gene from these microphages showed that the similarity between metagenomic clones and cultured microphage capsid sequences ranged from 47.5 to 61.2% at the nucleic-acid level and from 37.2 to 69.3% at the protein level, respectively. Interestingly, the VP1 gene of ΦCA82 has a similarly high level of sequence similarity (69.1% at the amino acid level) with the seawater metagenomic phages within the same VP1 region (data not shown). This observation is consistent with an environmental origin of modern poultry phages that have since undergone significant host-specific evolutionary divergence in agricultural settings.
A multiple alignment of major capsid proteins among diverse members shows similarities within the entire predicted coding region with the exception of the predicted surface-exposed IN5 loop and Ins (Figure 2). Amino acids located within these regions are involved in forming large protrusions at the threefold icosahedral axes of symmetry in the intracellular microvirus phages [36, 41, 43]. The IN5 loop, forming a globular protrusion on the virus coat and is the most variable region in the VP1 proteins from Chlamydia and Spiroplasma phages  is potentially located from residues 198 through 295 of ΦCA82 VP1, which is the most highly variable portion of the protein by BLAST. The hydrophobic nature of the cavity at the distal surface of the SpV4 protrusions suggests that this region may function as the receptor-recognition site during host infection. The short variable Ins sequences of ΦCA82 are putatively located from residues 459 through 464 of the VP1 protein.
Chipman et al  predicted that the IN5 trimer structure in VP1 may function as a substitute for spike proteins of the ΦX174-like viruses, which are not found in SpV4 or the Chlamydia phages, and as such may be responsible for receptor recognition. It has also been suggested that the diverse sequence in this region is associated with host range of phages [36, 41, 43, 44]. The presence of a large insertion in ΦCA82 further supports that it is closer to the intracellular phage subfamily and the sequence dissimilarity within this region between the ΦCA82 and various other phages strongly indicates that this domain indeed may function as a host range determinant.
Rep protein of ΦCA82
ORF3 encoded a putative replication initiation protein that was most similar to the SpV4-rep  and the Bdellovibrio phage ΦMH2K-rep  proteins (Table 2). Pairwise alignment of the ΦCA82 VP3 (rep) protein and SpV4 p1 (rep) protein revealed the presence of two conserved domains (Figure 3) from residues 73 through 176 and 195 through 320 of the ΦCA82 protein. Overall, the two rep proteins only had 22.6% identity, but shared many of the same sequences throughout the conserved regions that were recognized by BLAST as putative replication initiation protein regions. Rep protein plays an essential role in viral DNA replication and binds the origin of replication where it cleaves the dsDNA replicative form I (RFI) and becomes covalently bound to it via phosphotyrosine bond (see active sites in Figure 3). The conservation of the functional domains between the ΦCA82 phage rep protein and other microviral replication initiation proteins suggests a similar pathway/mechanism for DNA replication and virion packaging.
Full genome comparisons of ΦCA82 with other members of the Microviridae
The ΦCA82 genome is quite different from other members of Microviridae as indicated by comparisons of nucleotide similarity, predicted protein similarity, and functional classifications. Comparisons of ΦCA82 to 14 other Microviridae genomes showed very low correlations of tetra-nucleotide frequencies as a measure of genome similarity. ΦCA82 was most similar to SpV4, but the correlation of tetra-nucleotide frequencies was poor (R2 = 0.33; Figure 4A). Only ΦMH2K had lower similarities to other Microviridae (Figure 4A). Clustering of predicted proteins showed ΦCA82 was most closely related to a clade comprised of the chlamydial phages, but as in the nucleotide comparisons, the predicted proteins of ΦCA82 are quite distinct from those of the other microviruses (Figure 4B). Function-based clustering of genomes using pfam categories showed that ΦCA82 was most similar to SpV4 (Figure 4C), based on shared membership of the ΦCA82 ORF1 in pfam02305, an F super family capsid protein. These results were confirmed by comparisons of predicted proteins from the ΦCA82 genome to a Microviridae pangenome. This analysis showed only three genes with significant similarity as determined by local alignments using blastp with no overlap between ΦCA82 and the Microviridae pangenome based on global alignments at a 40% similarity cutoff (Figure 4D). ΦCA82 is only distantly related to other Microviridae, but is most similar to SpV4 and the chlamydial phages. In summary, the whole genome comparisons of ΦCA82 to other Microviridae members indicate a distant evolutionary relationship, perhaps suggesting that the divergence of ΦCA82 from other microviruses reflects unique evolutionary pressures encountered within the turkey gastrointestinal system.
These investigations were supported by ARS-USDA CRIS Project No. 6612-32000-054-00 "Epidemiology, Pathogenesis and Countermeasures to Prevent and Control Enteric Viruses of Poultry" at SEPRL and Project No. 6612-32000-055-00 "Molecular Characterization and Gastrointestinal Tract Ecology of Commensal Human Food-Borne Bacterial Pathogens in the Chicken" at PMSRU. The authors thank to Fenglan Li for excellent technical assistance and to the SEPRL sequencing facility for outstanding support.
Southeast Poultry Research Laboratory, Agricultural Research Service, United States Department of Agriculture
Poultry Microbiological Safety Research Unit, Agricultural Research Service, United States Department of Agriculture
Handelsman J: Metagenomics: application of genomics to uncultured microorganisms.Microbiol Mol Biol Rev 2004, 68:669–685.PubMedView Article
Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, Rohwer F: Genomic analysis of uncultured marine viral communities.Proc Natl Acad Sci USA 2002, 99:14250–14255.PubMedView Article
Brüssow H, Desiere F: Comparative phage genomics and the evolution of Siphoviridae: insights from dairy phages.Mol Microbiol 2001, 39:213–222.PubMedView Article
Sturino JM, Klaenhammer TR: Bacteriophage defense systems and strategies for lactic acid bacteria.Adv Appl Microbiol 2004, 56:331–378.PubMedView Article
Barnes HJ, Guy JS, Barnes HJ, Glisson JR, Fadly AM, McDougald LR, Swayne DE: Poult enteritis mortality syndrome. In Diseases of Poultry. 11th edition. Edited by: Saif Y. Iowa State Press; 2003:1171–1180.
Day JM, Ballard LL, Duke MV, Scheffler BE, Zsak L: Metagenomic analysis of the turkey gut RNA virus community.Virol J 2010, 7:313.PubMedView Article
Day JM, Zsak L: Determination and analysis of the full-length chicken parvovirus genome.Virology 2010, 399:59–64.PubMedView Article
Zsak L, Strother KO, Kisary J: Partial genome sequence analysis of parvoviruses associated with enteric disease in poultry.Avian Pathol 2008, 37:435–441.PubMedView Article
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool.J Mol Biol 1990, 215:403–410.PubMed
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.Nucleic Acids Res 1997, 25:3389–3402.PubMedView Article
Desnues C, Rodriguez-Brito B, Rayhawk S, Kelley S, Tran T, Haynes M, Liu H, Furlan M, Wegley L, Chau B, Ruan Y, Hall D, Angly FE, Edwards RA, Li L, Thurber RV, Reid RP, Siefert J, Souza V, Valentine DL, Swan BK, Breitbart M, Rohwer F: Biodiversity and biogeography of phages in modern stromatolites and thrombolites.Nature 2008, 452:340–343.PubMedView Article
Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.Nucleic Acids Res 1994, 22:4673–4680.PubMedView Article
Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Liebert CA, Liu C, Lu F, Lu S, Marchler GH, Mullokandov M, Song JS, Tasneem A, Thanki N, Yamashita RA, Zhang D, Zhang N, Bryant SH: CDD: specific functional annotation with the Conserved Domain Database.Nucleic Acids Res 2009, 37:205–210.View Article
Markowitz VM, Korzeniewski F, Palaniappan K, Szeto E, Werner G, Padki A, Zhao X, Dubchak I, Hugenholtz P, Anderson I, Lykidis A, Mavromatis K, Ivanova N, Kyrpides NC: The integrated microbial genomes (IMG) system.Nucleic Acids Res 2006, 34:344–348.View Article
Richter M, Rossello-Mora R: Shifting the genomic gold standard for the prokaryotic species definition.Proc Natl Acad Sci USA 2009, 106:19126–19131.PubMedView Article
Team RDC: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.Bioinformatics 2006, 22:1658–1659.PubMedView Article
Rokyta DR, Abdo Z, Wichman HA: The genetics of adaptation for eight microvirid bacteriophages.J Mol Evol 2009, 69:229–239.PubMedView Article
Rokyta DR, Burch CL, Caudle SB, Wichman HA: Horizontal gene transfer and the evolution of microvirid coliphage genomes.J Bacteriol 2006, 188:1134–1142.PubMedView Article
Garner SA, Everson JS, Lambden PR, Fane BA, Clarke IN: Isolation, molecular characterisation and genome sequence of a bacteriophage (Chp3) from Chlamydophila pecorum.Virus Genes 2004, 28:207–214.PubMedView Article
Brentlinger KL, Hafenstein S, Novak CR, Fane BA, Borgon R, McKenna R, Agbandje-McKenna M: Microviridae, a family divided: isolation, characterization, and genome sequence of φMH2K, a bacteriophage of the obligate intracellular parasitic bacterium Bdellovibrio bacteriovorus.J Bacteriol 2002, 184:1089–1094.PubMedView Article
McKenna R, Bowman BR, Ilag LL, Rossmann MG, Fane BA: Atomic structure of the degraded procapsid particle of the bacteriophage G4: induced structural changes in the presence of calcium ions and functional implications.J Mol Biol 1996, 256:736–750.PubMedView Article
Storey CC, Lusher M, Richmond SJ: Analysis of the complete nucleotide sequence of Chp1, a phage which infects avian Chlamydia psittaci.J Gen Virol 1989, 70:3381–3390.PubMedView Article
Renaudin J, Pascarel MC, Bove JM: Spiroplasma virus 4: nucleotide sequence of the viral DNA, regulatory signals, and proposed genome organization.J Bacteriol 1987, 169:4950–4961.PubMed
Sait M, Livingstone M, Graham R, Inglis NF, Wheelhouse N, Longbottom D: Identification, sequencing and molecular analysis of Chp4, a novel chlamydiaphage of Chlamydophila abortus belonging to the family Microviridae.J Gen Virol 2011, 92:1733–1737.PubMedView Article
Liu BL, Everson JS, Fane B, Giannikopoulou P, Vretou E, Lambden PR, Clarke IN: Molecular characterization of a bacteriophage (Chp2) from Chlamydia psittaci.J Virol 2000, 74:3464–3469.PubMedView Article
Read TD, Brunham RC, Shen C, Gill SR, Heidelberg JF, White O, Hickey EK, Peterson J, Umayam L, Utterback T, Berry K, Bass S, Linher K, Weidman J, Khouri H, Craven B, Bowman C, Dodson R, Gwinn M, Nelson W, DeBoy R, Kolonay J, McClarty G, Salzberg SL, Eisen J, Fraser CM: Genome sequences of Chlamydia trachomatis MoPn and Chlamydia pneumoniae AR39.Nucleic Acids Res 2000, 28:1397–1406.PubMedView Article
Chipman PR, Agbandje-McKenna M, Renaudin J, Baker TS, McKenna R: Structural analysis of the Spiroplasma virus, SpV4: implications for evolutionary variation to obtain host diversity among the Microviridae.Structure 1998, 6:135–145.PubMedView Article
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.