- Short report
- Open Access
Analysis of the nucleotide sequence of the guinea pig cytomegalovirus (GPCMV) genome
Virology Journal volume 5, Article number: 139 (2008)
In this report we describe the genomic sequence of guinea pig cytomegalovirus (GPCMV) assembled from a tissue culture-derived bacterial artificial chromosome clone, plasmid clones of viral restriction fragments, and direct PCR sequencing of viral DNA. The GPCMV genome is 232,678 bp, excluding the terminal repeats, and has a GC content of 55%. A total of 105 open reading frames (ORFs) of > 100 amino acids with sequence and/or positional homology to other CMV ORFs were annotated. Positional and sequence homologs of human cytomegalovirus open reading frames UL23 through UL122 were identified. Homology with other cytomegaloviruses was most prominent in the central ~60% of the genome, with divergence of sequence and lack of conserved homologs at the respective genomic termini. Of interest, the GPCMV genome was found in many cases to bear stronger phylogenetic similarity to primate CMVs than to rodent CMVs. The sequence of GPCMV should facilitate vaccine and pathogenesis studies in this model of congenital CMV infection.
Guinea pig cytomegalovirus (GPCMV) serves as a useful model of congenital infection, due to the ability of the virus to cross the placenta and infect the fetus in utero [1–3]. This model is well-suited to vaccine studies for prevention of congenital cytomegalovirus (CMV) infection, a major public health problem and a high-priority area for new vaccine development . However, an impediment to studies in this model has been the lack of detailed DNA sequence data. Although a number of reports have identified specific gene products or clusters of genes [5–11], to date a full genomic sequence has not been available.
We recently reported the construction and preliminary sequence map of a GPCMV bacterial artificial chromosome (BAC) clone maintained in E. coli [12, 13], and this clone was used as an initial template for sequence analysis of the full GPCMV genome. BAC DNA was purified using Clontech's NucleoBond® Plasmid Kits as described previously  and both strands were sequenced using an ABI PRISM® 377 DNA Sequencer, with primers synthesized, as needed, to 'primer-walk' the nucleotide sequence. In parallel, Hin d III- and EcoR I-digested fragments were gel-purified and cloned into pUC and pBR322-based vectors as previously described . Plasmid sequences were determined from overlapping Hin d III and EcoR I fragments using the map coordinates originally described by Gao and Isom . These sequences were compared to the BAC sequence to facilitate assembly of a full-length contiguous sequence. Since the cloning of the BAC in E. coli involved insertion of BAC origin sequences into the Hin d III "N" region of the viral genome, sequence obtained from this specific restriction fragment cloned in pBR322 was utilized for assembly of the final contiguous sequence; analysis of this sequence confirmed that there were no adventitious deletions in the Hin d III "N" region generated during the original BAC cloning process. Since a deletion in the Hin d III "D" region occurred during cloning of the GPCMV BAC in E. coli , DNA sequence from a plasmid containing the full-length Hin d III "D" fragment was similarly obtained, and used for assembly of the final contiguous sequence. The GPCMV genomic sequence has been deposited with GenBank (Accession Number FJ355434).
Sequence analysis of GPCMV revealed a genome length of 232,678 bp with a GC content of 55%. This value is in agreement with the value of 54.1% determined previously by CsCl buoyant density centrifugation . A total of 326 open reading frames (ORFs) were identified that were capable of encoding proteins of ≥ 100 amino acids (aa). For ORFs predicted by the sequence analysis that had substantial overlap with other adjacent or complementary GPCMV ORFs that appeared to encode gene products that were highly conserved in other cytomegaloviruses, only those sequences with < 60% overlap with these highly conserved ORFs were further analyzed. ORFs homologous to those encoded by other CMVs with an e-value of < 0.1 and ≥ 100 aa were identified, based on comparisons analyzed using NCBI Blast (blastall version program 2.2.16). Of the ORFs so identified, 104 had sequence and/or positional homology to one or more ORFs encoded by human (HCMV), murine (MCMV), rat (RCMV), rhesus (RhCMV), chimpanzee (CCMV), or tupaia herpesvirus (THV) cytomegaloviruses (Table 1). Of note, homologs of HCMV ORFs UL23 through UL122 were identified . For ease of nomenclature, we have designated these ORFs using upper case font (GP23 through GP122). ORFs with homologs in other CMVs that do not correspond to HCMV UL23 through UL122 have been designated with a lower case "gp" prefix. Homologs of HCMV UL41a (69 aa; gp38.2), UL51 (99 aa; GP51), and UL91 (87 aa; GP91) were annotated in these initial analyses, based primarily on positional, and not sequence, homology to the respective HCMV ORFs. Three ORFs, homologs of MHC class I genes known to be encoded by multiple other CMVs (gp 147–149, Table 1) were also identified. One ORF, gp1 (homolog of CC chemokines), did not have a positional or sequence homolog when compared to other CMVs, but was included in the annotation because of its previous molecular characterization . Including ORFs with mapped exons, the total number of ORFs annotated in this preliminary analysis was 105 [Table 1].
A map of the GPCMV genome illustrating the relative positions of these ORFs is shown in Fig. 1. ORFs that represent homologs of the individual exons of spliced HCMV genes, in particular UL89 (terminase) and UL112/UL113 (replication accessory protein) are annotated separately. The splice junction for the GP89 mRNA was predicted based on comparisons to other CMVs. For the UL112/113 region, further studies will be required to map the precise splicing patterns of the putative transcripts encoded by this region of the GPCMV genome. Similarly, the ORF encoding the sequence homolog of the HCMV IE transactivator, UL122, has been annotated without regard to the splicing events previously shown to take place in this region of the genome ; further analyses of cDNA from this and other GPCMV genome regions of IE transcription, including those encoded in the Hin d III 'D' region of the genome, will likely result in annotation of multiple heretofore unidentified ORFs. A comprehensive table of all ORFs > 25 aa and their homology to other CMV genomes is provided in additional files 1 and 2. As RNA analyses are completed, the total number of annotated GPCMV ORFs will expand in number.
The schematic representation of GPCMV ORFs demonstrated in Fig. 1 highlights several gene families of particular interest. Of particular interest and importance to vaccine studies in the guinea pig model are conserved homologs of the ORFs encoding major envelope glycoproteins gB, gH/gL/gO/, and gM/gN. These glycoproteins are important determinants of humoral immune responses in the setting of CMV infection, and serve as potential subunit vaccine candidates. Of these, the gB homolog has been demonstrated to confer protection against congenital GPCMV infection in subunit vaccine studies [21–23]. Homologs of putative HCMV immune modulation genes, including G-protein coupled receptors and major histocompatibility class I homologs, were also identified . Also of interest was the presence of multiple US22 gene family homologs, heavily clustered near the rightward terminus of the GPCMV genome. These ORFs predict protein products that are analogous to the MCMV dsRNA-binding proteins, M142 and M143, that have been shown to inhibit dsRNA-activated antiviral pathways [25, 26]. Members of this family have also been implicated in macrophage tropism in MCMV . Our sequence analysis also confirmed the findings of Liu and Biegalke  that the GPCMV genome does not encode a positional homolog of the antiapoptotic HCMV UL36 gene . However, an ORF with homology to R36, which encodes the presumed RCMV cell death suppressor, was identified (gp29.1, Table 1). Further studies will be required to determine whether this putative gene supplies a UL36-like function.
It was also of interest to note the presence of ORFs that have apparent homology to the MCMV M129-133 region. This region has positional homologs in human and primate CMVs [29–31], but is absent in THV . Recently, it was determined that passage of GPCMV in cultured fibroblasts promotes the deletion of a ~1.6-kb locus containing potential positional homologs of this gene cluster. The presence of this 1.6 kb locus was found by Inoue and colleagues to be associated with an enhanced pathogenesis of GPCMV in vivo . We independently confirmed the presence of this locus and its sequence in our salivary gland-derived viral stocks, and have included this sequence in our GenBank annotation (Accession Number FJ355434). Further studies will be required to fully annotate the transcripts encoded by this region of the GPCMV genome. Interestingly, the original GPCMV BAC clone that we sequenced was derived using GPCMV viral DNA obtained after long-term tissue culture passage of ATCC 2122 viral stock, and not surprisingly this BAC was found to lack the 1.6 kb virulence locus . Subsequently, PCR and preliminary sequencing of a more recently obtained GPCMV BAC clone with an excisable origin of replication  revealed that the 1.6-kb sequence was retained in this clone. The apparent modifications of this locus that occur following viral passage on fibroblast cells are reminiscent of the mutations and deletions that occurred during fibroblast-passage of HCMV  and rhesus CMV . The congruence of these events suggests that the selective pressures that promote mutational inactivation of genes in this region may be similar across viral species. Additional analyses, including sequencing of a full-length GPCMV genome derived from replicating virus in vivo, will be required to determine what other deletions or mutations are present in genomes from tissue culture-passaged viruses. Since additional ORFs are likely to be identified by these analyses, we have annotated the first ORF identified in the BAC sequence to the right of this 1.6 kb region as gp138 (Fig. 1), to allow for ease of nomenclature as ORFs in this virulence locus are better characterized. Application of other genome sequence analysis methods, including identification of small or overlapping genes and further assessment of mRNA splicing or unconventional translation signals, will likely result in identification of other putative ORFs in future studies .
Comparisons of GPCMV ORFs with sequences from other CMV genomes yielded interesting results. ORF translations were compared with all proteins from the 6 sequenced CMV genomes (HCMV, MCMV, RCMV, RhCMV, THV, and CCMV), and hits with e-values less than 1e-5 were aligned individually for each protein, using both ClustalW (version 1.82; ) and Muscle (version 3.6; ). The alignments were then used to generate trees based on neighbor-joining using JalView. Clustal trees for glycoproteins B (GP55) and N (GP73) are shown in Fig. 2, with distance scores indicated. Overall, comparison of the various glycoproteins (gB, gM, gH, and gO) yielded similar phylogenies, with GPCMV glycoproteins generally appearing closer to primate CMVs than rodent CMVs , except for the gN homolog, which appears closer to rodents. ClustalW and Muscle comparisons of GPCMV ORFs with homologous ORFs from the other sequenced CMVs are provided in additional file 3.
In summary, the complete DNA sequence of GPCMV was determined, using a combination of sequencing of BAC DNA, viral DNA, and cloned Hin d III and Eco RI fragments. These analyses identified both conserved ORFs found in all mammalian CMVs, as well as the presence of novel genes apparently unique to the GPCMV. These similarities underscore the usefulness of the guinea pig model, with positive translational implications for development and testing of CMV intervention strategies in humans. Further characterization of the GPCMV genome should facilitate ongoing vaccine and pathogenesis studies in this uniquely useful small animal model of congenital CMV infection.
Kern ER: Pivotal role of animal models in the development of new therapies for cytomegalovirus infections. Antiviral Res 2006, 71: 164-71.
Schleiss MR: Animal models of congenital cytomegalovirus infection: an overview of progress in the characterization of guinea pig cytomegalovirus (GPCMV). J Clin Virol 2002,25(Suppl 2):S37-49.
Schleiss MR: Comparison of vaccine strategies against congenital CMV infection in the guinea pig model. J Clin Virol 2008, 41: 224-30.
Schleiss MR: Cytomegalovirus vaccine development. Curr Top Microbiol Immunol 2008, 325: 361-82.
McVoy MA, Nixon DE, Adler SP: Circularization and cleavage of guinea pig cytomegalovirus genomes. J Virol 1997, 71: 4209-17.
Fox DS, Schleiss MR: Sequence and transcriptional analysis of the guinea pig cytomegalovirus UL97 homolog. Virus Genes 1997, 15: 255-64.
Schleiss MR, McGregor A, Jensen NJ, Erdem G, Aktan L: Molecular characterization of the guinea pig cytomegalovirus UL83 (pp65) protein homolog. Virus Genes 1999, 19: 205-221.
Liu Y, Biegalke BJ: Characterization of a cluster of late genes of guinea pig cytomegalovirus. Virus Genes 2001, 23: 247-56.
Haggerty SM, Schleiss MR: A novel CC-chemokine homolog encoded by guinea pig cytomegalovirus. Virus Genes 2002, 25: 271-9.
McGregor A, Liu F, Schleiss MR: Identification of essential and non-essential genes of the guinea pig cytomegalovirus (GPCMV) genome via transposome mutagenesis of an infectious BAC clone. Virus Res 2004, 101: 101-8.
Paglino JC, Brady RC, Schleiss MR: Molecular characterization of the guinea-pig cytomegalovirus glycoprotein L gene. Arch Virol 1999, 144: 447-62.
McGregor A, Schleiss MR: Molecular cloning of the guinea pig cytomegalovirus (GPCMV) genome as an infectious bacterial artificial chromosome (BAC) in Escherichia coli. Mol Genet Metab 2001, 72: 15-26.
Schleiss MR, Lacayo J: The Guinea-Pig Model of Congenital CMV Infection. In Cytomegaloviruses: Molecular Biology and Immunology. Edited by: Reddehase MJ, Lemmermann N. Horizon Scientific Press; 2006:525-50.
McGregor A, Liu F, Schleiss MR: Molecular, biological, and in vivo characterization of the guinea pig cytomegalovirus (CMV) homologs of the human CMV matrix proteins pp71 (UL82) and pp65 (UL83). J Virol 2004, 78: 9872-89.
Schleiss MR: Cloning and characterization of the guinea pig cytomegalovirus glycoprotein B gene. Virology 1994, 202: 173-85.
Gao M, Isom HC: Characterization of the guinea pig cytomegalovirus genome by molecular cloning and physical mapping. J Virol 1984, 52: 436-47.
Cui X, McGregor A, Schleiss MR, McVoy MA: Cloning the complete guinea pig cytomegalovirus genome as an infectious bacterial artificial chromosome with excisable origin of replication. J Virol Methods 2008, 149: 231-9.
Isom HC, Gao M, Wigdahl B: Characterization of guinea pig cytomegalovirus DNA. J Virol 1984, 49: 426-36.
Chee MS, Bankier AT, Beck S, Bohni R, Brown CM, Cerny R, Horsnell T, Hutchison CA 3rd, Kouzarides T, Martignetti JA, et al.: Analysis of the protein-coding content of the sequence of human cytomegalovirus strain AD169. Curr Top Microbiol Immunol 1990, 154: 125-69.
Yin CY, Gao M, Isom HC: Guinea pig cytomegalovirus immediate-early transcription. J Virol 1990, 64: 1537-48.
Bourne N, Schleiss MR, Bravo FJ, Bernstein DI: Preconception immunization with a cytomegalovirus (CMV) glycoprotein vaccine improves pregnancy outcome in a guinea pig model of congenital CMV infection. J Infect Dis 2001, 183: 59-64.
Schleiss MR, Bourne N, Bernstein DI: Preconception vaccination with a glycoprotein B (gB) DNA vaccine protects against cytomegalovirus (CMV) transmission in the guinea pig model of congenital CMV infection. J Infect Dis 2003, 188: 1868-74.
Schleiss MR, Bourne N, Stroup G, Bravo FJ, Jensen NJ, Bernstein DI: Protection against congenital cytomegalovirus infection and disease in guinea pigs, conferred by a purified recombinant glycoprotein B vaccine. J Infect Dis 2004, 189: 1374-81.
Powers C, DeFilippis V, Malouli D, Früh K: Cytomegalovirus immune evasion. Curr Top Microbiol Immunol 2008, 325: 333-59.
Valchanova RS, Picard-Maureau M, Budt M, Brune W: Murine cytomegalovirus m142 and m143 are both required to block protein kinase R-mediated shutdown of protein synthesis. J Virol 2006, 80: 10181-90.
Child SJ, Hanson LK, Brown CE, Janzen DM, Geballe AP: Double-stranded RNA binding by a heterodimeric complex of murine cytomegalovirus m142 and m143 proteins. J Virol 2006, 80: 10173-80.
Ménard C, Wagner M, Ruzsics Z, Holak K, Brune W, Campbell AE, Koszinowski UH: Role of murine cytomegalovirus US22 gene family members in replication in macrophages. J Virol 2003, 77: 5557-70.
Skaletskaya A, Bartle LM, Chittenden T, McCormick AL, Mocarski ES, Goldmacher VS: A cytomegalovirus-encoded inhibitor of apoptosis that suppresses caspase-8 activation. Proc Natl Acad Sci USA 2001, 98: 7829-34.
Lagenaur LA, Manning WC, Vieira J, Martens CL, Mocarski ES: Structure and function of the murine cytomegalovirus sgg1 gene: a determinant of viral growth in salivary gland acinar cells. J Virol 1994, 68: 7717-7727.
Dolan A, Cunningham C, Hector RD, Hassan-Walker AF, Lee L, Addison C, Dargan DJ, McGeoch DJ, Gatherer D, Emery VC, Griffiths PD, Sinzger C, McSharry BP, Wilkinson GW, Davison AJ: Genetic content of wild-type human cytomegalovirus. J Gen Virol 2004, 85: 1301-1312.
Ryckman BJ, Rainish BL, Chase MC, Borton JA, Nelson JA, Jarvis MA, Johnson DC: Characterization of the human cytomegalovirus gH/gL/UL128-131 complex that mediates entry into epithelial and endothelial cells. J Virol 2008, 82: 60-70.
Bahr U, Darai G: Analysis and characterization of the complete genome of tupaia (tree shrew) herpesvirus. J Virol 2001, 75: 4854-70.
Nozawa N, Yamamoto Y, Fukui Y, Katano H, Tsutsui Y, Sato Y, Yamada S, Inami Y, Nakamura K, Yokoi M, Kurane I, Inoue N: Identification of a 1.6 kb genome locus of guinea pig cytomegalovirus required for efficient viral growth in animals but not in cell culture. Virology 2008, 379: 45-54.
Cha TA, Tom E, Kemble GW, Duke GM, Mocarski ES, Spaete RR: Human cytomegalovirus clinical isolates carry at least 19 genes not found in laboratory strains. J Virol 1996, 70: 78-83.
Oxford KL, Eberhardt MK, Yang KW, Strelow L, Kelly S, Zhou SS, Barry PA: Protein coding content of the ULb' region of wild-type rhesus cytomegalovirus. Virology 2008, 373: 181-8.
Brocchieri L, Kledal TN, Karlin S, Mocarski ES: Predicting coding potential from genome sequence: application to betaherpesviruses infecting rats and mice. J Virol 2005, 79: 7570-96.
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD: Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Research 2003, 31: 3497-3500.
Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 2004, 32: 1792-1797.
Beuken E, Slobbe R, Bruggeman CA, Vink C: Cloning and sequence analysis of the genes encoding DNA polymerase, glycoprotein B, ICP18.5 and major DNA-binding protein of rat cytomegalovirus. J Gen Virol 1996, 77: 1559-62.
Grant support was provided from NIH HD044864-01 and HD38416-01 (to MRS) and R01AI46668 (to MAM). The authors acknowledge helpful discussions and input from Becket Feierbach (Genentech, Inc.). The authors also acknowledge the technical contributions of Yonggen Song and the gift of the Hin d III "D" plasmid from HC Isom, Penn State University.
The authors declare that they have no competing interest. SVD is an employee of Genentech Corporation.
MRS cloned viral fragments, performed sequence analysis, analyzed the data and prepared the communication. AM and XC cloned the GPCMV BACs. AM cloned individual genes for sequence analysis. AM, XC and KYC, performed sequence analysis, participated in data analysis, and helped in preparation of the communication. MAM cloned viral DNA fragments, performed sequence analysis, participated in BAC cloning, and aided in preparation of the communication. SVD performed comparative genomic analyses and comparisons and aided in the preparation of the communication.
Electronic supplementary material
Additional file 1: ORFs of ≥ 25 aa (tab A). 50 aa (tab B), or 100 aa (tab C) with Blast analysis against other sequenced CMV genomes; e-value cutoff of 0.1. (XLS 671 KB)
Additional file 2: ORFs of ≥ 25 aa (tab A). 50 aa (tab B), or 100 aa (tab C) with Blast analysis against other sequenced CMV genomes; e-value cutoff of 1e-5. (XLS 458 KB)
Additional file 3: Phylogenetic trees for glycoproteins gB, gH, gO, gL, gM and gN, IRS 1–3 family, and GP116 (functional homolog of UL119; Fc receptor/immunoglobulin binding domains). Alignments generated using both ClustalW and Muscle, as described in the text. (PDF 149 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Schleiss, M.R., McGregor, A., Choi, K.Y. et al. Analysis of the nucleotide sequence of the guinea pig cytomegalovirus (GPCMV) genome. Virol J 5, 139 (2008). https://doi.org/10.1186/1743-422X-5-139
- Bacterial Artificial Chromosome
- Bacterial Artificial Chromosome Sequence
- Accession Number FJ355434
- Positional Homolog
- HCMV UL23