Genetic diversity among five T4-like bacteriophages
© Nolan et al; licensee BioMed Central Ltd. 2006
Received: 31 March 2006
Accepted: 23 May 2006
Published: 23 May 2006
Bacteriophages are an important repository of genetic diversity. As one of the major constituents of terrestrial biomass, they exert profound effects on the earth's ecology and microbial evolution by mediating horizontal gene transfer between bacteria and controlling their growth. Only limited genomic sequence data are currently available for phages but even this reveals an overwhelming diversity in their gene sequences and genomes. The contribution of the T4-like phages to this overall phage diversity is difficult to assess, since only a few examples of complete genome sequence exist for these phages. Our analysis of five T4-like genomes represents half of the known T4-like genomes in GenBank.
Here, we have examined in detail the genetic diversity of the genomes of five relatives of bacteriophage T4: the Escherichia coli phages RB43, RB49 and RB69, the Aeromonas salmonicida phage 44RR2.8t (or 44RR) and the Aeromonas hydrophila phage Aeh1. Our data define a core set of conserved genes common to these genomes as well as hundreds of additional open reading frames (ORFs) that are nonconserved. Although some of these ORFs resemble known genes from bacterial hosts or other phages, most show no significant similarity to any known sequence in the databases. The five genomes analyzed here all have similarities in gene regulation to T4. Sequence motifs resembling T4 early and late consensus promoters were observed in all five genomes. In contrast, only two of these genomes, RB69 and 44RR, showed similarities to T4 middle-mode promoter sequences and to the T4 motA gene product required for their recognition. In addition, we observed that each phage differed in the number and assortment of putative genes encoding host-like metabolic enzymes, tRNA species, and homing endonucleases.
Our observations suggest that evolution of the T4-like phages has drawn on a highly diverged pool of genes in the microbial world. The T4-like phages harbour a wealth of genetic material that has not been identified previously. The mechanisms by which these genes may have arisen may differ from those previously proposed for the evolution of other bacteriophage genomes.
The T4-like phages are a diverse group of lytic bacterial myoviruses that share genetic homologies and morphological similarities with the well-studied coliphage T4 [1, 2]. These phages provide an attractive model for the study of comparative genomics and phage evolution for several reasons: They possess relatively large dsDNA genomes that vary widely in size (~160–250 kb) and genetic composition. They contain host-like functions, such as nucleotide metabolism and a DNA replisome (reviewed in ). They experience different evolutionary constraints due to their lytic life cycle than do either their bacterial host or lysogenic bacteriophages. They exist under less stringent genomic size constraints than, for example, the lambdoid phages . T4 has a terminally redundant genome  that replicates by a recombination-primed replication pathway. The efficient and promiscuous T4-encoded recombination machinery  may generate a high degree of evolutionary diversity, via both homologous and non-homologous recombination between this phage genome and that of bacterial hosts or other phages. Thus the characteristics of the T4-like genome, its mechanism of replication, and the interactions with cellular hosts suggest that the T4-like phages constitute a natural crucible for the acquisition, evolution and dispersal of genetic information in the microbial world.
We present here a bioinformatics analysis of the genome sequences of five T4-like bacteriophages. These phages include three coliphages (RB69, RB49 and RB43), and two Aeromonas phages (44RR2.8t and Aeh1). Our results complement and extend those previously reported from the coliphage T4 , the Vibrio phage, KVP40 , and from the marine cyanophages S-PM2 , P-SSM2 and P-SSM4 . Our data identify a conserved core of T4-like genes found in all of these genomes, including some conserved ORFs of unknown function. One of the most striking findings is the presence of large numbers of novel open reading frames (ORFs), most of which have no significant match in GenBank. Both conserved and nonconserved regions of the genomes include sequence motifs resembling T4 promoters. Thus, it appears that both core and novel genes are co-ordinately expressed in a manner similar to that of T4. We compare the possible origins of the novel regions of the T4 genome with those proposed for other phages.
We have analyzed five complete genome sequences of phylogenetically distant T4-like bacteriophages. This analysis is the first part of an ongoing comparative genomics project on T4-like phages. At present this project has generated single contiguous sequences for 12 divergent T4-like genomes. Of these sequences, five genomes were selected for in depth analysis on the basis of their phylogenetically diversity . Among completed genomes that are not dealt with here are the Aeromonas phages 31 and 25, since they are both close relatives of 44RR2.8t and thus do not add significantly to the sequence diversity of the group. Five other genomes are considered draft quality (coliphages RB16 and phi-1, Vibrio phage nt-1, Acinetobacter phage 133, and Aeromonas phage 65) and are not included in this analysis but are available through the Tulane T4-like Genome Website http://phage.bioc.tulane.edu. The five genomes presented here share between 61 and 67 percent amino acid similarity to each other among ~100 conserved open reading frames. T4 is most closely related to RB69, with which it shares 81% amino acid similarity over 207 ORFs. T4 exhibits about the same level of similarity to the other 4 genomes as they do to each other.
Summary of T4-like genome sequences determined in comparison with T4
# ORFS (% of genome)
# T4-like ORFs (% of all)
Conserved genes and ORFs
Domain matches for T4 conserved ORFs
Pfam domain name
E value range
Gly_radical formyl transferase
1.40E-45 to 8.8E-15
0.012 to 0.74
AAA ATPase family
0.082 to 0.16
COG3541: nucleotidyl transferase
4.0E-07 to 0.013
2/6 full alignment 4/6 partial alignment
2.0E-20 to 0.04
4/6 full alignment 1/6 partial alignment
Only recently has the conserved ORF uvsW.1 been recognized  in T4. Previously this sequence was believed to encode the C-terminal 76 amino acids of the UvsW protein. For all 5 of the genomes analyzed here, the coding region corresponding to T4 uvsW was divided into 2 ORFs, uvsW and uvsW.1. Concurrent crystallography on the UvsW protein from T4, showed that it too lacked the region similar to uvsW.1 and subsequent resequencing of this region in T4 confirmed the presence of the two distinct ORFs, uvsW.1 and uvsW . Although uvsW.1 is conserved among T4 and all 5 genomes studied here, its function remains unknown.
Each phage genome includes a surprisingly large number of ORFs that have no matches in T4. We term these ORFs "novel ORFs" and their numbers range from 230 in Aeh1 (54% of the genome) to 62 (20% of the genome) in RB69. Similarly, 64 T4 ORFs (15% of the genome) have no apparent ortholog in RB69, its closest relative in this analysis; these 64 ORFs are novel to T4 (see Table 1). Locations of the novel ORFs appear to be non-random, with most clustered in groups between blocks of conserved genes. In a few instances, however novel ORFs are found singly between conserved genes (see Figure 1). The direction of transcription of the novel ORFs is almost invariably the same as flanking conserved genes. This suggests that the novel ORFs are subject to the same regulatory constraints as the rest of the phage genome, with early expressed genes being transcribed primarily counterclockwise and late genes being transcribed clockwise. Nearly 90% of the novel ORFs are clustered among early and middle gene orthologs, suggesting that these genes are expressed at the beginning of the infectious cycle, along with the flanking conserved genes (see also below). The novel ORFs do not appear to differ significantly in codon bias from conserved genes. They share the same strand bias of the third codon position seen in T4  and do not vary significantly in codon adaptation index  from conserved genes (data not shown). These observations argue that the novel ORFs are not recent acquisitions of host genes.
Pfam hits for novel ORFs
Pfam Domain name
Prokaryotic N-terminal methylation motif
SPFH domain/Band 7 family
Glutaredoxin-like domain (DUF836)
Ribonucleotide reductase, small chain
Prokaryotic dksA/traR C4-type zinc finger
Domain of unknown function (DUF1732)
Sodium:solute symporter family
Putative metallopeptidase (SprT family)
Carbohydrate binding domain
Carbohydrate binding domain
Prokaryotic N-terminal methylation motif
Putative metallopeptidase (SprT family)
SPFH domain/Band 7 family
Bacterial transferase hexapeptide (3 repeats)
Poly(ADP-ribose) polymerase catalytic domain
von Willebrand factor type A domain
Bacterial regulatory proteins, lacI family
Poly A polymerase family
Phage T4 tail fibre
von Willebrand factor type A domain
C-5 cytosine-specific DNA methylase
SPFH domain/Band 7 family
PhoH-like protein PIN domain
DnaJ central domain (4 repeats)
DnaJ central domain (4 repeats)
Protein of unknown function (DUF1054)
Phage tail fibre adhesin Gp38
DEAD/DEAH box helicase
Prokaryotic N-terminal methylation motif
Methyltransferase small domain Ribosomal RNA adenine dimethylase
Protein of unknown function (DUF723)
Protein of unknown function (DUF1311)
Peptidase family U32
Putative mobile DNA elements
AP2 domain HNH endonuclease
Putative nucleotide salvage enzymes
NUDIX domain Cytidylyltransferase
NUDIX domain Cytidylyltransferase
Several other novel ORFs may be involved in nucleotide modification and synthesis. These include DNA methylase, nucleotidyl transferase, nucleotide triphosphatase and sugar isomerase domain functions identified by Pfam matches. In addition, phylogenetic analyses suggest that phage 44RR appears to have acquired ribonucleotide reductase and thioredoxin genes from a bacterial host, rather than through conservation of the T4-like orthologs . A number of the predicted ORFs likely to be involved in gene regulation were also identified, including DNA binding proteins, polyADP-ribosylases and -hydrolases, DNA helicases, an excision repair endonuclease and homing endonucleases, as indicated in Table 3. Other putative functions identified include membrane proteins, peptidases, ATPases, an exotoxin, and a putative DnaJ-type protein chaperone. Several ORFs that do not match known genes in GenBank do match GenBank environmental sample sequences. It is unclear if these matches are to uncharacterized bacterial hosts, or to unknown bacteriophages.
Mobile DNA elements
The T4 genome encodes a number of mobile DNA elements, including 3 group I introns with integrated ORFs encoding homing endonucleases as well as the freestanding homing endonucleases genes (HEGs), mob and seg . No group I introns were detected among any of the T4-like genomes sequenced here. However, two ORFs bearing similarity to the mob genes of T4 were identified in Aeh1 and RB43. An ORF similar to T4 segD has also been described for KVP40 . Thus, T4 seems to carry many more mobile elements than the genomes analyzed here. Interestingly, both RB49 and RB43 exhibit matches to a recently identified class of HEGs, AP2-HNH mobile DNA elements, which are related to the AP2 DNA transcription factor in plants  (also see ). This class of HEGs has been postulated to have transferred from bacteriophages into plant genomes via the chloroplast genome .
Putative signals for transcriptional regulation
The similarities of genome organization to T4 suggested that T4 transcriptional regulatory circuits might be conserved for many T4-like phages in nature. However, phages 44RR and Aeh1 replicate in different hosts than T4 and coliphage RB43 has a substantially rearranged genome compared to the T4 prototype. The relevance of these differences to gene regulation was analyzed by prediction of transcriptional promoter elements in each genome. Consensus nucleotide sequences have been described for three temporal classes of promoters in T4: genes expressed early, middle and late in the infectious cycle . Each of the five T4-like genomes was searched for matches to these T4 transcriptional regulatory signals.
All putative early promoters resemble the T4 consensus in the -10 region, which is recognized in the host by the σ subunit of RNA polymerase. In general, there is high conservation of T at position -7 and A residues at position -11, as seen in T4. However, in our phage conservation of the T at position -12 is variable; T is not rigidly conserved at position -12 in Aeh1, and in RB49 it can be either T or C. There is variable conservation of the GT-rich sequence 5' to position -12 exhibited by T4. 44RR shows a higher degree of conservation of A at -8 than any of the other phages. The genomes of RB69, RB49, and 44RR all show preference for C residues in the -3 to -1 region. The predicted RB49 early consensus agrees with that previously identified by 5' end mapping of RB49 early transcripts .
In T4, late promoters are recognized by a phage-encoded σ factor, gp55. Contact between T4 gp55 and the DNA is facilitated by the T4 polymerase sliding clamp, gp45. A third T4-encoded gene product, gp33 forms a bridge between gp55 and gp45 . The T4 late promoter consensus sequence is a short but highly conserved motif, TATAAATA, between nucleotide positions -13 and -6 relative to the transcriptional start site . Putative late promoters were found readily for four of the five phage genomes studied, using the strategy employed for early and middle promoter searches (Figure 4B). However, the T at position -13 was poorly conserved for most phages, with either A or T commonly found at this position. A similar observation was made for late promoters in an earlier description of RB49 late promoters , as well as in KVP40  and S-PM2 .
Since our search strategy failed to detect late promoter sequences for phage Aeh1, an alternative strategy was employed to identify them. Regions upstream of ORFs orthologous to T4 late genes were analyzed with the ELPH program  to identify sequence motifs common to these DNA segments. The selected motifs were used as seed to identify additional late promoter sequences using HMMer. This strategy identified a conserved sequence, CTAAATA, beginning at -12 from the putative initiation site. Once identified, this putative promoter sequence was used as a seed for string search followed by HMM refinement used for late promoters of the other phages. Although the C at position -12 is a strong determinant for detection of Aeh1 late promoters, C is rarely found at this position in the putative late promoters of the other four phage genomes (Figure 4B). It should be noted that the phage Aeh1 gp55 protein, which presumably recognizes the divergent late promoter sequences of Aeh1, is itself substantially diverged from all the other phage gp55 sequences (data not shown). Coordinates of putative late promoters can be found in the supplements (see additional file 5).
Terminators and operons
Putative rho-independent terminator sequences were identified for all 5 genomes, using the TransTerm program . Although the locations of putative terminator sequences vary between phages, several terminators appear at conserved locations (see additional file 6). One striking example is the bi-directional terminator predicted downstream of uvsW.1 that is conserved in T4 and the other 5 genomes. In all cases, the gene downstream of uvsW.1 is transcribed from the opposite strand and a bidirectional terminator is predicted between the converging transcripts. Genes 35 and 36 are transcribed rightward and a predicted terminator is located between them in all 6 genomes. Likewise, gene 23 has a terminator predicted downstream in all 6 genomes. Terminators conserved in 5 out of 6 genomes were identified downstream of Gene 32 and upstream of alt.
Comparisons between the positions of predicted terminators and transcription initiation signals allowed the identification of putative operons of gene expression. An example of operon structure from phage RB69 is shown in Figure 3. In some instances, it appears that the upstream promoters of novel genes drive expression of T4-like early genes that lack their own early promoter. In general, T4-like genes are predicted to be in operons with other T4-like genes, while novel ORFs appear to reside in operons with other novel ORFs.
tRNAs and codon bias
The genome sequences presented here display broad diversity in primary sequence. Orthologous ORFs can be detected for 45 to 85 percent of open reading frames between any pair of these genomes. Orthologous protein sequences are on average 65% similar between genomes. This diversity is comparable to that seen across vertebrate evolution. For example, humans and chickens share 60% orthologous genes at a median amino acid similarity of 75%. Humans and teleost fishes share approximately 55% orthologous genes. The two most closely related phage genomes analyzed here, T4 and RB69, share 80% orthologs of 81% similarity, a distance comparable to that between humans and mice. Despite the diversity of their predicted protein sequences, these five T4-like phage genomes share a highly conserved genome organization. Most orthologs of T4 genes were identified in the same gene order and orientation as the cistrons in T4. RB43 shows the largest number of exceptions to this observation. It appears that several genome rearrangements must have occurred in one or both of these phages since they diverged from their common ancestor.
The possibility of shared genetic regulatory elements among the T4-like phages was investigated by motif searches that identified putative promoter elements resembling T4 early and late promoters in all genomes. Late promoters were found exclusively 5' to conserved orthologs of T4 late genes. Many early promoters were found 5' to T4 early gene orthologs, but others were found 5' to novel ORFs. It thus appears that the early and late transcriptional modes are conserved among the T4-like phages. The novel ORFs appear to be coordinately expressed with early genes in all phages. The middle gene expression pathway appears to be less conserved among the T4-like phages. The middle promoter consensus was detected in RB69, and to a lesser degree in 44RR. The MotA protein product, required for recognition of the middle promoter Mot box, appears to be conserved only in T4, RB69 and 44RR.
The T4 genome is predicted to encode over 120 ORFs of unknown function. 11 ORFs were found to have homologs in all five of the genomes in our study. Given this level of conservation, these ORFs must encode products that are vital to the phage in some hosts or environments. We have identified putative functional domains for 5 of these ORFs based on matches to known Pfam domains. The candidate functions include nucleotide metabolism, host cell lysis, and gene regulation. An aggregate of about 70% of T4 ORFs are conserved in at least one other genome, suggesting that the protein products of these ORFs provide selective advantages to these phages. Conservation of these ORFs does not generally extend to more divergent phages than those analyzed here. Although several of these ORFs are conserved in KVP40, no matches were found in any of the marine phage genomes.
Each of the T4-like genomes we have examined, including T4, harbors a number of ORFs that are unique to that genome. In Aeh1, these novel ORFs comprise over half of the Aeh1 genome and most show no significant similarity to known sequences in GenBank. Functions identified for some novel ORFs suggest physiologically important roles in the phage life cycle, such as nucleotide metabolism, transcription and lateral DNA mobility. However, most novel ORFs have no known function or origin. It is thus unclear where these sequences arose, how they were acquired, and what function they might serve in the phage-infected cell. In many instances, regions containing novel ORFs were observed to be underrepresented in plasmid libraries constructed for shotgun sequencing and were only identified during PCR-based gap closure  and data not shown). It would appear then, that at least some novel ORFs in our study are deleterious to the host cell when expressed in high copy plasmids. Some of the gene products of these ORFs may function in cell lysis or in commandeering host machinery for phage growth.
The mechanisms of gain and loss of ORFs by T4-like genomes in evolution may differ from that proposed for the genomes of other phages, such as the lambdoid phage . The novel lambdoid ORFs include "morons" – apparent short insertions of DNA consisting of an ORF flanked by transcriptional promoter and terminator signals. Moron DNAs are distinct from other lambdoid genes in %GC content, and thus appear to be recent acquisitions of genes by nonhomologous recombination with host DNA. In contrast, the majority of novel ORFs in T4-like phages does not appear moronic; they have a %GC that is indistinguishable from the rest of the phage genome (average %GC in RB69: ORFs-36.9%, conserved-37.6%) and thus do not appear to be recent acquisitions from the host. Another class of novel lambdoid ORFs appears to be chimeras of other phage genes . In the few instances where the T4-like novel ORFs have significant matches to other phage or GenBank proteins, the similarities generally extend over the entire length of the coding sequence rather than being restricted to the blocks of similarity found in chimeras. A better understanding of the origins of the novel ORFs in T4-like phages will provide clues into the mechanisms underlying the evolution of protein coding sequences and the biology of host-phage interactions. The mechanisms by which T4-like phages acquire ORFs may differ from the lambdoid phages. T4-like phage do not undergo lysogeny, thus they cannot acquire genes by imprecise excision from the host genome. They do not generally transduce host DNA as frequently as other Myoviridae, such as P22 , perhaps because of their propensity to hydrolyze host DNA. T4-like phages have a recombination-driven replication pathway that is facilitated by redundant DNA sequences at the chromosome ends. During replication, the redundant end sequences synapse with homologous regions of other replicating DNA molecules for further replication into long concatamers . A variation of this pathway has been postulated as a mechanism for the lateral transfer of novel genes between related phages . However, the ultimate source of these novel genes remains unknown but may include bacterial hosts or bacteriophages encountered in coinfection. The failure to detect significant similarities between many of the novel ORFs described here and known bacterial genomes indicates that either these ORFs arose from bacterial hosts quite diverged from any known bacterium, or that bacterial genomes are not a major source for these ORFs. The latter appears to be more likely, at least in the case of novel ORFs identified in closely related phages, such as T4 and RB69. Unknown phages would seem a more likely source for many of these ORFs. Newly sequenced phage genomes often include numerous ORFs for which there is no known ortholog. Clearly, more phage genomes must be mined to incorporate more of their sequence diversity into the library of known sequence databases.
Our survey of a diverse set of T4-like phage genomes reveals similarities in general genome organization and gene regulation. Although a core of conserved ORFs was identified, the genome sequences exhibited a striking diversity of ORFs novel to each genome. The origins of this diversity have yet to be uncovered.
Bacteriophages and hosts
ORFs were detected primarily by use of the GeneMarkS program [11, 12]. The program was chosen based upon its accuracy in ORF prediction of the T4 genomic sequence by comparison to the GenBank accession (97% of ORFs recognized). When an orthologous gene was detected in a related phage genome, the predicted translational start sites were scrutinized for additional N-terminal protein sequences with significant similarity to orthologs upstream of the predicted translational start site. In these cases, the translational start site was adjusted to maximize the length of predicted amino acid similarity. Although prediction models were not based upon similarity between genomes, generally fewer than 5% of the predicted start sites required adjustment.
GeneMarkS predictions were compared with those obtained using Glimmer . There was general agreement between the predictions obtained with the two programs. Glimmer predicted more ORFs per genome, but in some cases the additional ORFs predicted were inconsistent with the direction of transcription of flanking genes, which is uncommon in T4  and appears unusual for the genomes sequenced here. Thus, the Glimmer predictions were used primarily to adjust GeneMarkS predictions as mentioned above, or in regions where Glimmer predicted an ORF and GeneMarkS predicted an unusually long (> 200 bp) intercistronic region.
Predicted ORFs were checked for similarity to T4 genes by blastp  mutual similarity. Genes with mutual best hit E-values < 10-4 to known T4 genes were designated by the T4 gene name. Putative genes without T4 orthologs were designated by their ORF numbers, with conserved gene rIIA designated as ORF001. The strand of each ORF is designated "w" for clockwise (left-to-right) transcribed genes, and "c" for counterclockwise (right-to-left) transcribed genes. In T4, the origin of the genome has been assigned to the rIIB – rIIA intercistronic region; the terminus of the genome is defined as the start of translation of the rIIB gene. The sequence origin of each genome sequenced here is defined as the termination codon of the rIIA gene.
Genomes were also searched for tRNA genes using tRNAscan-SE . All genomes except that of RB49 had at least one putative tRNA gene.
DNA sequences are available through GenBank [Genbank:NC_005135] (44RR), [Genbank:NC_007023] (RB43), [Genbank:NC_004928] (RB69), [Genbank:NC_005260] (Aeh1), and [Genbank:NC_005066] (RB49). Additional analyses are available through the Tulane T4-like Genome Website http://phage.bioc.tulane.edu Available data include an interactive genome browser , clustalW  alignments, EMBOSS pepstat statistics, octanol hydropathy plots , and HMMer Pfam matches .
We thank Guy Plunkett and Takashi Kunisawa for identifying putative lysidine-modified tRNA genes. JN thanks Eric Miller for numerous helpful discussions, and Candace Timpte for helpful comments on the manuscript. This work was supported by awards MCB-0138236 and EF-0333130 from the National Science Foundation to JDK.
- Büchen-Osmond C: T4-like viruses. ICTVdB - The Universal Virus Database, version 3 [http://www.ncbi.nlm.nih.gov/ICTVdb/ICTVdB/]
- Ackermann HW, Krisch HM: A catalogue of T4-type bacteriophages. Arch Virol 1997,142(12):2329-2345. 10.1007/s007050050246View ArticlePubMedGoogle Scholar
- Miller ES, Kutter E, Mosig G, Arisaka F, Kunisawa T, Ruger W: Bacteriophage T4 Genome. Microbiol Mol Biol Rev 2003,67(1):86-156. 10.1128/MMBR.67.1.86-156.2003PubMed CentralView ArticlePubMedGoogle Scholar
- Hendrix RW: Bacteriophage genomics. Curr Opin Microbiol 2003,6(5):506-511. 10.1016/j.mib.2003.09.004View ArticlePubMedGoogle Scholar
- Grossi GF, Macchiato MF, Gialanella G: Circular permutation analysis of phage T4 DNA by electron microscopy. Z Naturforsch [C] 1983,38(3-4):294-296.Google Scholar
- Mosig G: Homologous recombination. In Molecular Biology of Bacteriophage T4. Edited by: Karam JD, Drake JW, Kreuzer KN, Mosig G, Hall DH, Karam JD, Drake JW, Kreuzer KN, Mosig G, Hall DH, Eiserling FA, Black LW, Spicer EK, Kutter E, Carlson K, Miller ES. Washington, D.C. , American Society for Microbiology; 1994:54-82.Google Scholar
- Miller ES, Heidelberg JF, Eisen JA, Nelson WC, Durkin AS, Ciecko A, Feldblyum TV, White O, Paulsen IT, Nierman WC, Lee J, Szczypinski B, Fraser CM: Complete Genome Sequence of the Broad-Host-Range Vibriophage KVP40: Comparative Genomics of a T4-Related Bacteriophage. J Bacteriol 2003,185(17):5220-5233. 10.1128/JB.185.17.5220-5233.2003PubMed CentralView ArticlePubMedGoogle Scholar
- Mann NH, Clokie MR, Millard A, Cook A, Wilson WH, Wheatley PJ, Letarov A, Krisch HM: The genome of S-PM2, a "photosynthetic" T4-type bacteriophage that infects marine Synechococcus strains. J Bacteriol 2005,187(9):3188-3200. 10.1128/JB.187.9.3188-3200.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW: Three Prochlorococcus cyanophage genomes: signature features and ecological interpretations. PLoS Biol 2005,3(5):e144. 10.1371/journal.pbio.0030144PubMed CentralView ArticlePubMedGoogle Scholar
- Desplats C, Krisch HM: The diversity and evolution of the T4-type bacteriophages. Res Microbiol 2003,154(4):259-267. 10.1016/S0923-2508(03)00069-XView ArticlePubMedGoogle Scholar
- Mills R, Rozanov M, Lomsadze A, Tatusova T, Borodovsky M: Improving gene annotation of complete viral genomes. Nucleic Acids Res 2003,31(23):7041-7055. 10.1093/nar/gkg878PubMed CentralView ArticlePubMedGoogle Scholar
- Besemer J, Lomsadze A, Borodovsky M: GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res 2001,29(12):2607-2618. 10.1093/nar/29.12.2607PubMed CentralView ArticlePubMedGoogle Scholar
- Petrov V, Nolan JM, Bertrand C, Chin Levy D, Desplat C, Krisch H, Karam J: Divergence of the DNA replication genes among T4-like phage genomes. J Mol Biol 2006., in press:Google Scholar
- Edgar RS, Lielausis I: Temperature-Sensitive Mutants Of Bacteriophage T4d: Their Isolation And Genetic Characterization. Genetics 1964, 49: 649-662.PubMed CentralPubMedGoogle Scholar
- Kawabata T, Arisaka F, Nishikawa K: Structural/functional assignment of unknown bacteriophage T4 proteins by iterative database searches. Gene 2000,259(1-2):223-233. 10.1016/S0378-1119(00)00442-XView ArticlePubMedGoogle Scholar
- Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH: CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res 2002,30(1):281-283. 10.1093/nar/30.1.281PubMed CentralView ArticlePubMedGoogle Scholar
- Sickmier EA, Kreuzer KN, White SW: The crystal structure of the UvsW helicase from bacteriophage T4. Structure 2004,12(4):583-592. 10.1016/j.str.2004.02.016View ArticlePubMedGoogle Scholar
- Kano-Sueoka T, Lobry JR, Sueoka N: Intra-strand biases in bacteriophage T4 genome. Gene 1999,238(1):59-64. 10.1016/S0378-1119(99)00296-6View ArticlePubMedGoogle Scholar
- Sharp PM, Li WH: The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 1987,15(3):1281-1295.PubMed CentralView ArticlePubMedGoogle Scholar
- Xu W, Gauss P, Shen J, Dunn CA, Bessman MJ: The gene e.1 (nudE.1) of T4 bacteriophage designates a new member of the Nudix hydrolase superfamily active on flavin adenine dinucleotide, adenosine 5'-triphospho-5'-adenosine, and ADP-ribose. J Biol Chem 2002,277(26):23181-23185. 10.1074/jbc.M203325200View ArticlePubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004,340(4):783-795. 10.1016/j.jmb.2004.05.028View ArticlePubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001,305(3):567-580. 10.1006/jmbi.2000.4315View ArticlePubMedGoogle Scholar
- Magnani E, Sjolander K, Hake S: From endonucleases to transcription factors: evolution of the AP2 DNA binding domain in plants. Plant Cell 2004,16(9):2265-2277. 10.1105/tpc.104.023135PubMed CentralView ArticlePubMedGoogle Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics 2000,16(6):276-277. 10.1016/S0168-9525(00)02024-2View ArticlePubMedGoogle Scholar
- Tiemann B, Depping R, Gineikiene E, Kaliniene L, Nivinskas R, Ruger W: ModA and ModB, two ADP-ribosyltransferases encoded by bacteriophage T4: catalytic properties and mutation analysis. J Bacteriol 2004,186(21):7262-7272. 10.1128/JB.186.21.7262-7272.2004PubMed CentralView ArticlePubMedGoogle Scholar
- Estrem ST, Ross W, Gaal T, Chen ZW, Niu W, Ebright RH, Gourse RL: Bacterial promoter architecture: subsite structure of UP elements and interactions with the carboxy-terminal domain of the RNA polymerase alpha subunit. Genes Dev 1999,13(16):2134-2147.PubMed CentralView ArticlePubMedGoogle Scholar
- Ross W, Aiyar SE, Salomon J, Gourse RL: Escherichia coli promoters with UP elements of different strengths: modular structure of bacterial promoters. J Bacteriol 1998,180(20):5375-5383.PubMed CentralPubMedGoogle Scholar
- Estrem ST, Gaal T, Ross W, Gourse RL: Identification of an UP element consensus sequence for bacterial promoters. Proc Natl Acad Sci U S A 1998,95(17):9761-9766. 10.1073/pnas.95.17.9761PubMed CentralView ArticlePubMedGoogle Scholar
- Desplats C, Dez C, Tetart F, Eleaume H, Krisch HM: Snapshot of the genome of the pseudo-T-even bacteriophage RB49. J Bacteriol 2002,184(10):2789-2804. 10.1128/JB.184.10.2789-2804.2002PubMed CentralView ArticlePubMedGoogle Scholar
- Hinton DM, Pande S, Wais N, Johnson XB, Vuthoori M, Makela A, Hook-Barnard I: Transcriptional takeover by sigma appropriation: remodelling of the sigma70 subunit of Escherichia coli RNA polymerase by the bacteriophage T4 activator MotA and co-activator AsiA. Microbiology 2005,151(Pt 6):1729-1740. 10.1099/mic.0.27972-0View ArticlePubMedGoogle Scholar
- Li N, Sickmier EA, Zhang R, Joachimiak A, White SW: The MotA transcription factor from bacteriophage T4 contains a novel DNA-binding domain: the 'double wing' motif. Mol Microbiol 2002,43(5):1079-1088. 10.1046/j.1365-2958.2002.02809.xView ArticlePubMedGoogle Scholar
- Adelman K, Brody EN, Buckle M: Stimulation of bacteriophage T4 middle transcription by the T4 proteins MotA and AsiA occurs at two distinct steps in the transcription cycle. Proc Natl Acad Sci U S A 1998,95(26):15247-15252. 10.1073/pnas.95.26.15247PubMed CentralView ArticlePubMedGoogle Scholar
- Truncaite L, Piesiniene L, Kolesinskiene G, Zajanckauskaite A, Driukas A, Klausa V, Nivinskas R: Twelve new MotA-dependent middle promoters of bacteriophage T4: consensus sequence revised. J Mol Biol 2003,327(2):335-346. 10.1016/S0022-2836(03)00125-6View ArticlePubMedGoogle Scholar
- Hinton DM, Vuthoori S: Efficient inhibition of Escherichia coli RNA polymerase by the bacteriophage T4 AsiA protein requires that AsiA binds first to free sigma70. J Mol Biol 2000,304(5):731-739.View ArticlePubMedGoogle Scholar
- Pineda M, Gregory BD, Szczypinski B, Baxter KR, Hochschild A, Miller ES, Hinton DM: A family of anti-sigma70 proteins in T4-type phages and bacteria that are similar to AsiA, a Transcription inhibitor and co-activator of bacteriophage T4. J Mol Biol 2004,344(5):1183-1197. 10.1016/j.jmb.2004.10.003View ArticlePubMedGoogle Scholar
- Williams KP, Kassavetis GA, Herendeen DR, Geiduschek EP: Regulation of late-gene expression. In Molecular Biology of Bacteriophage T4. Edited by: Karam JD, Drake JW, Kreuzer KN, Mosig G, Hall DH, Karam JD, Drake JW, Kreuzer KN, Mosig G, Hall DH, Eiserling FA, Black LW, Spicer EK, Kutter E, Carlson K, Miller ES. Washington, DC , American Society for Microbiology; 1994:161-175.Google Scholar
- Salzberg SL: ELPH, a motif finder that can find ribosome binding sites, exon splicing enhancers, or regulatory sites.[http://www.cbcb.umd.edu/software/ELPH/]
- Ermolaeva MD, Khalak HG, White O, Smith HO, Salzberg SL: Prediction of transcription terminators in bacterial genomes. J Mol Biol 2000,301(1):27-33. 10.1006/jmbi.2000.3836View ArticlePubMedGoogle Scholar
- Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997,25(5):955-964. 10.1093/nar/25.5.955PubMed CentralView ArticlePubMedGoogle Scholar
- Nureki O, Niimi T, Muramatsu T, Kanno H, Kohno T, Florentz C, Giege R, Yokoyama S: Molecular recognition of the identity-determinant set of isoleucine transfer RNA from Escherichia coli. J Mol Biol 1994,236(3):710-724. 10.1006/jmbi.1994.1184View ArticlePubMedGoogle Scholar
- Guthrie C, McClain WH: Rare transfer ribonucleic acid essential for phage growth. Nucleotide sequence comparison of normal and mutant T4 isoleucine-accepting transfer ribonucleic acid. Biochemistry 1979,18(17):3786-3795. 10.1021/bi00584a023View ArticlePubMedGoogle Scholar
- Fukada K, Abelson J: DNA sequence of a T4 transfer RNA gene cluster. J Mol Biol 1980,139(3):377-391. 10.1016/0022-2836(80)90136-9View ArticlePubMedGoogle Scholar
- Muramatsu T, Nishikawa K, Nemoto F, Kuchino Y, Nishimura S, Miyazawa T, Yokoyama S: Codon and amino-acid specificities of a transfer RNA are both converted by a single post-transcriptional modification. Nature 1988,336(6195):179-181. 10.1038/336179a0View ArticlePubMedGoogle Scholar
- Plunkett G, Rose DJ, Durfee TJ, Blattner FR: Sequence of Shiga toxin 2 phage 933W from Escherichia coli O157:H7: Shiga toxin as a phage late-gene product. J Bacteriol 1999,181(6):1767-1778.PubMed CentralPubMedGoogle Scholar
- Pedulla ML, Ford ME, Houtz JM, Karthikeyan T, Wadsworth C, Lewis JA, Jacobs-Sera D, Falbo J, Gross J, Pannunzio NR, Brucker W, Kumar V, Kandasamy J, Keenan L, Bardarov S, Kriakov J, Lawrence JG, Jacobs WRJ, Hendrix RW, Hatfull GF: Origins of highly mosaic mycobacteriophage genomes. Cell 2003,113(2):171-182. 10.1016/S0092-8674(03)00233-2View ArticlePubMedGoogle Scholar
- Hendrix RW, Lawrence JG, Hatfull GF, Casjens S: The origins and ongoing evolution of viruses. Trends Microbiol 2000,8(11):504-508. 10.1016/S0966-842X(00)01863-1View ArticlePubMedGoogle Scholar
- Wu H, Sampson L, Parr R, Casjens S: The DNA site utilized by bacteriophage P22 for initiation of DNA packaging. Mol Microbiol 2002,45(6):1631-1646. 10.1046/j.1365-2958.2002.03114.xView ArticlePubMedGoogle Scholar
- Mosig G, Gewin J, Luder A, Colowick N, Vo D: Two recombination-dependent DNA replication pathways of bacteriophage T4, and their roles in mutagenesis and horizontal gene transfer. Proc Natl Acad Sci U S A 2001,98(15):8306-8311. 10.1073/pnas.131007398PubMed CentralView ArticlePubMedGoogle Scholar
- Delcher AL, Harmon D, Kasif S, White O, Salzberg SL: Improved microbial gene identification with GLIMMER. Nucleic Acids Res 1999,27(23):4636-4641. 10.1093/nar/27.23.4636PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997,25(17):3389-3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Stein LD, Mungall C, Shu SQ, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S: The Generic Genome Browser: A Building Block for a Model Organism System Database. Genome Res 2002,12(10):1599-1610. 10.1101/gr.403602PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994,22(22):4673-4680.PubMed CentralView ArticlePubMedGoogle Scholar
- Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL: The Pfam protein families database. Nucleic Acids Res 2002,30(1):276-280. 10.1093/nar/30.1.276PubMed CentralView ArticlePubMedGoogle Scholar
- Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res 2004,14(6):1188-1190. 10.1101/gr.849004PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.