Transcriptional control in the prereplicative phase of T4 development

Control of transcription is crucial for correct gene expression and orderly development. For many years, bacteriophage T4 has provided a simple model system to investigate mechanisms that regulate this process. Development of T4 requires the transcription of early, middle and late RNAs. Because T4 does not encode its own RNA polymerase, it must redirect the polymerase of its host, E. coli, to the correct class of genes at the correct time. T4 accomplishes this through the action of phage-encoded factors. Here I review recent studies investigating the transcription of T4 prereplicative genes, which are expressed as early and middle transcripts. Early RNAs are generated immediately after infection from T4 promoters that contain excellent recognition sequences for host polymerase. Consequently, the early promoters compete extremely well with host promoters for the available polymerase. T4 early promoter activity is further enhanced by the action of the T4 Alt protein, a component of the phage head that is injected into E. coli along with the phage DNA. Alt modifies Arg265 on one of the two α subunits of RNA polymerase. Although work with host promoters predicts that this modification should decrease promoter activity, transcription from some T4 early promoters increases when RNA polymerase is modified by Alt. Transcription of T4 middle genes begins about 1 minute after infection and proceeds by two pathways: 1) extension of early transcripts into downstream middle genes and 2) activation of T4 middle promoters through a process called sigma appropriation. In this activation, the T4 co-activator AsiA binds to Region 4 of σ70, the specificity subunit of RNA polymerase. This binding dramatically remodels this portion of σ70, which then allows the T4 activator MotA to also interact with σ70. In addition, AsiA restructuring of σ70 prevents Region 4 from forming its normal contacts with the -35 region of promoter DNA, which in turn allows MotA to interact with its DNA binding site, a MotA box, centered at the -30 region of middle promoter DNA. T4 sigma appropriation reveals how a specific domain within RNA polymerase can be remolded and then exploited to alter promoter specificity.


Background
Expression of the T4 genome is a highly regulated and elegant process that begins immediately after infection of the host. Major control of this expression occurs at the level of transcription. T4 does not encode its own RNA polymerase (RNAP), but instead encodes multiple factors, which serve to change the specificity of polymerase as infection proceeds. These changes correlate with the temporal regulation of three classes of transcription: early, middle, and late. Early and middle RNA is detected prereplicatively [previously reviewed in [1][2][3][4][5][6]], while late transcription is concurrent with T4 replication and discussed in another chapter. T4 early transcripts are generated from early promoters (Pe), which are active immediately after infection. Early RNA is detected even in the presence of chloramphenicol, an antibiotic that prevents protein synthesis. In contrast, T4 middle transcripts are generated about 1 minute after infection at 37°C and require phage protein synthesis. Middle RNA is synthesized in two ways: 1) activation of middle promoters (Pm) and 2) extension of Pe transcripts from early genes into downstream middle genes.
This review focuses on investigations of T4 early and middle transcription since those detailed in the last T4 book [1,5]. At the time of that publication, early and middle transcripts had been extensively characterized, but the mechanisms underlying their synthesis were just emerging. In particular, in vitro experiments had just demonstrated that activation of middle promoters requires a T4-modified RNAP and the T4 activator MotA [7,8]. Subsequent work has identified the needed RNAP modification as the tight binding of a 10 kDa protein, AsiA, to the σ 70 subunit of RNAP [9][10][11][12][13]. In addition, a wealth of structural and biochemical information about E. coli RNAP [reviewed in [14][15][16]], MotA, and AsiA [reviewed in [2]] has now become available. As detailed below, we now have a much more mechanistic understanding of the process of prereplicative T4 transcription. To understand this process, we first start with a review of the host transcriptional machinery and RNAP.
The E. coli transcriptional machinery E. coli RNAP holoenzyme, like all bacterial RNAPs, is composed of a core of subunits (β, β',α 1 , α 2 , and ω), which contains the active site for RNA synthesis, and a specificity factor, σ, which recognizes promoters within the DNA and sets the start site for transcription. The primary σ, σ 70 in E. coli, is used during exponential growth; alternate σ factors direct transcription of genes needed during different growth conditions or times of stress [reviewed in [17][18][19]]. Sequence/function analyses of hundreds of σ factors have identified various regions and subregions of conservation. Most σ factors share similarity in Regions 2-4, the central through C-terminal portion of the protein, while primary σ factors also have a related N-terminal portion, Region 1.
Recent structural information, together with previous and ongoing biochemical and genetic work [reviewed in [14,15,20,21]], has resulted in a biomolecular understanding of RNAP function and the process of transcription. Structures of holoenzyme, core, and portions of the primary σ of thermophilic bacteria with and without DNA [15,16,[22][23][24][25][26][27][28], and structures of regions of E. coli σ 70 alone [29] and in a complex with other proteins [26,30] are now available. This work indicates that the interface between σ 70 and core within the RNAP holoenzyme is extensive (Figure 1). It includes contact between a portion of σ Region 2 and a coiled/coil domain composed of β, β', an interaction of σ 70 Region 1.1 within the "jaws" in the downstream DNA channel (where DNA downstream of the transcription start site will be located when RNAP binds the promoter), and an interaction between σ 70 Region 4 and a portion of the β subunit called the β-flap.
For transcription to begin, portions of RNAP must first recognize and bind to double-stranded (ds) DNA recognition elements present within promoter DNA ( Figure 1) [reviewed in [20]]. Each of the C-terminal domains of the α subunits (α-CTDs) can interact with an UP element, A/T rich sequences present between positions -40 and -60. Portions of σ 70 , when present in RNAP, can interact with three different dsDNA elements. A helix-turn-helix, DNA binding motif in σ 70 Region 4 can bind to the -35 element, σ 70 Region 3 can bind to a -15TGn-13 sequence (TGn), and σ 70 subregion 2.4 can bind to positions -12/-11 of a -10 element. Recognition of the -35 element also requires contact between residues in σ 70 Region 4 and the β-flap in Figure 1 RNAP holoenzyme and the interaction of RNAP with s 70 -dependent promoters. Structure-based cartoons (left to right) depict RNAP holoenzyme, RPc (closed complex), RPo (open complex), and EC (elongating complex) with σ 70 in yellow, core (β,β',α 2 , and ω) in turquoise, DNA in magenta, and RNA in purple. In holoenzyme, the positions of σ 70 Regions 1.1, 2, 3, and 4, the α-CTDs, the β-flap, and the β,β' jaws are identified. In RPc, contact can be made between RNAP and promoter dsDNA elements: two UP elements with each of the α-CTDs, the -35 element with σ 70 Region 4, TGn (positions -15 to -13) with σ 70 Region 3, and positions -12/-11 of the -10 element with σ 70 Region 2. σ 70 Region 1.1 lies in the downstream DNA channel formed by portions of β and β' and the β',β' jaws are open. In RPo, unwinding of the DNA and conformational changes within RNAP result in a sharp bend of the DNA into the active site with the formation of the transcription bubble surrounding the start of transcription, the interaction of σ 70 Region 2 with nontemplate ssDNA in the -10 element, movement of Region 1.1 from the downstream DNA channel, and contact between the downstream DNA and the β' clamp. In EC, σ 70 and the promoter DNA have been released. The newly synthesized RNA remains annealed to the DNA template in the RNA/DNA hybrid as the previously synthesized RNA is extruded through the RNA exit channel past the β-flap. order to position σ 70 correctly for simultaneous contact of the -35 and the downstream elements. Typically, a promoter only needs to contain two of the three σ 70dependent elements for activity; thus, E. coli promoters can be loosely classified as -35/-10 (the major class), TGn/-10 (also called extended -10), or -35/TGn [reviewed in [20]].
The initial binding of RNAP to the dsDNA promoter elements usually results in an unstable, "closed" complex (RPc) (Figure 1). Creation of the stable, "open" complex (RPo) requires bending and unwinding of the DNA [31] and major conformational changes (isomerization) of the polymerase (Figure 1) [ [32,33]; reviewed in [20]]. In RPo the unwinding of the DNA creates the transcription bubble from -11 to~+3, exposing the single-stranded (ss) DNA template for transcription. Addition of ribonucleoside triphosphates (rNTPs) then results in the synthesis of RNA, which remains as a DNA/RNA hybrid for about 8-9 bp. Generation of longer RNA initiates extrusion of the RNA through the RNA exit channel formed by portions of β and β' within core. Since this channel includes the σ 70 -bound β-flap, it is thought that the passage of the RNA through the channel helps to release σ from core, facilitating promoter clearance. The resulting elongation complex, EC, contains core polymerase, the DNA template, and the synthesized RNA ( Figure 1) [reviewed in [34]]. The EC moves rapidly along the DNA at about 50 nt/sec, although the complex can pause, depending on the sequence [35]. Termination of transcription occurs either at an intrinsic termination signal, a stem-loop (hairpin) structure followed by a Urich sequence, or a Rho-dependent termination signal [reviewed in [36,37]]. The formation of the RNA hairpin by an intrinsic terminator sequence may facilitate termination by destabilizing the RNA/DNA hybrid. Rhodependent termination is mediated through the interaction of Rho protein with a rut site (Rho utilization sequence), an unstructured, sometimes C-rich sequence that lies upstream of the termination site. After binding to the RNA, Rho uses ATP hydrolysis to translocate along the RNA, catching up with the EC at a pause site. Exactly how Rho disassociates a paused complex is not yet fully understood; the DNA:RNA helicase activity of Rho may provide a force to "push" RNAP off the DNA. Rho alone is sufficient for termination at some Rhodependent termination sites. However, at other sites the termination process also needs the auxiliary E. coli proteins NusA and/or NusG [reviewed in [36]].
When present in intergenic regions, rut sites are readily available to interact with Rho. However, when present in protein-coding regions, these sites can be masked by translating ribosomes. In this case, Rho termination is not observed unless the upstream gene is not translated, for example, when a mutation has generated a nonsense codon. In such a case, Rho-dependent termination can prevent transcription from extending into the downstream gene. Thus, in this situation, which is called polarity [38], expression of both the upstream mutated gene and the downstream gene is prevented.

T4 early transcription
Early promoters T4 only infects exponentially growing E. coli, and transcription of T4 early genes begins immediately after infection. Thus, for an efficient infection, the phage must rapidly redirect the σ 70 -associated RNAP, which is actively engaged in transcription of the host genome, to the T4 early promoters. This immediate takeover is successful in part because most T4 early promoters contain excellent matches to the σ 70 -RNAP recognition elements (-35, TGn, and -10 elements) and to the α-CTD UP elements ( Figure 2; for lists of T4 early promoter sequences, see [4,5]). However, sequence alignments of T4 early promoters reveal additional regions of consensus, suggesting that they contain other bits of information that can optimize the interaction of host RNAP with the promoter elements. Consequently, unlike most host promoters that belong to the -35/-10, TGn/-10 or  [20,150]. Below, similar consensus sequences found in T4 early [4] and middle [91] promoters are in black and differences are in red; the MotA box consensus sequence in T4 middle promoters is in green. Spacer lengths between the TGn elements and the -35 elements (host and T4 early) or the MotA box are indicated. W = A or T; R = A or G; Y = C or T, n = any nucleotide; an uppercase letter represents a more highly conserved base.
-35/TGn class, T4 early promoters can be described as "über" UP/-35/TGn/-10 promoters. Indeed, most T4 early promoters compete extremely well with the host promoters for the available RNAP [39] and are similar to other very strong phage promoters, such as T7 P A1 and λ P L .

The T4 Alt protein
Besides the sheer strength of its early promoters, T4 has another strategy, the Alt protein, to establish transcriptional dominance [ [40][41][42][43], reviewed in [1,4]]. Alt, a mono-ADP-ribosyltransferase, ADP-ribosylates a specific residue, Arg265, on one of the two α subunits of RNAP. In addition, Alt modifies a fraction of other host proteins, including the other RNAP subunits and host proteins involved in translation and cell metabolism. Alt is an internal phage head protein that is injected with the phage DNA. Consequently, Alt modification occurs immediately after infection and does not require phage protein synthesis. Each α subunit is distinct (one α interacts with β while the other interacts with β') and Alt modification is thought to specifically target a particular α, although which particular α is not known.
What is the purpose of Alt modification? The major Alt target, α Arg265, has been shown to be crucial for the interaction of an α-CTD with a promoter UP element [44][45][46] and with some host activators, including c-AMP receptor protein (CRP), a global regulator of E. coli [46,47]. Thus, an obvious hypothesis is that Alt simply impairs host promoters that either need these activators or are enhanced by α-CTD/UP element interaction. However, overexpression of Alt from a plasmid does not affect E. coli growth [40], and general transcription of E. coli DNA in vitro is not impaired when using Alt-modified RNAP [48]. Instead, it appears that Alt-modification is helpful because it increases the activity of certain T4 early promoters. This 2-fold enhancement of activity has been observed both in vivo [40,49] and in vitro [48]. How Alt-modification stimulates particular early promoters is not known, but it is clear that it is not simply due to their general strength. Other strong promoters, such as P tac , T7 P A1 and P A2 , T5 P 207 , and even some of the T4 early promoters, are unaffected when using Altmodified RNAP [49]. Alt-mediated stimulation of a promoter is also not dependent on specific σ 70 -dependent elements (-35, TGn, and -10 elements); some promoters with identical sequences in these regions are stimulated by Alt while others are not [49]. A comprehensive mutational analysis of the T4 early promoter P 8.1 and P tac reveals that there is not a single, specific promoter position(s) responsible for the Alt effect. This result suggests that the mechanism of Alt stimulation may involve cross-talk between RNAP and more than one promoter region [50] or that ADP-ribosylation of α Arg265 is a secondary, less significant activity of Alt and additional work on the importance of this injected enzyme is needed.

Continuing early strategies for T4 domination
Because T4 promoters are so efficient at out-competing those of the host, a burst of immediate early transcription occurs within the first minute of infection. From this transcription follows a wave of early products that continue the phage takeover of the host transcriptional machinery. One such product is the T4 Alc protein, a transcription terminator that is specific for dC-containing DNA, that is, DNA that contains unmodified cytosines. Consequently, Alc terminates transcription from host DNA without affecting transcription from T4 DNA, whose cytosines are hydroxymethylated and glucosylated [ [51,52]; reviewed in [1,4]]. Alc directs RNAP to terminate at multiple, frequent, and discrete sites along dC-containing DNA. The mechanism of Alc is not known. Unlike other terminating factors, Alc does not appear to interact with either RNA or DNA, and decreasing the rate of RNA synthesis or RNAP pausing near an Alc termination site actually impairs Alc termination [51]. Mutations within an N-terminal region of the β subunit of RNAP, a region that is not essential for E. coli (dispensable region I), prevent Alc -mediated termination, suggesting that an interaction site for Alc may reside in this region [52].
T4 also encodes two other ADP-ribosylating enzymes, ModA and ModB, as early products. Like Alt, ModA modifies Arg265 of RNAP α [ [53,48]; reviewed in [1,4]]. However, unlike Alt, ModA almost exclusively targets the RNAP α subunits. In addition, ModA modifies both α subunits so there is no asymmetry to ModA modification. Synthesis of ModA is highly toxic to E. coli. In vitro, ModA-modified RNAP is unable to interact with UP elements or to interact with CRP [cited in [40]] and is less active than unmodified RNAP when using either E. coli or T4 DNA [48]. Thus, it has been suggested that ModA helps to diminish both host and T4 early promoter activity, reprogramming the transcriptional machinery for the coming wave of middle transcription [48]. However, a deletion of the modA gene does not affect the rapid decrease in early transcription or the decrease in the synthesis of early gene products, which begins about 3 minutes post-infection [54]. This result suggests that the phage employs other as yet unknown strategies to stop transcription from early promoters. ModB, the other early ADP-ribosylating enzyme, targets host translation factors, the ribosomal protein S30 and trigger factor, which presumably helps to facilitate T4 translation [43].
Finally, many of the early transcripts include genes of unknown function and come from regions of the T4 genome that are not essential for infection of wild type (wt) E. coli under normal laboratory conditions. Presumably, these genes encode phage factors that are useful under specific growth conditions or in certain strains. Whether any of these gene products aid T4 in its takeover of the host transcriptional machinery is not known.

The switch to middle transcription
Within a minute of infection at 37°C, some of the T4 early products mediate the transition from early to middle gene expression. As detailed below, the MotA activator and AsiA co-activator are important partners in this transition, since they direct RNAP to transcribe from middle promoters. In addition, the ComC-α protein, described later, may also have a role in the extension of early RNAs into downstream middle genes or the stability of such transcripts once they are formed.
As middle transcription begins, certain early RNAs decay rapidly after their initial burst of transcription. This arises from the activity of the early gene product RegB, an endoribonuclease, which specifically targets some T4 early mRNAs. For the mRNAs of MotA and RegB itself, a RegB cleavage site lies within the Shine-Dalgarno sequence; for ComC-α mRNA, the site is within AU-rich sequences upstream and downstream of this sequence [55]. The mechanism by which RegB recognizes and chooses the specific cleavage site is not yet known.
The onset of T4 middle transcription also finishes the process of eliminating host transcription by simply removing the host DNA template for RNAP. T4encoded nucleases, primarily EndoII encoded by denA and EndoIV encoded by denB, selectively degrade the dC-containing host DNA ( [56,57] and references therein). Thus, a few minutes after infection, there is essentially no host DNA to transcribe.

Transcription of middle genes from T4 middle promoters Middle promoters
Middle genes primarily encode proteins needed for replication, recombination, and nucleotide metabolism; various T4-encoded tRNAs; and transcription factors that program the switch from middle to late promoter activation. Middle RNAs arise by 2 pathways: extension of early transcription into middle genes (discussed later) and the activation of T4 middle promoters by a process called σ appropriation [2]). To date, nearly 60 middle promoters have been identified (Table 1). Unlike early promoters, T4 middle promoters contain a host element, the σ 70 -dependent -10 sequence, and a phage element, a MotA box, which is centered at -30 and replaces the σ 70 -dependent -35 element present in T4 early promoters and most host promoters ( Figure 2). In addition, about half of the middle promoters also contain TGn, the extended -10 sequence. Activation of the Ptd 145142 [100] phage middle promoters requires the concerted effort of two T4 early products, AsiA and MotA.
AsiA, the co-activator of T4 middle transcription AsiA (Audrey Stevens inhibitor or anti-sigma inhibitor) is a small protein of 90 residues. It was originally identified as a 10 kDa protein that binds very tightly to the σ 70 subunit of RNAP [11,58,59] with a ratio of 1:1 [60].
Later work indicated that a monomer of AsiA binds to C-terminal portions of σ 70 , Regions 4.1 and 4.2 [26,[60][61][62][63][64][65][66][67][68][69][70]. In solution, AsiA is a homodimer whose selfinteraction face is composed of mostly hydrophobic residues within the N-terminal half of the protein [65,71]. A similar face of AsiA interacts with σ 70 [26], suggesting that upon binding to σ 70 , a monomer of AsiA in the homodimer simply replaces its partner for σ 70 . Curiously, the AsiA structure also contains a helix-turnhelix motif (residues 30 to 59), suggesting the possibility of an interaction between AsiA and DNA [71]. However, as yet, no such interaction has been detected. Multiple contacts make up the interaction between AsiA and σ 70 Region 4 ( Figure 3A). The NMR structure ( Figure 3B, right) reveals that 18 residues present in three α helices within the N-terminal half of AsiA (residues 10 to 42) contact 17 residues of σ 70 [26]. Biochemical analyses have confirmed that AsiA residues E10, V14, I17, L18, K20, F21, F36, and I40, which contact σ 70 Region 4 in the structure, are indeed important for the AsiA/σ 70 interaction and/or for AsiA transcriptional function in vitro [72][73][74]. Of all of these residues, I17 appears to be the most important, and thus, has been termed "the linchpin" of the AsiA/σ 70 Region 4 interaction [74]. A mutant AsiA missing the C-terminal 17 residues is as toxic as the full length protein when expressed in vivo [72,75], and even a mutant missing the C-terminal 44 residues is still able to interact with σ 70 Region 4 and to co-activate transcription weakly [72]. These results are consistent with the idea that only the N-terminal half of AsiA is absolutely required to form a functional AsiA/σ 70 complex. Together, the structural and biochemical work indicate that there is an extensive interface between the N-terminal half of AsiA and σ 70 Region 4, consistent with the early finding that AsiA copurifies with σ 70 until urea is added to dissociate the complex [76].
The σ 70 face of the AsiA/σ 70 complex includes residues in Regions 4.1 and 4.2 that normally contact the -35 DNA element or the β-flap of core [26] (Figure 3). Mutations within Region 4.1 or Region 4.2, which are at or near the AsiA contact sites in σ 70 , impair or eliminate AsiA function [77][78][79], providing biochemical evidence for these interactions. The structure of the AsiA/ σ 70 Region 4 complex also reveals that AsiA binding dramatically changes the conformation of σ 70 Region 4, converting the DNA binding helix-turn-helix ( Figure 3B, left) into one continuous helix ( Figure 3B, right). Such a conformation would be unable to retain the typical σ 70 contacts with either the -35 DNA or with the β-flap. Thus, the association of AsiA with σ 70 should inhibit the binding of RNAP with promoters that depend on recognition of a -35 element. Indeed, early observations showed that AsiA functions as a transcriptional inhibitor at most promoters in vitro [9,10], blocking RPc formation [60], but TGn/-10 promoters, which are independent of a RNAP/-35 element contact, are immune to AsiA [62,66,80]. However, this result is dependent on the buffer conditions. In the presence of glutamate, a physiologically relevant anion that is known to facilitate protein-protein and protein-DNA interactions [81,82], extended incubations of AsiA-associated RNAP with -10/-35 and -35/TGn promoters eventually result in the formation of transcriptionally competent, open complexes that contain AsiA [72,83]. Under these conditions, AsiA inhibition works by significantly slowing the rate of RPo formation [83]. However, the formation of these complexes still relies on DNA recognition elements other than the -35 element (UP, TGn, and -10 elements), again demonstrating that AsiA specifically targets the interaction of RNAP with the -35 DNA.
Because AsiA strongly inhibits transcription from -35/-10 and -35/TGn promoters, expression of plasmidencoded AsiA is highly toxic in E. coli. Thus, during infection, AsiA may serve to significantly inhibit host transcription. Although it might be reasonable to suppose that AsiA performs the same role at T4 early promoters, this is not the case. The shut-off of early transcription, which occurs a few minutes after infection, is still observed in a T4 asiAinfection [54], and early promoters are only modestly affected by AsiA in vitro [84]. This immunity to AsiA is probably due to the multiple RNAP recognition elements present in T4 early promoters ( Figure 2). Thus, AsiA inhibition does not significantly contribute to the early to middle promoter transition. AsiA also does not help to facilitate the replacement of σ 70 by the T4-encoded late σ factor, which is needed for T4 late promoter activity [85],  indicating that AsiA is not involved in the middle to late promoter transition.
Although AsiA was originally designated as an "antisigma" factor and is still frequently referred to as such, it is important to note that it behaves quite differently from classic anti-sigma factors. Unlike these factors, its binding to σ 70 does not prevent the σ 70 /core interaction; it does not sequester σ 70 . Instead it functions as a member of the RNAP holoenzyme. Consequently, AsiA is more correctly designated as a co-activator rather than an anti-sigma factor, and its primary role appears to be in activation rather than inhibition.

MotA, the transcriptional activator for middle promoters
The T4 motA (modifier of transcription) gene was first identified from a genetic selection developed to isolate mutations in T4 that increase the synthesis of the early gene product rIIA [86]. In fact, expression of several early genes increase in the T4 motAinfection, presumably because of a delay in the shift from early to middle transcription [87]. MotA is a basic protein of 211 amino acids, which is expressed as an early product [88]. The MotA mRNA is cleaved within its Shine-Dalgarno sequence by the T4 nuclease, RegB. Consequently, the burst of MotA protein synthesis, which occurs within the first couple minutes of infection [55], must be sufficient for all the subsequent MotA-dependent transcription.
The structure of this region shows five α-helices, with helices 1, 3, 4, and 5 packing around the central helix 2 [93]. The C-terminal half, MotA CTD , binds MotA box DNA [97] and consists of a saddle-shaped, 'double wing' motif, three α-helices interspersed with six β-strands [94]. As information about MotA-dependent activation has emerged, it has become apparent that MotA differs from other activators of bacterial RNAP in several important aspects. The unique aspects of MotA are discussed below. 1) MotA tolerates deviations within the MotA box consensus sequence Early work [ [3,99]; reviewed in [1]] identified a highly conserved MotA box sequence of (a/ t)(a/t)TGCTT(t/c)a with an invariant center CTT based on more than twenty T4 middle promoters. However, subsequent mutational analyses revealed that most single bp changes within the consensus sequence, even within the center CTT, are well-tolerated for MotA binding and activation in vitro [100]. Furthermore, several active middle promoters have been identified whose MotA boxes deviate significantly from the consensus, confirming that MotA is indeed tolerant of bp changes in vivo [91,[100][101][102].
An examination of the recognized base determinants within the MotA box has revealed that MotA senses minor groove moieties at positions -32 and -33 and major groove determinants at positions -28 and -29 [103]. (For this work, the MotA box was located at positions -35 to -26, its position when it is present 13 bp upstream of the -10 element.) In particular, the 5-Me on -29 T contributes to MotA binding. However, despite its high conservation, there seems to be little base recognition of -31 G:C, -30 C:G at the center of the MotA box. In wt T4 DNA, each cytosine in this sequence is modified by the presence of a hydroxymethylated, glucosylated moiety at cytosine position 5. This modification places a large, bulky group within the major groove, making it highly unlikely that MotA could contact a major groove base determinant at these positions. In addition, MotA binds and activates transcription using unmodified DNA; thus, the modification itself cannot be required for function. However, for two specific sequences, DNA modification does seem to affect MotA activity. One case is the middle promoter upstream of gene 46, P46. The MotA box within P46 contains the unusual center sequence ACTT rather than the consensus GCTT. MotA binds a MotA box with the ACTT sequence poorly, and MotA activation of P46 in vitro using wt T4 DNA is significantly better than that observed with unmodified DNA [100]. These results suggest that DNA modification may be needed for full activity of the ACTT MotA box motif. On the other hand, when using unmodified DNA in vitro, MotA binds a MotA box with a center sequence of GATT nearly as well as one with the consensus GCTT sequence, and a promoter with the GATT motif is fully activated by MotA in vitro. However, several potential T4 middle promoter sequences with a GATT MotA box and an excellent σ 70 -dependent -10 element are present within the T4 genome, but these promoters are not active [100]. This result suggests that the cytosine modification opposite the G somehow "silences" GATT middle promoter sequences. 2) MotA is not a strong DNA-binding protein In contrast to many other well-characterized activators of E. coli RNAP, MotA has a high apparent dissociation constant for its binding site (100 -600 nM [92,103,104]), and a large excess of MotA relative to DNA is needed to detect a MotA/DNA complex in a gel retardation assay or to detect protein protection of the DNA in footprinting assays [90]. In contrast, stoichiometric levels of MotA are sufficient for transcription in vitro [90]. These results are inconsistent with the idea that the tight binding of MotA to a middle promoter recruits AsiA-associated RNAP for transcription. In fact, in nuclease protection assays, MotA binding to the MotA box of a middle promoter is much stronger in the presence of AsiA and RNAP than with MotA alone [89,90]. Furthermore, in contrast to the sequence deviations permitted within the MotA box, nearly all middle promoters have a stringent requirement for an excellent match to the σ 70 -dependent -10 element [91,100,101]. This observation suggests that the interaction of σ 70 Region 2.4 with its cognate -10 sequence contributes at least as much as MotA binding to the MotA box in the establishment of a stable RNAP/MotA/AsiA/Pm complex.
Recent structural and biochemical work has indicated that a basic/hydrophobic cleft within MotA NTD contains the molecular face that interacts with σ 70 H5 ( Figure  3C, right). Mutation of MotA residues K3, K28, or Q76, which lie in this cleft, impair the ability of MotA to interact with σ 70 H5 and to activate transcription, and render the protein incapable of complementing a T4 motAphage for growth [104]. Interestingly, substitutions of MotA residues D30, F31, and D67, which lie on another exposed surface outside of this cleft, also have deleterious effects on the interaction with σ 70 , transcription, and/or phage viability [98,104]. These residues are contained within a hydrophobic, acidic patch, which may also be involved in MotA activation or another unidentified function of MotA.

The process of sigma appropriation
The mechanism of MotA-dependent activation occurs through a novel process, called sigma appropriation [reviewed in [2]]. Insight into this process began with the finding that some middle promoters function in vitro with RNAP alone. The middle promoter P uvsX , which is positioned upstream of the T4 recombination gene uvsX, is such a promoter [13]. This promoter is active because it has UP elements and a perfect -10 element to compensate for its weak homology to a σ 70 -35 sequence. (It should be noted that significant activity of P uvsX and other middle promoters in the absence of MotA/AsiA is only seen when using unmodified DNA, because the modification present in T4 DNA obscures needed major grove contacts for RNAP.) Using unmodified P uvsX DNA, it has been possible to investigate how the presence of MotA and AsiA alone and together affect the interactions between RNAP and a middle promoter [72,89,90,103]. The RPo formed by RNAP and P uvsX exhibits protein/ DNA contacts that are similar to those seen using a typical -35/-10 promoter; addition of MotA in the absence of AsiA does not significantly alter these contacts. As expected, addition of AsiA without MotA inhibits the formation of a stable complex. However, in the presence of both MotA and AsiA, a unique RPo is observed. This MotA/AsiA activated complex has the expected interactions between RNAP and the -10 element, but it has unique protein-DNA interactions upstream of the -10 element. In particular, σ 70 Region 4 does not make its usual contacts with the -35 element DNA; rather MotA binds to the MotA box that overlaps the -35 sequence. As expected, when using fully ADP-ribosylated RNAP there is an abrupt loss of footprint protection just upstream of the MotA box in P uvsX , consistent with the loss of UP element interactions when both α-CTD's are modified; when using RNAP that has not been ADP-ribosylated, the UP elements in P uvsX are protected.
Taken together, these biochemical studies argued that within the activated complex, σ 70 Region 2.4 binds tightly to the σ 70 -dependent -10 element, but the MotA/MotA box interaction is somehow able to replace the contact that is normally made between σ 70 Region 4 and the -35 DNA (Figure 4) [89,103]. The subsequent AsiA/σ 70 Region 4 structure [26] (Figure 3B, right) shows just how this can be done. Through its multiple contacts with σ 70 residues in Regions 4.1 and 4.2, AsiA remodels Region 4 of σ 70 . When the AsiA/σ 70 complex then binds to core, σ 70 Region 4 is incapable of forming its normal contacts with -35 element DNA ( Figure 3B, left). In addition, the restructuring of σ 70 Region 4 prevents its interaction with the β-flap, allowing the far C-terminal region H5 of σ 70 to remain available for its interaction with MotA. Consequently, in the presence of AsiA-associated RNAP, MotA can interact both with the MotA box and with σ 70 H5 [77,97,104].
Recent work has suggested that additional portions of AsiA, MotA and RNAP may be important for σ appropriation. First, the C-terminal region of AsiA (residues 74-90) may contribute to activation at P uvsX by directly interacting both with the β-flap and with MotA NTD . In particular, the AsiA N74D substitution reduces an AsiA/β-flap interaction observed in a 2-hybrid assay and impairs the ability of AsiA to inhibit transcription from a -35/-10 promoter in vitro [116]. This mutation also renders AsiA defective in co-activating transcription from P uvsX in vitro if it is coupled with a σ 70 F563Y substitution that weakens the interaction of AsiA with σ 70 Region 4 [117]. On the other hand, an AsiA protein with either a M86T or R82E substitution has a reduced capacity to interact with MotA NTD in a 2-hybrid assay and yields reduced levels of MotA/AsiA activated transcription from P uvsX in vitro [118]. The M86 and R82 mutations do not affect the interaction of AsiA with σ 70 or with the β-flap, and they do not compromise the ability of AsiA to inhibit transcription [118], suggesting that they specifically affect the interaction with MotA. These results argue that AsiA serves as a bridge, which connects σ 70 , the β-flap, and MotA. However, in other experiments, MotA/AsiA activation of P uvsX is not affected when using AsiA proteins with deletions of this C-terminal region (Δ79-90 and Δ74-90), and even AsiA Δ47-90 still retains some ability to co-activate transcription [72]. Furthermore, the C-terminal half of the AsiA ortholog of the vibrio phage KVP40 (discussed below) has little or no sequence homology with its T4 counterpart yet in the presence of T4 MotA and E. coli RNAP, it effectively co-activates transcription from P uvsX in vitro [119], and NMR analyses indicate that the addition of MotA to the AsiA/σ 70 Region 4 complex does not significantly perturb chemical shifts of AsiA residues [104]. Thus, further work is needed to clarify the role of the of AsiA C-terminal region. Finally, very recent work has shown that the inability of T4 motA mutants to plate on the TabG strain arises from a G1249D substitution within β, thereby implicating a region of β that is distinct from the β-flap in MotA/AsiA activation [120]. This mutation is located immediately adjacent to a hydrophobic pocket, called the Switch 3 loop, which is thought to aid in the separation of the RNA from the DNA-RNA hybrid as RNA enters the RNA exit channel [28]. The presence of the β G1249D mutation specifically impairs transcription from T4 middle promoters in vivo, but whether the substitution directly or indirectly affects protein-protein interactions is not yet known [120]. Taken together, these results suggest that MotA/AsiA activation employs multiple contacts, some of which are essential under all circumstances (AsiA with σ 70 Regions 4.1 and 4.2, MotA with σ 70 H5) and some of which may provide additional contacts perhaps under certain circumstances to strengthen the complex.
Concurrent work with the T4 middle promoter P rIIB2 has yielded somewhat different findings than those observed with P uvsX [121]. P rIIB2 is a TGn/-10 promoter that does not require an interaction between σ 70 Region 4 and the -35 element for activity. Thus, the presence of AsiA does not inhibit RPo formation at this promoter. An investigation of the complexes formed at P rIIB2 using surface plasmon resonance revealed that MotA and AsiA together stimulate the initial recognition of the promoter by RNAP. In addition, in vitro transcription experiments indicated that MotA and AsiA together aid in promoter clearance, promoting the formation of the elongating complex. Thus, MotA may activate different steps in initiation, depending on the type of promoter. However, there is no evidence to suggest that the protein/protein and protein/DNA contacts are significantly different with different middle promoters.
Interestingly, AsiA binds rapidly to σ 70 when σ 70 is free, but binds poorly, if at all, to σ 70 that is present in RNAP [122]. The inability of AsiA to bind to σ 70 within holoenzyme may be useful for the phage because it ties the activation of middle promoters to the efficiency of early transcription. This stems from the fact that σ 70 is usually released from holoenzyme once RNAP has cleared a promoter [ [123] and references therein]. Since there is an excess of core relative to σ factors, there is only a brief moment for AsiA to capture σ 70 . Consequently, the more efficiently the T4 early promoters fire, the more opportunities are created for AsiA to bind to σ 70 , which then leads to increased MotA/AsiA-dependent middle promoter transcription.

Sigma appropriation in other T4-type phages
Although hundreds of activators of bacterial RNAP are known, the T4 MotA/AsiA system represents the first identified case of sigma appropriation. A search for MotA and AsiA orthologs has revealed several other T4-type phage genomes that contain both motA and asiA genes [ [124] and http://phage.bioc.tulane.edu/]. These range from other coliphages (RB51, RB32, and RB69) to more distantly related phages that infect aeromonas (PHG25, PHG31, and 44RR) and acinetobacter (PHG133). In addition, orthologs for asiA have also been found in the genomes of the vibrio phages KVP40 and NT1 and the aeromonas phages PHG65 and Aeh1, even though these genomes do not have a recognizable motA. The KVP40 AsiA protein shares only 27% identity with its T4 counterpart. However, it inhibits transcription by E. coli RNAP alone and co-activates transcription with T4 MotA as effectively as T4 AsiA [119]. Thus, it may be that KVP40 and other phages that lack a MotA sequence homolog, do in fact have a functional analog of the MotA protein. Alternatively, the KVP40 AsiA may serve only as an inhibitor of transcription.
No examples of sigma appropriation outside of T4type phage have been discovered. Although sequence alignments suggested that the E. coli anti-sigma protein Rsd, which also interacts with σ 70 , may be a distant member of the AsiA family [119], a structure of the Rsd/sigma Region 4 complex is not consistent with this idea [30]. Recent work has identified a protein (CT663) involved in the developmental pathway of the human pathogen Chlamydia trachomatis that shares functional features with AsiA [125]. It binds both to Region 4 of the primary σ (σ 66 ) of C. trachomatis and to the β-flap of core, and it inhibits σ 66 -dependent transcription. More importantly, like AsiA, it works by remaining bound to the RNAP holoenzyme rather than by sequestering σ 66 .

Transcription of middle genes by the extension of early transcripts
Even though the expression of middle genes is highly dependent on the activation of middle promoters, isolated mutations within motA and asiA are surprisingly not lethal. Such mutant phage show a DNA delay phenotype, producing tiny plaques on wt E. coli [11,87]. The replication defect reflects the reduced level of T4 replication proteins, whose genes have MotA-dependent middle promoters. In addition, two T4 replication origins are driven by MotA-dependent transcription from the middle promoters, P uvsY and P 34i [126]. However, deletion of either motA [127] or asiA [54] is lethal. Recent work suggests that leakiness of the other nonsense and temperature sensitive mutations provide enough protein for minimal growth [120].
Besides MotA-dependent promoters, middle RNA is also generated by the extension of early transcripts into middle genes. This is because most, if not all, middle genes are positioned downstream of early gene(s) and early promoters. Production of this extended RNA is time-delayed relative to the RNA from the upstream "immediate early (IE)" gene. Thus, middle RNA generated from this extension was originally designated "delayed early" (DE), since it cannot be synthesized until the elongating RNAP reaches the downstream gene(s). Early work (reviewed in [1]) classified genes as IE, DE, or middle based on when and under what conditions the RNA or the encoded protein was observed. IE RNA represents transcripts that are detected immediately after infection and do not require phage protein synthesis. DE RNA requires phage protein synthesis, but this RNA and DE gene products are still detected in a T4 motAinfection. In contrast, the expression of genes that were classified as "middle" is significantly reduced in a T4 motA-infection. In addition, while both DE and "middle" RNA arise after IE transcription, the peak of the RNA that is substantially dependent on MotA is slightly later and lasts somewhat longer than the DE peak. However, it should be noted that these original designations of genes as DE or middle are now known to be somewhat arbitrary. Many, if not all, of these genes are transcribed from both early and middle promoters. In fact, while a microarray analysis investigating the timing of various prereplicative RNAs [128] was generally consistent with known Pe and Pm promoters [4], there were a number of discrepancies, especially between genes that were originally classified as either "DE" or "middle". Thus, it is now clear that both the extension of early transcripts and the activation of middle promoters is important for the correct level of middle transcription.
Early experiments [summarized in [1]] offered evidence that DE RNA synthesis might require a T4 system to overcome Rho-dependent termination sites located between IE and DE genes. First, the addition of chloramphenicol at the start of a T4 infection prevents the generation of DE RNAs, indicating a requirement for protein synthesis and suggesting that phage-encoded factor(s) might be needed for the extension of IE RNAs. Second, in a purified in vitro system using RNAP and T4 DNA, both IE and DE RNA are synthesized unless the termination factor Rho is added. Addition of Rho restricts transcription to IE RNA, indicating that Rhodependent termination sites are located upstream of DE genes. Third, DE RNA from a specific promoter upstream of gene 32 is not observed in a T4 motAinfection, suggesting that MotA itself may be needed to form or stabilize this DE RNA [129]. It is unlikely that a MotA-dependent gene product, rather than MotA, is responsible for this effect, since the DE transcripts are synthesized before or simultaneously with the activation of middle promoters. Finally, wt T4 does not grow in particular rho mutant alleles, called nusD, that produce Rho proteins with altered activity, and the level of certain DE RNAs and DE gene products in T4/nusD infections is depressed. An initial interpretation of this result was that there is more Rho-dependent termination in a nusD allele, which then depresses the level of DE RNA. T4 suppressors that grow in nusD were subsequently isolated and found to contain mutations within the T4 comC-a (also called goF) gene [130,131], which expresses an early product.
Given all of these findings, it was postulated that T4 uses an anti-termination system, perhaps like the N or Q systems of phage λ [reviewed in [132]], to actively prevent Rho-dependent termination and that MotA, ComC-α, or another protein is involved in this process. However, comC-a is not essential, and the addition of amino acid analogs, which would generate nonfunctional proteins, has been shown to be sufficient for the synthesis of at least certain DE RNAs [reviewed in [1]]. These results suggest that at least in some cases, translation is simply needed to prevent polarity; consequently, the process of translation itself, rather than a specific factor (s), is sufficient to inhibit Rho termination. If so, the loss of DE RNA observed in the presence of Rho in vitro would be due to the lack of coupled transcription/ translation. Thus, when the upstream gene is being translated in an infection in vivo, Rho RNA binding sites would be occluded by ribosomes and consequently unavailable.
More recent work has suggested that Rho may affect DE RNA in vivo because of its ability to bind RNA rather than its termination activity [133,134]. Sequencing of the rho gene in six nusD alleles has revealed that in five cases, the rho mutation lies within the RNAbinding site of Rho. Furthermore, the addition of such a mutant Rho protein to an in vitro transcription system does not produce more termination but rather results in an altered and complicated pattern of termination. There is actually less termination at legitimate Rhodependent termination sites, but in some cases, more termination at other sites. Unexpectedly, increasing the amount of the mutant Rho proteins rescues T4 growth in a nusD allele, a result that is not compatible with the mutant Rho promoting more termination. In addition, expression of the Rop protein, an RNA-binding protein encoded by the pBR322 plasmid, also rescues T4 growth in nusD.
Taken together, these results have led to another hypothesis to explain DE RNA. In this model, T4 DE transcripts in vivo are susceptible to nuclease digestion and require a process to limit this degradation. Active translation can prevent this nuclease attack, thus explaining the loss of DE RNA in the presence of chloramphenicol. In addition, a protein that can bind RNA, such as wt Rho, Rop, or perhaps the mutated T4 ComC-α, may also be useful. Thus, the nusD Rho proteins are defective not because they terminate IE transcripts more effectively, but because they have lost the ability of wt Rho to bind and somehow protect the RNA. However, it should be noted that as of yet, there is no evidence identifying a particular nuclease(s) involved in this model. Furthermore, the function of wt comC-a or exactly how Rho or Rop "protect" DE RNA is not known. Recent work has shown that both transcription termination and increased mRNA stability by RNA-binding proteins are involved in the regulation of gene expression in eukaryotes and their viruses [135,136]. A thorough investigation of these processes in the simple T4 system could provide a powerful tool to understanding this mode of gene regulation.

Conclusion
T4 regulates its development and the timed expression of prereplicative genes by a sophisticated process. In the past few years, we have learned how T4 employs several elegant strategies, from encoding factors to alter the host RNAP specificity to simply degrading the host DNA, in order to overtake the host transcriptional machinery. Some of these strategies have revealed unexpected and fundamentally significant findings about RNAP. For example, studies with T4 early promoters have challenged previous ideas about how the α-CTDs of RNAP affect transcription. Work with host promoters argued that contact between the α-CTDs of RNAP and promoter UP elements or certain activators increases transcription; in particular, α residue Arg265 was crucial for this interaction. Thus, one would expect that modification of Arg265 would depress transcription. However, the activity of certain T4 early promoters actually increases when Arg265 of one of the two RNAP α subunits is ADP-ribosylated. This finding underscores our limited understanding of α-CTD function and highlights how T4 can provide a tool for investigating this subunit of RNAP.
The T4 system has also revealed a previously unknown method of transcription activation called sigma appropriation. This process is characterized by the binding of a small protein, T4 AsiA, to Region 4 of the σ 70 subunit of RNAP, which then remodels this portion of polymerase. The conformation of Region 4 in the AsiA/σ 70 Region 4 structure differs dramatically from that seen in other structures of primary σ factors and demonstrates that Region 4 has a previously unknown flexibility. Furthermore, studies with the T4 MotA activator have identified the far C-terminal region of σ 70 as a target for activation. Prior to the T4 work, it was thought that this portion of σ 70 , which is normally embedded within the β-flap "hook" of core, is unavailable. Based on the novel strategy T4 employs to activate its middle promoters, we now know how a domain within RNAP can be remodeled and then exploited to alter promoter specificity. It may be that other examples of this type of RNAP restructuring will be uncovered.
The core subunits of bacterial RNAP are generally conserved throughout biology both in structure and in function [reviewed in [137,138]]. In addition, it is now apparent that eukaryotic RNAP II employs protein complexes that function much like σ factors to recognize different core promoter sequences [ [139,140] and references therein]. Thus, the T4 system, which is simple in components yet complex in details, provides an amenable resource for answering basic questions about the complicated process of transcriptional regulation. Using this system, we have been able to uncover at a molecular level many of the protein/protein and protein/DNA interactions that are needed to convert the host RNAP into a RNAP that is dedicated to the phage. This work has given us "snapshots" of the transcriptionally competent protein/DNA complexes generated by the actions of the T4 proteins. The challenge in the future will be to understand at a detailed mechanistic level how these interactions modulate the various "nuts and bolts" of the RNAP machine.