Analysis of an ankyrin-like region in Epstein Barr Virus encoded (EBV) BZLF-1 (ZEBRA) protein: implications for interactions with NF-κB and p53

Background The carboxyl terminal of Epstein-Barr virus (EBV) ZEBRA protein (also termed BZLF-1 encoded replication protein Zta or ZEBRA) binds to both NF-κB and p53. The authors have previously suggested that this interaction results from an ankyrin-like region of the ZEBRA protein since ankyrin proteins such as IκB interact with NF-κB and p53 proteins. These interactions may play a role in immunopathology and viral carcinogenesis in B lymphocytes as well as other cell types transiently infected by EBV such as T lymphocytes, macrophages and epithelial cells. Methods Randomization of the ZEBRA terminal amino acid sequence followed by statistical analysis suggest that the ZEBRA carboxyl terminus is most closely related to ankyrins of the invertebrate cactus IκB-like protein. This observation is consistent with an ancient origin of ZEBRA resulting from a recombination event between an ankyrin regulatory protein and a fos/jun DNA binding factor. In silico modeling of the partially solved ZEBRA carboxyl terminus structure using PyMOL software demonstrate that the carboxyl terminus region of ZEBRA can form a polymorphic structure termed ZANK (ZEBRA ANKyrin-like region) similar to two adjacent IκB ankyrin domains. Conclusions Viral capture of an ankyrin-like domain provides a mechanism for ZEBRA binding to proteins in the NF-κB and p53 transcription factor families, and also provides support for a process termed "Ping-Pong Evolution" in which DNA viruses such as EBV are formed by exchange of information with the host genome. An amino acid polymorphism in the ZANK region is identified in ZEBRA from tumor cell lines including Akata that could alter binding of Akata ZEBRA to the p53 tumor suppressor and other ankyrin binding protein, and a novel model of antagonistic binding interactions between ZANK and the DNA binding regions of ZEBRA is suggested that may be explored in further biochemical and molecular biological models of viral replication.


Background
Analysis of viruses, mobile DNA sequences and host genomes reveals evidence that genetic information can be shared resulting in generation of novel hybrid genetic elements. For example, the ubiquitous human pathogen Epstein Barr Virus (EBV), genome structure contains a region co-linear with the vertebrate variable immunoglobulin gene segments including regulatory regions for the myc oncogene, signals for immunoglobulin switch recombination, somatic mutation and coding regions [1,2]. Similarly, the EBV viral termini contain sequences similar to somatic V(D)J immunoglobulin and T cell receptor recombination sites and the virus encodes a recombinase co-regulated with the host RAG V(D)J recombinase [3][4][5].
These observations suggest that EBV and other DNA viruses evolve through sharing of genetic information with the host genome termed "Ping Pong Evolution" [1,2]. In this work, a remarkable example of Ping Pong Evolution between EBV and the host immune system is characterized. The EBV viral protein ZEBRA, Zta encoded by BZLF-1is proposed to result from capture and fusion of exons from two different host immune response genes, the fos/jun transcription factor and IB immune regulatory proteins. Analysis of host IB proteins is utilized to present an empirically testable model of ZEBRA binding to NF-B immune response proteins in which the unstructured carboxyl region of ZEBRA can assume two different structural conformations when bound to NF-B. In addition, an amino acid polymorphism in this region of ZEBRA is identified that could potentially alter the functional properties of the ZEBRA protein interactions with both NF-B and the distantly related tumor suppressor p53 [6].
Studies of ZEBRA, the lytic switch protein, have demonstrated that the two amino terminal exons of ZEBRA are structurally related to the fos/jun transcription factors and that ZEBRA binding to DNA sites, known as ZRE, is sufficient to activate the viral lytic cycle [7,8]. The carboxyl region of ZEBRA protein can bind to components of the NF-B transcription family in vitro and alter NF-B transcription in vivo [9,10]. Inactivation of NF-B transcription is apparently independent of ZEBRA's ability to function as a transcription factor [11][12][13]. It is not currently known which of these multiple effects of ZEBRA contribute to viral carcinogenesis in a humanized mouse model [14].
One potential outcome of these interactions is that inactivation of NF-B transcription during viral lytic replication confers a selective advantage upon the virus [15,16]. NF-B transcription factors and regulatory IB proteins are central mediators of both the innate and acquired immune responses [17,18]. Inactivation of NF-B could block the innate immune response in B lymphocytes which are the viral host cell. Interaction between ZEBRA and NF-B can trigger apoptosis of cells expressing ZEBRA [19][20][21]. Inactivation NF-B transcription by ZEBRA in host B lymphocytes may also oppose the effects of latency proteins that activate NF-B transcription, delaying apoptosis and contributing to viral maturation and release from pre-apoptotic host cells.
Other bystander cells such as T-lymphocytes, epithelial cells and macrophages may also be transiently infected with EBV. T lymphocytes in particular express both the EBV CD21 receptor as well as other EBV receptors and express ZEBRA protein [3,21,22]. Studies of the effects of transient and stable expression of ZEBRA in T-lymphoblastoid cell lines have confirmed an IB-like inactivation NF-B signaling by ZEBRA and increased T lymphocyte apoptosis [21]. Thus, an additional selective advantage of interactions between ZEBRA and NF-B may be destruction of immune responder cells through transient infection and apoptosis.
Another human gamma herpesvirus, human herpesvirus-8, first identified as the cause of Kaposi's sarcoma, encodes a lytic replication protein which lacks DNA binding but has extensive amino acid similarity to ZEBRA including the carboxyl region of ZEBRA [23][24][25]. This region of ZEBRA is highly conserved between different viral strains [26]. EBV infected lymphocytes also express a trans-spliced ZEBRA like protein termed RAZ sharing the carboxyl terminal NF-B binding region of the ZEBRA protein, but lacking the ability to bind DNA or ZRE [15,27]. RAZ is also expressed in EBV infected T lymphocytes [3,22].
Remarkably, the carboxyl region of ZEBRA required for interactions with NF-B transcription and ZEBRA dimerization also binds to p53 tumor suppressor in vitro and alters p53 transcription in vitro [28][29][30]. The effects of ZEBRA on p53 transcription are stimulatory in some cases such as T-lymphocytes and epithelial cells, but inhibitory in B-lymphocytes [29,30]. Interactions between ZEBRA and p53 also involve other p53 binding proteins and are more complex than interactions in vivo between ZEBRA and NF-B [31]. The ability of ZEBRA to interact with both NF-B and p53 has previously been suggested to result from a cryptic IB-like region in the carboxyl terminus of the ZEBRA protein [6,21]. NF-B and p53 proteins share a common IB binding region because they are descendents of a common ancestral transcription factor previously termed "proto p53/NF-B [6]". Since this hypothesis was proposed, the crystal structure of ZEBRA protein has been partially solved including a portion of the carboxyl terminus of the protein interacting with NF-B and p53 [8,32]. In addition, the interactions between p53 binding ankyrin proteins and NF-B proteins have been characterized providing additional evidence that regulatory proteins in the ankyrin family, including NF-B inhibitor proteins related to IB, bind to both NF-B and p53 proteins [33].
The recently available partial structure of the ZEBRA carboxyl terminus, and the structures of IBα and apoptosis-stimulating protein of p53 (ASPP2, previously known as p53BP2) ankyrin proteins are analyzed in this work, building on previous similarities noted between these regulatory proteins [6]. First, it is demonstrated that primary amino acid similarities between the alpha helix regions of ZEBRA and various other ankyrin proteins are unlikely to have arisen by chance or independent parallel evolution but instead appear to represent capture and homologous descent of a terminal exon encoding an IB domain by ZEBRA. Second, it is shown that the partial crystal structure of ZEBRA is consistent with dimerization of the terminal region of the protein to form an IB-like stem and loop structure. Finally, individual ankyrin domains of IBα are shown to have structural similarities to the ankyrins of ASPP2, and thus able to bind to the same region targeted by ZEBRA [6].

Methods
Definition of the ankyrin-like region of ZEBRA protein termed "ZANK" We previously noted that a region of ZEBRA (Figures 1,  2) has primary amino sequence similarity to IB prior to the determination of the crystal structure of ZEBRA [6]. We first used the program PyMOL (Delano Scientific Inc., CA) to define and visualize a region of ZEBRA whose primary amino acid sequence is similar to ankyrin proteins [6]. ZEBRA structure was partially solved by x-ray crystallography in 2006 and structural coordinates are available as a protein data-base in the public domain (2C9N.pdb). This region of ZEBRA, termed in this work ZANK "ZEBRA ANKryn-like region" has an alpha-helical stem and carboxyl unstructured region (Figure 1). In the course of this analysis, The ZANK region of ZEBRA was noted to be encoded by a single exon in all examined EBV strains coinciding exactly with the exon-intron junction of the ZEBRA protein third exon (Figure 2, 3). Thus, the p53 and NF-B binding regions of ZEBRA, termed ZANK, were found to be separate both structurally and genetically from the DNA binding and trans-activation regions of ZEBRA ( Figure 4).

Alignment of ZANK with defined ankyrin proteins
ZEBRA protein and nucleotide sequences are as determined previously [26]. Individual ankyrin domains from published IB and related sequences obtained from genbank were identified using PyMOL and first aligned with ZANK and with each other using standard algorithms such as BLAST as illustrated in the text ( Figure  5) Color scheme for Figure 5 can be found at http:// ekhidna.biocenter.helsinki.fi/pfam2/clustal_colours). Previously, some primary amino acid similarity between p53 binding ankyrins, IB ankyrins, invertebrate ankyrins and the ZEBRA carboxyl terminus have been illustrated and suggested to provide a basis for interactions between ZEBRA and ankyrin binding proteins [6,21]. The carboxyl terminus of dimeric ZEBRA protein bound to a ZRE oligonucleotide (orange) is shown. This image was generated from previously determined ZEBRA molecular coordinates using PyMOL. The ZANK (ZEBRA ANKyrin-like region) carboxyl terminus of ZEBRA is highlighted in blue. ZANK is composed of two regions, a highly structured alpha helix stem highlighted in darker blue (the ZANK stem region) and a partially solved unstructured region continuing to the end of the protein highlighted in light blue (the ZANK loop region).
Individual ankyrin domains in ASPP2 are denoted in this work based on previously reported optimized alignments between the p53BP2 fragment of ASPP2 and IB. Thus, ank3 of p53BP2 corresponds to IB ankyrin 3, p53BP2 ank4 corresponds to IB ankyrin 4, and p53BP2 ank5 corresponds to IB ankyrin 5 proceeding from the amino to carboxyl termini of both proteins respectively [6]. In a more recent naming convention, p53BP2 which was found to be a fragment of a larger protein known as ASPP2 (ankyrin repeat, SH3, and proline rich domaincontaining protein number 2 also previously called apoptosis stimulating p53 binding protein 2) to indicate that it is a member of a multi-protein family. Using this nomenclature, p53BP2 ank3 corresponds to ASPP2 ankyrin 1, p53BP2 ank4 corresponds to ASPP2 ankyrin 2, and p53BP2 ank5 corresponds to ASPP2 ankyrin 3 [34].

Randomization analysis of ZANK and other defined ankyrin domains
A randomization shuffling strategy was developed to estimate the probability that observed amino acid similarities between ZANK, IB and other related ankyrins such as invertebrate cactus and unc-22 were not explained by the non-random amino acid composition of ankyrin proteins. The amino acid sequence of each ankyrin was randomized or shuffled and compared to every other possible sequence with the number of equal or better scores to non shuffled sequence shown. Repeated IB and p53BP2 units of ankyrin are composed of a rigid alpha-helical stem and a less structured more variable loop region that may be under different evolutionary constraints. The borders of each ankyrin stem and loop were identified using PyMOL ( Figure 5). Because of this bi-partite structure, analysis of the stem and loop regions was conducted both together and independently ( Figure 6). Randomization of 10,000 of each stem and loop region was used in this analysis with correction for multiple comparisons. In this analysis, alignment of less than 500/10,000 randomized sequences is comparable to a P value of .05, shown in yellow. Alignment of less than 10/10,000 sequences comparable to a P value of .001 is shown in red.
Based on these studies the similarity between putative ZEBRA ankyrin stem and the cactus ankyrin stems 4 is  Figure  1) is shown in more detail including the addition of terminal amino acids 237-245 not present in the solved structure. This region was deleted in order to permit crystallization of the ZEBRA protein. A region with secondary structure denoted "ZANK structure" is evident in the carboxyl terminus of ZEBRA as discussed in more detail in the text. ZANK is encoded by a discrete exon corresponding precisely with the stem and loop regions shown in blue in Figure 1.  one order of magnitude greater than other ankyrin stems, and this relationship was maintained over a variety of Price Waterman gap values. Also highly unlikely to occur by chance are the alignments between the putative ZEBRA ankyrin stem and the cactus stem 2 and Bcl3 ankyrin 4 stem. In many cases individual ankyrin stems from different ankyrin proteins such as invertebrate cactus, IB and p53 binding protein were more closely related than adjacent ankyrins from a single protein, for example cactus ankyrin stem 4 and bcl3 ankyrin stem 4 are more similar than cactus ankyrin stem 4 and other cactus ankyrins (BCL3 data not shown). Based upon randomized sequence comparisons shown in the ZEBRA stem-like region is most similar to the stem region 4 of a Cactus ankyrin from the Drosophila fruit fly. Interestingly, the loop region of ZEBRA is most similar to a different loop region from the cactus protein, ankyrin 5 of cactus. However, because of the short length o the ZEBRA stem region and other confounding factors such as apparent functional constraints on stem sequence evolution it was not possible to construct a unique rooted phylogeny of individual ankyrin stems using currently available alignment software (data not shown).
Visual inspection of individual ankyrin stem and loop regions ( Figure 5) also suggests that the loop regions of all of the cellular ankyrins were less significantly related either to ZEBRA or to each other except in the stem regions. In addition, IB and cactus ankyrins follow a pattern where alternating ankyrin loops are similar between odd and even numbered repeats, for example ankyrin 3 and 5 loops have more similarity in both amino acid composition and length than to ankyrins 2 and 4 loops from both IB and cactus. Relationships between individual ankyrin stems and loops thus appear to represent a complicated mixture of functional constraints operating independently of evolutionary descent and shared sequences conserved through descent from a common precursor as will be analyzed in more detail elsewhere.

In silico modeling of structural similarities between ZANK and other defined ankyrin domains
Protein data based files of ZEBRA (2C9N.pdb), IB bound to NF-B protein p65 (1lKN.pdb), and the p53BP2 fragment of ASPP2 (1YCS.pdb) were used to generate PyMOL based alignments. Experimentally determined structures were aligned using the "align" feature of PyMOL without any operator adjustments and using molecular coordinates as deposited in respective pdb files without alteration of molecular coordinates. ZANK was superimposed upon adjacent ankyrin domains of the IBα protein initially solved as a complex with NF-B p65 and the p53BP2 fragment of ASPP2 p53 binding protein solved as a complex with p53 (p53 structure not shown). A mechanism is proposed consistent with the partial structure of ZANK suggesting that the observed structure of ZANK could act as a dimorphic dimer and interact with adjacent dimorphic regions of NF-B, and also related ankyrin binding regions of p53. IBα and IBα and p53 binding ankyrin proteins, although not identical in primary amino acid sequence, have remarkably similar interactions with NF-B protein. This analysis is also used to suggest a testable hypothesis where an amino acid polymorphism previously identified between ZEBRA protein from Akata tumor cells and the ZEBRA protein from non-tumorigenic B-958 could modulate the binding interactions between ZEBRA and p53 leading to a more cancer-promoting viral phenotype in the case of the Akata derived ZEBRA.

Results
Identification of a structural and functional divergence between ZEBRA carboxyl terminus and fos/jun transcription factors Epstein Barr Virus (EBV) encoded ZEBRA protein is capable of switching the viral state from latency to lytic growth through its effects on specific DNA binding sites termed ZRE (ZEBRA Response Elements) [7]. The crystal structure of the protein bound to a ZRE oligonucleotide reveals similarity to Fos/Jun and other DNA binding proteins of the BZIP transcription family as expected from similarities between ZRE and AP-1 binding sites [8]. The structure of the carboxyl terminal region of the ZEBRA protein has not been determined in the published crystal structure as inclusion of this region prevented protein crystallization suggesting a disordered structure ( Figure 1) [32]. This region of ZEBRA is also dispensable for DNA binding [11].
Efforts to discover a specific mechanism for proteinprotein interactions of the carboxyl terminus region of ZEBRA based upon the partially solved structure have been unsuccessful, although some similarities are evident between this region and C/EBP-α, a CAAT-binding transcription factor whose dysregulation is implicated in acute myeloid leukemia [13]. Notably, the conserved carboxyl terminus of ZEBRA ( Figure 2) diverges from DNA binding proteins of the BZIP family precisely at the intron border of exon 3 ( Figure 3). Like the ZEBRA terminal exon, IB binds to both p53 and NF-B [6,21]. Exon 1 and 2 of ZEBRA thus encode a Fos/Jun homologue that activates DNA transcription to initiate the viral lytic program, while exon 3 (shared with trans-spliced RAZ protein) provides both protein self-dimerization and binding to other master regulatory transcription factors such as p53 and NF-B. These observations are analyzed in more detail in the remainder of this work.

ZANK, the carboxyl terminal Zebra ANKyrin-like region
Interactions between ZEBRA and NF-B, p53 as well as other proteins can occur independently of ZEBRA DNA binding, since experiments in vivo have shown that in T-lymphocytes ZEBRA is not bound to DNA but localized to the cytoplasm where it blocks NF-B translocation to the nucleus [6,21]. The full sequence of the carboxyl terminus exon (exon 3) of ZEBRA is shown in Figure 2, including both regions of the protein present in the crystal structure (blue highlights in Figure 1) and extreme carboxyl residues not present in the crystal structure [26]. We previously suggested that the carboxyl terminus of ZEBRA encodes an ankyrin-like structure responsible for the ability of ZEBRA to bind ankyrin binding proteins NF-B and p53 [6,21]. For purposes of discussion, the carboxyl terminus of ZEBRA Remarkably, the crystal structure of ZEBRA protein ( Figure 1) shows divergence from BZIP Fos/Jun DNA binding proteins precisely at the intron/exon borders of ZEBRA exon 3 (Figure 2, 3).
Comparison of ZANK with defined vertebrate and invertebrate ankyrin domains ZANK spans part of the alpha helix of ZEBRA and unstructured regions at the carboxyl terminus of the protein (Figure 3, 4). The alpha-helical stem-like regions contribute most of the similarity between ZANK and individual ankyrins ( Figure 5). This is evident in Figure  5 because most of the colored coded similar regions occur to the left of the figure in the stem regions (denoted s) rather than the loop regions (denoted l). This was confirmed by a randomization analysis ( Figure  6). Randomizations demonstrate that the full ZANK sequence or loop region results in less significant alignments or no significant alignments for both ZANK and other ankyrins compared to randomization of the stem regions ( Figure 6). The alpha helical regions of ankyrin binding proteins are highly structured to provide a rigid framework, while the loop regions are less structured allowing for binding to different substrates. Results are consistent with the hypothesis that ZANK encodes amino acids derived from a common ancestral ankyrin domain shared with IB and p53 binding ankyrins and are most closely related to the invertebrate Drosophila ankyrin regulatory protein Cactus, rather than vertebrate ankyrins.
Statistical significance as highlighted in yellow in Figure 6 is defined as a given alignment found in less than 5% of shuffled sequences (less than 500 out of 10,000), corresponding to P < 0.05 that the alignment would occur by chance. A more stringent definition of a nonrandom alignment (highlighted in red in Figure 6) is defined as a given alignment found in less than 0.1% of shuffled sequences (less than 10 out of 10,000). In all cases analyzed, including ZANK, the stem or alpha helix regions of ankyrins are more conserved across ankyrin protein families than the loop regions. For example, not all IBα loops in host proteins are similar to each other, but all the IBα stem regions are similar based upon 10,000 random shuffling of the respective sequences.
The diagonal of each randomization experiment shown is the identity of the sequence with itself shown in red, which as expected, occurs once per 10,000 shuffling experiments. Because each cell represents a characterization of an independent shuffling "experiment" the number of aligned sequences are slightly different even for identical comparisons as can be seen by comparing identical comparisons on either side of the diagonal identity but because of the large number of shuffling per "experiment" results are usually similar for a given cell on either side of the diagonal.
ZANK is more similar to an invertebrate ankyrin domain from Cactus than vertebrate ankyrins As shown in Figure 6, the ZANK stem was most similar to the ankyrin 4 stem region of invertebrate Drosophila Cactus regulatory protein at the highest level of significance (less than 10/10,000 of shuffled sequences). ZANK loop was most similar to invertebrate Drosophila Cactus regulatory protein ankyrin 5 loop region at the highest level of significance (less than 10/10,000 of shuffled sequences). Similarity was also evident between ZANK and the Cactus regulatory protein ankyrin 2 stem region and the IBα ankyrin 3 and 5 stems as well as the whole protein. Similarity to both stem and loop regions of Cactus in addition to elements of IBα is a surprising result since EBV is a vertebrate pathogen, while Cactus is an invertebrate host gene, suggesting that the divergence of ZANK from other ankyrins was ancient, possibly preceding divergence of p53 and NF-B ankyrin binding proteins such as IBα. Alternatively, a vertebrate virus encoding ZANK could have resulted from recombination with an invertebrate Drosophila virus since in some cases insect viruses can infect vertebrate cells. Existing phylogeny programs do not appear able to resolve these two possibilities as will be discussed in more detail elsewhere.

In silico Modeling of ZANK
In the remainder of this work we used PyMOL generated structures to model experimentally observed binding between ZEBRA protein and both NF-B and p53 in vitro and in vivo. A limitation of this analysis is that the existing structure of ZANK is both partial and also solved in the absence of the partner ligands p53 or NF-B. However, even with those limitations this analysis is sufficient to suggest a model of ZANK interactions with its partner ligands in which the ZANK loop adopts two different conformations. We have previously suggested that binding between ZEBRA carboxyl terminus and other proteins can be explained by the common descent of both NF-B and p53 from an ancestral protein termed proto-p53/NF-B. If this hypothesis is correct then it should be possible to identify a common structural signature or code between ZANK, NF-B binding ankyrins and p53 binding ankyrins that ZANK has evolved to fit into. To use an analogy, if ankyrins resemble a key that can bind to their substrates resembling a lock, then the ZANK key should fit into both the NF-B and p53 locks [6,21].
To explore the possibility of this "lock and key" mechanism, IBα ankyrin structures were extracted and superimposed as shown, without any other alterations in the solved crystal structure coordinates. Since ZEBRA protein exists as a dimer of identical subunits (Figure 1), two adjacent ZANK loop domains in the dimerized ZEBRA protein could interact with adjacent IBα ankyrin binding regions of NF-B. As shown, Ank 3 and 4 of IBα play a critical role in IBα interactions with NF-B (Figure 7). In support of this hypothesis, residues such as Histamine 239 in ZANK (Figure 3), not present in the crystal structure of ZEBRA due to instability in the absence of either co-ligand NF-B and tumor suppressor p53, could align with a corresponding conserved stem histidine in IBα Ank 3 and related ankyrin proteins ( Figure 5). Binding to proteins such as NF-B and p53 could stabilize ZANK and permit a crystal or structure to confirm this model.

NF-B binding ankyrins have an alternating key or code of short and long ankyrin loops suggesting a polymorphic ZANK loop domain
Ankyrin loop regions are variable in length while the stems are conserved in length ( Figure 5). There is an alternating pattern of short loops (approximately 15 amino acids between adjacent stems) and long loops (approximately 20 amino acids between adjacent stems) in all of the NF-B binding ankyrins including vertebrate IBα and invertebrate cactus. Thus both IBα and cactus ankyrin loops 2, 4, and 6 are short and loops 1, 3, and 5 are long. To permit binding interactions with NF-B, the ZANK loop region of ZANK would need to be dimorphic or sufficiently unstructured or flexible to conform to adjacent non-identical short and long ankyrin binding pockets in NF-B, for example the regions of NF-B binding to Ankyrins 3 and 4 of IBα as shown (Figure 7, 8, 9). Because the ankyrin loops of IBα ank5 are very similar in both primary and secondary structure to IBα ank3, IBα has an alternating pattern of Ank3 and Ank4-like subunits, and this alternating structure could permit multiple ZANK binding sites in a single NF-kb ligand ( Figure 10).
Also resulting from this alternating short loop/long loop pattern, using PyMOL generated structures it is evident that IBα ankyrin 3 has additional amino acids in its loop region forming a bulge not present in ankyrin 4, and also has hydrophobic amino acids in its loop not present in ankyrin 4 ( Figure 9). If ZEBRA were to bind to both IBα ankyrin 3 and 4 binding regions in NF-B, ZANK would need to adapt 2 different conformations to fit into the two different sites, for example by having free amino acids outside of the binding region at the Ankyrin 4 site, while utilizing additional amino acids in the extreme carboxyl region of the ZEBRA protein in the deeper Ankyrin 3 site. The bulge region present both ankyrin 3 and 5 loops is not present in ankyrin 4 loop ( Figure 10), but corresponds to a structured region of the ZANK loop (Figure 9). In the IBα structure, these structured regions of are proposed to form a α helix like region stabilizing the IBα loops.
Shared structural elements between p53 and NF-kB binding ankyrins identified as a potential binding mechanism of ZANK We suggested previously that that the ability of ZEBRA to bind both NF-B and p53 results from an Helices. IBα Ank3 (green) and Ank4 (yellow) domains are extracted from the NF-B p65/p50 dimer bound structure and are shown superimposed to illustrate the highly conserved structure of the stem and the less conserved loop region. Ank3 loop has a bulge not present in Ank4. Ank3 also has more bulky hydrophobic residues than Ank4 in the region of the loop extending most deeply into the NF-B p65/p50 structure. Arrows illustrate key features of Ank3 and Ank4 including a highly conserved His residue shared by most ankyrin stems (Ank3, His 149; Ank4, His 188). Also shown are bulky residues in Ank3 (Tyr 181 and His184) present in Ank4. The structure of ZANK loop (light blue) is superimposed on IBα Ank3 (green) and Ank4 (yellow) structures. The structured region present in ZANK loop overlaps the bulge of Ank3, not present in Ank4. Like Ank4, ZANK has no significant hydrophobic regions in the solved structure, but ZANK extends further due to unstructured residues not present in the solved structure ( Figure 2). evolutionarily conserved binding pocket shared between the two proteins [6,21]. The solved structure of p53bp2 (more recently denoted ASPP2 (1YCS.pdb)) bound to p53 demonstrates that p53bp2/ASPP2 contacts p53 in only the extreme carboxyl terminus, ankyrins 4 and 5, of its ankyrin regions, although other binding modes could exist in vivo. Pymol generated alignment between IBα ankyrin 3 and p53bp2/ASPP2 ankyrin 5 is shown ( Figure 11). p53bp2/ASPP2 ankyrin 5 has a structure with features similar to IBα ankyrin 3a highly conserved tyrosine is shared between ankyrin 3 and ASPP2 ankyrin 5, and this residue in IBα is known to form a specific contact with a tyrosine in NF-B p50/p65 dimer. Interestingly, the p53bp2/ASPP2 ankyrin loops do not have a short long loop pattern ( Figure 5) but instead have a length corresponding to the short IBα ankyrin loops (approximately 15 aa), and a conserved tyrosine typical of the long loops. These observations are consistent with a shared binding signature between some NF-B and p53 binding proteins that could be targeted by a polymorphic ZANK loop as a master key.

Identification of a cancer associated sequence variant in ZANK potentially altering p53binding interactions
An advantage of PyMOL generated in silico structural analysis of ankyrin domains in NF-B proteins and p53 binding proteins is that it may now possible to predict the behavior of polymorphisms in ZEBRA that increase Figure 10 Structurally similar large ankyrin loops alternate with small ankyrin loops in IBa. IBα Ank3 (green) is shown superimposed on IBα Ank5 (light green) to illustrate corresponding structural features. Alternating small and large ankyrin loops seem to be a conserved feature of vertebrate and invertebrate NF-B binding ankyrins, suggesting a "lock and key" mechanism. ZEBRA is proposed to mimic both large ankyrin loop structures and small ankyrin loop structures in order to fit into the NF-B binding code. The bulge region of Ank3 and Ank5, although encoded by different primary amino acids both can form a structure corresponding to the helix structure found in ZANK (Figure 9). IBα Ank3, Ank5, and potentially ZANK, also share conserved orientation of aromatic and bulky amino acids in the loop regions not present in IBα Ank4.
or decrease the relative strength of binding to NF-B versus p53 proteins ( Figure 12). Alterations in ZANK amino sequences could alter the ability of ZEBRA to transiently block p53 function during viral replication and thus alter viral growth properties, accounting in part for the recently described association of ZEBRA with increased mutagenic and carcinogenic properties in a murine model [14]. For example, the limited sequence polymorphism data available for the ZANK region of ZEBRA currently suggests that a mutation in tumor associated ZEBRA from Akata cells is different from canonical B-958 ZEBRA at a single amino acid in the ZANK region (ZEBRA exon 3). Remarkably, primary sequence comparison between ZANK, IB and related proteins and ASPP2 ankyrins reveals that the substitution of a serine for an alanine at amino acid 205 ( Figure  3) is not random but rather converts ZANK from an IB-like AAA motif to a p53BP2/ASPP2-like ASA motif, also shared with a Cactus ankyrin ( Figure 5). Unlike alanine, a serine residue may be covalently modified in vivo by phosphorylation, further extending the possible functional consequences of this polymorphism [35]. These observations suggest further study of ZANK polymorphisms in vivo and their effects on NF-B and p53 binding Figure 11 Alignment of IBa Ank3 and p53BP2/ASPP2 Ank5 structures are consistent with a shared binding mechanism between NF-B binding ankyrins and p53 binding ankyrins that could be targeted by ZANK. IBα Ank3 (green) is shown superimposed on p53bp2/ ASPP2 Ank5 (Salmon). P53bp2/ASPP2 Ank5 is similar in length to NF-B binding short ankyrin loops but also shares structural features spatially conserved in location and orientation to long ankyrin loops such as IBα Ank3. p53BP2/ASPP2 Ank5 aa Tyr 424 aligns structurally a with IBα Ank3Tyr 181 (filled blue arrow). The terminal residue of ZANK (Tyr 245) could substitute for highly conserved aromatic tyrosine binding interaction between IBα Ank3 tyr 181 and NF-B, as well as coordinating corresponding interactions between ZANK and p53. Both short and long ankyrins and P53bp2/ASPP2 Ank5 share a highly conserved histidine corresponding to IBα Ank3 aa His 149 (dark blue arrow). A corresponding residue is present in ZANK (Val 203). P53bp2/ASPP2 Ank5 lacks a residue corresponding to IBα Ank3 aa His 184 (narrow stem blue arrow), but a corresponding residue (His 239) is present in ZANK.
in vitro could identify associations between certain EBV strain polymorphisms and cancer phenotypes.
A model of antagonistic interactions between ZANK NF-B and p53 protein binding and ZEBRA fos/jun DNA binding As outlined in this manuscript, the extreme carboxyl terminal exon of ZEBRA encodes an amino acid sequence termed ZANK that is functionally and evolutionarily distinct from DNA binding regions of ZEBRA protein. The distinct origin and structure of ZANK suggest that ZANK domain could have a regulatory role on the DNA binding regions of the ZEBRA protein in addition to cytoplasmic effects on NF-B and p53 proteins ( Figure 1). In particular, ZANK could modulate ZEBRA DNA binding through conformational changes in the Fos/Jun like region of ZEBRA (Figure 4) triggered by binding between ZANK and proteins of the NF-B and p53 family and vice versa, modulating the transcriptional activity of the cellular transcription machinery ( Figure  13).
Binding of ZANK to proteins of the NF-B and p53 families could stimulate, inhibit or have no effect of ZEBRA binding to DNA. As shown in Figure 13, structural considerations based on the partial structure of ZEBRA and ZANK support an antagonistic role between ZEBRA DNA binding and ZANK interactions with NF-B and p53 proteins. Based upon the crystal structure of ZEBRA bound to a ZRE/AP1 DNA binding oligonucleotide (Figure 1), it is apparent that DNA bound to ZEBRA causes the two DNA binding alpha helices to move away from each other. The movement of the ZANK regions at the termini of the helices in a region not bound to DNA is analogous to the blades of a scissor in the open position. A parallel orientation of the ZANK stems, similar to the closed blades of a scissor is required to form an ankyrin-like dimer capable of binding adjacent regions of NF-B and p53 proteins ( Figure  7).
Thus, ZANK could function as a NF-B and p53 protein dependent switch in the nucleus of EBV infected Blymphocytes, turning ZEBRA DNA binding off in the presence of high levels of nuclear NF-B and p53, and conversely activating ZEBRA DNA binding in the absence of nuclear NF-B and p53 protein ( Figure 13). In physiologic terms, these observations are consistent with high levels of active NF-B and p53 proteins present in the nucleus of latent B-lymphocytes during viral latency, which during viral latency would tend to block ZEBRA DNA binding and prevent activation of viral lytic growth. Conversely, transiently decreased levels of nuclear NF-B proteins in latently infected B- Later in the viral lytic replication cycle, the NF-B and p53 protein binding function of ZANK could further trigger cellular apoptosis and facilitate infectious viral release directly through cytoplasmic effects on NF-B and p53 protein. In T lymphocytes and possibly other non B-lymphocyte cells infected by EBV but incapable of supporting stable viral latency, ZEBRA is almost entirely present in the cell cytoplasm. In these cells, ZEBRA could initially inhibit NF-B and cause activation of p53 protein and cellular apoptosis as a mechanism of viral immunosuppression.
Transpliced RAZ, containing ZANK but not capable of ZEBRA DNA binding would then serve as an amplifier of the NF-B and p53 protein functions of ZEBRA, and a corresponding RAZ-like function might explain the non-DNA binding functions of a ZEBRA homologue in HSV-8, the etiologic agent of Kaposi's sarcoma and other human cancers. These possibilities could in principle be tested in vitro and in vivo through ZEBRA with ZANK domains engineered for either increased or reduced effects on NF-B and/or p53 protein through mutation of conserved ZANK amino acids interacting with NF-B and p53 protein identified with the structural models and analysis presented in this work.

Conclusions
Characterization of B cell lympho-proliferative disorders haa demonstrated that EBV viral latency proteins can transform and immortalize B-lymphocytes in vitro and in vivo. A causal role of the virus in carcinogenesis is suggested in some tumors because the viral genome sometimes remains in the malignant cell. However, the mechanism of viral carcinogenesis cannot simply be a function of cellular immortalization by the virus since although the vast majority of humans harbor EBV immortalized B lymphocytes, EBV-genome associated cancers are rare [36]. EBV infection may thus also be associated with carcinogenesis not only through cellular immortalization but also through transient mechanisms including destabilization of host genomes [37][38][39][40].
A novel role of ZEBRA and ZANK in viral carcinogenesis is suggested in this work. ZEBRA plays a critical role in the viral latent to lytic switch, possibly in part through response to NF-B and p53 activities by ZANK ( Figure 13). Alternating cycles of partial lytic growth triggered by ZEBRA activation and ZEBRA DNA binding followed by ZANK interactions with NF-B and p53 could synergize with genomic instability contributed by cellular or viral recombinase activation in EBV infected cells. Partial cycles of viral activation and inactivation without completion of the viral life cycle could be particularly relevant in the "hit and run" scenario, proposed for viral carcinogenesis and immune dysregulation as proposed in epithelial breast cancers [41][42][43][44][45][46][47].
Notably, it is possible that viral inactivation of p53 originates indirectly or as an evolutionary "spandrel." A spandrel refers to a new phenotype generated by alteration of a related developmental pathway. Because of the evolutionary relationship between NF-B and p53 through descent from a common ancestral transcription factor, an unavoidable consequence of the inactivation of NF-B by an ankyrin-like region of ZEBRA is a spandrel-like transient inactivation of p53. Restated, the common origins of NF-B and p53 conserve a similar ankyrin binding region in both proteins, so any viral protein that targets this site on NF-B proteins could, in parallel, affect p53 generating new p53 dependent phenotypes. This may in part contribute to the selective loss of the IBα locus in tumors, a conserved feature of several EBV associated tumors [48,49]. The ZANK spandrel effects on p53 are maintained because of a selective benefit to the virus from related transient inactivation of the NF-B transcription family and the innate immune system, as well as effects of ZANK on regulation of ZEBRA DNA binding. Right: When ZEBRA is not bound to DNA either because of its cytoplasmic location or due to its binding to NF-B and/or p53 proteins, the two alpha helices of the ZEBRA dimer can adopt a parallel or closed conformation similar to the parallel helices of ankyrin bound to NF-B and/or p53 protein.
Mutant p53 proteins were first characterized as oncogenes because their presence was associated with tumor phenotypes including tumor growth and resistance to apoptosis [50][51][52]. Surprisingly, wild type p53 was later shown to be a tumor suppressor through the effects of the protein as a DNA-binding transcription factor, rather than a tumor promoter. These contradictory effects could result from opposing tumor promoting spandrel effects between numerous viral proteins such as ZEBRA and mutant p53 proteins, versus tumor suppressing DNA binding gene activation by wild type p53 protein. Thus ZANK might augment the effects of mutant p53, not directly related to p53 DNA binding, but rather due to effects on structurally similar regions of NF-B proteins required for cellular proliferation.
From an evolutionary perspective, it is likely that EBV encoded ZEBRA protein and a related protein in Human Herpes Virus 8 diverged from other BZIP DNA binding proteins such as Fos/Jun through viral capture of a terminal exon encoding a single ankyrin-like stem and loop. This domain was capable of dimerization into a structure resembling adjacent IB ankyrins, and was subject to positive selection through interactions with NF-B proteins. However, because of the shared ankyrin binding domain between NF-B and p53 proteins an unfortunate consequence of NF-B inhibition by ZANK would also unavoidably be inactivation of p53. This could be particularly important as a pathogenic mechanism of EBV-related genomic instability in epithelial cells and T lymphocytes that do not support the establishment of viral latency or replication but can sustain a limited form a viral lytic growth that includes high levels of expression of a trans-spliced protein "RAZ" containing ZANK, but without the DNA-binding transcriptional activity on ZRE. In this context, these cells are "dead end hosts" in their inability to support viral replication but can nevertheless suffer the pathophysiological and molecular consequences of viral infection. Viral pre-conditioning by common viral pathogens has been indirectly implicated in the clustering of human cancers both geographically and temporally based upon indirect evidence in human cancers that defective or abortive replication of common viral pathogens can account for clustering of cancers [53]. In the "viral conditioning" scenario, abortive cycles of viral replication can promote genomic instability. A model is presented here for the specific case of EBV. In the case of EBV and, potentially, other common viruses that interact with both NF-B and p53, transient lytic gene expression in both B lymphocytes and non-B cells could provide a mechanism for this viral pre-conditioning where defective or aborted viral replication contributes to malignancy as supported by indirect epidemiologic evidence [54][55][56]. These effects of ZANK could be of particular importance not only in B lymphocytes but in epithelial and T cell infection in which the virus expresses lytic gene products but in most cases does not establish latency [57]. In these cell types transient inactivation of the tumor suppressor p53 would occur in the context of activation of endogenous somatic recombination pathways, potentially resulting in re-arrangement, instability and mutation of endogenous oncogenes.
In summary, we have demonstrated that ZEBRA is a hybrid protein resulting from viral capture of both a Fos/Jun like transcription factor and a polymorphic IBlike ankyrin carboxyl terminus. Primary sequence and structural similarities are evident between the carboxyl terminus of ZEBRA protein and the conserved stem regions of IBα and p53BP2/ASPP2 ankyrins. It is suggested that a shared binding pocket in NF-B is in turn bound by all of these ankyrin proteins and that ZANK is analogous to a "master key" capable of fitting into both NF-B and p53 protein "locks". In the case of ZEBRA, protein binding to NF-B provides a selective advantage to the virus through inactivation of the immune response against the virus during viral lytic replication, simultaneously and unavoidably altering the activation of p53.