Sequence analysis of the Epstein-Barr virus (EBV) BRLF1 gene in nasopharyngeal and gastric carcinomas

Background Epstein-Barr virus (EBV) has a biphasic infection cycle consisting of a latent and a lytic replicative phase. The product of immediate-early gene BRLF1, Rta, is able to disrupt the latency phase in epithelial cells and certain B-cell lines. The protein Rta is a frequent target of the EBV-induced cytotoxic T cell response. In spite of our good understanding of this protein, little is known for the gene polymorphism of BRLF1. Results BRLF1 gene was successfully amplified in 34 EBV-associated gastric carcinomas (EBVaGCs), 57 nasopharyngeal carcinomas (NPCs) and 28 throat washings (TWs) samples from healthy donors followed by PCR-direct sequencing. Fourteen loci were found to be affected by amino acid changes, 17 loci by silent nucleotide changes. According to the phylogenetic tree, 5 distinct subtypes of BRLF1 were identified, and 2 subtypes BR1-A and BR1-C were detected in 42.9% (51/119), 42.0% (50/119) of samples, respectively. The distribution of these 2 subtypes among 3 types of specimens was significantly different. The subtype BR1-A preferentially existed in healthy donors, while BR1-C was seen more in biopsies of NPC. A silent mutation A/G was detected in all the isolates. Among 3 functional domains, the dimerization domain of Rta showed a stably conserved sequence, while DNA binding and transactivation domains were detected to have multiple mutations. Three of 16 CTL epitopes, NAA, QKE and ERP, were affected by amino acid changes. Epitope ERP was relatively conserved; epitopes NAA and QKE harbored more mutations. Conclusions This first detailed investigation of sequence variations in BRLF1 gene has identified 5 distinct subtypes. Two subtypes BR1-A and BR1-C are the dominant genotypes of BRLF1. The subtype BR1-C is more frequent in NPCs, while BR1-A preferentially presents in healthy donors. BR1-C may be associated with the tumorigenesis of NPC.


Background
Epstein-Barr virus (EBV) is a ubiquitous human herpesvirus that infects over 90% of the world population. As the causal agent of infectious mononucleosis, EBV is also tightly associated with various malignancies, including Hodgkin's disease, Burkitt's lymphoma(BL), nasopharyngeal carcinoma(NPC), and B and T cell lymphomas in immunocompromised individuals such as AIDS patients and organ transplant recipients [1,2]. It is also responsible for some gastric carcinomas (GC). The EBV infection is found in 80-100% of gastric lymphoepithelioma-like carcinoma cases and 2-16% of common types of gastric adenocarcinoma [3][4][5][6]. In Northern China this rate is about 7.0% according to our previous study [7].
After primary infection, EBV establishes a lifelong, asymptomatic state in B cells. However, EBV can periodically reactivate and replicate in a lytic manner [8].
Understanding how viral latency is disrupted is a central focus in herpesvirus biology. Induction of the switch from latency to lytic cycle is associated with expression of immediate-early (IE) protein Rta (R transactivator), the product of the BRLF1 gene [9]. Rta is a 605-amino acid (AA) protein with unknown cellular homologues. The N-terminus of Rta contains an overlapping DNA binding (AA 1 to 320) and dimerization (AA 1 to 232) domain that does not correspond to any described DNA binding motif previously [10]. The transcriptional activation domain is found in the C-terminal region of the protein. An obligatory acidic activation domain (AA 520 to 605) contains highly conserved hydrophobic residues that are predicted to form alpha helices [10]. A weaker accessory activating domain contains two proline-rich subregions (AA 352 to 410 and 450 to 500).
ZEBRA, the product of EBV BZLF1 gene, had been thought to be the only viral protein capable of initiating the lytic cycle [11][12][13][14]. In recent years Rta has been found to be able to disrupt latency through activating Zp, the promoter of BZLF1, leading expression of ZEBRA, and thereby stimulation of early lytic genes, DNA replication, and late gene expression [9,15,16]. There are a number of interesting differences between the induction of lytic EBV infection by BZLF1 and that by BRLF1. Zalani S, et al. [17] reported that Rta can disrupt viral latency in an epithelial cell-specific manner (in contrast to the ability of ZEBRA to disrupt latency in B cells), and the mechanisms leading to disruption of EBV latency appear to be cell-type specific. Also, it has been demonstrated that BRLF1, not BZLF1, requires activation of the p38 and c-Jun stress MAP kinase pathways for induction of lytic EBV infection [18], and also requires PI3 kinase activation [19].
Several components of the immune system contribute to the highly efficient control of virus replication and proliferation of immortalized, EBV-infected cells in healthy individuals, and probably the most important components are HLA-restricted specific cytotoxic T lymphocytes (CTLs). As EBV can switch directly from the latent state into the lytic cycle without any expression of further latent proteins [20], CTL directed against latent proteins might not be able to prevent the ongoing viral replication. Therefore, CTL directed against immediate-early (IE) proteins is a pivotal step to control the virus lytic activation. Rta has been demonstrated to have multiple epitopes recognized by EBV-specific CTL [21,22]. Delineation of sequence variations of CTL epitopes may help the development of an effective control of EBV replication and cell proliferation.
A notable feature of EBV-associated malignancies is variation in incidence and the proportion of EBV-positive tumors in different geographic regions [23,24]. The disparity is poorly understood. To explore the potential association of the EBV-associated malignancies with integrated EBV sequence variations, as well as the possibility of a CTL-based control of EBV replication and cell proliferation, we analyzed the sequence variation of EBV BRLF1 gene in EBVaGCs, NPCs and healthy donors.

Sequence variation of BRLF1 gene
The sequence of BRLF1 gene coding 605 AAs was successfully amplified in 34 EBVaGCs, 57 NPCs, and 28 TWs, respectively. All the sequences were compared with the prototype B95-8 sequence. Nucleotide changes were detected in 31 loci, 14 of which resulted in AA changes. Among the 17 loci with silent nucleotide changes, one (at 103654) was detected with an A/G interchange in all the specimens tested. The translated AA mutations from the sequence variations were summarized in Figure 1. According to the phylogenetic tree ( Figure 2), 5 distinct subtypes of BRLF1 were identified among the observed 119 specimens, namely subtype BR1-A, BR1-B, BR1-C, BR1-D, and BR1-E. Two subtypes, BR1-A and BR1-C, were found to be dominant in the total specimens.
The subtype BR1-A, which was represented by NPC87, was detected in 42.9% (51/119) of samples. Forty specimens in this subtype had 2 common coding changes: 377(Ala Glu), 542(Ser Asn), while 6 specimens only had residue 377, 5 specimens only had residue 542 changes. Interestingly, residue 489 caused different AA changes among different specimens, namely, Gln Arg in 28 specimens, Gln Lys in 21 specimens. Besides these 3 residues, 3 isolates (TW165, TW121, and NPC6) showed Val Ile interchange at residue 479. Silent changes in this pattern were detected in 3 residues: 486(CCG CCA), 510(GAA GAG), 572 (CCC CCA) (data not shown). The prevalent AA mutations at residues 377, 489 and 542 of this subtype were identical to the GD1 strain, which is a representative EBV strain isolated from NPC patients in Guangdong, China [25]. Also, the BRLF1 gene in C666-1 cell line, which was established from an undifferentiated NPC biopsy in Southern China [26], harbored only these three mutations.
The rest 3 subtypes were only detected in small numbers of specimens. The subtype BR1-B shared 2 common AA substitutions: residues 273(Arg Met), 316 (Lys Glu). Additional Gln/Arg interchange at residue 489 was detected in 6 specimens. The subtype BR1-D was detected in 5 specimens, which had 5 common A S a g g g g t a c c t a c a a g c c a g c g c c c c t t g g c g t c c a g g c a a g c showing identical sequences to each other are listed by a representative isolate in the second column. The following numbers separated by "/" denote the number of the identical sequences from EBVaGC, NPC and TW, respectively. Only sequences different from B95-8 are indicated. The small letters denote the nucleotide, and the amino acids are denoted by capital letters. The GD1 sequence was taken from EBV genomes AY961628 [25]. C666-1 and SNU-719 were two EBV-positive cell lines [26,27], whose sequences were obtained by using PCR-direct sequencing method.
residues: 273 (Arg Met), 316(Lys Glu), 403(Pro Ser), 406(Leu Phe), 414(Gly Val). The last subtype BR1-E, which is represented by the strain GC4, had the same AA sequence with the prototype B95-8, except for a silent mutation at 103654. This subtype was only found in 3 EBVaGCs. The BRLF1 gene in EBV-positive GC cell line SNU-719, which was established from a Korea GC patient [27], also showed conserved sequence.

Distribution of BRLF1 subtypes in EBVaGCs, NPCs, and TWs
The frequency of BRLF1 subtypes in EBVaGCs, NPCs, and TWs of healthy donors was summarized in Table 1

Variation analysis in BRLF1 functional domains
As an important transactivator, BRLF1 gene harbors 3 vital functional domains: dimerization, DNA binding, and transactivation domains [10]. The AA mutations in BRLF1 functional domains were summarized in Table 2.
In this study, the domain of dimerization (AA 1 to 232) was found to be stably conserved, where no AA mutations were detected (data not shown). Five residues (273, 284, 288, 292 and 316) were detected to have AA

Variation analysis of CTL epitope sequences among EBV isolates
Sixteen CTL epitopes in Rta were identified in previous studies [22,28], 3 of which showed variations in the detected isolates. Variations of CTL epitopes were summarized in Table 3. The QKE epitope was affected by an S N change at position 14 of the epitope, and existed in the majority of the specimens (

Discussion
In this study we analyzed the sequence variations of BRLF1 gene in 34 EBVaGCs, 57 NPCs and 28 TWs in healthy donors. To our knowledge, this is the first report about the polymorphism of BRLF1 gene from multiple tissues. Based on the phylogenetic tree, we identified 5 distinct subtypes of BRLF1 gene in the specimens of Northern China. Two subtypes, BR1-A and BR1-C, were dominant in the specimens observed. In this study, subtype BR1-C was seen more in biopsies of NPC. It can be speculated that a substrain of EBV with this subtype infects NPC more frequently and this subtype may be more associated with the tumorigenesis of NPC in Northern China. Feng et al. [29] demonstrated that BRLF1 is specifically expressed in NPC tumor cells. Further studies of BRLF1 polymorphism in wider areas and functional studies of subtype BR1-C will help our understanding about the association between specific BRLF1 gene subtypes and EBV associated malignancies. Unlike subtype BR1-C, the incidence of BR1-A was significantly higher in healthy donors (67.8%) than that in EBVaGC (38.2%) or NPC group (33.3%), suggesting that this subtype was the dominant subtype of BRLF1 in healthy populations in the area studied. The prevalent mutations of this subtype were completely identical to the GD1 strain [25] and the EBV strain in NPC cell line C666-1, which were both established from Southern China. Unfortunately, we were unable to compare the prevalent rates in our samples with that in Southern China, because the distribution data of BRLF1 subtypes is not available for the populations in Southern China. Interestingly, a silent mutation A G at 103654 was found in all the wild isolates, suggesting this interchange may be a specific marker of the EBV strains in local area and the local EBV  strains may stem from a common ancestral virus different from the other BRLF1 groups. The AA mutations in the functional domains and CTL epitopes from this study suggest that not only BRLF1 gene subtypes, but also mutations in functional domains and CTL epitopes exhibit specific distribution among 3 sample groups.
As a transcriptional activator, Rta plays an important role in the switch from latency to a productive infection. Three domains, dimerization, DNA binding, and transactivation, contribute to this function. In this study, we found the dimerization domain was highly conserved without any AA mutations, suggesting its critical function in the lytic activation. DNA binding domain was mainly affected by mutation R M at residue 273 and K E at residue 316 and these changes were significantly higher in NPC. R M at residue 273 may be of great importance because interchange from hydrophilic to hydrophobic amino acid may alter the affinity of protein with DNA, while mutation at position 316 may contribute less to the change of the capacity of DNA binding, according to the results of Manet, E and colleague [10]. Although multiple AA mutations were detected in the transactivation domain, only one mutation at residue 542, which was located in the absolutely essential 90 C-terminal AAs for the protein's transcriptional activation, may have a significant impact on the transcription activity [10]. It may be of significance because it has been reported that variations in EBV-interacting molecules might alter DNA binding and transcription activity and thus may contribute to the tumorigenesis of EBV associated malignancies [30]. Interestingly, these 3 dominant mutations (R M at residue 273, K E at residue 316 and S N at residue 542) affecting functional domains were all included in the subtype BR1-C, but the two mutations at residues 273 and 316 were not detected in the isolates of subtype BR1-A at all (0/119). The subtype BR1-A is preferentially present in healthy donors, while BR1-C is more frequent in NPCs. Moreover, the two mutations at residues 273 and 316 in BR1-C were seen more in NPC. These observations indicate that these two AA mutations may be of great importance in the carcinogenesis of NPC. These two mutations can also be potentially used as a gene marker to distinguish subtype BR1-A from other subtypes in the area observed.
Studies have shown that EBV can elicit strong CTL responses which direct against a limited number of viral proteins [31][32][33]. Focus has been on the mutations of CTL epitopes in EBV latent-expressing proteins for their important roles in the associated malignancies, while little is known about the proteins which are expressed in the lytic phase. In the present study, we found 3 of 16 identified CTL epitopes of Rta were affected by AA mutations (Table 3). Epitope ERP was affected in a few isolates; while epitopes NAA and QKE were frequently affected by mutations, with relatively lower mutation rates in malignant groups (EBVaGC or NPC) than in healthy donors. This was contrary to the general belief that the viral strains associated with malignancies can evade immune surveillance by altering amino acids within CTL epitopes [34]. CTL directed against immediate-early (IE) proteins is a pivotal step to control the virus lytic activation. The sequence analysis to all known Rta CTL epitopes provides valuable information for choosing target epitopes for control of EBV lytic activation.

Conclusions
In conclusion, we have identified 5 distinct subtypes of BRLF1 in Northern Chinese EBV isolates in multiple clinical specimens. The subtype BR1-C is more frequent in NPCs, while BR1-A preferentially presents in healthy donors. Mutation analysis in functional domains and CTL epitopes revealed specific distribution of mutations among 3 specimen groups. The impact of these alterations on functions of Rta and immunological recognition of EBV is potentially interesting and needs more functional studies. Further investigation in extended areas and EBV associated diseases will enhance our understanding of BRLF1 gene polymorphism and their association with tumors.

Specimens, cells and DNA extraction
Thirty-four EBVaGCs, 57 NPCs, 28 TWs and 2 EBV positive cell lines (GC cell line SNU-719, NPC cell line C666-1) were used in this study. Tumor tissues of GCs and NPCs were collected from major hospitals of Shandong Province in the Northern China, a non-endemic area of NPC. The infection of EBV in GC and NPC tissues was determined by EBV-encoded small RNA (EBER) 1 in situ hybridization, as described previously [35]. TWs were collected from the healthy donors in the same geographic regions. The EBV-positive TWs were determined by the BamHI W fragment positive signals, using PCR with a BamHI W specific primer pair [36]. EBV positive cell lines SNU-719 and C666-1 were maintained in RPMI 1640 supplemented with 10% fetal bovine serum (FBS) [26,27]. The B95-8 cell line was used as a source of the prototype EBV genome. All the carcinoma patients as well as the healthy individuals gave an informed consent for the study and the study was approved by the Medical Ethics Committee at the Medical College of Qingdao University, China.
DNAs used in this study were extracted from fresh specimens and cell lines by using the standard method with proteinase K digestion and phenol-chloroform purification. QIAamp DNA FFPE Tissue kit (QIAGEN GmbH, Hilden, Germany) was used to extract the DNA from paraffin-embedded tumor tissues.

Amplification of DNA
Specific oligonucleotide primers flanking the BRLF1 gene were designed for nested PCR (Table 4). In each set of PCR, DNA from EBV-positive B95-8 cell lines was used as positive control, and nuclease-free distilled water served as negative control. For the amplification, the first round polymerase chain reaction (PCR) was performed in a total volume of 25 μl containing 1 × PCR reaction buffer, 100 ng of genomic DNA, 0.5 μM of each primer, 200 μM of each deoxyribonucleotide triphosphates, and 1 U Pfu Taq polymerase (TaKaRa Biotechnology Co., Ltd., Kyoto, Japan). PCR amplification was performed with an initial denaturation at 95°C for 5 min. Then, 35 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s, extension at 72°C for 1 min. A final elongation step at 72°C for 10 min was also conducted. BRLF1-A1 combined with BRLF1-A2 (splice1), BRLF1-B1 with BRLF1-B2 (splice2) and BRLF1-C1 with BRLF1-C2 (splice3) as the outer primers. When necessary, 2 μl of the PCR product were taken for a second round of PCR, using internal primers: BRLF1-A2 combined with BRLF1-A3 (splice1), BRLF1-B3 with BRLF1-B4 (splice2), and BRLF1-C3 with C4 (splice3). In order to prevent contamination, several measurements were taken, such as frequently changing gloves and cleaning the equipment, using aerosol-resistant pipette tips for PCR, and performing different procedures in separate areas. The PCR products were analyzed by electrophoresis through a 1.2% agarose gel.
Sequencing analysis of PCR products PCR products were purified using a gel extraction kit (QIAEX II; QIAGEN GmbH, Hilden, Germany), under the conditions specified by the manufacturer. PCR amplified fragments were sequenced by means of a Prism ready reaction Dyedeoxy terminator cycle sequencing kit (Applied Biosystems, Foster, USA).

Data analysis
The sequence data were checked for any homology in BLAST (National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov/) and were compared with the B95-8 prototype strain. Alignments between sequences were analyzed using DNA Star software (DNASTAR, Inc, version 5.0). The sequences from representative samples were used to draw a phylogenetic tree. Either χ2 test or Fisher's exact test (2-sided) was performed to determine the distribution difference of the EBV variations among the EBVaGCs, NPCs, and the TWs from the healthy adults. Significance was set at P value < 0.05.