Complete genome analysis of a novel E3-partial-deleted human adenovirus type 7 strain isolated in Southern China

Human adenovirus (HAdV) is a causative agent of acute respiratory disease, which is prevalent throughout the world. Recently there are some reports which found that the HAdV-3 and HAdV-5 genomes were very stable across 50 years of time and space. But more and more recombinant genomes have been identified in emergent HAdV pathogens and it is a pathway for the molecular evolution of types. In our paper, we found a HAdV-7 GZ07 strain isolated from a child with acute respiratory disease, whose genome was E3-partial deleted. The whole genome was 32442 bp with 2864 bp deleted in E3 region and was annotated in detail (GenBank: HQ659699). The growth character was the same as that of another HAdV-7 wild strain which had no gene deletion. By comparison with E3 regions of the other HAdV-B, we found that only left-end two proteins were remained: 12.1 kDa glycoprotein and 16.1 kDa protein. E3 MHC class I antigen-binding glycoprotein, hypothetical 20.6 kDa protein, 20.6 kDa protein, 7.7 kDa protein., 10.3 kDa protein, 14.9 kDa protein and E3 14.7 kDa protein were all missing. It is the first report about E3 deletion in human adenovirus, which suggests that E3 region is also a possible recombination region in adenovirus molecular evolution.


Introduction
Human adenoviruses (HAdVs) are implicated in a wide range of human diseases, including respiratory, ocular, metabolic, renal and gastrointestinal. They are responsible for 5-10% of lower respiratory tract infections in infants and children throughout the world. HAdV-7, a member of the B1 subspecies, causes acute respiratory disease (ARD). This pathogen is identified in epidemics, is highly virulent and is associated with clinical manifestations of considerable severity including residual lung damage and fatal outcomes [1]. Previous reports suggested that HAdV-3 and -5 are very stable across 50 years of time and space [2,3], which is common in DNA viruses. But HAdV in general are known to undergo recombination. Earlier studies demonstrated in vitro recombination. But more and more isolates, which were isolated from adenovirus epidemic, undergo new recombination between adenovirus types, which leaded to new"intermediates" or subtypes [4]. All the evidence supports the hypothesis that genome recombination drives the molecular evolution of HAdV types. In our research, we found a HAdV-7 strain isolated from a child with acute respiratory disease, with a large portion of E3 region deleted. The whole genome was annotated (GenBank: HQ659699). It hints that E3 region is also important in adenovirus recombination and molecular evolution.

Cells, virus and Preparation of viral DNA
The virus strain (designated HAdV-7 GZ07) in this study was isolated from nasal aspirates of a child with ARD in southern China in 2007. The Nasal aspirate specimen was inoculated to HEp-2 cells for isolation, which was maintained in minimal essential medium supplemented with 100 IU penicillin ml -1 , 100 μg streptomycin ml -1 and 2% (v/v) fetal bovine serum. The cells were observed for 1-2 weeks for CPE, and the supernatant was identified by a neutralization assay with type-specific reference antisera raised in rabbits by conventional procedures. Typespecific primers designed to the hypervariable regions (HVRs) of the HAdV hexon were also utilized to correctly identify the serotypes. Viral DNA was extracted by using a previously described method [5,6].

DNA restriction analysis
Restriction analysis was performed using restriction endonucleases (BamHI, EcoRI, EcoRV, HindIII, SalI, SmaI) and the restriction profiles were compared with those of prototype and other genome-types described in the literature and the genome-type denomination system [7].

DNA sequencing and analysis
According to the published sequences of HAdV-7 and others types, the PCR primer pairs were designed to amplify the fragments of the HAdV-GZ09 by using the isolated viral DNA. These fragments were either cloned and sequenced subsequently or sequenced directly from the amplicon. It was sequenced by primer walking with overlapping sequencing reactions. For confirmation of the exact ends of the ITR sequence, a method described by Zhang [5] was followed. All of the reported sequences are the result of at least three sequencing reactions. The sequencing reactions were carried out by using an ABI Prism BigDye Terminator v3.1 Cycle Sequencing Ready Reaction kit with AmpliTaq DNA polymerase on an ABI 3730 DNA sequencer (Applied Biosystems). Unresolved and ambiguous sequences were resequenced with primers close to the regions in question.
Sequence assembly was carried out with the program SeqMan 5.00 from the DNASTAR software package. The genome sequence of HAdV-7 GZ07 was firstly blasted in Genebank using megablast program, then annotated by parsing the 32442 bases into 1-kb nonoverlapping segments which were queried systematically against the nonredundant NCBI database using the BLASTX program [8]. Default parameters of word size = 3 and expectation = 10, with the BLOSUM62 substitution matrix and with gap penalties of 11 (existence) and 1 (extension), were applied to these analyses. Low complexity sequences were filtered out of the queries, as per the BLAST algorithm. Genome annotation, analysis of non coding DNA motifs and functional protein motifs were performed by using the web based gene prediction software GENEMARK software [9] and determined putative proteins were performed with blastp from NCBI http://www.ncbi.nlm.nih.gov/BLAST/.
Whole-genome alignment and comparisons of the sequences from HAdVs were performed by using the dot-plot software Advanced PipMaker http://pipmaker. bx.psu.edu/cgi-bin/pipmaker?advanced, which aligns long genomic DNA sequences quickly and with good sensitivity [10].
E3 region of HAdV-7 GZ07 strain was analyzed and compared with that of the other HAdV-B strains.
CLUSTALX was used to perform multiple-sequence alignments of adenovirus E3 sequences. Phylogenetic analysis was performed with the MEGA software package (version 4.1). The phylogenetic trees were constructed with the neighbor-joining method. Bootstrap analysis was performed with 1,000 pseudoreplicates.

Confirmation of serotype and genome type
Typical CPE was found in cells inoculated with HAdV-7 GZ07 strain and virus could be neutralized specifically by mice serum against HAdV-7. Type-specific PCR assay also indicated that this strain was serotype 7. Further genome-typing results of restriction profiles found difference between this strain and Gomen strain ( Figure 1).

General properties of the HAdV7-GZ07 genome sequence
The complete genome of HAdV7-GZ07 is 32442 bp in length with a base composition of 26.1% G, 26.0% C, 23.1% A and 24.8% T. The G+C content is 52.1%, which is similar to that of other members of HAdV-B (50-52%) (Shenk, 2001). We identified 41 coding regions that are homologue to previously described gene products of other human adenoviruses. The annotation of the predicted coding gene regions is listed in Additional file 1: Table S1 (GenBank: HQ659699).

Whole genome comparison and E3 region analysis
HAdV-7 Gomen strain genome was chose as a reference strain for whole genome comparison [11]. PipMaker analysis suggested that there is an obvious genome deletion in HAdV7-GZ07 strain (Figure 2). Under detailed scrutiny, 2864 bp was deleted in E3 region between nt 28365-31228; only left-end two proteins were remained, 12

E3 region phylogenetic analysis
A phylogenetic tree was constructed based on the multiple alignments of the E3 region sequence data using the program MEGA 4.1 by the neighbor-joining method ( Figure 3). The tree shows the phylogenetic relationship among the selective adenovirus isolates. As can be seen HAdV-7-GZ07 strain is very closed with Gomen strain.

Discussion
Although E3 is non-essential for viral replication in vitro, experiments with both mice and cotton rats have shown that it does play an important role in pathogenesis [12,13]. The size and composition of the E3 transcription unit vary considerably among Ad species. The E3 region within adenovirus genomes encodes proteins that modulate the host immune response to infection and are not essential for viral growth in vitro [14]. The HAdV-7 E3 region was found to encode the 12.1-, 16.1-, 19.3-, 7.7-, 10.3-, 14.9-, and 14.7-kDa proteins. Additionally, two different 20.6-kDa proteins were contained within this transcript [11]. The 12.1-kDa protein has significant identity to an immunomodulating E3 protein in HAdV-7 Gomen strain. A glycoprotein of 16.1-kDa has homologs in other HAdV species. The 19.3-kDa protein is a major histocompatibility class I antigen-binding glycoprotein that prevents the lysis of adenoinfected host cells by cytotoxic T-lymphocytes [15]. Both 20.6-kDa proteins are similar to the CR1 (conserved region 1)-containing proteins in the E3 region of other HAdVs and SAdVs. CR1 alpha and beta were described as species HAdV-A specific gene products [16]. Prediction of transmembrane domains suggested that both gene products were type Ia transmembrane proteins. The 7.7-kDa protein is reported to insert itself into the host cell membrane; its function is yet to be determined. The E3 7.7 K ORF appears to be another area of the Ad genome in which genetic diversity may be generated by illegitimate recombination [17]. A HAdV E3 transmembrane protein has identity to the HAdV-7 10.3-kDa protein. This may have a role in downregulating the epidermal growth factor (EGF) receptor [15]. The known RID (receptor internalization and degradation) alpha and beta proteins are present in the E3 transcription units of all HAdV. Both proteins are noncovalently associated integral membrane proteins (YxxO motifs function as signals for transport and internalization into lysosomes/endosomes RID alpha is a hydrophobic protein and appears in two isoforms [16]). The last two ORFs of the E3 region encodes 14.9-and 14.7-kDa proteins that are present in all species of HAdV species [18]. It has been shown to be located in the cytosol and nucleolus, functioning as an inhibitor of TNF mediated cell lysis [19].  Homologous recombination has been recognized as an important mechanism of evolution of adenovirus genomes [20]. In some types, e.g. HAdV-3 and HAdV-5, the genomes are very stable [2,3,21]. But more and more reports found new recombination between adenovirus subtypes, which leaded to new types [22]. Illegitimate recombination has previously been proposed to contribute to Ad evolution by driving hexon sequence variation and serotype differentiation [23,24]. Hypervariability in the hexon gene among Ad serotypes can be explained as a response to host-immune pressure [25]. The detail mechanism in adenovirus recombination was not known. Some species or types may be amenable to recombination based on sequence, e.g., hotspots, and biology, e.g., cell tropism and coinfection [21]. Human recombinase proteins may also have a propensity to bind certain sequences in adenovirus genomes.

Additional material
Additional file 1: Table S1. HAdV7-GZ07 strain genome-sequence annotation. DNA sequence motifs and Forty-three coding regions are identified and located on the HAdV7-gz07 genome sequence. The hypothetical and predicted proteins are marked as 'Hypo.'. Nucleotide positions of the start/stop codons and of the applicable splice sites are noted in the 5' to 3' direction. Functionality, which is embedded within the complementary strand is designated by 'c'.