Mimivirus reveals Mre11/Rad50 fusion proteins with a sporadic distribution in eukaryotes, bacteria, viruses and plasmids
© Yoshida et al; licensee BioMed Central Ltd. 2011
Received: 8 July 2011
Accepted: 7 September 2011
Published: 7 September 2011
The Mre11/Rad50 complex and the homologous SbcD/SbcC complex in bacteria play crucial roles in the metabolism of DNA double-strand breaks, including DNA repair, genome replication, homologous recombination and non-homologous end-joining in cellular life forms and viruses. Here we investigated the amino acid sequence of the Mimivirus R555 gene product, originally annotated as a Rad50 homolog, and later shown to have close homologs in marine microbial metagenomes.
Our bioinformatics analysis revealed that R555 protein sequence is constituted from the fusion of an N-terminal Mre11-like domain with a C-terminal Rad50-like domain. A systematic database search revealed twelve additional cases of Mre11/Rad50 (or SbcD/SbcC) fusions in a wide variety of unrelated organisms including unicellular and multicellular eukaryotes, the megaplasmid of a bacterium associated to deep-sea hydrothermal vents (Deferribacter desulfuricans) and the plasmid of Clostridium kluyveri. We also showed that R555 homologs are abundant in the metagenomes from different aquatic environments and that they most likely belong to aquatic viruses. The observed phyletic distribution of these fusion proteins suggests their recurrent creation and lateral gene transfers across organisms.
The existence of the fused version of protein sequences is consistent with known functional interactions between Mre11 and Rad50, and the gene fusion probably enhanced the opportunity for lateral transfer. The abundance of the Mre11/Rad50 fusion genes in viral metagenomes and their sporadic phyletic distribution in cellular organisms suggest that viruses, plasmids and transposons played a crucial role in the formation of the fusion proteins and their propagation into cellular genomes.
KeywordsFusion genes Viruses Mimivirus Viral gene pool DNA repair Replication SbcD/SbcC
DNA double-strand breaks (DSBs) are a major cause of genomic instability and can lead to chromosomal aberration and cancer [1, 2]. DSBs occur in intermediate steps during normal DNA metabolic processes such as genome replication, meiotic recombination and programmed DNA rearrangement, but are also caused by DNA-damaging agents, including ionizing radiation as well as genotoxic chemicals. All cellular organisms possess a set of conserved proteins to cope with this dangerous form of genomic DNA; the Mre11/Rad50 complex in eukaryotes/archaea and its bacterial homologs (the SbcD/SbcC system) are the key players of the DSB metabolism generating a recombinogenic 3' overhang . Due to their ubiquitous presence in cellular organisms, it has been suggested that the last universal common ancestor (LUCA) already possessed this system . Mre11 and SbcD are nucleases belonging to the calcineurin-like phosphoesterase family. Rad50 and SbcC are ABC ATPases belonging to the S tructural M aintenance of C hromosomes (SMC) superfamily, and exhibit a long coiled-coil domain (~500Å when fully stretched) used to bridge two DNA molecules. The Escherichia coli SbcD/SbcC complex has an affinity for DNA hairpins and is known to generate DSBs at these sites [2, 5]. Homologous enzymes are also found in viruses; for instance, T4 phage encodes a Rad50 homolog (gp46) and a Mre11 homolog (gp47). These proteins are involved in the recombination-dependent DNA replication, an elegant solution to the end replication problem of linear viral genomes [6, 7]. In bacteria, archaea and T4, these nuclease and ATPase are encoded in an operon [2, 8].
Mimivirus (Acanthamoeba polyphaga mimivirus; APMV), a giant dsDNA virus infecting Acanthamoeba spp., is the prototype of the Mimiviridae family, the latest addition to the nucleocytoplasmic large DNA virus (NCLDV) group. Mimivirus is the largest among known viruses in both particle size (~750 nm) and genome length (1.2 Mb-genome encoding 1018 genes) [9–11]. The Mimivirus genome encodes at least eight putative DNA repair enzymes capable of correcting mismatches or errors induced by oxidation, UV irradiation and alkylating agents, including the recently analyzed putative mismatch repair enzyme MutS7 (L359) that exhibits a distinctive domain organization shared by other giant viruses . Mimivirus R555 originally annotated as a Rad50 homolog  is part of this uniquely complete Mimivirus DNA repair tool box. The R555 gene product specifically attracted our attention following a study that pointed out the existence of numerous homologs closely related to R555 in a marine metagenomic data set . Monier et al. analyzed the sequences gathered by the Global Ocean Sampling (GOS) Expedition , and identified 5,293 homologs for 229 Mimivirus proteins in the metagenomic data set (mostly 0.1-0.8 μm size fractions). The number of such homologs for each of the Mimivirus proteins was variable ranging from 1 to 249. R555 homologs were found to be the most abundant with 249 GOS scaffold matches, followed by 189 matches for R382 (mRNA capping enzyme) and 185 matches for R322 (B-type DNA polymerase). Here, we analyzed the sequence of R555 using database searches and phylogenetic reconstruction, and assessed the abundance of its close homologs in another large metagenomic data set generated by the Metagenomic Profiling of Nine Biomes (BIOME), which consists of 42 viral (<0.22 μm size fractions with a concentration of viral DNAs) and 45 microbial metagenomes (typically >0.22 μm size fractions) .
R555 encodes a fusion protein with Mre11-like and Rad50-like domains
Identification of similar fusion proteins in viruses, plasmids, bacteria and eukaryotes
To search for additional instances of Mre11/Rad50 (or SbcD/SbcC) fusion proteins, we performed PSI-BLAST searches against the UniProt database using position specific scoring profiles corresponding to COG0419 (SbcC) and COG0420 (SbcD) as queries. We successfully found twelve sequences showing significant similarities to both profiles from bacteria, eukaryotes and a phage (Figure 1). Five bacteria (four firmicutes and one thermophilic bacterium of the phylum Deferribacteres) were found to possess an SbcDC fusion: the anaerobic soil bacteria Clostridium kluyveri DSM555 (A5F9P1_CLOK5) and C. kluyveri NBRC 12016 (B9E6H1_CLOK1), the colon bacteria Anaerostipes caccae DSM 14662 (B0MG68_9FIRM), the rumen-associated Ruminococcus sp. 5_1_39BFAA (C6JB59_9FIRM), and Deferribacter desulfuricans SSM1 (D3PEM5_9BACT), a thermophilic sulfur reducing bacterium isolated from a deep-sea hydrothermal vent. The eukaryotes exhibiting a fused version of protein sequence were the moss Physcomitrella patens subsp. patens (A9RK34_PHYPA), the slime molds Dictyostelium discoideum (Q8T663_DICDI) and Polysphondylium pallidum PN500 (D3AVR3_POLPA), the marine pennate diatom Phaeodactylum tricornutum CCAP 1055/1 (B7FXD6_PHATR), and the photosynthetic Prasinophyceae Micromonas pusilla CCMP1545 (C1N0B1_9CHLO) and Micromonas sp. RCC299 (C1FJU6_9CHLO), found throughout diverse marine environments. Finally, a similar fusion protein was also found in the Bacillus phage 0305phi8-36 (A7KV18_9CAUD). In both strains of C. kluyveri, the fusion proteins were encoded in their plasmids. In Deferribacter, the gene for the fusion protein was flanked by two sets of transposases (IS200/IS605) on the 308-kb megaplasmid. In the cases of Anaerostipes and the Bacillus phage, one of the P-loop NTPase segments normally part of SbcC was not observed. Mimivirus and the the Bacillus phage encode only the fused version of homologs. However, in all of the cellular organisms exhibiting a fused form of gene except for Deferribacter, a normal set of genes separately encoding Mre11/Rad50 or SbcD/SbcC homologs was identified (Figure 1), suggesting that the role of the fused version of proteins in these cellular organisms may differ from that of the regular Mre11/Rad50 (SbcD/SbcC) complexes.
The fusion form is compatible with known interactions observed in the Mre11/Rad50 complex
Structural analysis of the Thermotoga maritima Mre11/Rad50 complex revealed that Mre11 comprises a phosphodiesterase domain, an accessory DNA-binding "capping" domain and the most C-terminal helix-loop-helix (HLH) domain, and that the HLH domain is involved in the direct interaction with the root of the Rad50 coiled-coil domain (additional file 1) . Our sequence analysis shows that most of the newly identified fusion proteins exhibit a sequence region that is similar in length to the sequences of the Mre11 capping and HLH domains, except for the fusion protein of Bacillus phage lacking this part of sequence. Furthermore, three residues in the HLH domain previously suggested to be responsible for the interaction with Rad50 were found to be relatively well conserved in the fusion protein sequences (additional file 1). Finally, the fusion of Mre11 and Rad50 in this order from the N-terminus appears compatible with the structural organization of the T. maritima Mre11/Rad50 complex, as the C-terminal end of Mre11 is located in a close proximity (~28Å) to the N-terminus of Rad50. Such a distance would correspond to a linker of nine to twelve residues in an approximately extended conformation. The sequences of R555 and most of the newly identified fusion proteins appear to possess such extra residues between the Mre11-like and Rad50-like regions. This result suggests that the inter-domain interactions within the protein encoded by the fusion gene might mimic those observed in the T. maritima Mre11/Rad50 complex.
Fused versions of protein sequences are distinct from the canonical Mre11/Rad50 or SbcD/SbcC sequences
A more comprehensive evolutionary picture emerged when regular (i.e., "non-fused") versions of sequences were included in our phylogenetic analysis (Figure 2(b)). In this phylogenetic tree reconstruction, regular protein sequences were concatenated to obtain a multiple alignment with the fused protein sequences. In the resulting tree, the classical regular sequence versions showed groupings corresponding to the three domains of life: a bacterial clade represented by E. coli SbcD/SbcC, an archaeal Mre11/Rad50 clade, and a eukaryotic clade represented by the yeast and human Mre11/Rad50 proteins. These regular sequences forming three clades congruent with the three domains are hereafter referred to as the "canonical" Mre11/Rad50 (or SbcD/SbcC) homologs.
Remarkably, the thirteen fused versions of protein sequences were placed outside the three canonical clades, indicating that they are only distantly related to the experimentally characterized classical versions of Mre11/Rad50 homologs that are widespread within the cellular organisms. Furthermore, the fused sequences did not make a monophyletic group of their own, showing no specific affinity with each other. Non-fused and non-canonical versions of sequences from several bacteria (Thermoanaerobacterium, Dhalococcoides, Chloroflexus) and phages were found intercalated among the branches leading to the fused protein sequences. As previously described, we identified non-fused versions of Mre11/Rad50 or SbcD/SbcC homologs in ten cellular organisms harboring also a fused version of gene (Figure 1). These non-fused versions of proteins were clearly placed within the canonical clades, suggesting that the fused versions of genes in these bacteria and eukaryotes are not derived from the regular counterparts in the same cellular genome.
R555 homologs from the GOS and BIOME metagenomes originate in viruses
While investigating the DNA repair functions encoded by the Mimivirus genome, we discovered that the R555 initially annotated as a Rad50 homolog was in fact a fusion between Mre11 and Rad50, two proteins known to be involved in the repair of DNA double-strand breaks. Using the R555 sequence as a template, we then found that similar fusion proteins are present in a wide variety of unrelated organisms: phage, bacteria, unicellular and multicellular eukaryotes, albeit with a highly sporadic distribution. To our knowledge, this is the first description of the existence fusion genes encoding both Mre11-like and Rad50-like domains.
Interestingly, some of the fusion genes were identified in plasmids (Clostridium kluyveri and Deferribacter). In the case of Deferribacter, the plasmid-encoded sbcDC fusion gene was found flanked by insertion sequences (i.e., transposons). Our analysis of various metagenomic data sets revealed that close homologs of R555 are abundant in different aquatic environments and that they are most likely associated with viruses. Finally, our phylogenetic analysis indicated that the Mre11/Rad50 (or SbcDC) fusion proteins are only distantly related to the canonical versions of homologs and do not form a monophyletic group. No simple evolutionary scenario explains these observations. A possibility is that the cellular fusion genes were vertically derived from an ancestral gene present as a paralog of the canonical genes in an ancestral cellular organism such as LUCA. These cellular non-canonical genes were then lost from most cellular lineages but were recruited by viruses including the ancestor of Mimivirus. However, we consider this "cell-central" hypothesis unlikely as this scenario postulates numerous independent gene loss events in cellular organisms. A more likely explanation is to assume the presence of non-canonical genes in ancestral viruses and the occurrence of multiple gene fusion events.
In general, gene fusions are expected to facilitate or simplify the co-expression and assembly of protein domains initially encoded in separate genes. Such a physical link of two associated functions at the genomic level also enhances the probability of successful lateral gene transfers. Known examples of fused proteins in viruses include the primase and helicase domain fusions in large DNA viruses . More recently, the mismatch repair protein MutS of Mimivirus was suggested to combine functions normally encoded in separate proteins, thanks to the fusion of the classical mismatch recognition domains with a nicking endonuclease domain . The fusion proteins of this type, now classified in the MutS7 subfamily, are abundant in large DNA viruses as well as in environmental metagenomes, but also present in a few distantly related cellular organisms (i.e., the mitochondria of octocorals and Epsilonproteobacteria). Our observations on the Mre11/Rad50 fusion proteins show an intriguing resemblance to those made on the MutS7 subfamily.
The apparent contradiction between their abundance in metagenomes and the sporadic distribution in unrelated (but mostly marine) cellular organisms suggest that the true niche of these protein variants is in viruses. Viruses are abundant in aquatic environments, are known to hold a huge genetic diversity yet underrepresented in the current databases, and are suggested to be the place of the creation of new genes . We propose that viruses, plasmids and/or transposons might have played a key role in the emergence of these Mre11/Rad50-like fusion proteins as well as in their subsequent propagation into different cellular organisms. The non-fused and non-canonical versions of Mre11/Rad50 homologs from viruses (such as gp46/gp47 of T4 and other marine T4-like viruses) were found outside the canonical clades corresponding to the three domains of life. We confirmed that almost all T4-like viruses with complete genomes possess homologs of these genes (data not shown). This phylogenetic feature is consistent with our hypothesis that viruses are the evolutionary origin of the Mre11/Rad50 fusion proteins found in bacteria and eukaryotes.
Gene fusions probably occurred several times in different viral lineages using an operon structure as an evolutionary template towards fused genes. Lateral transfers then possibly spread the fused genes into different viruses and cellular organisms. For instance, the presence of close homologs in mosses (Streptophyta), diatoms (stramenopiles) and green algae (Chlorophyta) suggests gene transfers among these very distantly related eukaryotes via unidentified intermediates. Forterre hypothesized that the cellular DNA informational proteins have been recruited independently in the three domains of life from different viruses, which shared a few common DNA processing enzymes such as the canonical Rad50/Mre11 . In our phylogenetic tree (Figure 2(b)), cellular non-canonical versions (in either fused or non-fused forms) were found intercalated among the branches leading to non-canonical versions of viral proteins. It is possible to see this branching pattern as another (but more recent) case of the lateral flow of DNA processing genes from the virus gene pool to cellular genomes. Future efforts in generating much longer and deeper metagenomic reads are needed to better understand the evolution of these new types of proteins.
Another plausible role of R555 in Mimivirus may be associated with the replication of DNA hairpins. In E. coli, the SbcD/SbcC complex has an affinity for DNA hairpin structures (>200 bp stem) and is known to generate DSBs at these sites, which are then repaired by homologous recombination [2, 5]. Most Mimivirus genes 3'-UTRs exhibit a palindromic sequence that serves as a polyadenylation site on the transcribed mRNA and tRNA molecules [11, 31]. These sequences have the potential of forming hairpin structures (>12 bp stem). Out of the 581 Mimivirus genes for which the 3'transcript ends were mapped, 473 (81.4%) showed these potential hairpin structures . Given their large number of occurrence in the Mimivirus genome, these hairpins, albeit short, may occasionally inhibit DNA replication. R555 might thus be involved in the process ensuring the correct replication of functionally important palindromic sequences.
Homologs of the Mre11/Rad50 complex play crucial roles in the DSB repair metabolism in cellular organisms. In this study, we showed that Mimivirus R555 gene product corresponds to a fusion of Mre11-like and Rad50-like domains and that its close homologs are specifically abundant in aquatic viruses. We also identified twelve additional cases of similar fusion protein sequences in unrelated cellular organisms as well as in another virus for the first time through a systematic database search. The abundance of the Mre11/Rad50-like fusion genes in viral metagenomes and their sporadic phylogenetic distribution can be explained by recurrent creations of new variants of genes in viruses and their subsequent transfers to different cellular organisms possibly relayed by plasmids or transposons.
Mre11/Rad50 and SbcDC fusion proteins were identified by PSI-BLAST  using position specific scoring matrices for the COG0419 (SbcC) and COG0420 (SbcD) as queries. Prediction of coiled-coil domains was performed using the Coiled-Coil Prediction Server (http://npsa-pbil.ibcp.fr/; ). Multiple sequence alignments were constructed using MAFFT ver. 6 with E-INS-i option . The alignments were examined and columns with more than 50% gaps were trimmed prior to phylogenetic reconstructions. Maximum likelihood phylogenetic analysis was performed using PhyML ver. 3 with the JTT substitution model and 100 bootstrap replicates . We used MEGA5 (http://www.megasoftware.net/; ) for tree drawing. The BIOME data set was downloaded from the CAMERA web site . We identified close homologs for the Mimivirus ORFs based on the following procedure . First, all the Mimivirus ORF sequences were compared to the BIOME data set using TBLASTX (E-value <0.1). This initial search resulted into 13,305 metagenomic reads matching to Mimivirus ORFs. These 13,305 sequences were then searched against the UniProt database  using BLASTX. This search resulted into 869 metagenomic reads exhibiting their best match to Mimivirus ORFs (E-value <0.1). For each of these 869 sequences, we extracted a segment of the Mimivirus sequence that was aligned with the reads. Next, this partial Mimivirus sequence was searched against the UniProt database (excluding Mimivirus entries in the database) using BLASTP. If the best score obtained from this BLASTP search was lower than the BLASTX score obtained for the alignment of the metagenomic read and Mimivirus sequence, the metagenomic read was kept as a close homolog of the Mimivirus ORF.
List of abbreviations
DNA double-strand break
structural maintenance of chromosomes
Acanthamoeba polyphaga mimivirus
nucleocytoplasmic large DNA virus
global ocean sampling
metagenomic profiling of nine biomes
recombination-dependent DNA replication
We thank Dr. Youri Timsit for critical reading of the manuscript, Dr. Chantal Abergel for advice on the structural analysis presented in this study, and two anonymous referees for their valuable comments on the initial manuscript. The IGS laboratory is supported, in part, by CNRS and the French National Research Agency (Grant # ANR-09-PCS-GENM-218, ANR-08-BDVA-003). TY was supported by JSPS Excellent Young Researchers Overseas Visit Program (21-7339).
- Lammens K, Bemeleit DJ, Mockel C, Clausing E, Schele A, Hartung S, Schiller CB, Lucas M, Angermuller C, Soding J, et al.: The Mre11:Rad50 Structure Shows an ATP-Dependent Molecular Clamp in DNA Double-Strand Break Repair. Cell 2011, 145: 54-66. 10.1016/j.cell.2011.02.038PubMed CentralView ArticlePubMed
- Storvik KA, Foster PL: The SMC-like protein complex SbcCD enhances DNA polymerase IV-dependent spontaneous mutation in Escherichia coli . J Bacteriol 2011, 193: 660-669. 10.1128/JB.01166-10PubMed CentralView ArticlePubMed
- Cromie GA, Connelly JC, Leach DR: Recombination at double-strand breaks and DNA ends: conserved mechanisms from phage to humans. Mol Cell 2001, 8: 1163-1174. 10.1016/S1097-2765(01)00419-1View ArticlePubMed
- de Souza RF, Iyer LM, Aravind L: Diversity and evolution of chromatin proteins encoded by DNA viruses. Biochim Biophys Acta 2010, 1799: 302-318.PubMed CentralView ArticlePubMed
- Eykelenboom JK, Blackwood JK, Okely E, Leach DR: SbcCD causes a double-strand break at a DNA palindrome in the Escherichia coli chromosome. Mol Cell 2008, 29: 644-651. 10.1016/j.molcel.2007.12.020View ArticlePubMed
- Kreuzer KN: Recombination-dependent DNA replication in phage T4. Trends Biochem Sci 2000, 25: 165-173. 10.1016/S0968-0004(00)01559-0View ArticlePubMed
- Long DT, Kreuzer KN: Regression supports two mechanisms of fork processing in phage T4. Proc Natl Acad Sci USA 2008, 105: 6852-6857. 10.1073/pnas.0711999105PubMed CentralView ArticlePubMed
- Constantinesco F, Forterre P, Elie C: NurA, a novel 5'-3' nuclease gene linked to rad50 and mre11 homologs of thermophilic Archaea. EMBO Rep 2002, 3: 537-542. 10.1093/embo-reports/kvf112PubMed CentralView ArticlePubMed
- La Scola B, Audic S, Robert C, Jungang L, de Lamballerie X, Drancourt M, Birtles R, Claverie JM, Raoult D: A giant virus in amoebae. Science 2003, 299: 2033. 10.1126/science.1081867View ArticlePubMed
- Raoult D, Audic S, Robert C, Abergel C, Renesto P, Ogata H, La Scola B, Suzan M, Claverie JM: The 1.2-megabase genome sequence of Mimivirus. Science 2004, 306: 1344-1350. 10.1126/science.1101485View ArticlePubMed
- Legendre M, Audic S, Poirot O, Hingamp P, Seltzer V, Byrne D, Lartigue A, Lescot M, Bernadac A, Poulain J, et al.: mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus. Genome Res 2010, 20: 664-674. 10.1101/gr.102582.109PubMed CentralView ArticlePubMed
- Ogata H, Ray J, Toyoda K, Sandaa RA, Nagasaki K, Bratbak G, Claverie JM: Two new subfamilies of DNA mismatch repair proteins (MutS) specifically abundant in the marine environment. ISME J 2011.
- Monier A, Larsen JB, Sandaa RA, Bratbak G, Claverie JM, Ogata H: Marine mimivirus relatives are probably large algal viruses. Virol J 2008, 5: 12. 10.1186/1743-422X-5-12PubMed CentralView ArticlePubMed
- Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, Yooseph S, Wu D, Eisen JA, Hoffman JM, Remington K, et al.: The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol 2007, 5: e77. 10.1371/journal.pbio.0050077PubMed CentralView ArticlePubMed
- Dinsdale EA, Edwards RA, Hall D, Angly F, Breitbart M, Brulc JM, Furlan M, Desnues C, Haynes M, Li L, et al.: Functional metagenomic profiling of nine biomes. Nature 2008, 452: 629-632. 10.1038/nature06810View ArticlePubMed
- Hopfner KP, Karcher A, Shin DS, Craig L, Arthur LM, Carney JP, Tainer JA: Structural biology of Rad50 ATPase: ATP-driven conformational control in DNA double-strand break repair and the ABC-ATPase superfamily. Cell 2000, 101: 789-800. 10.1016/S0092-8674(00)80890-9View ArticlePubMed
- Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, et al.: CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res 2009, 37: D205-210. 10.1093/nar/gkn845PubMed CentralView ArticlePubMed
- Aravind L, Koonin EV: Phosphoesterase domains associated with DNA polymerases of diverse origins. Nucleic Acids Res 1998, 26: 3746-3752. 10.1093/nar/26.16.3746PubMed CentralView ArticlePubMed
- Naom IS, Morton SJ, Leach DR, Lloyd RG: Molecular organization of sbcC, a gene that affects genetic recombination and the viability of DNA palindromes in Escherichia coli K-12. Nucleic Acids Res 1989, 17: 8033-8045. 10.1093/nar/17.20.8033PubMed CentralView ArticlePubMed
- Connelly JC, Kirkham LA, Leach DR: The SbcCD nuclease of Escherichia coli is a structural maintenance of chromosomes (SMC) family protein that cleaves hairpin DNA. Proc Natl Acad Sci USA 1998, 95: 7969-7974. 10.1073/pnas.95.14.7969PubMed CentralView ArticlePubMed
- Das D, Moiani D, Axelrod HL, Miller MD, McMullan D, Jin KK, Abdubek P, Astakhova T, Burra P, Carlton D, et al.: Crystal structure of the first eubacterial Mre11 nuclease reveals novel features that may discriminate substrates during DNA repair. J Mol Biol 2010, 397: 647-663. 10.1016/j.jmb.2010.01.049PubMed CentralView ArticlePubMed
- Hopfner KP, Karcher A, Craig L, Woo TT, Carney JP, Tainer JA: Structural biochemistry and interaction architecture of the DNA double-strand break repair Mre11 nuclease and Rad50-ATPase. Cell 2001, 105: 473-485. 10.1016/S0092-8674(01)00335-XView ArticlePubMed
- Iyer LM, Koonin EV, Leipe DD, Aravind L: Origin and evolution of the archaeo-eukaryotic primase superfamily and related palm-domain proteins: structural insights and new members. Nucleic Acids Res 2005, 33: 3875-3896. 10.1093/nar/gki702PubMed CentralView ArticlePubMed
- Ogata H, Claverie JM: Unique genes in giant viruses: regular substitution pattern and anomalously short size. Genome Res 2007, 17: 1353-1361. 10.1101/gr.6358607PubMed CentralView ArticlePubMed
- Forterre P: Three RNA cells for ribosomal lineages and three DNA viruses to replicate their genomes: a hypothesis for the origin of cellular domain. Proc Natl Acad Sci USA 2006, 103: 3669-3674. 10.1073/pnas.0510333103PubMed CentralView ArticlePubMed
- Wang JC: Cellular roles of DNA topoisomerases: a molecular perspective. Nat Rev Mol Cell Biol 2002, 3: 430-440. 10.1038/nrm831View ArticlePubMed
- Senkevich TG, Koonin EV, Moss B: Predicted poxvirus FEN1-like nuclease required for homologous recombination, double-strand break repair and full-size genome formation. Proc Natl Acad Sci USA 2009, 106: 17921-17926. 10.1073/pnas.0909529106PubMed CentralView ArticlePubMed
- Claverie JM, Abergel C, Ogata H: Mimivirus. Curr Top Microbiol Immunol 2009, 328: 89-121. 10.1007/978-3-540-68618-7_3PubMed
- Willer DO, Mann MJ, Zhang W, Evans DH: Vaccinia virus DNA polymerase promotes DNA pairing and strand-transfer reactions. Virology 1999, 257: 511-523. 10.1006/viro.1999.9705View ArticlePubMed
- Hamilton MD, Nuara AA, Gammon DB, Buller RM, Evans DH: Duplex strand joining reactions catalyzed by vaccinia virus DNA polymerase. Nucleic Acids Res 2007, 35: 143-151.PubMed CentralView ArticlePubMed
- Byrne D, Grzela R, Lartigue A, Audic S, Chenivesse S, Encinas S, Claverie JM, Abergel C: The polyadenylation site of Mimivirus transcripts obeys a stringent 'hairpin rule'. Genome Res 2009, 19: 1233-1242. 10.1101/gr.091561.109PubMed CentralView ArticlePubMed
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389-3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMed
- Lupas A, Van Dyke M, Stock J: Predicting coiled coils from protein sequences. Science 1991, 252: 1162-1164. 10.1126/science.252.5009.1162View ArticlePubMed
- Katoh K, Misawa K, Kuma K, Miyata T: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 2002, 30: 3059-3066. 10.1093/nar/gkf436PubMed CentralView ArticlePubMed
- Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003, 52: 696-704. 10.1080/10635150390235520View ArticlePubMed
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Mol Biol Evol 2011.
- Seshadri R, Kravitz SA, Smarr L, Gilna P, Frazier M: CAMERA: a community resource for metagenomics. PLoS Biol 2007, 5: e75. 10.1371/journal.pbio.0050075PubMed CentralView ArticlePubMed
- The Universal Protein Resource (UniProt) in 2010 Nucleic Acids Res 2010, 38: D142-148.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.