Marine mimivirus relatives are probably large algal viruses
© Monier et al. 2008
Received: 09 November 2007
Accepted: 23 January 2008
Published: 23 January 2008
Skip to main content
© Monier et al. 2008
Received: 09 November 2007
Accepted: 23 January 2008
Published: 23 January 2008
Acanthamoeba polyphaga mimivirus is the largest known ds-DNA virus and its 1.2 Mb-genome sequence has revealed many unique features. Mimivirus occupies an independent lineage among eukaryotic viruses and its known hosts include only species from the Acanthamoeba genus. The existence of mimivirus relatives was first suggested by the analysis of the Sargasso Sea metagenomic data.
We now further demonstrate the presence of numerous "mimivirus-like" sequences using a larger marine metagenomic data set. We also show that the DNA polymerase sequences from three algal viruses (CeV01, PpV01, PoV01) infecting different marine algal species (Chrysochromulina ericina, Phaeocystis pouchetii, Pyramimonas orientalis) are very closely related to their homolog in mimivirus.
Our results suggest that the numerous mimivirus-related sequences identified in marine environments are likely to originate from diverse large DNA viruses infecting phytoplankton. Micro-algae thus constitute a new category of potential hosts in which to look for new species of Mimiviridae.
The discovery of Acanthamoeba polyphaga mimivirus was a significant breakthrough in the recent history of virology. Both mimivirus particle size (~750 nm) and its genetic repertoire (1.2 Mb-genome encoding 911 protein coding genes) are comparable to those of many parasitic cellular organisms [1, 2]. This giant virus exhibits several genes for translation system components , and its particle contains both DNA and RNA molecules . These features both quantitatively and qualitatively challenge the boundary between viruses and cells, and reignited a smoldering debate about the origin of viruses and their role in the emergence of eukaryotes [4–9].
Mimivirus belongs to Nucleocytoplasmic large DNA viruses (NCLDVs) . From its basal position in the phylogenetic trees based on conserved NCLDV core genes [1, 2], the new "Mimiviridae" family was proposed for mimivirus . NCLDVs now include Mimiviridae, Phycodnaviridae, Iridoviridae, Asfarviridae and Poxviridae. Mimivirus is the sole member of the Mimiviridae family. The lack of known close relatives of mimivirus makes it difficult to build the evolutionary history of its surprising features. Is mimivirus one of many eccentric creatures in nature such as Rafflesia, a parasitic plant in southeastern Asia known for its gigantic flower ? Are the mimivirus extraordinary characteristics linked to the origin of eukaryotes ? Clearly, appraising the actual biological significance of this exceptional virus requires the isolation and characterization of additional members of the Mimiviridae family.
Mimivirus was initially isolated in amoebae sampled from the water of a cooling tower. Following the circumstances of its discovery, mimivirus was suspected to be a causative agent of pneumonia . The presence of antibodies recognizing mimivirus in the sera of patients with community or hospital-acquired pneumonia was reported [14, 15]. However, no serological evidence of mimivirus infection was found in hospitalized children in Austria  and mimivirus has never been isolated from an infected patient despite numerous attempts. In the laboratory, mimivirus appears to infect only species of Acanthamoeba . Acanthamoeba are ubiquitous in nature and they have been isolated from diverse environments including freshwater lakes, river waters, salt water lakes, sea waters, soils and the atmosphere [18, 19]. Mimivirus relatives might thus exist everywhere.
Ghedin and Claverie identified sequences similar to mimivirus genes in the environmental sequence library from the Sargasso Sea . This strongly suggested the existence of mimivirus relatives in the sea. More recently, we found numerous additional "mimivirus-like" sequences in the much larger metagenomic data set generated by the Global Ocean Sampling Expedition (hereafter referred to as GOS data; ) (Monier et al., manuscript in preparation). However, the analysis of metagenomic data (i.e. short sequences from unknown and mixed organisms) provides no insights into the hosts susceptible to harbor the putative new species of Mimiviridae corresponding to these sequences.
While continually monitoring the new occurrences of mimivirus-like sequences in public databases, we recently noticed that the type B DNA polymerase (hereafter referred to as PolB) sequences of three lytic viruses from Norwegian coastal waters were very similar to the PolB sequence of mimivirus. The three viruses [CeV01 (GenBank accession: ABU23716), PpV01 (ABU23718), PoV01 (ABU23717)] were isolated from diverse marine unicellular algae: Chrysochromulina ericina, Phaeocystis pouchetii and Pyramimonas orientalis, respectively [22, 23]. C. ericina and P. pouchetii are both haptophytes but phylogenetically distant and classified in different orders, i.e. Prymnesiales and Phaeocystales. P. pouchetii forms dense and almost monospecific spring blooms while C. ericina thrive in mixed flagellate communities and at cell densities usually not attaining bloom levels [24, 25]. P. orientalis is a prasinophyte belonging to the green algae. It has a worldwide distribution but the abundance is most often low with no significant contribution to the overall phytoplankton biomass [26, 27]. The three algal viruses infecting these phytoplankters have all been classified as phycodnaviruses.
In this report, we first analyzed the distribution of mimivirus-like sequences found in the GOS data and mapped them on the mimivirus genome. We then performed phylogenetic analyses which indicated a very close relationship between the PolB sequences of mimivirus and the three algal viruses (CeV01, PpV01, PoV01), as well as with their homologs from the metagenomic data set.
A selected list of mimivirus genes with closely related sequences in the GOS data.
Number of "mimivirus-like" sequences in the GOS data
NCLDV class I core genes
Helicase III/VV D5-type ATPase (C-term)
Helicase III/VV D5-type ATPase (N-term)
DNA polymerase (B family)
putative transcription termination factor, VV D6R helicase
VV A18 helicase
S/T protein kinase
Major capsid protein
VV A32 virion packaging ATPase
A1L transcription factor
Thiol oxidoreductase E10R
NCLDV class II core genes
TFII-like transcription factor
Proliferating Cell Nuclear Antigen
NCLDV class III core genes
SW1/SNF2 helicase (MSV224)
mRNA Capping Enzyme
PBCV1-A494R-like, 9 paralogs
Translation initiation factor 4E, (mRNA cap binding)
GTP binding elongation factor eF-Tu
Hydrolysis of DNA containing ring-opened N7 methylguanine
DNA mismatch repair ATPase MutS
Alkylated DNA repair
Endonuclease for the repair of UV-irradiated DNA
Other genes with more than 100 matches
putative transcription initiation factor IIB
Lon domain protease
NAD-dependent DNA ligase
Metal-dependent hydrolase (Chilo iridescent virus 136R)
putative NTPase I
TATA-box binding protein (TBP)
putative DNA repair protein
Contains helicase conserved C-terminal domain (PFAM)
CeV01, PpV01 and PoV01 were initially isolated from Norwegian coastal waters. An electron cryomicroscopic analysis revealed the icosahedral capsid of PpV01 particles with a maximum diameter of 220 nm . Icosahedral morphology was also suggested for CeV01 (160 nm) and PoV01 (220 × 180 nm) from the observations by transmission electron microscopy . The genomes of these viruses are composed of double-stranded DNA, with estimated sizes being 510-kb for CeV01, 485-kb for PpV01 and 560-kb for PoV01 [22, 30]. The genome sizes are substantially larger than the currently sequenced largest phycodnavirus genome (i.e. 407-kb for EhV-86, . Electron microscopy observations of infected cells indicate that viral assembly takes place in the cytoplasm of all three host cells [22, 32]. Given these features, these three lytic algal viruses are tentatively classified as phycodnaviruses.
Previous studies have indicated a relatively close phylogenetic relationship  and a similarity in gene composition  between phycodnaviruses and mimivirus. Several phycodnaviruses exhibit the largest genome sizes (>300-kb) after mimivirus . Claverie et al. have hypothesized that Phycodnaviridae is a promising source of giant viruses . In this study, we present phylogenetic evidence for a close relationship between the PolB sequences of three algal viruses (CeV01, PpV01, PoV01) and mimivirus, and for the segregation of these from homologs of other known viruses. PolB is one of the NCLDV core genes, and serves as a phylogenetic marker for the classification of large DNA viruses [33, 34]. There now seems to be a continuum between the giant mimivirus and some algal viruses at least with respect to the sequence of this essential viral enzyme. The large genome sizes of CeV01, PpV01, and PoV01 might be another indication of their close evolutionary relationship with mimivirus. Phylogenetic classification of phycodnaviruses and mimiviruses (including the split of Phycodnaviridae or merging of Mimiviridae and Phycodnaviridae) may have to be revisited based on sequence information from other genetic markers such as major capsid proteins (Larsen et al. manuscript in preparation) and other NCLDV core genes.
Our discovery of the close relationships among PolB sequences of mimivirus and the three algal viruses as well as their homologs from metagenomic data now sheds new light on the nature of the mimivirus relatives in the sea. The mimivirus-like sequences in the metagenomic data are likely to originate from large DNA viruses closely related to mimivirus, CeV01, PpV01 and PoV01. Probably, there is a substantial genetic variation among these putative viruses. The fact that the host algae of CeV01, PpV01 and PoV01 have worldwide distributions, suggests that these putative viruses might not be necessarily associated with marine amoebae, but rather to algal species closely related to C. ericina, P. pouchetii or P. orientalis.
Mimivirus was proposed to be a human pathogen causing pneumonia. However, the close relationship of mimivirus with viruses infecting phytoplankton does not favor this hypothesis, as eukaryotic large DNA virus groups (e.g. at the level of genus) usually correspond to a relatively narrow hosts range. Given the strong cytopathic effect of mimivirus on its amoebal host and its phylogenetic affinity with certain algal viruses, we now begin to suspect that the natural reservoir of mimivirus might be some algae. Indeed, algae are frequently found together with acanthamoeba, in anthropogenic ecosystems such as air-conditioning units.
If horizontal transfer of viral PolB genes does occur, it would become difficult to interpret the PolB phylogeny as representing the true relationships between viruses. However, to the best of our knowledge, no instance of lateral transfer of PolB genes between distantly related eukaryotic large DNA viruses has been documented. The determination of the whole genome sequences of CeV01, PpV01 and PoV01 would definitely help clarifying their evolutionary relationship with mimivirus.
Three algal viruses (CeV01, PpV01 and PoV01) possess DNA polymerase genes that are closely related to the DNA polymerase from the giant mimivirus. This suggests that the numerous "mimivirus-like" sequences detected in marine metagenomic data might originate from viruses infecting phytoplankton species related to C. ericina, P. pouchetii or P. orientalis, rather than marine amoebae. These results imply new approaches in attempting the isolation of additional, and eventually closer, relatives of mimivirus.
The scaffold sequences for the combined assembly of the GOS metagenomic data were downloaded from the CAMERA web site . We extracted 21,406,171 ORFs (≥ aa) from the scaffolds using the EMBOSS/getorf program .
We defined "mimivirus-like ORFs" based on the following two-way BLASTP searches . First, the amino acid sequences of the ORFs were searched against the UniProt sequence database release 11.3 (as of July 2007, ) using BLASTP (E-value < 0.001). This search resulted in 6,212 ORFs with its best hit to a mimivirus protein in the database. For each of the 6,212 ORFs, we extracted a segment of the mimivirus sequence that was aligned with the ORF by BLASTP. Next, this partial mimivirus sequence was searched against the UniProt database (excluding mimivirus entries in the database). If the best score obtained by this second BLASTP search is lower than the BLASTP score obtained by the first BLASTP search, we kept the ORF as "mimivirus-like". Accordingly, we obtained 5,293 mimivirus-like ORFs. The UniProt database does not contain the three entries used for the phylogenetic study (i.e. ABU23716, ABU23717, ABU23718).
Mimivirus ORFans were defined by the lack of detectable homologs in the UniProt database using BLASTP with an E-value threshold of 0.001.
Multiple sequence alignment was constructed using MUSCLE . All the gap-containing sites in the alignment were excluded in the phylogenetic analysis. We used only the polymerase domain sequences, and removed exonuclease domain sequences. The delineation of the polymerase domains were performed using the Pfam entry PF00136 . Intein sequences were also removed from Mimivirus, HaV, CeV01 PolB sequences. Maximum likelihood phylogenetic analysis was performed using PhyML  with JTT substitution model and 100 bootstrap replicates. Neighbor joining analysis was performed using BIONJ . The above methods are available from the Phylogeny.fr server . Maximum parsimony analysis was performed using PHYLIP/PROTPARS .
Chrysochromulina ericina virus
Phaeocystis pouchetii virus
Pyramimonas orientalis virus
Nucleocytoplasmic large DNA virus
Global Ocean Sampling Expedition
type B DNA polymerase
open reading frame
AM is partially supported by the EuroPathoGenomics European network of excellence. This work was partially supported by Marseille-Nice Genopole and the French National Network (RNG).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.