Comparative genomic analysis of the family Iridoviridae: re-annotating and defining the core set of iridovirus genes
Virology Journalvolume 4, Article number: 11 (2007)
Members of the family Iridoviridae can cause severe diseases resulting in significant economic and environmental losses. Very little is known about how iridoviruses cause disease in their host. In the present study, we describe the re-analysis of the Iridoviridae family of complex DNA viruses using a variety of comparative genomic tools to yield a greater consensus among the annotated sequences of its members.
A series of genomic sequence comparisons were made among, and between the Ranavirus and Megalocytivirus genera in order to identify novel conserved ORFs. Of these two genera, the Megalocytivirus genomes required the greatest number of altered annotations. Prior to our re-analysis, the Megalocytivirus species orange-spotted grouper iridovirus and rock bream iridovirus shared 99% sequence identity, but only 82 out of 118 potential ORFs were annotated; in contrast, we predict that these species share an identical complement of genes. These annotation changes allowed the redefinition of the group of core genes shared by all iridoviruses. Seven new core genes were identified, bringing the total number to 26.
Our re-analysis of genomes within the Iridoviridae family provides a unifying framework to understand the biology of these viruses. Further re-defining the core set of iridovirus genes will continue to lead us to a better understanding of the phylogenetic relationships between individual iridoviruses as well as giving us a much deeper understanding of iridovirus replication. In addition, this analysis will provide a better framework for characterizing and annotating currently unclassified iridoviruses.
Iridoviruses are large DNA viruses (~120–200 nm in diameter) that replicate in the cytoplasm of infected cells. Iridovirus genomes are circularly permuted and terminally redundant, and range in size from 105 to 212 kbp [1, 2]. The family Iridoviridae is currently subdivided into five genera:Chloriridovirus, Iridovirus, Lymphocystivirus, Megalocytivirus, and Ranavirus .
Iridoviruses have been found to infect invertebrates and poikilothermic vertebrates, including amphibians, reptiles, and fish . Iridovirus infections produce symptoms that range from subclinical to very severe, which may also result in significant mortality [5–9]. The high pathogenicity associated with some members of the iridovirus family has had a significant impact on modern aquaculture, fish farming, and wildlife conservation. For example, systemic iridovirus infections have been found in economically important freshwater and marine fish species worldwide. In addition, iridovirus infections have been implicated in amphibian population declines, representing a set of emerging infectious diseases whose spread has been accelerated by human activities [10–14].
Despite the economic and ecological significance of iridoviruses, very little is currently known about their molecular biology. One approach towards gaining a deeper understanding of iridoviral pathogenesis is to investigate the core set of essential genes conserved among all members of the family. The genomes of twelve iridoviruses, including at least one from each genus, have been completely sequenced (Table 1). According to the previously published annotations, these genomes contained only 19 core genes associated with a variety of viral activities: transcriptional regulation, DNA metabolism, protein modification, and viral structure. Definition of this core set of genes also highlights those genes that are conserved across some, but not all, genera, and unique genes found within a single species. These non-core genes may be involved in specific virus-host interactions, enhancement of virus replication, and augmented pathogenesis in certain species.
Despite the growing number of sequenced iridovirus genomes, no systematic comparative genomic analysis of the family has yet been performed. Thus, annotation of these genomes has been performed without standardization and has so far been guided primarily by the position of start/stop codons rather than the presence of homologous sequences. As a result, some long overlapping potential ORFs have been automatically designated as coding sequences, and smaller homologous ORFs overlooked. In this paper, we have taken a comparative genomics approach to re-examine the annotation of all twelve iridovirus genomes, using the Viral Orthologous Clusters (VOCs)  and Viral Genome Organizer (VGO)  software. These re-annotated genomes were then analysed further, both to define the core set of iridovirus genes more accurately, and to provide a deeper understanding into the phylogenetic relationship between individual iridovirus species.
Results & discussion
Re-annotation of Iridovirus genomes
One objective of this project was to demonstrate the application of comparative genomics to annotating viral genomes, particularly those that have been poorly characterized experimentally. In an earlier study, we utilized comparative genomics to identify previously unannotated small viral ORFs in the Poxviridae . Here, we focused our analysis on the Iridoviridae family, which represents a challenge in genome annotation since there is little experimental evidence available to confirm gene expression. Another problem is that iridovirus promoter elements have not been well characterized, and thus cannot be used as a reliable criterion for assigning ORFs. These combined factors made previous iridovirus gene annotation a somewhat arbitrary process, resulting in closely related iridovirus species with dramatic differences in their genomic annotations. Therefore, we decided to analyse all members of this family using a standardized comparative genomics approach, using the fact that ORFs that are conserved in more than one divergent species are likely to be functional genes.
Analysis was begun with the Megalocytivirus genus, which contains three sequenced genomes: infectious spleen and kidney necrosis virus (ISKNV), rock bream iridovirus (RBIV), and orange-spotted grouper iridovirus (OSGIV). These three viruses display a co-linear arrangement of genes with an overall DNA sequence identity of greater than 90%. In the analysis of this genus, differences in gene content were examined in detail. Dotplots were used to determine presence of orthologous DNA and a variety of BLAST searches and the VGO genome visualization software were used to determine the reason (frameshifts, extra stop codons) behind the apparent absence of some ORFs.
Using this approach, a substantial number of ORFs were either added to, or deleted from members of the Megalocytivirus genus (Table 2). OSGIV and RBIV share 99% DNA sequence identity, and thus are probably different strains of the same virus; however, previous annotation described only 82 out of 118 total annotated ORFs shared by the two genomes [18, 19]. After our re-analysis, the RBIV and OSGIV genomes had an identical complement of annotated genes. Furthermore, this re-annotated ISKNV genome contained 110 ORFs orthologous with both RBIV and OSGIV (compared to 71 in the old annotation.) (Table 2) [18, 20].
In the process of re-examining these genomes, we annotated a number of genes containing apparent frameshift mutations between species. In RBIV we annotated ten genes with potential frameshift mutations, while OSGIV had four such genes (Table 2). All of the genes containing potential frameshift mutations had orthologs in the other two members of the Megalocytivirus genus (Table 2). In some cases, these mutations may be the result of natural mutations within the viruses; however, it is also possible that these apparent frameshift mutations are actually sequencing errors. For both RBIV and OSGIV, PCR primers based on the ISKNV sequence were used to amplify genomic fragments, which were subsequently sequenced [18, 19]. It is possible that errors were introduced during the PCR process, leading to apparent frameshifts in the reported sequence. It is interesting to note that the genomic sequence of ISKNV (sequenced using subcloned fragments rather than PCR products) , had significantly fewer annotation changes made during our re-analysis. Though we have not experimentally proven that the frameshift mutations in OSGIV and RBIV are the result of sequencing errors, it would be useful to focus future sequencing efforts on these regions, to determine if the reported sequences are indeed correct.
After re-annotating the Megalocytivirus genus, we applied the same comparative genomic analysis to the Ranavirus genus. The genus contains five sequenced members divided into two groups, each with a high degree of sequence conservation and a co-linear arrangement of genes. The first group is comprised of frog virus 3 (FV3), tiger frog virus (TFV), and Ambystoma tigrinum virus (ATV). The second group contains Singapore grouper iridovirus (SGIV) and grouper iridovirus (GIV).
The first step in the re-annotation of the Ranavirus genus was a comparative genomic analysis of FV3, TFV, and ATV. This resulted in an increase in the number of conserved annotated genes from 76 to 87 (Table 3). Subsequent re-analysis of the second Ranavirus group, containing SGIV and GIV, resulted in an increase from 131 to 138 conserved annotated ORFs (Table 4). It should be noted that two of the newly annotated ORFs, SGIV 0.5L and GIV 120.5L, appear to "wrap around", beginning at one end of the genome with the remainder of the ORF located at the opposite end [21, 22]. These apparent "split ORFs" are actually the result of the circularly permutated iridovirus genome being represented as a linear genomic sequence, when the arbitrarily chosen start point happens to fall in the middle of an ORF .
As seen above, our comparative genomic approach was able to identify previously unannotated ORFs, homologous ORFs with potential frameshifts, and ORFs split between the two ends of a circular genome. Although this approach proved extremely successful for the Ranavirus and Megalocytivirus genera, we were unable to use it for the Chloriridovirus, Iridovirus, and Lymphocystivirus genera. This is due to the lack of co-linearity and the highly divergent sets of genes that exist between the members of these genera, as well as the low number of available genome sequences. However, we did modify the annotations of lymphocystis disease virus-China (LCDV-China) and invertebrate iridescent virus-6 (IIV-6). The previous annotations of these genomes of both species had contained a large number of overlapping ORFs [2, 24], which we decided to exclude on several grounds. First, LCDV-China and IIV-6 are the only iridoviruses, out of the twelve so far sequenced, in which overlapping ORFs have been annotated. In addition, the original sequencing paper for IIV-6  and a follow-up paper by the same group  did not include a number of the overlapping ORFs reported in the database sequence, presumably due to their small size and lack of similarity with other viral and cellular genes. Finally, there is no experimental or bioinformatics evidence to suggest that any of these ORFs encode proteins. Therefore, to improve the overall consistency of the Iridoviridae family annotations, we removed the small overlapping ORF annotations from the LCDV-China and IIV-6 genomic sequences (Table 5, Additional File 1 &2).
Defining the conserved genes in Iridoviruses
As a result of this re-annotation of the Iridoviridae family, species within each genus now have a much greater consensus among their annotated ORFs. Prior to re-annotation, only 19 ORFs appeared to be conserved across all iridovirus species (Table 6). Although a previous report has suggested that 27 core genes exist within the Iridoviridae family , those core genes reported are found in most, but not all published iridoviridal species. In light of our previous results, we re-examined this core set of genes using the VOCs software. We identified seven novel core genes (Table 7), increasing the total number to 26 (Table 6 &7). This increase in the number of core genes was primarily due to the five new genes annotated during the re-analysis of RBIV (Table 7 bold highlighted genes). As expected most of the core genes are predicted to have essential functions, required for transcription, replication, and virus formation. Interestingly, three core genes, the orthologs of FV3 12L, 41R, and 94L, have no predicted functions. As previously stated Delhon et al.  identified 27 core genes, one more than we identified after our re-analysis. Delhon et al.  report the orthologs of FV3 20L represent a core . However, our analysis shows that orthologs of FV3 20L exist in all genera except the Megalocytivirus (Figure 1) suggesting that FV3 20L is not a core gene. Future research to determine the functions of these genes, which are also likely to be essential, will provide important data for understanding the replication cycle of iridoviruses.
Identifying genes conserved between some, but not all, iridovirus species can give us important information when investigating evolutionary relationships within the family. A number of past phylogenetic analyses of Iridoviridae have used phylogenic trees constructed from aligned protein sequences [1, 18–20, 22, 24, 27]. However, there are potential problems with phylogenic analysis based on comparisons of single genes. This type of analysis is rarely consistent due to horizontal gene transfer  and variable rates of evolution . Therefore, we decided to take a whole genome comparative phylogenetic analysis to understand the relationship between iridoviruses. Our approach was to identify all the genes conserved between different genera to gain a better understanding of the relationships within the iridovirus family. This approach yields an indication of how similar in gene content 2 genomes are. Our whole-genome comparative analysis, grouped orthologous genes between genera (Figures 1 &2 and Additional File 3), and was consistent with phylogenic trees constructed from single protein sequences. Based on gene conservation, the Ranavirus and Lymphocystivirus genera appear to be most closely related to one another (Figure 2). In addition, the Iridovirus and Chloriridovirus genera are also closely related to one another based on presence of orthologous genes (Figure 2). In contrast, the Megalocytivirus genus and the Iridovirus/Chloriridovirus genera are equally divergent from each other as well as all other Iridoviridae family members (Figure 2).
As the list of sequenced iridovirus genomes grows, the non-co-linearity between many of these genomes becomes more apparent. The Megalocytivirus and Ranavirus, but not the Chloriridovirus, Iridovirus, and Lymphocystivirus genera, show a co-linear arrangement of genes within each genus. However, comparisons of genomic sequences from different genera suggest no co-linearity. This trend may be the result of the high recombination rates  seen in some iridovirus members . For example, within the Ranavirus genus, ATV has two inversions relative to the FV3 and TFV sequences , reducing the co-linearity of these genomes to some degree. Figure 3A shows how two recombination events could convert FV3 to the ATV arrangement of genes. In contrast, a comparison between the more distantly related members within the Ranavirus genus (such as FV3 and GIV) demonstrate a much more dramatic loss of co-linearity. No long stretches of co-linear genes exist between these sequences, although small sections of co-linearity remain as seen through a dotplot analysis between FV3 and GIV (Figure 3B). The dotplot shows small regions of co-linearity scattered throughout the genome of FV3 and GIV as seen by short diagonal lines on the dotplot (Figure 3B). A schematic representation of the co-linearity between FV3 and GIV demonstrates that co-linearity occurs in small clusters of genes often only 2–4 genes in length (Figure 3C).
The Iridoviridae family can cause severe diseases resulting in significant economic and environmental losses. Very little is known about how iridoviruses cause disease in their host. Our re-analysis of genomes within the Iridoviridae family provides a unifying framework to understand the biology of these viruses. For example, the re-analysis of the Iridoviridae family has increased the consistency of annotated sequences from viruses within the same genus. In addition, the re-analysis has helped create a much greater consensus among Iridoviridae family members and enhanced our understanding of this virus family as a whole. The updated annotations that we have produced for the iridovirus sequences can be found in the additional files to this paper; in addition, the databases and tools to analyse Iridoviridae genomes are available to all researchers . This database will contain genomes from the original GenBank files and also the edited genomes described in this paper. Further re-defining the core set of iridovirus genes will continue to lead us to a better understanding of the phylogenetic relationships between individual iridoviruses as well as giving us a much deeper understanding of iridovirus replication. In addition, this analysis will provide a better framework for characterizing and annotating currently unclassified iridoviruses.
Re-annotation of the iridoviridae
Annotated sequences for the twelve completely sequenced iridovirus genomes (Table 1) were obtained from GenBank files and imported into the Viral Orthologous Clusters (VOCs) database . Species from the same genus were examined using VOCs to identify all of the orthologous genes. The analysis then focused on the differences found between genomes within the same genus. For those genomes that contained co-linear arrangements of genes (those in the Ranavirus and Megalocytivirus genera), we compared those regions containing annotated ORFs. If more than two sequenced genomes were available for a given genus, and the ORF was present in at least two of the genomes, then we set out to determine if that ORF was also present in the remainder of the genomes. By this method, we were able to re-annotate small segments of each genome without needing to re-analyse the entire genome. The Viral Genome Organizer (VGO) software  was used to visualize the annotated ORFs, as well as the start and stop codons found within each genome.
Analysis of orthologous genes
We used a combination of BLAST searches and queries using the VOCs software  to define orthologous genes between Iridoviridae genera. VOCs is a JAVA client-server that accesses a sequence query language (SQL) database containing iridovirus genomes. This SQL database permits complex queries to be assembled in an easy to use graphical user interface. VOCs initially groups orthologous genes into families based on BLASTP scores, these can be manually checked and altered if necessary.
Dotplots of FV3 and GIV were done using JDotter . JDotter provides an interactive input window that links JDotter to the VOCs database. The sequences for the FV3 and GIV were obtained through the VOCs database.
He JG, Lu L, Deng M, He HH, Weng SP, Wang XH, Zhou SY, Long QX, Wang XZ, Chan SM: Sequence analysis of the complete genome of an iridovirus isolated from the tiger frog. Virology 2002, 292: 185-197. 10.1006/viro.2001.1245
Jakob NJ, Muller K, Bahr U, Darai G: Analysis of the first complete DNA sequence of an invertebrate iridovirus: coding strategy of the genome of Chilo iridescent virus. Virology 2001, 286: 182-196. 10.1006/viro.2001.0963
Williams T, Chinchar VG, Darai G, Hyatt A, Kalmakoff J, Seligy V: Iridoviridae. In Virus Taxonomy: The Classification and Nomenclature of Viruses. Edited by: van Regenmortel MHV, Bishop DHL, Carstens EB, Estes MK, Lemon SM, Maniloff J, Mayo MA, McGeoch DJ, Pringle CR, Wickner RB. Seventh report of the International Committee on the Taxonomy of Viruses; 2000:167-182.
Williams T: The iridoviruses. Adv Virus Res 1996, 46: 345-412.
Ahne W, Schlotfeldt HJ, Thomsen I: Fish viruses: isolation of an icosahedral cytoplasmic deoxyribovirus from sheatfish (Silurus glanis). J Vet Med 1989, 36: 333-336.
Hedrick RP, McDowell TS: Properties of iridoviruses from ornamental fish. Vet Res 1995, 26: 423-427.
Hedrick RP, McDowell TS, Ahne W, Torhy C, De Kinkelin P: Properties of three iridovirus-like agents associated with systemic infections of fish. Dis Aquat Org 1992, 13: 203-209.
Langdon JS, Humphrey JD, Williams LM, Hyatt AD, Westbury HA: First virus isolation from Australian fish: an iridovirus-like pathogen from redfin perch Perca fluviatilis. J Fish Dis 1986, 9: 263-268. 10.1111/j.1365-2761.1986.tb01011.x
Pozet F, Morand M, Moussa A, Torhy C, De Kinkelin P: Isolation and preliminary characterization of a pathogenic icosahedral deoxyribovirus from the catfish Ictalurus melas. Dis Aquat Org 1992, 14: 35-42.
Bollinger TK, Mao J, Schock D, Brigham RM, Chinchar VG: Pathology, isolation, and preliminary molecular characterization of a novel iridovirus from tiger salamanders in Saskatchewan. J Wildlife Disease 1999.,35(413-429):
Jancovich JK, Davidson EW, Seiler A, Jacobs BL, Collins JP: Transmission of the Ambystoma tigrinum virus to alternate hosts. Dis Aquat Organ 2001, 46: 159-163.
Collins JP, Brunner JL, Miera V, Parris MJ, Schock DM, Storfer A: Ecology and evolution of infectious disease. In Amphibian Conservation. Edited by: Semlitsch R. Washington , Smithsonian Institution Press; 2003:137-151.
Daszak P, Cunningham AA, Hyatt AD: Infectious disease and amphibian population declines. Divers Distrib 2003, 9: 141-150. 10.1046/j.1472-4642.2003.00016.x
Jancovich JK, Davidson EW, Parameswaran N, Mao J, Chinchar VG, Collins JP, Jacobs BL, Storfer A: Evidence for emergence of an amphibian iridoviral disease because of human-enhanced spread. Molecular Ecology 2005, 14: 213-224. 10.1111/j.1365-294X.2004.02387.x
Ehlers A, Osborne J, Slack S, Roper RL, Upton C: Poxirus orthologous clusters (POCs). Bioinformatics 2002, 18: 1544-1545. 10.1093/bioinformatics/18.11.1544
Upton C, Hogg D, Perrin D, Boone M, Harris NL: Viral genome organizer: a system for analyzing complete viral genomes. Virus Research 2000, 70: 55-64. 10.1016/S0168-1702(00)00210-0
Brunetti CR, Amano H, Ueda Y, Qin J, Miyamura T, Suzuki T, Li X, Barrett JW, McFadden G: Complete genomic sequence and comparative analysis of the tumorigenic poxvirus Yaba monkey tumor virus. Journal of Virology 2003, 77: 13335-13347. 10.1128/JVI.77.24.13335-13347.2003
Lu L, Zhou SY, Chen C, Weng SP, Chan SM, He JG: Complete genome sequence analysis of an iridovirus isolated from the orange-spotted grouper, Epinephelus coicodes. Virology 2005, 339: 81-100. 10.1016/j.virol.2005.05.021
Do JW, Moon CH, Kim HJ, Ko MS, Kim SB, Son JH, Kim JS, An EJ, Kim MK, Lee SK, Han MS, Cha SJ, Park MS, Park MA, Kim YC, Kim JW, Park JW: Complete genomic DNA sequence of rock bream iridovirus. Virology 2004, 325: 351-363. 10.1016/j.virol.2004.05.008
He JG, Deng M, Weng SP, Li Z, Zhou SY, Long QX, Wang XZ, Chan SM: Complete genome analysis of Mandarin fish infectious spleen and kidney necrosis iridovirus. Virology 2001, 291: 126-139. 10.1006/viro.2001.1208
Tsai CT, Ting JW, Wu MH, Wu MF, Guo IC, Chang CY: Complete genomic sequence of the grouper iridovirus and comparison of genomic organization with those of other iridoviruses. Journal of Virology 2005, 79: 2010-2023. 10.1128/JVI.79.4.2010-2023.2005
Song WJ, Qin QW, Qiu J, Huang CH, Wang F, C.L. H: Functional genomics analysis of Singapore grouper iridovirus: complete sequence determination and proteomic analysis. Journal of Virology 2004,78(22):12576-12590. 10.1128/JVI.78.22.12576-12590.2004
Goorha R, Murti KG: The genome of frog virus 3, an animal DNA virus, is circularly permuted and terminally redundant. Proc Natl Acad Sci USA 1982, 79: 248-252. 10.1073/pnas.79.2.248
Zhang QY, Xiao F, Xie J, Li ZQ, Gui JF: Complete genome sequence of lymphocystis disease virus isolated from China. Journal of Virology 2004,78(13):6982-6994. 10.1128/JVI.78.13.6982-6994.2004
Jakob NJ, Kleespies RG, Tidona CA, Muller K, Gelderblom HR, Darai G: Comparative analysis of the genome and host range characteristics of two insect iridoviruses: Chilo iridescent virus and a cricket iridovirus isolate. Journal of General Virology 2002, 83: 463-470.
Delhon G, Tulman ER, Afonso CL, Lu Z, Becnel JJ, Moser BA, Kutish GF, Rock DL: Genome of invertebrate iridescence virus type 3 (Mosquito iridescent virus). Journal of Virology 2006, 80: 8439-8449. 10.1128/JVI.00464-06
Tan WGH, Barkman TJ, Chinchar VG, Essani K: Comparative genomic analyses of frog virus 3, type species of the genus Ranavirus (family Iridoviridae). Virology 2004, 323: 70-84. 10.1016/j.virol.2004.02.019
Doolittle WF, Logsdon JM: Archaeal genomics: Do Archaea have a mixed heritage? Current Biology 1998, 8: R209-R211. 10.1016/S0960-9822(98)70127-7
Huynen MA, Bork P: Measuring genome evolution. Proc Natl Acad Sci USA 1998, 95: 5849-5856. 10.1073/pnas.95.11.5849
Jancovich JK, Mao J, Chinchar VG, Wyatt C, Case ST, Kumar S, Valente G, Subramanian S, Davidson EW, Collins JP, Jacobs BL: Genomic sequence of a ranavirus (family Iridoviridae) associated with salamander mortalities in North America. Virology 2003, 316: 90-103. 10.1016/j.virol.2003.08.001
Chinchar VG, Granoff A: Temperature-sensitive mutants of frog virus 3: biochemical and genetic characterization. Journal of Virology 1986, 58: 192-202.
Viral Bioinformatics Resource Center [www.virology.ca]
Brodie R, Roper RL, Upton C: JDotter: a Java interface to multiple dotplots generated by dotter. Bioinformatics 2004, 20: 279-281. 10.1093/bioinformatics/btg406
Tidona CA, Darai G: The complete DNA sequence of lymphocystis disease virus. Virology 1997, 230: 207-216. 10.1006/viro.1997.8456
We would like to thank Daniel Rock for sharing information about mosquito iridescent virus prior to publication and Cristalle Watson for critically reviewing the manuscript. This work was supported by Discovery Grants (Natural Science and Engineering Research Council (NSERC) of Canada) to C.R.B. and C.U. H.E.E. is the recipient of an NSERC postgraduate scholarship.
The author(s) declare that there are no competing interests.
HEE, JM, EP, and CRB carried out the analysis of the Iridoviridae family and generated the tables and figures. VTJ and CU generated the databases and tools to carry out the analysis done in the manuscript. CRB and CU conceived of the study, and participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.