Analysis of the codon usage pattern of the RdRP gene of mycovirus infecting Aspergillus spp.

Background Mycoviruses that infect fungi generally do not have a significant effect on the host and, instead, reduce the toxicity of the fungi. However, recent studies have shown that polymycovirus-1, a mycovirus that infects Aspergillus species known to cause disease in humans, is related to increased virulence of the fungus. Methods Comparative analysis was performed of RdRP gene codon usage patterns of Aspergillus fumigatus polymycovirus-1 (AfuPmV-1) and other mycoviruses known to infect Aspergillus spp. to examine the genetic characteristics of AfuPmV-1. In addition, codon usage analysis was performed to determine whether the nucleotide composition and codon usage characteristics of AfuPmV-1 were also present in other polymycoviruses and hypervirulence-related mycoviruses. Phylogenetic analysis was also performed to investigate their evolutionary relationship. Results Analysis of nucleotide composition indicated that AfuPmV-1 had the highest GC content among analyzed mycoviruses and relative synonymous codon usage analysis indicated that all of the codons preferred by AfuPmV-1 ended with C or G, while codons ending with A or U were not observed. Moreover, the effective number of codons, the codon adaptation index, and correspondence analysis showed that AfuPmV-1 had greater codon preference compared with other mycoviruses and that AfuPmV-1 had relatively high adaptability to humans and fungi. These results were generally similar among polymycoviruses. Conclusions The codon usage pattern of AfuPmV-1 differs from other mycoviruses that infect Aspergillus spp. This difference may be related to the hypervirulence effect of AfuPmV-1. Analysis of AfuPmV-1 codon usage patterns could contribute to the identification and prediction of virulence effects of mycoviruses with similar genetic characteristics.


Background
Mycoviruses are viruses that infect fungi and are known to be infectious to most fungal species. While mycoviruses have little or no influence on the fungal host in most cases, some have been shown to control the pathogenicity of the host by increasing or decreasing the virulence [1,2]. To date, research on mycoviruses has mainly focused on plant pathogenic fungi. The hypovirulence effect of mycoviruses in fungi has led to the development of biological control agents to reduce the virulence of fungi on important crops and plant resources [2,3]. There have been relatively few studies on the hypervirulence effect of mycoviruses, although some results have been reported recently. In 2015, the A78 virus, which infects the human pathogenic fungus, Aspergillus fumigatus, was reported to have significant mild hypervirulence effects in the moth species, Galleria mellonella [4], and similar virulence effects have been reported in Aspergillus fumigatus tetramycovirus-1 (AfuTmV-1) found in the same fungus [5]. AfuTmV-1 was renamed AfuPmV-1, and its virus family name was changed from Tetramycoviridae to Polymycoviridae as more viruses showing similar characteristics to AfuTmV-1 were discovered, including BbPmV-1, which infects the insect pathogenic fungus, Beauveria bassiana [6] (Table 1).
Aspergillus is a fungus that belongs to the phylum Ascomycota and is related to various human diseases through infection, including aspergillosis on infection of the lungs, asthma, and allergies. Moreover, it is one of the fatal human pathogenic fungi, with a high mortality rate in patients with low immunity by opportunistic infection [7,8].
Although studies have focused on polymycovirus-1 with regard to the increase in fungal pathogenicity in A. fumigatus, it is also important to examine other mycoviruses, such as partitivirus, chrysovirus, and victorivirus, which are infectious to other species in the genus Aspergillus, including A. fumigatus, A. foetidus, A. ochraceus, and A. niger. This study was performed to explore other mycoviruses that may increase the pathogenicity of the genus Aspergillus by comparison with polymycovirus-1, and to investigate the genetic characteristics of mycoviruses that enhance the toxicity of human pathogenic fungi.

Sequence data collection
The AfuPmV-1 genome consists of four capped doublestranded RNAs (dsRNAs), of which the largest dsRNA1 segment encodes the RNA-dependent RNA polymerase (RdRP) [5,6]. The data were downloaded from the NCBI GenBank database (https://www.ncbi.nlm.nih.gov/genbank) and the complete nucleotide sequences of the RdRP coding region of mycoviruses that infect Aspergillus species were used. Eight sets of data published between 2006 and 2015 were studied in the analysis. Other polymycoviruses and hypervirulence-related mycoviruses data were downloaded from NCBI for comparison (Table 2).

Codon usage analysis
The programs CodonW (https://sourceforge.net/projects/ codonw/) and CALcal (http://genomes.urv.es/CAIcal/) were used to conduct analyses of nucleotide composition, overall/local G + C content, relative synonymous codon  usage (RSCU), effective number of codons (ENC), codon adaptation index (CAI), and correspondence analysis (COA) for each of the selected genetic data [9,10]. Nucleotide composition analysis was conducted based on the overall frequency of nucleic acid occurrence (A%, C%, U%, and G%), total AU%/GC%, frequency of the third nucleic acid in synonymous codons (A3s, C3s, U3 s, and G3 s), and the GC3s values. The RSCU is a value represented as the ratio between predicted and observed usage rates of specific codons, assuming that all synonymous codons were used equally for an amino acid. An RSCU value of 1.0 indicates that all codons were used randomly or equally. The value would be greater (less) than 1.0 if certain codons were used more (less) frequently [11,12].  [15,16]. The CAI is the geometric mean of relative adaptation level and is a quantitative method to calculate the differences in codon usage bias against highly expressed known reference data. The CAI value ranges between 0 and 1, with values closer to 1 indicating a high degree of similarity between the reference data and the codon usage pattern and expression level, whereas smaller values indicate lower similarity in the codon usage pattern and expression level [17]. The reference data of codon usage levels in humans (Homo sapiens) and fungus (Aspergillus fumigatus) were acquired from the codon usage database (http://www.kazusa.or.jp/codon/) for comparison [18]. Furthermore, the RSCU values were examined by the COA method with the XLSTAT 2016 program for visualization of the CodonW results. The COA method is the preferred method in the area of multivariate analysis and represents the data as vectors consisting of rows and columns [19]. Each individual data of the RdRP codon region was represented as a vector with 59 dimensions covering 59 codons except methionine (AUG) and tryptophan (UGG), which lack synonymous codons, as well as termination codons. A comparative analysis was performed including three groups: mycoviruses infecting Aspergillus (Group 1), polymycoviruses with a fungal host other than Aspergillus (Group 2), and newly reported mycoviruses that enhance host virulence (Group 3).

Phylogenetic analysis
The phylogenetic relationships among the mycoviruses used for the analysis were inspected and phylogenetic analysis was conducted using the MEGA7 program (http://www.megasoftware.net) to infer the influence of evolutionary processes on codon usage patterns. RdRP coding sequences were aligned with the MUSCLE algorithm, and the phylogenetic tree was constructed by applying the Maximum Likelihood method and the Kimura 2-parameter substitution model. The robustness of the tree was verified with the bootstrap value set to 1000 [20].

Nucleotide composition features
Basic nucleotide composition analysis was conducted for the RdRP gene of mycoviruses infecting Aspergillus spp. (Table 3, Fig. 1  respectively (Fig. 1). The GC content of AfuPmV-1 was reported to be approximately 63% in a previous study of the entire or partial genome sequences of Polymycoviridae viruses [6]. As a result of the comparison, Group 2 showed a similar pattern to that of AfuPmV-1. The GC content was between 59.97 and 61.98%, and the frequencies of nucleotides at the third position in the synonymous codons were higher in C3s and G3 s than in A3s and U3 s, respectively. Among these, BbPmV-1 is an experimentally reported virus that may be related to mild hypervirulence. The Group 3 results differed from those of Groups 1 and 2 ( Table 3).

RSCU value and codon usage preference
The RSCU values of AfuPmV-1, AfuCV, and AfuPV-1 of Group1 were compared to inspect the codon usage bias according to virus species (Table 4, Fig. 2). Of codons related to the entire 18 amino acids, 18 codons were preferred in AfuPmv-1 of which 11 showed RSCU values ≥1.6. AfuCV showed 17 preferred codons, three of which had RSCU values ≥1.6. AfuPV had 20 preferred codons, three of which showed RSCU values ≥1.6. AfuPmV-1 showed similarities to AfuCV in codons CAC (His), CAG (Gln) and to AfuPV-1 in codons CUC (Leu) and AUC (Ile). The codons preferred by each virus  showed similar preferences in all three viruses. The RSCU values and end nucleotide composition indicated that, in the RdRP coding region, C-ended codons were strongly preferred in AfuPmV-1 (15 of 18), G-ended codons were preferred in AfuCV (7 of 17), and U-ended codons were preferred in AfuPV-1 (8 of 20). Interestingly, there were no A/U-ended codons among the preferred codons of AfuPmV-1, indicating that AfuPmV-1 has a codon bias toward C-and G-ended codons. Group 2 showed a similar codon usage pattern to that of AfuPmV-1, which indicates a preference for the C/ G-ended codon. Group 3 also preferred C-or G-ended a b Fig. 4 Correspondence analysis results using RSCU values (COA-RSCU). Axis1 and Axis2 of the COA-RSCU explained 44.31 and 18.81%, respectively, of the total variation. a COA result for A/C/U/G-ended codons; G-and C-ended codons formed one group and A-and U-ended codons formed one group. b COA result for over-represented codons (RSCU > 1.6); the codons strongly preferred individually by the three viruses formed separate groups codons, but the four nucleic acids were distributed more evenly.

General codon usage pattern
The ENC value was calculated to quantitatively measure the magnitude of RdRP gene codon usage bias of the eight mycoviruses infecting Aspergillus spp. An ENC value < 35 indicates a strong codon usage bias. The results showed that the ENC value was lowest for AfuPmV-1 (40.67) and the other viruses showed ENC values > 50 (Table 3, Fig. 3). Taking into account the results of previous studies on RNA viruses in which the ENC values of Zaire Ebola virus, Chikungunya virus, Hepatitis C virus, and West Nile virus were 57.23, 55.56, 52.62, and 53.81, respectively [21], AfuPmV-1 appeared to have stronger codon usage bias relative to other viruses. The CAI was calculated to compare the adaptability of synonymous genetic codon usage in mutually different individuals, and the codon usage patterns were considered similar to the reference individual with CAI values closer to 1. This study referred to the CAI values of H. sapiens and A. fumigatus, which have ranges of 0.699-0.762 and 0.676-0.843, respectively. The mean and SD were 0.72 ± 0.02 and 0.74 ± 0.05 for H. sapiens and A. fumigatus, respectively. Remarkably, AfuPmV-1 showed the highest values for both H. sapiens and A. fumigatus with values of 0.762 and 0.843, respectively. These results indicated that AfuPmV-1 has the greatest similarity to the reference data in codon usage pattern and expression level, and that it has higher adaptability to human and fungal hosts compared with other mycoviruses. In Group 2, CAI and ENC values were similar to those of AfuPmV-1 (Table 3).

General trend of codon usage variation
To inspect the trends related to codon usage patterns of the Aspergillus-infecting viruses AfuPmV-1, AfuCV, and AfuPV-1, COA was performed with the RSCU values. Axis1, Axis2, Axis3, and Axis4 of the COA-RSCU explained 44. 31, 18.81, 17.33, and 9.78% of the total variation, respectively. The results were based on 59 codons, excluding the three termination codons and methionine (AUG)/tryptophan (UGG) that do not have synonymous codons. Although there were exceptions in the COA results according to the third nucleotide in the codon, G-and C-ended codons formed one group and A-and U-ended codons formed another group ( Fig. 4(a)). Moreover, the COA results for over-represented codons with RSCU values ≥1.6 showed that the codons strongly preferred individually by the three viruses formed separate groups ( Fig. 4(b)). The observations verified that there were differences in the codon pattern preferences among the Aspergillus-infecting viruses.

Evolutionary relationship between mycoviruses
A phylogenetic tree was constructed to examine the phylogenetic relationships among the 14 mycoviruses, including AfuPmV-1. Polymycoviruses were grouped with AfuPmV-1. Polymycoviruses showed a relatively close relationship with alternaviruses (Fig. 5). To provide more information, the RdRP family for each sequence was examined from Pfam. AfuPmV-1 was assigned to RdRP_1 and all other polymycoviruses with similar codon patterns were assigned to the same RdRP_1 (pfam00680). Other mycoviruses infecting Aspergillus were classified as RdRP_1 or RdRP_4, although they were found in the same fungus (Table 5). This result suggests that the hypervirulent effects of AfuPmV-1 may be more affected by viral genome characteristics than by the effect of Aspergillus as a host.

Discussion
The mechanism underlying the hypervirulence effect of polymycoviruses has yet to be determined. However, experimental studies have demonstrated the existence of mycoviruses with mild hypervirulence effects, and other mycoviruses with similar sequences are continuously being discovered. The pathogenic effects of pathogenic fungi on the hosts may be increased by infection with mycoviruses that show hypervirulence effects. Therefore, it is necessary to determine the genetic characteristics of mycoviruses with hypervirulence effects. The results of the present study showed that AfuPmV-1 has a high GC content, and all of the strongly preferred codons (RSCU value ≥1.6) ended with either a C or G nucleotide. The distinctive codon usage pattern of AfuPmV-1 compared to other mycoviruses that infect Aspergillus spp. may be related to its hypervirulence effect. These characteristics did not appear in all mycoviruses with hypervirulent effects, but were shared by polymycoviruses. Nucleotide composition and codon usage patterns of polymycoviruses may be useful in predicting hypervirulent effects of unidentified mycoviruses.

Conclusions
Aspergillus spp. are pathogenic fungi that cause various symptoms in humans. The hypervirulence effect of mycoviruses can increase the toxicity of Aspergillus spp. in human hosts, and thus increase the severity of symptoms. Here, AfuPmV-1 was shown to have distinct patterns in some codon usage indexes compared to other mycoviruses that infect Aspergillus spp. The distinctive codon usage pattern of AfuPmV-1 demonstrated in the present study indicated the need for monitoring of mycoviruses with similar characteristics. Research on mycoviruses has generally focused on their hypovirulence effects on fungi that infect plants. With the discovery of polymycoviruses, further research on the hypervirulence effects of mycoviruses is needed, particularly with regard to mechanisms of virulence control in mycoviruses, such as AfuPmV-1, which infects human pathogenic fungi.