Molecular epidemiology of salmonid alphavirus (SAV) subtype 3 in Norway

Background Pancreas disease (PD) is a viral fish disease which in recent years has significantly affected Norwegian salmonid aquaculture. In Norway, the aetiological agent salmonid alphavirus (SAV) has been found to be represented by the subtype 3 only. SAV subtype 3 has in previous analyses been found to show a lower genetic divergence than the subtypes found to cause PD in Ireland and Scotland. The aim of this study was to evaluate the nucleotide (nt) and amino acid divergence and the phylogenetic relationship of 33 recent SAV subtype 3 sequences. The samples from which the sequences were obtained originated from both PD endemic and non-endemic regions in an attempt to investigate agent origin/spread. Multiple samples throughout the seawater production phase from several salmonid populations were included to investigate genetic variation during an outbreak. The analyses were mainly based on partial sequences from the E2 gene. For some samples, additional partial 6 K and nsP3 gene sequences were available. Results The nucleotide divergence for all gene fragments ranged from total identity (0.0% divergence) to 0.45% (1103 nt fragment of E2), 1.11% (451 nt fragment of E2), 0.94% (6 K) and 0.28% (nsP3). This low nucleotide divergence corresponded well to previous reports on SAV 3 sequences; however the observed divergence for the short E2 fragment was higher than that previously reported. When compared to SAVH20/03 (AY604235), amino acid substitutions were detected in all assessed gene fragments however the in vivo significance of these on for example disease outbreak mortality could not be concluded on. The phylogenetic tree based on the 451 nt E2 fragment showed that the sequences divided into two clusters with low genetic divergence, representing only a single SAV subtype. Conclusions The analysed sequences represented two clusters of a single SAV subtype; however some of the observed sequence divergence was higher than that previously reported by other researchers. Larger scale, full length sequence analyses should be instigated to allow further phylogenetic and molecular epidemiology investigations of SAV subtype 3.


Background
The fish disease known as pancreas disease (PD) impacts significantly on Norwegian salmonid aquaculture, affecting both Atlantic salmon (Salmo salar L.) and rainbow trout (Oncorhynchus mykiss) seawater production [1][2][3]. In addition, Scottish and Irish Atlantic salmon production has been severely affected since the emergence of PD in Scotland in 1976 [4,5]. High proportions of the salmonid aquaculture sites have been continually affected by PD in both Ireland and Scotland; with Irish figures estimating 95% of examined Irish farms affected by PD between 1985 and 1989 [6], 62% affected in 2003 and 86% in 2004 [7]. PD emerged in Norwegian aquaculture in the 1980s [8], followed by a gradual increase in the number of cases diagnosed within two western counties (Hordaland and Sogn & Fjordane) initially constituting the endemic region. A gradual expansion of this endemic region southwards (Rogaland, 2004) and northwards (Møre & Romsdal, 2006) resulted in almost the entire south-western coast constituting an endemic region by the end of 2006. The first cases outside this region were detected in 2003 in the two northernmost counties (Finnmark and Troms), with Troms also affected in 2009. An area within the northernmost county (Finnmark) may be considered to constitute a separate endemic area, having had one or two cases diagnosed each year between 2005 and 2008. A third, northern county has been affected (Nordland, 2004 and, as well as one county in mid-Norway (Sør Trøndelag, 2009). The number of Norwegian seawater sites with diagnosed or suspected PD peaked at 109 in 2008, while declined to 75 in 2009 following industry and government efforts to reduce the impact of the disease. Although having a serious impact on Norwegian salmonid aquaculture, the proportion of affected sites in Norway remains lower than that seen in the Scottish and Irish industries. PD-affected fish generally show anorexia and lethargy, and develop SAV-associated lesions particularly in exocrine pancreas and heart-and skeletal muscle [3]. PD-associated mortality levels vary greatly, with a range between 0.7 and 26.9% seen in recently studied Norwegian sites [9].
The aetiological agent was first isolated in Ireland [10] and was later identified as an alphavirus in the family Togaviridae [11,12]. The species name salmonid alphavirus (SAV) was suggested [12] and has been adopted by researchers despite not being accepted by the International Committee on Taxonomy of Viruses. The SAV nomenclature will be used throughout this paper. Six SAV subtypes have been classified so far. In Ireland SAV subtypes 1, 4, and 6 have been isolated from fish affected by PD, while Scottish outbreaks have been caused by SAV subtypes 1, 2, 4, and 5 [13,14]. From Norwegian PD outbreaks, only SAV subtype 3 has been detected [2,[13][14][15], with a very low level of genetic variance between isolates [13,15]. Although now isolated from Atlantic salmon in the seawater phase [13]; the majority of outbreaks due to SAV subtype 2 occurs in freshwater farms stocking rainbow trout where the resultant disease has become known as sleeping disease (SD) [16]. As with other alphaviruses, SAV has a positive sense, single stranded RNA genome [17] of approximately 12 kb [12]. The non-structural proteins (nsP1 to nsP4) are encoded by the 5' end and the structural proteins (capsid, envelope glycoproteins (E1 to E3) and 6 K) by the 3' end [17]. The alphavirus structural protein E2 has been found to be the site of most neutralising epitopes [18]. Salmonid alphaviruses have been found to be genetically distinct from the other alphaviruses, many of which use arthropod vectors in their transmission [18]. No vectors have been found to be included in SAV transmission, and horizontal transmission pathways appear to be most important for the spread of SAV and PD between seawater populations [2,[7][8][9][13][14][15][19][20][21][22][23][24][25].
The aim of this study was to evaluate the nucleotide (nt) and amino acid divergence as well as the phylogenetic relationship of 33 recently obtained SAV subtype 3 sequences originating from both PD endemic and nonendemic regions of Norway. Based on the results, the possibility of gaining information on agent origin/spread were to be investigated. Multiple samples throughout the seawater production phase from several salmonid populations were included to investigate the presence of genetic changes during an outbreak. Analyses were to be based mainly on the partial E2 gene, with additional partial 6 K and nsP3 gene sequences available from some samples.

Sample selection
Samples originated from SAV-positive Atlantic salmon in the seawater production phase. A total of 33 SAVpositive samples from 12 seawater sites were selected for partial sequence analysis (Table 1). Multiple samples, originating from one to three sampling points, were included from nine sites. The sampling point(s) at each site varied in time, ranging from two months post seawater transfer to slaughter. As a result, almost the entire seawater production cycle was represented and gave a wide range in fish age and weight at time of sampling. Samples from six sites located within the endemic region were selected from participants in a cohort study [9]. Out of these, four sites (sites 1, 3, 4 and 6, Table 1) were included as they were found SAV-positive earlier in the seawater phase than the majority of the studied sites. Further four diagnostic samples from site 3, from an outbreak investigation on the fish generation put to sea after the slaughter of the cohort study generation, were included. Samples from two additional cohort study sites were included as they represented the minimum (site 5, Table 1) and maximum (site 2, Table 1) recorded PD-associated mortality. Finally, diagnostic samples submitted from six sites (sites 7 to 12, Table 1) in the non-endemic region or the endemic area of Finnmark in December 2003 or between November 2007 and October 2009 were included. All selected sites had PD diagnosed by SAV detection by real-time RT-PCR (Rt RT-PCR) being combined with histopathological changes in accordance with PD (as described by [9]).

RNA extraction and Rt RT-PCR
RNA was extracted from a mixture of heart and midkidney tissue according to the protocol previously described [9]. A 1762 base pair (bp) region within the nsP3 gene and a 1871 bp region within the Capsid-E3-E2-6 K genes, corresponding to positions 4206-5968 and 8411-10282 of the Norwegian SAVSF21/03 (AY604238) respectively, were amplified using partial overlapping sequences. For the Capsid-E3-E2-6 K genes three primer pairs were used, with two primer pairs used for the nsP3 gene ( Table 2). The primer sequence of F1600, R2357,

Sequence analysis
Consensus sequences were generated using Sequencher (Gene Codes Corporation) or ChromasPro (Technelysium Pty Ltd). All sequences were edited so that the longest shared and least conserved genetic region was included in the analysis. All sequence analyses and editing was carried out with the aid of the MEGA4 software [26].  Table 1). The sequences were aligned using both Muscle [27] and Clustal [28]. Pair wise nucleotide percentage similarity and divergence was calculated using the program Laglin (available at http://www.ch.embnet.org/index.html). Phylogenetic trees were generated from the multiple alignments using maximum parsimony (MP), unweighted pair group method using arithmetic average (UPGMA) and neighbor joining (NJ) methods, and generated using both the MEGA4 and Seaview (version 4) software packages [29]. Sequence data from eight SAV subtype 3 were obtained from GenBank and included in the phylogenetic analyses (SAVH20/03 (AY602435), SAVH10/02 (AY604236), PD97-N03 (AY604237), SAVSF21/03 (AY604238), SAVF29/03 (DQ122127), SAVT28/03 (DQ122128), SAVN32/04 (DQ122129), SAVSF22/03 (DQ122131)). Additionally, the Irish SAV 1 reference strain F93-125 (AJ316244) and the French SAV 2 reference strain S49p (AJ316246) were included. The phylogenetic tree shown in this paper was based on the NJ method and bootstrapped 1000 times. The 33 study sequences are available from GenBank, with accession numbers as shown in Table 1.

E2 and 6 K fragment
Amongst the 16 sequences covering the 1103 nt E2 fragment, the nucleotide divergence ranged from 0.0% to 0.45%. When compared to SAVH20/03 (AY604235), amino acid substitutions were detected in two sequences (Table 3). In the 106 nt 6 K fragment, a nucleotide divergence between 0.0% and 0.94% was found with five sequences showing an amino acid substitution ( Table 3). The 17 sequences covering the shorter, 451 nt E2 fragment had a nucleotide divergence between 0.0% and 1.11%. Amino acid substitutions were observed in 12 sequences (Table 3).

nsP3 fragment
In the six sequences covering the 716 nt partial nsP3 fragment the nucleotide divergence ranged from 0.0% and 0.28%. Amino acid substitution(s) were detected in two sequences (Table 3), one of which also showed an amino acid substitution in the E2 fragment.

Phylogenetic analyses
Both sequence-alignment programs and all three treegeneration methods produced identical results. Three phylogenetic trees were generated based on the nucleotide sequences of the obtained isolates; E2-6K sequences, short E2 sequences and nsP3 sequences. All three trees showed similar topology. The tree covering the largest number of sequences, 33 sequences covering the 451 nt (short) E2 fragment, has been included in this paper (Figure 1). The Irish SAV 1 reference strain

Discussion
The observed nucleotide divergence amongst our study sequences was generally low, with the short E2 sequences showing the highest divergence (up to 1.1%); followed closely by the 6 K sequences (up to 0.94%).
The longer sequences showed a lower divergence, with the long E2 sequences showing a slightly higher divergence (up to 0.45%) than the nsP3 sequences (up to 0.28%). The low divergence amongst our Norwegian sequences corresponded well with that reported from previous analyses of SAV subtype 3 sequences; however the divergence amongst our short E2 sequences was higher than that previously reported [13,15]. The divergence seen amongst the short E2 and 6 K sequences may be artificially inflated to some degree by covering only a relatively small portion of the respective genes, which may represent the most variable region within these. On the other hand, it may be that this withinsubtype variance is a true representation of the current SAV subtype 3 affecting Norwegian aquaculture. The sequences included in this study were, with one exception, covering August 2006 to October 2009, and originated from affected populations both inside and outside the endemic region. Our analysis, covering a total of 33 Norwegian SAV subtype 3 sequences, is the largest reported analysis of Norwegian sequences and covered more recent sequences than those previously published. SAV subtypes originating in Ireland and Scotland have been reported to show higher nucleotide divergence than SAV subtype 3 (SAV subtypes 1, 2, 4, and 5: E2 fragment divergence 1.2%, 4.8%, 3.4% and 1.7%; nsP3 fragment divergence 0.8%, 6.6%, 3.7% and 4,2%) [13]. RNA viruses are generally rapidly evolving viruses; however alphaviruses, including SAV, appears to be comparatively highly conserved with slower rates of evolution [30][31][32]. It is possible that the observed difference in within-subtype nucleotide divergence of SAV subtype 3 and the other SAV subtypes can be related to the differences in the proportion of susceptible populations (sites) affected in Norway compared to Ireland and Scotland. Based on the published reports, PD also appears to have been present in Scottish aquaculture for a longer time period. In Norway the proportion of affected populations remain well below that seen in Ireland and Scotland, were the majority of susceptible  Figure 1 Phylogenetic tree based on 33 SAV subtype 3 study sequences and eight GenBank obtained sequences. The phylogenetic tree (NJ method, bootstrapped 1000 times) was based on a 451 nt E2 sequence. Bootstrap-values above 60 have been displayed.
populations are affected. This difference, together with the historical differences in emergence of PD, may have resulted in differing evolutionary pressure on the respective SAV subtypes. It is possible that a continued high impact on Norwegian aquaculture, with or without a further expansion in geographical distribution, may result in a gradual increase in the sequence divergence towards that of other SAV subtypes. Our results support the theory that there has been only a single introduction of SAV subtype 3 into Norwegian aquaculture, from which it has dispersed to reach its current distribution. The observed amino acid substitutions were partially the same as those previously reported in SAV subtype 3. Similar substitutions to those reported at E2 position 204 (R to K) and 206 (S to P) [15] was seen in 11 of our sequences originating from four sites. In vitro studies have reported this serine to proline substitution at position 206 to be associated with the appearance of a cytopathic effect [15]. The in vivo significance of this substitution remains unclear. It was only possible to obtain reliable data on the PD-associated mortality for one of the sites where this substitution was seen (site 3, Table 1: 12.2%). Although higher than the average mortality observed in recently studied Norwegian Atlantic salmon sites affected by PD [9], the two sequences obtained from the study site with the highest mortality (site 2, Table 1: 26.9%) did not show this substitution. It can not be determined from this study whether any particular amino acid substitutions has had effect on the disease progression or the mortality of the affected sites, however this should be investigated further in future SAV subtype 3 sequence analyses.
The phylogenetic analyses revealed the presence of two clusters in the phylogenetic tree ( Figure 1). Due to the low divergence between the sequences in the upper and lower clusters of the phylogenetic tree, the use of the term branch has been avoided. When comparing the sequences from the upper and lower clusters, a maximum of six nucleotide substitutions and four amino acid substitutions were detected. The upper cluster consists of 11 study sequences and two GenBank obtained sequences (previously found to form a separate cluster to other analysed sequences [15]), which all show the serine to proline substitution at E2 position 206. This group consists of sequences from Finnmark (sites 11 and 12, Table 1) and Troms (site 10, Table 1) together with six sequences from one site in Hordaland (site 3, Table 1 Table 1) grouped together with the remaining sequences in the lower cluster. This lower cluster also contained sequences originating from both the endemic and the non-endemic regions. One sequence from site 2 (SAVH07-2(2), Table 1) within the lower cluster separates to a certain degree from the remaining sequences. This sequence represents the site showing the highest recorded site mortality level in a recent cohort study, although no conclusion on the significance of this can be made. Sequences obtained from each site generally clustered close together. The exception to this was sequences from site 3 ( Table 1) where sequences from both outbreaks (2007 and 2009) clustered in both the upper and the lower clusters. Any epidemiological interpretation of for example site-specific agent origin has proven difficult due to the high degree of similarity seen amongst the studied SAV subtype 3 sequences.

Conclusions
It can be concluded that the analysed sequences represented only a single subtype; however some of the observed sequence divergence was higher than that previously reported by other researchers. The phylogenetic analyses confirmed that Norwegian SAV sequences can be separated into two clusters, although the differences between the two clusters were limited up to six nucleotides and four amino acids. In the future it would be desirable with larger scale, full length sequence analyses in order to enable complete sequence divergence analyses, together with investigations into the effect of particular amino acid substitutions in field outbreaks and epidemiological investigations on agent origin and spread.