Molecular characterization of highly pathogenic H5N1 avian influenza viruses isolated in Sweden in 2006

Background The analysis of the nonstructural (NS) gene of the highly pathogenic (HP) H5N1 avian influenza viruses (AIV) isolated in Sweden early 2006 indicated the co-circulation of two sub-lineages of these viruses at that time. In order to complete the information on their genetic features and relation to other HP H5N1 AIVs the seven additional genes of twelve Swedish isolates were amplified in full length, sequenced, and characterized. Results The presence of two sub-lineages of HP H5N1 AIVs in Sweden in 2006 was further confirmed by the phylogenetic analysis of approximately the 95% of the genome of twelve isolates that were selected on the base of differences in geographic location, timing and animal species of origin. Ten of the analyzed viruses belonged to sub-clade 2.2.2. and grouped together with German and Danish isolates, while two 2.2.1. sub-clade viruses formed a cluster with isolates of Egyptian, Italian, Slovenian, and Nigerian origin. The revealed amino acid differences between the two sub-groups of Swedish viruses affected the predicted antigenicity of the surface glycoproteins, haemagglutinin and neuraminidase, rather than the nucleoprotein, polymerase basic protein 2, and polymerase acidic protein, the main targets of the cellular immune responses. The distinctive characteristics between members of the two subgroups were identified and described. Conclusion The comprehensive genetic characterization of HP H5N1 AIVs isolated in Sweden during the spring of 2006 is reported. Our data support previous findings on the coincidental spread of multiple sub-lineage H5N1 HPAIVs via migrating aquatic birds to large distance from their origin. The detection of 2.2.1. sub-clade viruses in Sweden adds further data regarding their spread in the North of Europe in 2006. The close genetic relationship of Swedish isolates sub-clade 2.2.2. to the contemporary German and Danish isolates supports the proposition of the introduction and spread of a single variant of 2.2.2. sub-clade H5N1 avian influenza viruses in the Baltic region. The presented findings underline the importance of whole genome analysis.

in the North of Europe in 2006. The close genetic relationship of Swedish isolates sub-clade 2.2.2. to the contemporary German and Danish isolates supports the proposition of the introduction and spread of a single variant of 2.2.2. sub-clade H5N1 avian influenza viruses in the Baltic region. The presented findings underline the importance of whole genome analysis.

Background
The first reports of outbreaks caused by highly pathogenic avian influenza viruses (HPAIV) of H5N1 subtype in 1996 originated from southern China [1]. Systematic influenza surveillances showed that distinct genetic sub-lineages of H5N1 HPAIVs, reflecting on their geographic origin, have been established since then among domestic poultry and have been transmitted to long distances by migratory waterfowl [2,3]. Europe experienced a peak of outbreaks of H5N1 HPAI in domestic poultry and wild birds in March 2006 -that was supposedly the consequence of an unusual westward movement of waterfowl from the Black Sea area [4][5][6]. The recent avian influenza virus strains of European-Middle Eastern-African (EMA) origin were assigned to three clades (EMA-1-3) based on the phylogeny of the complete genomes of the isolates [7], which are referred as sub-clades 2.2.1.-2.2.3. according to the more recent nomenclature [8]. Further, clade 2.2. was classified into three sub-clades: Clade 2.2.1. appeared in Egypt, southern Germany, Italy, Mongolia, and some regions in sub-Sahara Africa. Clade 2.2.2. viruses were detected in northern Germany, Denmark, Sweden, Scotland, and Nigeria, while clade 2.2.3. viruses were demonstrated in India, Afghanistan, Italy, and Iran [9]. Simultaneous transmission of different strains was reported in several European countries such as Sweden [10], Germany [9], France and Italy [11]. Characterization of the Swedish H5N1 HPAIV isolates based on the nonstructural (NS) gene nucleotide sequences demonstrated that all belonged to clade 2.2. The majority of them clustered together with clade 2.2.2., viruses belonging to clade 2.2.1. were also introduced into Sweden [10].
The aim of this study was to further investigate the Swedish H5N1 HPAI viruses by sequencing twelve selected isolates representing four east-coast provinces of the area affected by the epidemic during March-April 2006. The sequence information was used to study the evolution and epidemiology of the outbreak of H5N1 in Europe during 2006. Further, a H5N1 strain isolated from a mink was investigated to reveal any possible adaptation towards mammals.
All twelve Swedish H5N1 isolates in this study belonged to the 2.2. clade and the phylogenetic trees of all eight genes had similar topologies. Representative trees of the HA and PB2 genes are shown (Figures 1 and 2). These data along with those generated from the other genes confirmed the close genetic relationship of H5N1 HPAIVs isolated in the northern region of Germany The separation of the Swedish H5N1 HPAIVs into two subgroups was already demonstrated on the basis of NS gene sequences [10] and this finding was consistent for all eight genes of the isolates (herein summarized in Additional file 1). No reassortant variant was found among the sequenced twelve Swedish isolates.
Evolutionary relationships of HA genes of Swedish HP H5N1 AIVs compared to genetically closely related H5N1 viruses iso-lated in Europe Figure 1 Evolutionary relationships of HA genes of Swedish HP H5N1 AIVs compared to genetically closely related H5N1 viruses isolated in Europe. The phylogenetic trees were generated by maximum parsimony analysis (neighbor-joining revealed similar tree topologies). Bootstrap values of 1000 resamplings in per cent are indicated at key nodes. The Swedish viruses are highlighted by bold letters.

Molecular characterization
Characteristic findings regarding the preservation/substitutions at particular amino acid positions along with potential distinctive molecular markers for the Swedish H5N1 viruses are summarized in Additional file 1.

Polymerase genes
A single amino acid substitution, from glutamic acid (E) to Lysine (K) in position 627 in PB2 is a determinant of mammalian host range [13,14]. Most avian isolates have E in this position. The substitution to K in this position converts a nonlethal H5N1 influenza A virus isolated from a human to a lethal virus in mice [13]. PB1-F2 has been identified as a proapoptotic mitochondrial protein expressed from a second open reading frame of the PB1 gene [16] and it has been shown to contribute to viral pathogenesis in mice [17]. Aspargine in position 66 in PB1-F2 has been demonstrated to play a key role in the pathogenicity of H5N1 viruses [18] [11]. The identified amino acid markers of H5N1 influenza viruses isolated at Qinghai and Poyang Lakes from migratory birds (HA-I99, HA-N268, and NA-R110) were present in all Swedish isolates as well [11]. No "sub-clade"-specific amino acid changes were identified in the HA among the two subgroups of Swedish isolates. All the Swedish isolates had the 238Q and 240G (numbered from the H5 start codon) which indicates preferred receptor specificity for the avian α-(2,3) linkage to galactose [19,20]. All HA sequences contained 6 N-linked potential glycosylation sites, as analysed with NetNglyc server (threshold: 0.5) at the following positions: 27, 39, 181, 302, 500, 559; none of them is adjacent to the cleavage site. Furthermore, the substitutions S145L and A172T, which are associated with viral adaptation to poultry [21] were not determined in association with the Swedish H5N1 viruses.
The amino acid substitutions R178I and I248V in HA that were found in the domestic birds of the Danish isolates [22] were not present in any of the Swedish viruses, nor the V73I substitution that was found in the Danish swan isolates. However, the D387N substitution found in the German and most of the Danish isolates was also present in the Swedish isolates.
The H5N1 virus isolated from a mink (A/Sweden/mink/ 2006/V907) was examined in order to reveal any possible adaptation towards mammals. As a result, a unique E513G substitution was found in the HA gene but no substitutions that could be regarded as host-related were found, which is consistent with previous findings, i.e. that a single passage in mammals is not necessarily associated with changes in receptor-binding sites [9].
As in the other 2.2. viruses, NA-R110 was present in the Swedish isolates, and a 20 amino acid deletion was also found at positions 49-68 similarly to the majority of the recent H5N1 strains [22]. The N228S substitution was present only in A/Herring gull/1116/06 Swedish 2.2.1. virus (alike with several other member of the sub-clade) and not in A/Tufted duck/Sweden/599/06 isolate. These two isolates differed further in amino acid residues 414 and 434 by bearing N/K and S/G corresponding to A/Herring gull/1116/06 and A/Tufted duck/Sweden/599/06 viruses, respectively. Interestingly, while the Danish and German isolates shared unique amino acids in the NA (G336D), PB1 (K531R) and NS2 (G63E) proteins the Swedish isolates were not homogenous in this regard: although NA-G336D was a characteristic of the Swedish viruses too, two isolates retained the PB1-531K, and NS2-63G. Reported substitutions in NA, inducing oseltamivir resistance [9], were not found in the Swedish isolates.

The NP and M genes
The NP-10Y amino acid residue, which may affect the pathogenicity of AIVs [15], was present in all of the Swedish isolates. Concerning the M2 gene, all Swedish viruses contained the L26-V27-A30-S31-G34 amino acid pattern, thus, no adamantan drug resistant variant was revealed [9]. Substitutions S64A and E66A that were present in the M2 genes of H5N1 AIV isolates from Hong Kong [11] did not appear in Swedish viruses.
The complete characterization of the NS genes from these isolates was described by Zohari et al., [10], and is not further discussed here.
The effect of substitutions on the predicted antigenicity was investigated among the Swedish isolates for the surface glycoproteins (HA and NA) and for those primarily targeted by the host's cellular immune response (PB2, PA, and NP [23]) ( Table 1). The observed amino acid alterations affected the predicted antigenic epitopes in few cases. Regarding the HA in all but one cases 22 epitopes were predicted by the Kolaskar-Tongaonkar approach [24], the exception was strain A/herring gull/Sweden/ 1116/06, bearing a V201M substitution, which resulted in splitting the corresponding GKLCDLDGVKPLILRDCS-VAGW predicted epitope (between amino acid residues 55-76) into two smaller ones: GKLCDLD (aa residues 55-61) and PLILRDCSVAGW (aa residues 65-76  [17][18]. However, in this case no splitting of epitope(s) was predicted due to a change in the amino acid sequence, but rather, the substitutions could be asso-ciated with the appearance of newer epitopes (data not shown). No changes in the number of predicted epitopes was found in for PB2 and PA. In general, the Swedish viruses coded for 15 epitopes on the NP with the only exception of sublineage 2.2.2. virus A/eagle owl/Sweden/ V618/06, which had an additional epitope of seven amino acids between residues 22-28. In summary, the detected amino acid changes among the Swedish viruses appeared to have greater effect on the composition of proteins targeted by the humoral than those targeted by the cellular immune responses, in particular, on the NA gene.

Conclusion
The incursion of H5N1 HPAIV strains falling into three sub-clades into Europe throughout late 2005 and 2007 has been demonstrated earlier [7]. Further reports and the analysis of the corresponding published sequences revealed the introduction of multiple variants of H5N1 HPAIV into several European countries, such as sub-clade 2.2.1. and 2.2.2. viruses into Germany, France, and Sweden [6,9,11,25], and subclade 2.2.1. and 2.2.3. viruses into Italy [7]. The Swedish 2.2.1. sub-clade viruses were closely related to A/Cygnus olor/Italy/808/2006 and A/ mallard/Italy/835/2006 and shared several common nucleotide and amino acid motifs, among them, importantly, the PB2-627E, suggesting that they derived from an The number and composition of the immune reactive peptides predicted by computing indicated that the surface glycoprotein genes were more affected than the nucleoprotein, polymerase basic protein 2, and polymerase acidic protein, the main targets of the cellular immune responses.
The above observations, alike those with similar objectives, highlight and warrant the importance of whole genome sequencing of HPAIV isolates, in order to improve the surveillance and preparedness against highly pathogenic avian influenza.

Viral isolates
The isolates involved in this study are shown in Table 2. They were collected during the HPAI outbreak in Northern Europe in spring 2006 [10].

RT-PCR and nucleotide sequencing
The collection of specimens, RNA extraction, and RT-PCR amplification of the NS1 sequences was described earlier and the same RNA batches were used for this study that served as targets in the previous investigation [10]. In order to obtain possibly the full length nucleotide sequences of the coding regions of the influenza virus isolates several approaches were combined that comprised of either published protocols/primers [22,26,27] or those developed and used by the Influenza Genome Sequencing Project [ [28]; the primer sequences were kindly provided by David Spiro, The J. Craig Venter Institute, Rockville, Maryland, USA), or designed by ourselves. The primer and PCR protocols for sequencing are available from the authors upon request.

Phylogenetic analysis
For the phylogenetic analyses, a set of H5N1 AIVs that were isolated in Europe, Asia and Africa in 2005 -2006 was selected and used for all genes. These were collected from the Influenza Virus Resource at NCBI [29] and these were included in the phylogenetic analyses.
Sequence assembly, multiple alignment and alignment trimming were performed with the CLC Combined Workbench 3.0.2. bioinformatics software (CLC bio A/S, Aarhus, Denmark). Distance based neighbor joining and character based maximum parsimony phylogenetic trees were generated using the Molecular Evolutionary Genetics Analysis (MEGA) software v.4.0. [30] with 1000 bootstrap replicates. For the neighbor-joining trees, the Kimura-2 substitution model was used. Other models were also tested which showed similar topologies. The evolutionary divergence between the sub-clades was investigated by pairwise analyses over all sequence pairs between the groups by using the MEGA software also. The occurrence and distribution of synonymous and nonsynonymus substitutions was investigated by the DNA Sequence Polymorphism software (Version 4.50.3) software [31]. Computational analysis of the antigenic sites was carried out by using the Kolaskar-Tongaonkar method [24].