Phylodynamics of avian influenza clade 2.2.1 H5N1 viruses in Egypt

Background Highly pathogenic avian influenza (HPAI) viruses of the H5N1 subtype are widely distributed within poultry populations in Egypt and have caused multiple human infections. Linking the epidemiological and sequence data is important to understand the transmission, persistence and evolution of the virus. This work describes the phylogenetic dynamics of H5N1 based on molecular characterization of the hemagglutinin (HA) gene of isolates collected from February 2006 to May 2014. Methods Full-length HA sequences of 368 H5N1 viruses were generated and were genetically analysed to study their genetic evolution. They were collected from different poultry species, production sectors, and geographic locations in Egypt. The Bayesian Markov Chain Monte Carlo (BMCMC) method was applied to estimate the evolutionary rates among different virus clusters; additionally, an analysis of selection pressures in the HA gene was performed using the Single Likelihood Ancestor Counting (SLAC) method. Results The phylogenetic analysis of the H5 gene from 2006–14 indicated the presence of one virus introduction of the classic clade (2.2.1) from which two main subgroups were originated, the variant subgroup which was further subdivided into 2 sub-divisions (2.2.1.1 and 2.2.1.1a) and the endemic subgroup (2.2.1.2). The clade 2.2.1.2 showed a high evolution rate over a period of 6 years (6.9 × 10−3 sub/site/year) in comparison to the 2.2.1.1a variant cluster (7.2 × 10−3 over a period of 4 years). Those two clusters are under positive selection as they possess 5 distinct positively selected sites in the HA gene. The mutations at 120, 154, and 162 HA antigenic sites and the other two mutations (129∆, I151T) that occurred from 2009–14 were found to be stable in the 2.2.1.2 clade. Additionally, 13 groups of H5N1 HPAI viruses were identified based on their amino acid sequences at the cleavage site and “EKRRKKR” became the dominant pattern beginning in 2013. Conclusions Continuous evolution of H5N1 HPAI viruses in Egypt has been observed in all poultry farming and production systems in almost all regions of the country. The wide circulation of the 2.2.1.2 clade carrying triple mutations (120, 129∆, I151T) associated with increased binding affinity to human receptors is an alarming finding of public health importance. Electronic supplementary material The online version of this article (doi:10.1186/s12985-016-0477-7) contains supplementary material, which is available to authorized users.


Background
Influenza-A viruses contain eight segments of singlestrand RNA (ssRNA) and they are continuously evolving overtime. Point mutations can introduce small changes known as genetic drift which mainly occurs because the virus polymerase lacks the proofreading property. These changes are thought to be selected by pressures that force the virus to mutate. Highly pathogenic avian influenza viruses of the H5N1 subtype caused severe outbreaks in 1996/97 in southern China and Hong Kong [1]. In recent years, the H5N1 viruses spread from Asia to Europe and then to Africa, becoming endemic in poultry in parts of Asia and Egypt with frequent transmission to humans. In Egypt, the H5N1 HPAI virus (clade 2.2) was first reported in poultry in February 2006. Since then, the virus has spread rapidly among commercial and backyard flocks in most of the governorates [2]. Human infection rates were still rising as of June 2, 2014, reaching 175 infections with 63 deaths. By May 2015, infections reached 342 with 114 deaths due to the emergence of a new cluster originated from 2.2.1.2 [3]. The H5N1 viruses were isolated from ducks, chickens, and humans in Egyptian households and clustered into a distinct genetic group designated as 2.2.1. The majority of viruses derived from vaccinated poultry in commercial farms belonged to the 2.2.1.1 clade of variant viruses [4][5][6].
Genotypic characterization of avian influenza H5N1 viruses with the study of the evolutionary dynamics of circulating viruses will promote understanding of the virus evolution in a particular place. In a situation like Egypt, the genetic diversity of viruses leads to the production of heterogeneous genotypes [5]. However, the mechanisms associated with the genotype diversity of H5N1 viruses have still not been investigated [7]. Genetic phylogeny is currently considered the gold standard in characterizing viral genomics, transmission, and molecular evolution. Attempts to analytically trace the migration of viruses through evolutionary history have been done to infer migratory events [8]. In order to enhance the understanding of H5N1 HPAI virus epidemiology and the disease dynamics, particularly in endemic countries, regular linking of epidemiological data from individual outbreaks with the respective sequence information is of paramount importance to decide the activity of efficient disease control.
The aim of this work was to study the evolution of Egyptian H5N1 viruses from 2006 to 2014 using longitudinal epidemiological and virological data. The phylogenetic analysis of the HA gene linked with spatial data analysis can help us to understand the geographic spread of those viruses. This work aimed to describe the cluster dynamics of circulating viruses and evaluate the persistence of H5N1 strains in annual epidemics. Through that analysis, it was possible to determine the evolution rates of the HA gene and to characterise different H5N1genotypes in poultry in Egypt.  Figure S1).

Results and discussion
Analysis of virus population dynamics of the entire data set of the Egyptian H5N1 viruses showed a rise in genetic diversity in the 2.2.1.2 cluster from early 2008, shortly after the first introduction of the H5N1 viruses in the country in 2006. From 2009 to 2014, the 2.2.1.2 cluster exhibited a constant progressive adaptation to poultry and was considered to be an endemic cluster [11]. Genetically and antigenically distinct viruses emerged in Egypt in late 2007 after vaccination began in poultry (referred to as subgroups E and F [12] or subclade B [13]  There is an apparent lack of disease notification and reporting in the commercial poultry sector in Egypt [14]. Thus HA gene sequences of H5N1 viruses since 2012 from this sector is lacking in most governorates. Phylogeography can highlight the drivers of H5N1 emergence and spread. Qalubiya appears to represent a popular location for virus transmission as also has been explored in previous study [15]. In addition, Sharkia and Dakahlia in Delta and Giza were of the same character as they have all virus clusters recorded in different time periods (Fig. 3). However, there remains uncertainty about virus spread to and from those locations and thus more research needs to be conducted in order to investigate this phenomenon.

Analysis of selection pressures and evolutionary rates
The Nonsynonymous/Synonymous nucleotide substitution ratio (dN/dS) per site was greater than one for nine individual sites in the HA1 domain of the HA gene (   Table 1). The results indicated that the 2.2.1.2 clade is under positive selection pressure that leads to more adaptation of the viruses to the environment and the maintenance of its endemic state.
The population dynamics analysis revealed a rapid increase in the genetic diversity of A/goose/Guangdong/1/ 96 lineage viruses from mid-1999 to early 2000 [7]. In this study, it was shown that the Egyptian H5N1 viruses exhibited high evolution dynamics in almost all governorates of the country. The viruses from clades 2.2.1.2 and 2.2.1.1a had the highest record of positive selection sites (Table 1), which may be attributed to vaccination pressure due to long-standing application of vaccines with high virus load in the endemic environment. This reflects the continuous adaptation of Egyptian viruses to the poultry and to their environment with persistent changes every season [16]. The genetic variation among the Egyptian viruses was previously reported and the presence of positive selection was recorded. In this regard, Cattoli et al. [13] indicated that evolutionary dynamics and positive selection significantly increased in virus populations in countries applying the avian influenza vaccination for H5N1, compared to viruses in countries that had never used vaccination. They also   (Table 2). In addition, the Bayesian skyride analysis of the 2.2.1.2 viruses from 2009 to 2014 showed that the genetic diversity is directly proportional to the annual prevalence peaks. The genetic diversity of the variant clusters from 2007 to 2010 showed a higher pattern of increase followed by a sharp decline in 2011 (Fig. 5a, b).
The evolutionary analysis of Egyptian viruses revealed that these viruses have progressive rates of evolution. The factors related to this increase were mainly attributed to the sub-optimal use of vaccines and long-lasting virus persistence in the environment leading to the endemic prevalence of 2.2.1.2 viruses over six successive years. Cattoli et al. [13] revealed that the two main Egyptian clades (designated as A and B) have cocirculated in domestic poultry since late 2007 and exhibited different profiles of positively selected codons and rates of nucleotide substitution. The mean evolutionary rate of clade 2.2.1 H5N1 viruses was estimated in their study as 4.07 × 10 −3 nucleotide substitutions per site, per year whereas clade 2.2.1.1 viruses possessed a markedly higher substitution rate (8.87 × 10 −3 ) and that reflected the high genetic diversity among Egyptian viruses.

Molecular characterization and genetic analysis of HA gene
The analysis of the 368 HA genes enabled us to examine changes in the receptor binding site (RBS), antigenic sites (AS) and the cleavage site.

Changes in the receptor binding site
The most characteristic change in the receptor binding site of the Egyptian viruses included in this study was the observation of one amino acid deletion at site 129 (129Δ) that was not recorded in the ancestral strain (A/   1.1a), which showed S129L substitution. The substitution was more pronounced in the 2.2.1.1a cluster and was linked with another substitution (P74S) in most cases (28/35) ( Table 3). The most recently isolated viruses (n = 56) that were collected between 2012 and 2014 were examined for HA gene mutations and belonged to the 2.2.1.2 clade in which three mutations (129Δ, I151T, and S120(D,N)) were shown to be constant ( Table 3).
The H5N1 viruses from Egypt displayed four characteristic mutations (D43N, S120(D,N), (S,L)129Δ, and I151T). The results showed that 57 % of the HA sequenced genes showed a triple mutation (129Δ, S120(D,N), and I151T) ( Table 3). These triple mutations are characteristic in 2.2.1.2 clusters from different bird species such as chicken, duck, turkey, geese, ostrich, and quail; however, few of the 2.2.1.2 viruses did not carry them. The percentage of those viruses with the triple mutation reached 100 % from 2012 to 2014. Two mutations of those (129Δ, I151T) had increased attachment and infectivity to the human lower respiratory tract, but not in the larynx [17]; that indicates an increasing possibility of human infections in Egypt.
The majority of variant viruses of clade 2.2.1.1 had no changes in the receptor binding site at position 129 (129Δ) ( Table 3), and they were not responsible for any human infection in Egypt, except for one case in early 2008 during the beginning of widespread prevalence of this cluster [20]. In addition, Perovic et al. [21] indicated the extensive evolution of Egyptian H5N1 HPAI virus towards human. They reported that all G2 viruses (referred to as 2.2.1.2 clade in our study) displayed four characteristic mutations (D43N, S120(D,N), (S,L)129Δ and I151T). The other mutations that are linked to increased affinity to human receptors like "S223N, D183G, E186G, Q192R, Q222L and G224S" were not found in the Egyptian H5N1 viruses.

Changes in the antigenic sites
Amino acid substitutions at the antigenic sites can cause antigenic drift, possibly leading to vaccine escape as observed in the field. In comparison to the virus introduced in 2006, the characteristic changes that occurred in the antigenic sites of HA gene are very specific to each cluster (  (Table 3).
There have been 21 potential antigenic sites identified in the HA of H5N1 HPAI. It was shown that a single amino acid change in the HA of H5N1 HPAI virus can affect immune response and protection [22]. The antigenic analysis of the earlier H5N1 variant strains in Egypt demonstrated antigenic variation [12,23], which was driven by multiple mutations primarily occurring in the major antigenic sites at the globular head of HA [24]. Other studies showed that the classic clade 2.2.1 strains are antigenically related and crossreactive to the ancestral Asian H5N1 strains, but demonstrated weak cross-reactivity with the Egyptian variant 2.2.1.1 strains [23,25]. The majority of these mutations, alongside the other 19 amino acid mutations, were located within or adjacent to the receptor binding domain (RBD) in the HA1 that may affect the virus replication and transmission. There were six conserved mutations in previously reported antigenic sites (D43N, S120(N,D), S129Δ, I151T, D154N and R162K) between the early 2006 strain and the endemic 2.2.1.2 cluster strains. It was shown that the D43N mutation resulted in antigenic drift between classic 2.2.1 and 2.2.1.2 clusters [26].
The results of the present study showed clear differences among the virus clusters in terms of the absence or presence of certain changes in the antigenic sites. For instance, the majority of 2.2.1.2 cluster had four changes occurring at the same time in the antigenic sites (S120 (D,N), R162K, I151(T,L) and D145 (A,N)), while the latter two sites were lacking in the 2.2.1.1 cluster. Ibrahim et al., 2013 [26] observed an antigenic variation between different H5N1 clusters, especially between the variant 2.2.1.1a and the endemic 2.2.1.2 cluster strains showing a significant antigenic drift. Some residues were located in different antigenic sites like 133S, 154D and 156A, 190L and 192Q and 71L. They also confirmed that the 2.2.1.1 cluster showed a broader reactivity to all strains that represent different H5N1 clusters circulating in Egypt, as it shared residues with all the strains in the major antigenic sites.

Changes in the cleavage site
There were 13 groups of viruses identified based on the amino acid sequences at the cleavage site. Most of the H5N1 HPAI virus isolates belonging to 2.2.1 (42/75), 2.2.1.2 (105/224) and 2.2.1.1a (7/38) as well as all the viruses that belong to 2.2.1.1 cluster (31/31) have a common cleavage site of "ERRRKKR", described as the consensus cleavage site for clade 2.2 viruses [27]. However, the pattern "EKRRKKR" became dominant from 2013 and replaced the previous pattern which disappeared after 2012 ( Table 4). The pattern "EGRRKKR" of amino acid sequence was exclusively present in 2.2.1.1a cluster and represented the highest proportion (23/38) among the other six patterns in the same cluster, while the pattern "ERRRKR" was observed only in 30.6 % (23/75) of the viruses that belong to the 2.2.1 cluster ( Table 4).
The currently dominant amino acid cleavage site pattern "PQGEKRRKKR/GLF" is closely associated with the mutation 129Δ at the receptor binding site, and mutations at the antigenic sites (S120D, I151T, D145N and R162K) which are known to increase the binding ability to human receptors. All these mutations characterize the dominant cluster (2.2.1.2) from 2011 onwards [26].  significantly reduce pathogenicity without altering the transmission efficiency of H5N1 HPAI virus [28] and shows that non-adaptive mutations can play a role in virus evolution as the 2.2.1.1a cluster disappeared in Egypt since 2011. The high diversity of the HA gene in relation to some governorates indicates active virus circulation in different locations. In this study, the hypervariability of the HA gene was noticed in relation to the geographical location (data not shown). Viruses from Qalubiya, Giza, Menufia and Dakahlya governorates had the highest number of heterogeneous amino acid sequences of the HA gene. In addition, unequal virus distribution was noticed among governorates and that favour virus persistence in an endemic area (Fig. 3). All the above mentioned results support the existence of genetic diversity of HA gene in Egypt with progressive virus evolution in a model of intermittent re-emerging H5N1 viruses to a clean areas located inside an endemic environment.

Conclusions
Evolution of H5N1 HPAI viruses in Egypt continues to occur in all poultry farming and production systems and in almost all regions of the country. mutations associated with increased binding affinity to human receptors is an alarming finding of public health importance. Continuous monitoring of the circulating viruses and sequencing of HA and other genes, in particular the NA gene, is important to better select viruses for vaccine studies and to understand the evolution of viruses over time. Regular data sharing among professionals in the animal and public health sectors will allow linking of epidemiological and sequence information and will provide a clear picture on the virus evolution.

Nucleotide sequencing of HA gene
The H5N1 HPAI virus isolates and field samples from 368 cases of H5N1 were collected in Egypt during the period from 2006 to 2014. They were collected from different localities in Lower and Upper Egypt, different bird species (chicken, duck, turkey, geese, quail and ostrich) and different poultry value chain nodes like households, commercial poultry farms and live bird markets.
The full length HA gene sequencing has been conducted, where the ribonucleic acids (RNAs) of virus isolates or samples were extracted using QiaAmp viral RNA extraction kit (Qiagen, Germany) according to the manufacturer's instructions. A one-step RT-PCR was conducted on the extracted RNAs using specific primers for Matrix (M) and H5 genes [29,30]. The PCR products were purified using a QiaAmp purification kit (Qiagen, Germany). The HA gene sequencing was done using a Bigdye Terminator Kit (version 3.1; Applied Biosystems, Foster City, CA) on a 3130 Genetic Analyzer (Applied Biosystems, Foster City, CA). The sequencing of the HA gene was conducted at NLQP and the data were regularly submitted to the GenBank and are available at the National Center for Biotechnology Information (NCBI) Influenza Virus Resource. Recently, new sequence data from 2012-2014 were added in the GenBank under accession numbers of KJ522707-KJ522745 and KP209286-KP209303.

Phylodynamics of HA gene
After excluding the sequences from duplicate strains, 365 out of 368 full-length HA genes of Egyptian H5N1 viruses were used for this analysis. For the estimation of the rates of nucleotide substitution among H5N1 viruses from Egypt, the Bayesian Markov Chain Monte Carlo (BMCMC) method (as implemented in BEAST v1.4.7) was applied [31]. The Bayesian GMRF skyride coalescent tree model was used [32]. The uncorrelated lognormal relaxed (UCLD) clock [33] that allows evolutionary rates to vary along branches within lognormal distributions was used and Hasegawa-Kishino-Yano (HKY) substitution model with empirical base frequencies and gamma site heterogeneity model at 4 categories. Mean evolutionary rates and divergence times were calculated using Tracer V.1.5 [34]. The phylogenetic trees were visualized with FigTree v.1.1.2 [35].

Analysis of selection pressures
The site-specific selection pressures for the HA gene of Egyptian H5N1 viruses was measured as the ratio of nonsynonymous (dN) to synonymous (dS) nucleotide substitutions per site (dN/dS) or omega (ω). Normalized dN-dS was estimated as the raw dN-dS divided by the total length of the tree measured in the number of expected substitutions per nucleotide per site. The estimates were made using the Single Likelihood Ancestor Counting (SLAC) method available at the Datamonkey online version of the Hy-Phy package [36]. This analysis used input Neighbor Joining (NJ) phylogenetic trees estimated according to the HKY model of nucleotide substitution. A site with ω > 1 is indicating positive selection. Statistical distributions were used to model the variation in ω among sites, allowing a subset of sites to have ω >1 while the rest of the sequence may be under purifying selection with ω < 1 with p-value of less than 0.05 [37].

Genetic characterization of HA gene
In the present study, HA genes of 368 Egyptian H5N1 viruses were genetically characterized and studied for evidence of genetic mutations in different parts of the gene, including receptor binding, antigenic and cleavage sites. Multiple and pairwise sequence alignments were constructed using the Clustal-W algorithm of Bio-edit® software V.7.1.11 [38]. The mutations in the receptor binding and antigenic sites were tabulated against the virus clusters to explore their proportion among the sequenced HA genes and in order to identify the common antigenic differences between the virus clusters. The acquisition or loss of glycosylation sites of known importance in the HA gene was recorded. The amino acid sequences at the receptor binding, antigenic and cleavage sites were grouped and tabulated per cluster and year.