Genetic diversity of HA1 domain of heammaglutinin gene of influenza A(H1N1)pdm09 in Tunisia

We present major results concerning isolation and determination of the nucleotide sequence of hemagglutinin (HA1) of the pandemic (H1N1)pdm09 influenza viruses found in Tunisia. Amino acid analysis revealed minor amino acid changes in the antigenic or receptor-binding domains. We found mutations that were also present in 1918 pandemic virus, which includes S183P in 4 and S185T mutation in 19 of 27 viruses analyzed from 2011, while none of the 2009 viruses carried these mutations. Also two specific amino acid differences into N-glycosylation sites (N288T and N276H) were detected. The phylogenetic analysis revealed that the majority of the Tunisian isolates clustered with clade A/St. Petersburg/27/2011 viruses characterized by D97N and S185T mutations. However it also reveals a trend of 2010 strains to accumulate amino acid variation and form new phylogenetic clade with three specific amino acid substitutions: V47I, E172K and K308E.


Introduction
The incidence of recurring epidemics is primarily attributed to the high frequency of mutational changes in the hemagglutinin (HA) and neuraminidase (NA) major surface glycoproteins. The hemagglutinin protein, responsible for viral uptake into host epithelial cells, is also the antigen targeted by antibodies. The hemagglutinin, a trimeric surface glycoprotein that binds the viral receptor and promotes fusion and penetration from low-pH endosomes, is the principal surface antigen on influenza virions [1]. HA presents conserved and variable epitopes, but neutralizing antibodies against the latter dominate the response to immunization and infection [2]. The HA1 subunit of the H1N1 hemagglutinin protein consists of the globular head and contains four major antibody binding sites: Sa, Sb, Ca1, and Ca2, [3,4]. There are determined on the basis of the assessment of antigenic variations of virus mutants [5,6]. Mutational changes (antigenic drift) at these four sites are thought to be driven by selective antibody pressure, which can produce novel strains capable of evading the immunologic response. Because of the high morbidity and mortality due to influenza epidemics, monitoring of the accumulated antigenic variations in circulating influenza viruses is important for predicting epidemics, severity, and for the design of future vaccines [7]. Glycosylation at antigenic sites is an important mechanism of immune evasion by influenza virus [2,8,9]. In fact, the well-known seasonal drift of influenza virus antigenicity accounts for the absence of long-term immune protection in previously infected individuals. In this study we try to present major mutation found in Tunisian pandemic strains.

Sources of specimens
Between May 2009 and January 2011, influenza throat swab specimens were collected from patient in the sentinel centre and patient hospitalized in the hospital in different region in Tunisia. The criteria included patients with a fever >37.8°C accompanied by cough or sore throat. Throat swabs were taken within 72 hours of the onset of symptoms, placed in viral transport media (Vircell, Sequencing of viral RNA genome Viral RNA was extracted from 140 μl of clinical sample using the QIAmp Viral RNA Mini kit ® (QIAGEN). In order to detect and assign the A(H1N1)pdm09 strains isolated from patients, a real-time PCR assay was used CDC protocol [10]. A representative number of influenza viruses were genotypically characterized by analysis of the nucleotide sequence of partial haemagglutinin HA1 chain (931 nucleotide residues). All viruses analysed were amplified and sequenced according to the protocol of National Influenza Centre Madrid [11]. Mega 4.0 software [12] was used to analyze variations in antigenic sites, virulence-related sites, and glycosylation sites of influenza A (H1N1)pdm09 viruses. Analysis of the phylogenetic relationship was based on nucleotide sequences of HA of the isolated viruses and carried out by the maximumlikelihood method against sequences from global isolates. The evolutionary history was inferred by using the maximum likelihood method based on the Kimura twoparameter model [13]. Sequences generated in this study were deposited in the Genbank database with accession numbers: JN037716-JN037761.

Amino acid variations in antigenic sites
Amino acid alignment of the HA of Tunisian isolates in 2009-2010 and 2010-2011 seasons against global isolates was performed, and the results are summarized in Table 1 (Table 2). Among these amino acid differences, H138Q, S203T, R205K, D222G, S162I, S183P, and S185T substitutions were located within the antigenic HA1 sites (Ca1, Ca2, Sa, and Sb) ( Table 3) in several Tunisian sequences studied. The highest number of variations in predicted antigenic sites were observed in Ca1, where S203T was observed in 46 out of 50 (92%) viruses analyzed and R205K in 10 out of 50 viruses (20%). Analysis of other positions of antigenic sites revealed that Ca1 (H138Q) with 1% substitution, and Sa where S162I (1%) to be highly conserved. Whereas Ca2 had a substitution (D222G/E) in 4 out of 50 (8%).
Several signature amino acids found in 1918-like viruses were observed in sequence data analyzed. Two additional mutations similar to 1918-like virus emerged in 2011 only: Proline at position 183 in four virus sequences from 2011 season ( Table 2); and threonine at position 185 in 19 sequences only from 2011 season.
The most common variation compared to the vaccine strain was P83S, which was actually observed in almost of the Tunisian isolates (96%). Furthermore, a high proportion, 98%, of Tunisian strains possessed the variation I321V. It was observed that the remaining virus that retained isoleukine at this position was associated with severe illness, patient developed pneumonia. Most importantly, it was one strain, A/Tunisia/15656/2009 with altered N-linked glycosylation site, due to variation N276H. All but one of the pandemic viruses had retained all the potential 6 N-glycosylation sites. The one exception (A/Tunisia/15656/2009) had lost a glycosylation site with a change at amino acid residue 276 from asparagine (N) to histidine (H).
It was an alteration to preexisting glycosylation site, and none of the Tunisian isolates gained new glycosylation sites. Moreover, It was one strain, A/Tunisia/2144/ 2011 showed a switch into N-linked glycosylation site (T288N), associated with severe pneumonia. Interestingly, this case has got three more substitutions in the antigenic sites (S203T, S162I and S183P). Another interesting strain, A/Tunisia/1423/2011 contained three substitutions (H138Q, S203T and R205K). This strain was associated with a severe pneumonia in pregnant woman.
Moreover, three viruses possessed the previously identified D222G/E mutation at the Ca2 antigenic site of the HA. Interestingly, three out of four patients with the D222G mutation (75%), suffered from pneumonia and were hospitalized. One of those infections was fatal (33.33%). However, this mutation was also detected in one virus that caused mild illness (25%).

Phylogentic analysis of H1N1pdm09 viruses from 2009 and 2011
To understand the genetic diversity of influenza A (H1N1)pdm09 in Tunisia, sequencing of 50 HA1 regions of isolates collected in either 2009-2010 season (n = 23) and 2010-2011 season (n = 27) was undertaken (Table 1). Tunisian strains were compared to that of the vaccine strain A/California/7/09 (H1N1) and a number of variations were detected. Substitutions are mentioned in detail in Table 2 and Table 3. Phylogenetic analysis of the HA1

Discussion
Pandemics are believed to arise when a novel avian or swine influenza HA and/or NA are acquired through reassortment between human, swine, and avian influenza viruses or by a non-human virus adapting to efficient human transmission [15]. The current pandemic H1N1pdm09, which is the result of genetic reassortment of multiple gene segments from different lineages, has a known evolutionary history of about a century [16]. Thus, the unpredictability of viral mutations that arise during continued circulation in humans raises questions about antigenic evolution and its impact on future severity of infection, and on vaccine composition. We herein report the characterization of 50 HA genes of the pandemic (H1N1) pdm09 influenza viruses isolated in Tunisia between September 2009 and February 2011, were analysed and their phylogenetic relationships with references strains were showed. We provide evidence that while overall the viruses were fairly homogeneous; there are subtle differences in predicted antigenic epitopes which are potential targets of positive selection pressure.
We further compared the amino acid residues constituting the 7 antigenic sites of A/California/07/2009 with those of the Tunisia isolates. Almost all residues were highly conserved in the Tunisian isolates, however that detailed characterization of the HA gene of H1N1pdm09 has identified some mutations in four antigenic sites (Ca1, Ca2, Sa, and Sb) that are exposed for antibody recognition [16]. The Ca site is proximal to the oligosaccharide at HA1 Asn87, which may interfere with antibody recognition of this region. The sequence data showed that amino acid substitutions were scattered at different loci. However positions 114, 185, 203, 205, 222, and 321 appear the most variable as amino acid residues at these positions vary more than once in the viruses examined. In fact, Melidou et al. 2010 suggest that the occurrence of 321I in severe cases could only be an effect of changing frequency over time rather than an association to severity [17]. An area of largely conserved residues lies between amino acids 280 and 327 and represent part of the stalk structures that supports the globular region of HA1 [18]. Some changes at different positions within the four antigenic domains were observed in sequences from Tunisia. In fact, 11 out 17 (64.7%) Tunisian strains defined by an amino acid change within the haemagglutinin (HA) gene at position 185 (S185T) were from severe cases. Recently, outbreaks varying in severity throughout the UK were caused by a 2009 A(H1N1)pdm09 variant [19]. Analysis of NCBI database sequences revealed the presence of serine at position 183 among H1N1pdm09, while all 1918 (H1N1) viruses have proline at this position. Whether substitution of serine at position 183 has reduced the receptor-binding ability of the (H1N1)pdm09 virus resulting in a less virulent phenotype remains to be determined. It is important to note that 4 of the 2011 viruses from Tunisian severe cases have acquired a proline at this position, which, in combination with additional mutations, could lead to altered pathogenicity [20]. Table 3 Dynamic changes of amino acid residues at antigenic and N-glycosylation sites of influenza A(H1N1)pdm09-HA in Tunisia (Continued) Tunisia/1713/2011  V  -T  -------A/Tunisia/2133/2011  V  -T  ----T  --A/Tunisia/2137/2011  V  -T  ----T -- In addition to the variations in the viral HA antigenic sites, observations were made on the number of Nglycosylation sequons in the globular head region of the HA. Since glycosylation can potentially affect the antigenic properties of influenza A virus, we analyzed the changes in the potential N-linked glycosylation sites of the pandemic virus strains. Potential N-glycosylation sites are found in the HA molecule of the A/California/ 07/2009 virus, six of which reside in the HA1 and the remaining two in the HA2 region. In fact, the absence of glycosylation may influence the function of the glycoprotein, as it is involved in the protein folding, oligomerization, quality control, sorting and transport [17,21]. Among the Tunisian isolates, there were two strains that had variations to glycosylation sites. A/Tunisia/15656/ 2009 bared the substitution N276H that resulted to the loss of a glycosylation site (NTT). And A/Tunisia/2144/ 2011 possessed the variation T288N into the glycosylation site. D222G substitution in the HA gene, reported for the first time in influenza A(H1N1)pdm09 virus on November 2009 in Norway [22], has been found again associated to severe patients during the winter 2011 [14]. In the present study, D222G change was detected in 5.2% of severe patients analyzed, percentage similar to that previously reported in 2009-2010 season in Tunisia [23].
Of greater significance is the finding that two of the viruses studied from 2011 shared the amino acid changes V47I, E172K and K308E. In fact, since the genetic characterization algorithms were established in 2010 for pandemic H1N1pdm09 viruses, a new genetic group characterized by these mutations, cited recently, in Spanish study [14]. According to the last ECDC report [24], eight genetic groups were presented whereas the new genetic group reported in this study was indentified as A/Madrid/SO8189/2010. Phylogenetic analysis of these strains showed that, as expected, they belonged to the same clade with the vaccine strain. Further antigenic analysis is needed to assess the characteristics of the rest of the circulating viruses, especially those that possessed variations to antigenic and glycosylation sites of the HA.
However, on the 28th of December, WHO reported that so far all of the circulating pandemic (H1N1) 2009 viruses are antigenically related to A/California/7/2009 [25]. Disease severity is difficult to assess, it can change with different geographic contexts and under different seasonal conditions and may alter as the virus adapts to its new host. Our findings confirm the genetic instability of influenza type A (H1N1) viruses and highlight the importance of continuous molecular surveillance for the effective management of influenza epidemics.
Abbreviations ECDC: European surveillance centre for disease prevention and control; WHO: World Health Organization.
(See figure on previous page.) Figure 1 Phylogenetic relationship of partial length HA sequences of influenza A(H1N1)pdm09 viruses in Tunisia. The tree was rooted with the vaccine strain A/California/07/2009 (boxed) as outgroup. Branch lengths are drawn to scale. Signature amino acid changes (H1 numbering) are annotated at the nodes of each cluster.