Influenza virus undergoes rapid evolution by both antigenic shift and antigenic drift. Antibodies, particularly those binding near the receptor-binding site of hemagglutinin (HA) or the neuraminidase (NA) active site, are thought to be the primary defense against influenza infection, and mutations in antibody binding sites can reduce or eliminate antibody binding. The binding of antibodies to their cognate antigens is governed by such biophysical properties of the interacting surfaces as shape, non-polar and polar surface area, and charge.
To understand forces shaping evolution of influenza virus, we have examined HA sequences of human influenza A and B viruses, assigning each amino acid values reflecting total accessible surface area, non-polar and polar surface area, and net charge due to the side chain. Changes in each of these values between neighboring sequences were calculated for each residue and mapped onto the crystal structures.
Areas of HA showing the highest frequency of pairwise changes agreed well with previously identified antigenic sites in H3 and H1 HAs, and allowed us to propose more detailed antigenic maps and novel antigenic sites for H1 and influenza B HA. Changes in biophysical properties differed between HAs of different subtypes, and between different antigenic sites of the same HA. For H1, statistically significant differences in several biophysical quantities compared to residues lying outside antigenic sites were seen for some antigenic sites but not others. Influenza B antigenic sites all show statistically significant differences in biophysical quantities for all antigenic sites, whereas no statistically significant differences in biophysical quantities were seen for any antigenic site is seen for H3. In many cases, residues previously shown to be under positive selection at the genetic level also undergo rapid change in biophysical properties.
The biophysical consequences of amino acid changes introduced by antigenic drift vary from subtype to subtype, and between different antigenic sites. This suggests that the significance of antibody binding in selecting new variants may also be variable for different antigenic sites and influenza subtypes.
Influenza virus undergoes rapid evolution in nature by both genetic shift, where one (or more) of the eight gene segments is exchanged from one virus into another , and genetic drift, whereby mutations accumulate in viral genes , presumably due to the relatively error-prone replication of the viral RNA. This presents a significant challenge for vaccine design, as new vaccines must be produced almost every year in order to provide the best match with viruses likely to circulate in the coming influenza season. While other potential targets for vaccination to protect against influenza infection are under investigation [3, 4], it is likely that vaccines based on the intact surface proteins of influenza viruses will remain in use for the foreseeable future. The activities of both hemagglutinin (HA) and neuraminidase (NA) are essential to viral function, and antibodies recognizing HA and NA are the primary defense against viral infection . Antibodies binding near the receptor-binding site of HA [6, 7] or the substrate binding site of NA [8, 9] strongly inhibit viral function, so it is presumed that mutations in these binding sites which reduce or eliminate antibody binding confer a significant evolutionary advantage.
Studies of changes occurring in human influenza isolates and the selection of “escape mutant” variant viruses resistant to neutralizing monoclonal antibodies have allowed the delineation of critical neutralizing antigenic sites in both HA and NA . In many cases, a single amino acid change is sufficient to reduce, often drastically, the neutralizing effect of antibody. Studies of interactions between mutant influenza NA and monoclonal antibodies at the biochemical and structural level have revealed at least two classes of binding phenomena; for some antibody-antigen pairs, the contribution of some amino acids is much more important than others in the epitope, presumably because interactions with these amino acids contribute much more to the antibody binding energy [10, 11], while for other antibody-antigen pairs, the contribution of each amino acid in the epitope is approximately similar [12, 13], suggesting that considerations such as shape complementarity between the binding site on the antibody and the antigenic site is critical to antibody binding. Biophysical analyses of antigen/antibody pairs consisting of either lysozyme and monoclonal antibody or idiotype/anti-idiotype monoclonal antibody pairs suggest that epitopes that are tightly bound by antibody may often have a hydrophic core surrounded by hydrophilic amino acids, suggesting that both entropy and electrostatics are important in antibody binding (reviewed in ). It should be noted the total number of antibody/antigen pairs that have been analyzed at the biophysical level remains small, so any generalization must be made with caution.
As first suggested by Darwin , evolution is presumably governed by a complex interplay between positive selection for a novel function, such as a new enzyme specificity or escape from antibody binding, and negative selection against those changes which have a deleterious effect on the protein’s structure or critical functions or interactions. To begin to understand the forces shaping the evolution of influenza virus HA, we have examined HA sequences available in the National Center for Biotechnology Information (NCBI) Influenza Database . We reasoned that, if ongoing selection by neutralizing antibodies is important, those residues targeted by neutralizing antibody will continually change over time. Thus, we have made pairwise comparisons between aligned sequences to look for changes in closely related HAs. We have both quantitated the frequency of change of individual amino acids, and attempted to understand how these changes affect the biophysical properties of individual residues within HA. Our studies indicate that the types of changes observed at different antigenic sites vary between influenza subtypes, and between individual antigenic sites in the same HA. We also demonstrate that many HA residues shown by others to be under positive selection at the genetic level [17, 18] also have a propensity to undergo changes in biophysical properties. These data may prove useful in developing algorithms to better predict future changes in influenza antigens to improve influenza vaccine design.
Influenza sequences and sequence alignments
Amino acid sequences for the HA1 domain of HA from human clinical H1N1 (n = 531, 1918–2008, i.e. excluding 2009 “Swine-origin” pandemic isolates), H3N2 (n = 968, 1968–2005), and influenza B (n = 209, 1940 – 2007, alignments performed without separating out Victoria and Yamagata lineages). Due to the fact that many sequences did not contain complete sequence data for the HA2 portion of the molecule, analyses were performed solely for the HA1 portion. Amino acid sequences were obtained and a best fit alignment performed using MUSCLE , as implemented in the NCBI Influenza Virus Resource (http://www.ncbi.nlm.nih.gov/genomes/FLU, ). Incomplete and duplicate sequences were removed prior to alignment where possible. See Additional file 1 for sequence alignments used in this study.
Pairwise comparison of aligned sequences
Aligned sequences from NCBI were uploaded into Kalignvu (http://msa.sbc.su.se/cgi-bin/msa.cgi, ) to produce a dataset containing complete amino acid sequences which were then uploaded into Excel (Microsoft, Renton WA). The absolute number of pairwise changes at each position was determined and divided by the total number of sequences. This was designated ∆abs, and represents the frequency of any amino acid change at a given position. Note that, under this approach, a single change from the root sequence which is then perpetuated throughout the rest of the sequences in the alignment will have a low value for ∆abs, whereas a position where different amino acids can occur in different sequences will have a much higher ∆abs.
Parameterization and calculation of change in biophysical properties
Each amino acid in the dataset was then assigned values for ∆ASAtot∆ASAnp, and ∆ASApol (Table 1, [21, 22]). Each amino acid was also assigned a value for net charge at pH 7.0 (Q, Table 1) based on the side chain pKa, with completely ionized acidic and basic residues being assigned values of −1 and +1, respectively. For every residue in HA, pairwise changes in each parameter were calculated by subtracting the assigned value from that at the same position in the sequence immediately above it in the alignment table (i.e. the most closely related sequence). The absolute values of these differences were averaged for the same position in all sequences in the alignment table, then normalized to ∆abs to generate Normalized Change Index (NCI) values for ∆∆ASAtot∆∆ASAnp∆∆ASApol and ∆Q. Thus, in cases where no change was observed between the two sequences, the numerical value of the difference was zero, but where a difference occurred, the value represents the average magnitude of the difference every time a change occurs. Because of the normalization to ∆abs, a frequently occurring conservative change can be readily distinguished from a rarer, non-conservative change. Values for ∆abs, ∆∆ASAtot∆∆ASAnp∆∆ASApol and ∆Q for each amino acid position in HA were analyzed statistically to determine the median, 75th percentile and 90th percentile values for each dataset using Kaleidagraph (Synergy Software). Rapidly changing residues were defined as those residues in the 75th percentile and above in terms of ∆abs.
Values for parameters assigned to each amino acid
aValues for ΔASAtot,ΔASAnp, and ΔASApol as in . Note that values for ΔASAtot are all based on comparison to the surface area of glycine, which is set to 0.
bCalculated at pH 7.0, setting completely ionized acids and bases at −1.00 and 1.00, respectively.
cValues for deleted or missing amino acids were chosen arbitrarily.
Structural analysis, assignment of antigenic sites, and statistical analysis
To allow comparison of changes in biophysical parameters with previously defined antigenic sites and the receptor binding pocket, amino acid residues in the crystal structures of H1, H3, and influenza B HA were color-coded to represent NCI values for biophysical parameters (see individual figure legends for structures used in each case), using Mac PyMol (DeLano Scientific LLC). Rapidly changing residues (∆abs ≥ 75th percentile) were color-coded based on whether the NCI value of interest fell below the median, was between the 50th and 75th percentile, between the 75th and 90th percentile, or above the 90th percentile for HA1 residues in terms of NCI values for ∆∆ASAtot, ∆∆ASAnp, ∆∆ASApol and ∆Q. See figure legends for further details.
Rapidly changing amino acids on the outer surface of the respective HA1 monomers formed surface patches roughly analogous to the previously described neutralizing antigenic sites of H1, H3, and B HA antigenic sites, and were deemed to belong to these antigenic sites. The properties of these antigenic sites were compared statistically by comparing the value of each parameter for all the residues assigned to a particular antigenic site to a dataset comprising all amino acids from the HA1 portion of the same HA molecule not assigned to antigenic sites (non-antigenic site residues). It is assumed that non-antigenic site residues include both amino acids that cannot be altered without deleterious effects on structure or function and residues subject to genetic drift but where antibody-mediated selection is unlikely to occur. The majority of non-antigenic site residues undergoing rapid change are on solvent-exposed surfaces not likely to be accessible to antibody, such as on the back of the monomer. Statistical comparisons were performed using Kruskal-Wallis ANOVA with Dunn’s post- test (GraphPad Prism).
Effect of alignment on biophysical parameters
To test for potential biases due to a particular method of alignment, and any effect of potential alignment error, We generated a dataset to represent each sequence composed of antigenic site residues paired with a set of randomly selected residues for the HA1 region of each HA. These amino acids were extracted from each sequence, then the datasets containing the extracted residues representing each sequence were re-organized such that each dataset (representing a single sequence), now had new “nearest neighbors” in the data table. Values for Δabs, ΔΔASAtot, ΔASAnp, ΔΔASApol, and ΔQ were recalculated for each amino acid in the dataset based on the new arrangement of sequences. The epitope residues for each HA were paired with datasets of randomly chosen residues. This resorting process was carried out twenty times to achieve a partially randomized arrangement of datasets. Statistical comparisons between the parameter values for the antigenic site amino acids and the randomly selected residues were performed both for the original alignments and the resorted datasets using Kruskal-Wallis ANOVA with Dunn’s post- test.
Results and discussion
Sequence alignment and parameterization
Amino acid sequences were aligned using the multiple protein sequence alignment tool MUSCLE. Since we wish to test the hypothesis that antibody selection is a key player in virus evolution, and this acts at the protein level, we elected to align amino acid rather than nucleic acid sequences, An alignment algorithm based on pairwise sequence comparison was chosen over other approaches because we wished to compare sequences on the basis of pairwise differences in values reflecting amino acid properties, and we reasoned that sequences aligned in such a fashion to minimize pairwise differences, as is the case with MUSCLE, would provide the most conservative approach, although we cannot rule out the possibility that potentially important sequence differences might be obscured. Amino acids in the alignment tables were then parameterized based on one of four properties: side chain size (measured by solvent-accessible surface area), hydrophobicity (measured by solvent-accessible non-polar surface area), hydrophilicity (measured by solvent-accessible polar surface area), or side-chain charge. Values pertaining to each property of interest were then compared mathematically to determine whether there was any trend in changes at a particular site in the protein (see Methods).
Prediction of novel sites in potential antigenic sites in H1 and B HA
Neutralizing antigenic sites have been described for human H1 [23, 25], H3 [7, 26], and influenza B  HA. For each HA, we calculated the average number of changes between neighboring aligned sequences (Δabs), and mapped these on to the surfaces of HA structures (Figure 1). There is reasonably good agreement between the previously described antigenic sites and residues with high Δabs values, especially those in the top 25th percentile range (red and orange residues in Figure 1a). This is particularly true for H3, the human influenza HA best characterized at the antigenic level. Residues in each of the previously described H3 HA antigenic sites (A-E) are represented in the residues with the highest Δabs values (Figure 1a, Table 2), suggesting that pairwise sequence analysis for determining frequencies of change is a useful method for predicting residues that may be evolving in response to antibody selection. Somewhat unexpectedly, we also find rapidly changing residues on the rear face of the monomer, which would not be expected to be accessible to antibody, at least in the neutral pH conformation. Two of these residues, amino acids 220 and 229, have been shown to be under positive selection at the genetic level based on comparing rates of synonymous and non-synonymous nucleotide substitutions .
aH1 numbered according to the amino acid position in A/Puerto Rico/8/34/Mount Sinai.
bResidues showing rates of change in the top 25% of all residues were assigned to previously described epitopes [7, 23, 24].
cNot defined in prior studies.
dAmino acid 132a deleted in A/Puerto Rico/8/34/Mount Sinai, but present in other strains prior to 1997.
eResidues assigned to this epitope in this work only are italicized.
fResidues assigned to this epitope in this work and in previous studies are underlined.
gResidues assigned to this epitope in previous studies not meeting our inclusion criteria are enclosed in parentheses.
hH3 HA residues numbered as for mature HA1 of A/Aichi/2/1968.
iInfluenza B HA residues numbered as for B/Lee/40.
jPrevious studies define a single epitope at the top of influenza B HA , but we have elected to divide this into two based on apparent functional differences based on the pattern of biophysical changes we observe.
When Δabs values were mapped onto the surface of the influenza H1 HA monomer from crystal structure and compared to antigenic sites described for A/Puerto Rico/8/34 (H1N1, Figure 1a), there is good agreement between residues showing high Δabs values and the previously identified Sb and Sa antigenic sites on the top of the HA molecule ([23, 25], yellow and orange, respectively), roughly akin to the B antigenic site of H3 HA. The Ca2 antigenic site, below the receptor binding site (RBS), structurally analogous to the A site in H3 (blue in Figure 1a), shows some overlap with residues in this region showing high Δabs values, but higher values are seen for neighboring residues that form part of a prominent projection immediately below the RBS. Overlap with the remaining previously-described H1 antigenic sites, Ca1 and Cb (olive and red in Figure 1a) is less extensive. Additionally, high Δabs values predict an additional antigenic site composed of shelf-like projection below the Cb antigenic site, analogous to the C antigenic site in H3. For ease of further discussion, we will refer to this as H1C. We note that a somewhat similar antigenic site in H1 HA has been reported elsewhere . Residues assigned to each antigenic site are listed in Table 2. As for H3, H1 residues at the rear of the monomer are also changing relatively rapidly, and one of these, amino acid 98, has shown to be positively selected . Strikingly, differences in Δabs values between the Sa antigenic site on the top of the H1 monomer and non-antigenic site residues are not statistically significant, suggesting that the rate of change at this antigenic site is not high, even though the loss of a glycosylation site at this antigenic site seems to be a critical antigenic difference between “seasonal” H1N1 strains circulating between 1977 and 2008, and the pandemic “Swine-origin” 2009 H1N1 strains , possibly because this site might be constrained to preserve some unknown function. All other antigenic sites described are statistically significantly different from non-antigenic site residues in terms of Δabs values.
When compared to a previous antigenic map of B HA , residues with high Δabs values match well with the best defined antigenic site, analogous to the influenza A H3 B and H1 Sb antigenic sites, lying above the RBS. Antigenic sites analogous to the H3 BD, and E antigenic sites were previously defined, some by as few as three residues. Based on high Δabs, our studies support the existence of important antigenic determinants in these areas of the molecule, and suggest the existence of two additional antigenic sites on influenza BHA. One is found on a shelf-like structure below the previously-described E antigenic site, analogous to the H3 C and H1C sites. For ease of discussion, we will refer to this as the BC antigenic site. On the top of the molecule, in addition to the previously described H3 B-like antigenic site, adjacent to this we observe a putative novel antigenic site analogous to the Sa site in H1. For ease of further discussion, we will refer to these as BB1 and BB2 antigenic sites, respectively. The BB1 antigenic site consists of a “knob” of residues above the RBS, while BB2 consists mainly of a ridge of rapidly changing residues across the top of the molecule.
Site-specific differences in biophysical properties in H1 HA
When biophysical properties of those residues in H1 undergoing most frequent changes (Δabs values in the 75th percentile and above) were examined, there are quite striking differences between different antigenic sites. Changes in NCI values for ΔΔASAtot for the Ca2SbSa, and H1C antigenic sites were not statistically significant compared to changes in NCI values for ΔΔASAtot for non-antigenic site residues in H1 HA (Figure 2, Table 3), suggesting that volume occupied by individual amino acids, and hence the shape of the surface in these regions associated with antibody binding, is relatively conserved. This suggests that the overall shape of these antigenic sites is not particularly important in antibody recognition, and so changes to the shape of the antigenic site do not confer a selective advantage. Alternatively, the shape of the antigenic site must be conserved to prevent loss of some other important function, such as binding of cell surface receptors or a putative co-receptors . In contrast, two antigenic sites on the side of the trimer, Ca1, and Cb, did show statistically significant changes in ΔΔASAtot NCI, suggesting that changes in the shape of the surface in this region is at least tolerated, if not advantageous due to disruption of antibody binding. The Ca1 antigenic site is close to the trimer interface, so changes in shape might alter interactions between monomers, potentially affecting stability and influencing the pH of the transition to the fusion-active conformation. No significant changes in ΔΔASAnp NCI are found in any of the H1 HA antigenic sites. The H1C and Cb sites show significant differences in changes in NCI values for ∆∆ASApol, and the Ca2 antigenic site shows significant differences in ∆Q compared to non-antigenic site residues in HA1.
Statistical comparison of antigenic sites to non-antigenic residues
bStatistics: Kruskal Wallis one-way ANOVA (non-parametric) with Dunn’s post-test
cNon-antigenic site residues are all residues in HA1 not assigned to a particular antigenic site
Biophysical properties of frequently changing H3 residues
Sequences of HA genes from 958 human H3N2 influenza isolates were analysed as described above (Figure 3, Table 23). Unlike H1, we did not observe statistically significant differences in ∆∆ASAtot, ∆∆ASAnp, or ∆∆ASApol NCI between any of the H3 antigenic sites and non-antigenic site residues. Although the potential importance of charge in evolution of H3 antigenic sites has also been recently suggested , we did not find statistically significant differences in ∆Q between rapidly changing residues and non-antigenic site residues for any H3 antigenic site. Strikingly, some of the least conservative changes occur in residues within antigenic site D and at the rear of the trimer (Figure 1d), in areas of the molecule at least partially occluded by the neighboring monomer in the 3D structure. It has been suggested that these changes affect antibody binding at a distance by changing the conformation at the surface . Other studies demonstrate that the trimer may adopt a more open conformation than seen in the crystal structures at least transiently, exposing these residues to antibody . Thus, changes in the region of the trimer interface may act to increase or decrease the stability of the trimer, and covariation of residues interacting in the interface between neighboring monomers might be expected to occur. Alternatively, the rate of change of residues expected to be occluded based on the crystal structure may represent a background rate of amino acid change, and that all areas of the molecule undergoing change at lower rates are actually undergoing negative selection to maintain important functions such as interaction with alternate receptors or putative co-receptors [29, 32].
Differences in biophysical properties define separate adjacent antigenic sites in B HA
HA genes from 209 influenza B isolates were also studied (Figure 4, Tables 2-3). Unlike influenza A H1 and H3 HAs, NCI values for ΔΔASAtot, ΔΔASAnp, ΔΔASApol, and ∆Q are significantly different between each of the antigenic sites residues and non-antigenic site residues for all but ΔΔASApol NCI values at antigenic site BE. These findings suggest that changes in BHA antigenic sites may be more likely to confer selective advantage than those occurring in H1 and H3 HAs.
Observed changes in biophysical properties are dependent on alignment
To determine whether our findings were dependent upon the quality of the sequence alignment, NCI values for antigenic site residues were compared to a randomly chosen set of ten residues from the same HA (Table 4). The tables of sequences, with each sequence now represented by a dataset comprising the antigenic site residues (Table 2) and the randomly chosen residues, were then rearranged such that each sequence dataset now had new sequences as nearest neighbors, compared to its position in the original alignment. NCI values for Δabs, ΔΔASAtot, ΔΔASAnp, ΔΔASApol, and ∆Q were calculated for each amino acid position before rearrangement, and after twenty rounds of resorting, which we believe represents a partial randomization of the sequence order. In many cases, the degree of statistical significance differed between the same datasets in the original alignment and following partial randomization (Table 4). The fact that the statistical significance is altered when the data obtained reflect an alignment where the nearest neighbor sequences are not necessarily the most closely related suggests both that our analysis is yielding important information about changes between the most closely related sequences, and that our conclusions might be skewed if the alignment of sequences is poor.
Effect of partial randomization of sequence dataset on antigenic site statistics
pb from aligned sequencesc (pb from resorted sequencesd)
bProbability determined from comparison of antigenic site residues to randomly selected residues (see below) determined using Kruskal-Wallis ANOVA
cAmino acid sequences aligned using MUSCLE. See Additional file 1 for resultant sequence alignment.
dAntigenic site residues, along with a set of randomly selected residues (below), were extracted from each sequence, then the datasets containing the extracted residues representing each sequence were re-organized and Δabs, ΔΔASAtot, ΔΔASAnp, and ΔQ recalculated for each amino acid in the dataset based on the new arrangement of sequences (see Methods)
gRandomly selected influenza B HA residues: 4, 17, 36, 47, 78, 152, 196, 222, 251, 273, 300, 304
Comparison of changes in biophysical properties to other techniques to identify evolutionarily important residues
We wished to compare our results to those of others who have attempted to identify residues in influenza HA which might have evolutionarily predictive value (Table 5, Figure 5). A recent study of human seasonal H1N1 viruses identified eight residues in HA1 which were apparently under positive selection . Of these, all but one residue is also found in our dataset of amino acid residues (Table 2), and statistically significant differences are found between this dataset and the non-antigenic site residues from H1 HA1. Amino acid 98, the lone residue not assigned to an antigenic site in our studies is highly variable, but found on the solvent-exposed surface on the rear of the monomer. Studies of residues which were changed in viruses forming new branches within the H3N2 HA phylogenetic tree identified a group of 19 residues which seemed to be predictive of forming a new branch ; of these, all but three are also assigned to antigenic sites in our study. Two of these (190 and 194) are adjacent to the receptor binding site and do not change at sufficiently high frequency to meet our inclusion criteria, and the remaining residue (262) is solvent exposed on the lip of the monomer at the trimer interface. This dataset is statistically significantly different from the H3 HA1 non-antigenic site residues in terms of the absolute frequency of amino acid change, but not in any other quantity examined. We also compared our data to a dataset of sites in H3 HA1 undergoing directional selection, another means of identifying accelerated substitutions at a specific site . As for the residues identified in  and , many of the residues identified by this technique are also identified as antigenic site residues in our analysis. Unlike the residues identified by Bush et al. and antigenic site residues from our methodology, we observe statistically significant differences between the dataset of directionally selected residues  and non-antigenic site residues for both ∆abs and NCI values for ∆Q. We note that a large number of amino acids are invariant in our dataset, particularly in H1 and influenza B. For those residues making critical structural interactions, this is presumably the result of negative selection to maintain structural integrity, but for those residues on the surface it is difficult to distinguish between the effects of negative selection to maintain a previously unappreciated function and the background rate of mutation in the absence of positive selection.
Changes in biophysical properties of positively-selected or “predictive” amino acids
iSee  Residues not in HA1 are excluded from this analysis
Role of alteration in biophysical properties in antibody-mediated selection of variant viruses
Insights into the mechanism of antibody binding have been derived from structural, biophysical, and biochemical characterization of antibody-antigen pairs , particularly for hen-egg lysozyme and anti-idiotypic antibodies (reviewed in ), and influenza A HA (reviewed in ) and NA [10, 13, 37, 38]. Changes in shape of the antigenic sites due changes in the volumes of individual side-chains were monitored by examining ΔΔASAtot. Larger NCI values for ΔΔASAtot suggest that an amino acid with a small side-chain surface area has been replaced with a larger amino acid or vice-versa. The biophysical quantities ΔΔASAnp, ΔΔASApol, and ΔQ measure the propensity of residues to participate in certain kinds of interactions. Charged residues will interact with residues of opposite charge and be repelled by residues of like charge. Charged and polar residues can also participate in hydrogen bonding, either with water molecules or with other proteins. Hydrophobic interactions between non-polar surfaces are important in protein-protein interactions by contributing to positive entropy to favor the energetics of the bound state  and hydrophobic surfaces are a feature of at least some antibodies showing evidence of affinity maturation . However, solvent-exposed hydrophobic surfaces are energetically unfavorable.
Changes in shape may drive evolution of some antigenic sites
Statistically significant changes in ΔΔASAtot NCI values were seen for antigenic sites on the side of H1 HA (Cb and Ca1) and for all antigenic sites described for influenza B HA, but not for antigenic sites at the top of the H1 HA (Sb and Sa) or at the trimer interface (Cb), or for any antigenic site in H3 HA. Thus, the ΔΔASAtot NCI values we measured suggest that the shape of the surface in the antigenic sites is altered significantly by the accumulation of mutations for some antigenic sites, and thus changes in the overall shape of these antigenic site may contribute to escape from antibody binding. For those antigenic sites not showing significant differences in ΔΔASAtot, such as the Sb and Sa antigenic sites of on the top of H1, the shape of the surface may be critical to maintaining other hitherto unappreciated functions in virus binding or entry.
Changes in thermodynamic properties may influence antibody escape
Statistically significant changes in ΔΔASAnp NCI values were found for all antigenic sites in influenza B HA, and for previously-described positively selected residues in H1 HA. The fact that the influenza B HA antigenic sites have some hydrophobic character might indicate that they play some other role in the function of HA, so there may be important functional reasons for hydrophobic residues to be retained. Antibody binding sites studied to date at the structural and biophysical level seem to fall into at least two classes, the first, where the antigenic site consists of a central core area of hydrophobic residues, often surrounded by an outer ring of hydrophilic amino acids, and a second where hydrophilic residues and immobilized water molecules seem to play an important role. In the first situation, so called “O-ring” epitopes, much of the binding energy is contributed by the increase in entropy due to the liberation of the highly ordered water molecules at the hydrophobic residues in both antibody and antigen. Thus, mutation of hydrophobic residues in the antigenic site would be expected to reduce the binding energy of the antibody-antigen complex, as has be shown in vitro[11, 39]. We note that many of the positively selected residues identified by Li et al., which as a group show significant differences in ΔΔASAnp NCI values compared to non-antigenic site residues in H1 HA (Table 5), are also identified in our study. These residues are mainly found in the Ca2Sb, and Sa antigenic sites. In the Sb, and Sa antigenic sites, positively selected residues are clustered together towards near center of our antigenic sites, suggesting that these amino acids may act as the hydrophobic core of “O-ring” like epitopes (Figure 2).
Biophysical and structural studies show that charge-charge interactions (“salt bridges”) can make critical contributions to both the extent and rate of antibody binding , thus it is logical that changes in charge within an antigenic site may confer a selective advantage, as seen in the H1 Ca2 and influenza B HA antigenic sites. The loss of a critical charged residue would be expected to have a deleterious effect on both rate and extent of antibody binding, and the gain of a novel charged residue could either prevent antibody binding due to electrostatic repulsion or alter the rate of binding by altering “electrostatic steering” required for correct alignment of an antibody with its cognate antigenic site (see  for review).
Forces shaping evolution of influenza HA may vary between subtypes and antigenic sites
Differences between the different HAs, and between antigenic sites of the same HA molecule may suggest that the “rules” for selecting changes at these sites may be different. The rates of change of amino acid identity were significant for all H3 and influenza B HA antigenic sites compared to non-antigenic site residues, and for all but the Sa antigenic site of H1 HA. This result is somewhat surprising, given that an important structural difference in this antigenic site between the pandemic “Swine-origin” H1N1 influenza virus emerging in 2009 and prior seasonal H1N1 apparently played an important role the susceptibility of the many people born after 1957 to the pandemic virus . No statistically significant changes in the other quantities studied were observed for H3 HA, suggesting that the antibody repertoire against H3 HA, if responsible for selecting the changes observed, is sufficiently discriminatory that even highly conservative amino acid substitutions are sufficient to confer a selective advantage. Interestingly, some residues on the surface of HA monomer apparently undergoing rapid change may not be antibody accessible, at least based on the available crystal structures, suggesting their evolution may be controlled by other factors. This is particularly true of the rapidly changing residues on the “rear” of the HA monomer, which would not be expected to be solvent exposed in the neutral pH trimer form, although there is good evidence to suggest that the HA trimer is less rigid in vivo than expected from available crystallographic and electron microscopy data, allowing the trimer structure to open and close . These residues may vary simply because they are not under negative selection since they would not be expected to be required to participate in any of the known functions of HA and are not involved in stabilizing its secondary, tertiary, or quaternary structure.
Possible implications for influenza evolution and immunity
Our data raise several important issues in understanding the function of influenza HA and the host immune system. First, there appear to be important differences between evolution of H3 HA and that of H1 and influenza B HA. This suggests immune responses to H3 HA may be functionally different from the immune responses to H1 and influenza B. Differences in the role of antibody selection between influenza B and H3N2 viruses have been proposed previously . Our data suggest that even conservative structural or biophysical changes in H3 HA antigenic sites may be sufficient to confer a selective advantage. Influenza B and H1 HAs may also be more subject to structural or functional constraints, so fewer kinds of changes are permitted. A second possibility is that escape from antibody neutralization may not be a significant positive selection for H3N2 viruses in vivo, and changes in the neutralizing antigenic sites may be selected because they act in concert with other changes in replication in order to generate more fit progeny, as observed with recent human H3N2 isolates .
The specific kinds of changes observed in antigenic sites in influenza B and H1 HAs, may also suggest that the antibody repertoires specific for these sites is more restricted than for H3N2 viruses, and thus a particular type of change may be reflected in the antibody response of many individuals. The primary anti-influenza antibody response in humans may not be truly polyclonal, at least against influenza B and H1 HAs. Instead, certain heavy and/or light chain rearrangements and combinations may be more likely to confer tight binding to individual antigenic sites. Studies in humans vaccinated against H1N1 and H3N2 also showed that the primary response is highly restricted, with some donors having only small numbers of unique VH and VL rearrangements represented but showing evidence of significant diversification due to somatic hypermutation . Similarly, studies in BALB/c mice immunized with influenza A/Puerto Rico/8/34 (PR8) showed that certain heavy and light chain genes, and particular VH-VL combinations were overrepresented in the primary antibody response [45–47], with more than 50% of the antibodies in the primary response targetted to a particular antigenic site sharing a single VL gene . Interestingly, those antibodies most abundant in the primary response were not as frequent in the secondary response, which showed a broader representation of VH and VL genes. Thus, the apparent differences in behavior we observe at different antigenic sites could represent the effects of positive selection by a set of primordial anti-influenza antibodies overrepresented in the primary antibody response.
If positive selection by antibody does indeed play an important role, understanding how influenza virus persists in a large and outbred population with a highly diverse immune system, such as humans, presents something of a conundrum. The viruses circulating each year are closely related both to each other and to the viruses circulating in the previous year. It has been suggested that certain individuals in the population play a disproportionate role in the spread of influenza ; such “superspreaders”, should they exist, might also play a role as “superselectors” in modulating the virus repertoire in the human population. The existence of some sort of primordial antibody response where a particular VH, VL, or V(D)J rearrangement predominates would also explain apparent differences in behavior between different antigenic sites in the same molecule, since each antigenic site would be under the selection of a different set of primordial antibodies that are consistent from individual to individual. Thus, influenza viruses evolving to escape this primordial response in one individual would now have a selective advantage in other human hosts.
The role of antibody selection remains a critical open question in understanding evolution of influenza virus in the human population. Our data suggest that the relative contribution of positive selection for antibody escape may vary from subtype to subtype and site to site. Other data suggest that there is a complex interplay between antigenicity and receptor utilization. For example, studies comparing infection of immunized mice with mouse-adapted influenza virus gave rise to numerous HA mutations which simultaneously altered both receptor binding and antibody neutralization . Analyses of clinical H3N2 viruses from 2003 to 2008 indicated that these viruses had become progressively restricted in terms of the types of sialic acids bound, correlating with a decreased requirement for receptor-matched NA activity [50–52]. Since, as seen in HA, antigenic sites on NA are also located on the lip of the receptor binding pocket , adjustments in receptor binding could either drive or result from changes in antigenicity of HA, or even changes in NA. Finally, in the context of the polyclonal antibody response, the role of alterations in virus replication or innate immunity cannot be discounted .
We have attempted to integrate an understanding of the role of protein structure and the thermodynamics of protein-protein interactions into evolutionary studies of influenza virus. Our studies indicate important and surprising differences in the evolution of different influenza HAs, and different antigenic sites within these molecules in humans, possibly due to differences in the immune response mounted to these viruses. Some antigenic sites show evidence that changes affecting specific biophysical properties may play critical roles in selecting novel influenza variants. Our findings may allow development of models to predict, or at least assess the importance of novel influenza strains in the future, enhancing the effectiveness of vaccine design.
Side-chain non-polar surface area
change in side-chain non-polar surface area
side-chain polar surface area
change in side-chain polar surface area
total solvent-exposed surface area due to the side-chain
change in total solvent-exposed surface area due to the side-chain
hemagglutinin subtype 1
hemagglutinin subtype 3
National Center for Biotechnology Information
neuraminidase subtype 1
neuraminidase subtype 2
normalized change index
side-chain net charge at pH 7.0
Change in side-chain net charge at pH 7.0
Receptor binding site.
We thank the following for helpful discussions and critical reading of the manuscript: GM Air, EM Bengtén, CR Bourne, JJ Coreia, VG Chinchar, LB King, ME Marquart, LS McDaniel, RJ O’Callaghan, DA Robinson, DC Sullivan, RR Thangavel, ME Sanders, DB Sittman, DC Sullivan, MR Wilson, and A Zlotnick. Work in the authors’ laboratory has been supported by the American Cancer Society (IRG 98-275-07), and the University of Mississippi Medical Center Intramural Research Support Program (SJS). The Base Pair Program (LBP) was supported by the Howard Hughes Medical Institute (Precollege and Undergraduate Science Education Grant #51006104).
Department of Microbiology, University of Mississippi Medical Center
Base Pair Program, University of Mississippi Medical Center
Present address: Sally McDonnell Barksdale Honors College, University of Mississippi
Desselberger U, Nakajima K, Alfino P, Pedersen F, Haseltine W, Hannoun C, Palese P: Biochemical evidence that "new" influenza virus strains in nature may arise by recombination (reassortment).Proc Natl Acad Sci U S A 1978, 75:3341–3345.PubMedView Article
Schild G, Oxford J, Dowdle W, Coleman M, Pereira M, Chakraverty P: Antigenic variation in current influenza A viruses: evidence for a high frequency of antigenic 'drift' for the Hong Kong virus.Bull World Health Organ 1974, 51:1–11.PubMed
Fiers W, De Filette M, El Bakkouri K, Schepens B, Roose K, Schotsaert M, Birkett A, Saelens X: M2e-based universal influenza A vaccine.Vaccine 2009, 27:6280–6283.PubMedView Article
Wang T, Tan G, Hai R, Pica N, Ngai L, Ekiert D, Wilson I, García-Sastre A, Moran T, Palese P: Vaccination with a synthetic peptide from the influenza virus hemagglutinin provides protection against distinct viral subtypes.Proc Natl Acad Sci U S A 2010, 107:18979–18984.PubMedView Article
Hirst G: The quantitative determination of influenza virus and antibodies by means of red cell agglutination.J Exp Med 1942, 75:49–64.PubMedView Article
Wiley DC, Wilson IA, Skehel JJ: Structural identification of the antibody-binding sites of Hong Kong influenza haemagglutinin and their involvement in antigenic variation.Nature 1981, 289:373–378.PubMedView Article
Wilson IA, Cox NJ: Structural basis of immune recognition of influenza virus hemagglutinin.Annu Rev Immunol 1990, 8:737–771.PubMedView Article
Air GM, Els MC, Brown LE, Laver WG, Webster RG: Location of antigenic sites on the three-dimensional structure of the influenza N2 virus neuraminidase.Virology 1985, 145:237–248.PubMedView Article
Colman PM, Laver WG, Varghese JN, Baker AT, Tulloch PA, Air GM, Webster RG: Three-dimensional structure of a complex of antibody with influenza virus neuraminidase.Nature 1987, 326:358–363.PubMedView Article
Nuss JM, Air GM: Defining the requirements for an antibody epitope on influenza virus neuraminidase: How tolerant are protein epitopes?J Mol Biol 1994, 235:747–759.PubMedView Article
Nuss JM, Whitaker PB, Air GM: Identification of critical contact residues in the NC41 epitope of a subtype N9 influenza virus neuraminidase.Proteins, Struct, Funct Genet 1993, 15:121–132.View Article
Lee J, Air G: Contacts between influenza virus N9 neuraminidase and monoclonal antibody NC10.Virology 2002, 300:255–268.PubMedView Article
Venkatramani L, Bochkareva E, Lee J, Gulati U, Laver W, Bochkarev A, Air G: An epidemiologically significant epitope of a 1998 human influenza virus neuraminidase forms a highly hydrated interface in the NA-antibody complex.J Mol Biol 2006, 356:651–663.PubMedView Article
Sundberg EJ, Mariuzza RA: Molecular recognition in antibody-antigen complexes.Adv Protein Chem 2002, 61:119–160.PubMedView Article
Darwin C: On the origin of species by means of natural selection. Murray, J, London; 1859.
Bao Y, Bolotov P, Dernovoy D, Kiryutin B, Zaslavsky L, Tatusova T, Ostell J, Lipman D: The Influenza Virus Resource at the National Center for Biotechnology Information.J Virol 2008, 82:596–601.PubMedView Article
Suzuki Y: Natural selection on the influenza virus genome.Mol Biol Evol 2006, 23:1902–1911.PubMedView Article
Li W, Shi W, Qiao H, Ho S, Luo A, Zhang Y, Zhu C: Positive slelection on hemagglutinin and neuraminidase genes of H1N1 influenza viruses.Virol J 2011, 8:183.PubMedView Article
Edgar R: MUSCLE: multiple sequence alignment with high accuracy and high throughput.Nucleic Acids Res 2004, 32:1792–1797.PubMedView Article
Lassmann T, Sonnhammer E: Kalign, Kalignvu and Mumsa: web servers for multiple sequence alignment.Nucleic Acids Res 2006, 34:W596-W599.PubMedView Article
Hubbard S, Thornton J: NACCESS. Faculty of Life Sciences, University of Manchester; 1996.
Baker BM, Murphy KP: Prediction of binding energetics from structure using empirical parameterization.Methods Enzymol 1998, 295:294–315.PubMedView Article
Caton AJ, Brownlee GG, Yewdell JW, Gerhard W: The antigenic structure of the influenza virus A/PR/8/34 hemagglutinin (H1) subtype.Cell 1982, 31:417–427.PubMedView Article
Krystal M, Young JF, Palese P, Wilson IA, Skehel JJ, Wiley DC: Sequential mutations in hemagglutinins of influenza B virus isolates: definition of antigenic domains.Proc Nat'l Acad Sci USA 1983, 80:4527–4531.View Article
Gerhard W, Yewdell J, Frankel M, Webster RG: Antigenic structure of influenza virus hemagglutinin defined by hybridoma antibodies.Nature 1981, 290:713–717.PubMedView Article
Wilson IA, Skehel JJ, Wiley DC: Structure of the hemagglutinin membrane glycoprotein of influenza virus at 3 Å resolution.Nature 1981, 289:366–373.PubMedView Article
Deem M, Pan K: The epitope regions of H1-subtype influenza A, with application to vaccine efficacy.Protein Eng Des Sel 2009, 22:543–546.PubMedView Article
Xu R, Ekiert D, Krause J, Hai R, Crowe JJ, Wilson I: Structural basis of preexisting immunity to the 2009 H1N1pandemic influenza virus.Science 2010, 328:357–360.PubMedView Article
Stray SJ, Cummings RD, Air GM: Influenza virus infection of desialylated cells.Glycobiology 2000, 10:649–658.PubMedView Article
Pan K, Long J, Sun H, Tobin G, Nara P, Deem M: Selective Pressure to Increase Charge in Immunodominant Epitopes of the H3 Hemagglutinin Influenza Protein.J Mol Evol 2011, 72:90–103.PubMedView Article
Yewdell J, Taylor A, Yellen A, Caton A, Gerhard W, Bächi T: Mutations in or near the fusion peptide of the influenza virus hemagglutinin affect an antigenic site in the globular region.J Virol 1993, 67:933–942.PubMed
Bush R, Bender C, Subbarao K, Cox N, Fitch W: Predicting the evolution of human influenza A.Science 1999, 286:1921–1925.PubMedView Article
Kosakovsky Pond S, Poon A, Leigh Brown A, Frost S: A maximum likelihood method for detecting directional evolution in protein sequences and its application to influenza A virus.Mol Biol Evol 2008, 25:1809–1824.PubMedView Article
Li Y, Li H, Yang F, Smith-Gill SJ, Mariuzza RA: X-ray snapshots of the maturation of an antibody response to a protein antigen.Nat Struct Biol 2003, 10:482–488.PubMedView Article
Skehel JJ, Wiley DC: Receptor binding and membrane fusion in virus entry: the influenza hemagglutinin.Annu Rev Biochem 2000, 69:531–569.PubMedView Article
Colman PM, Tulip W, Varghese JN, Tulloch PA, Baker AT, Laver WG, Air GM, Webster RG: Three-dimensional structures of influenza virus neuraminidase-antibody complexes.Phil Trans Roy Soc London B 1989, 323:511–518.PubMedView Article
Gulati U, Hwang C, Venkatramani L, Gulati S, Stray S, Lee J, Laver W, Bochkarev A, Zlotnick A, Air G: Antibody epitopes on the neuraminidase of a recent H3N2 influenza virus (A/Memphis/31/98).J Virol 2002, 76:12274–12280.PubMedView Article
Sundberg EJ, Urrutia M, Braden BC, Isern J, Tsuchiya D, Fields BA, Malchiodi EL, Tormo J, Schwarz FP, Mariuzza RA: Estimation of the hydrophobic effect in an antigen-antibody protein-protein interface.Biochemistry 2000, 39:15375–15387.PubMedView Article
Sinha N, Mohan S, Lipschultz C, Smith-Gill S: Differences in electrostatic properties at antibody-antigen binding sites: implications for specificity and cross-reactivity.Biophys J 2002, 83:2946–2968.PubMedView Article
Sinha N, Smith-Gill S: Electrostatics in protein binding and function.Curr Protein Pept Sci 2002, 3:601–614.PubMedView Article
Air GM, Gibbs AJ, Laver WG, Webster RG: Evolutionary changes in influenza B are not primarily governed by antibody selection.Proc Nat'l Acad Sci 1990, 87:3884–3888.View Article
Memoli M, Jagger B, Dugan V, Qi L, Jackson J, Taubenberger J: Recent human influenza A/H3N2 virus evolution driven by novel selection factors in addition to antigenic drift.J Infect Dis 2009, 200:1232–1241.PubMedView Article
Wrammert J, Smith K, Miller J, Langley W, Kokko K, Larsen C, Zheng N, Mays I, Garman L, Helms C, et al.: Rapid cloning of high-affinity human monoclonal antibodies against influenza virus.Nature 2008, 453:667–671.PubMedView Article
Clarke S, Staudt L, Kavaler J, Schwartz D, Gerhard W, Weigert M: V region gene usage and somatic mutation in the primary and secondary responses to influenza virus hemagglutinin.J Immunol 1990, 144:2795–2801.PubMed
Kavaler J, Caton AJ, Staudt LM, Schwartz D, Gerhard W: A set of closely related antibodies dominates the primary antibody response to the antigenic site CB of the A/PR/8/34 influenza virus hemagglutinin.J Immunol 1990, 145:2312–2321.PubMed
Caton A, Stark S, Kavaler J, Staudt L, Schwartz D, Gerhard W: Many variable region genes are utilized in the antibody response of BALB/c mice to the influenza virus A/PR/8/34 hemagglutinin.J Immunol 1991, 147:1675–1686.PubMed
Hope-Simpson R, Golubev D: A new concept of the epidemic process of influenza A virus.Epidemiol Infect 1987, 99:5–54.PubMedView Article
Hensley S, Das S, Bailey A, Schmidt L, Hickman H, Jayaraman A, Viswanathan K, Raman R, Sasisekharan R, Bennink J, Yewdell J: Hemagglutinin receptor binding avidity drives influenza A virus antigenic drift.Science 2009, 326:734–736.PubMedView Article
Kumari K, Gulati S, Smith D, Gulati U, Cummings R, Air G: Receptor binding specificity of recent human H3N2 influenza viruses.Virol J 2007, 4:42.PubMedView Article
Gulati U, Wu W, Gulati S, Kumari K, Waner J, Air G: Mismatched hemagglutinin and neuraminidase specificities in recent human H3N2 influenza viruses.Virology 2005, 339:12–20.PubMedView Article
Gulati S, Smith D, Air G: Deletions of neuraminidase and resistance to oseltamivir may be a consequence of restricted receptor specificity in recent H3N2 influenza viruses.Virol J 2009, 6:22.PubMedView Article
Thangavel R, Reed A, Norcross E, Dixon S, Marquart M, Stray S: "Boom" and "Bust" cycles in virus growth suggest multiple selective forces in influenza A evolution.Virol J 2011, 8:180.PubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.