Table 1 Positively selected sites in hemagglutinin from viruses from different clusters

Dataset Description Number of sequences Length of alignment (bp) Number of sites under positive selection Positively selected sites1
       HA1 HA2
1.1 Avian strains 75 1647 0 0     
1.2 North American swine strains 196 1647 0 1   138   
1.3 Eurasian swine strains 90 1647 0 1     399
1.4 Seasonal human strains 1404 1647 8 8 82B,94B, 141B,162BG,186B,187BR,222BR 82B,94B,160BG,162BG,186B,187BR,222BR 451T 451T
1.5 Pandemic 2009 human strains 1891 1647 7 9 186B,197,203,205B,222BR,223B,261BT 186B,197,203,222BR,261BT   411T,451T,460T,530T
  1. 1In these columns, B indicates that the site lies in the B-cell antigenic regions [18]. G means that it is a potential glycosylation site. T indicates that the site lies in the T-cell antigenic regions [19]. R indicates that it is a receptor-binding site [11]. We use the same numbering strategy as Deem and Pan [18] and start numbering from the amino acids DTLC.