We and others have previously shown that reassortment has occurred between different EIV [4–8, 15, 16], but a lack of full genome sequences for EIV makes it difficult to ascertain the extent of reassortment between them, and whether reassortment has occurred between EIV and influenza A viruses from other species. We therefore developed a simple method for sequencing viral genomes that included primers with M13 sequence tags to improve the sequencing efficiency. This was based upon the approach recommended by the WHO for sequencing swine influenza virus isolates in 2009 , however our method used only 26 PCR fragments to cover the segment-specific regions of EIV, rather than 46 fragments. Alternative methods have been employed for sequencing influenza A viruses, such as using universal primers to simultaneously amplify all eight genome segments, or segment specific primers to amplify entire segments; however, in our hands such protocols result in poor amplification of the three largest genome segments (data not shown). Other methods based on amplification of small PCR fragments do not include the M13 sequences in the primers, which makes the method described here simple and efficient. A method previously described for sequencing the NCRs of the influenza gene segments was also modified and successfully used for the first time on an equine influenza virus, with novel primers designed for an N8 subtype NA.
The two segments encoding the surface glycoproteins, HA and NA, contained a large number of amino acid differences between the two viruses. This was expected as these two proteins are under constant immune-driven selection pressure to undergo antigenic drift. Interestingly a high number of amino acid differences were found in PA, especially when compared to the other two polymerase subunits PB2 and PB1, and a similar finding was observed by Murcia et al. . The other internal segments contained fewer changes, which is not surprising as they are both smaller and may be under less immune pressure than the surface proteins.
Interestingly, a mutation in the +1 reading frame of PA, causing a premature stop codon in the translated amino acid sequence of PA-X, was observed in A/equine/Richmond/1/07. PA-X is a recently discovered protein containing the N-terminal 191 amino acids of PA and, in the majority of strains, a further 61 amino acids derived from a frameshift to the +1 reading frame of PA . PA-X has been implicated in the modulation of influenza virus pathogenicity and virulence in a mouse model, whereby PA-X deficient viruses caused greater clinical signs and were less able to shut off host cell responses compared to wild-type viruses with full length PA-X . The premature stop codon in A/equine/Richmond/1/07 would lead to a truncation of the protein by 42 amino acids. Truncated forms of PA-X have been described previously, however the majority of these are due to a nonsense mutation at codon 42 in the +1 reading frame . Sequencing of PA, as described here, revealed that several other virus isolates from different outbreaks in 2007 as well as from the same yard as A/equine/Richmond/1/07, had the same truncated form of PA-X, however the truncated form did not persist in the UK.
Sequence analysis of the NCRs from each segment showed that EIV strain A/equine/Richmond 1/07 had cytosine at position 4 of the 3′ vRNA in the three polymerase segments, as found in other influenza viruses, and uracil at this position in the remaining 5 segments. This is the same pattern seen in the majority of other influenza A viruses for which the promoter sequences have been determined, including the prototype avian influenza virus, A/chicken/Rostock/34 (H7N1) .
The methods outlined here can be used to determine the genome sequences of EIV, including the NCRs, from both clade 1 and clade 2 of the Florida sublineage. The techniques described here are affordable, and the equipment required is available in most research laboratories. The sequence assembly process is simple and does not require in depth bioinformatics, unlike next generation sequencing methodology. Due to the small genome size and small sample numbers usually associated with EIV, this method is therefore highly cost effective and straightforward. Amplicon sequencing has also been shown to be less labour intensive and more affordable than plasmid cloning methods . This method also permits the sequencing of individual gene segments with relative ease, as was the case with PA described here to investigate the frequency of the truncated form of PA-X.