Phylogenetic analyses of the polyprotein coding sequences of serotype O foot-and-mouth disease viruses in East Africa: evidence for interserotypic recombination

Background Foot-and-mouth disease (FMD) is endemic in East Africa with the majority of the reported outbreaks attributed to serotype O virus. In this study, phylogenetic analyses of the polyprotein coding region of serotype O FMD viruses from Kenya and Uganda has been undertaken to infer evolutionary relationships and processes responsible for the generation and maintenance of diversity within this serotype. FMD virus RNA was obtained from six samples following virus isolation in cell culture and in one case by direct extraction from an oropharyngeal sample. Following RT-PCR, the single long open reading frame, encoding the polyprotein, was sequenced. Results Phylogenetic comparisons of the VP1 coding region showed that the recent East African viruses belong to one lineage within the EA-2 topotype while an older Kenyan strain, K/52/1992 is a representative of the topotype EA-1. Evolutionary relationships between the coding regions for the leader protease (L), the capsid region and almost the entire coding region are monophyletic except for the K/52/1992 which is distinct. Furthermore, phylogenetic relationships for the P2 and P3 regions suggest that the K/52/1992 is a probable recombinant between serotypes A and O. A bootscan analysis of K/52/1992 with East African FMD serotype A viruses (A21/KEN/1964 and A23/KEN/1965) and serotype O viral isolate (K/117/1999) revealed that the P2 region is probably derived from a serotype A strain while the P3 region appears to be a mosaic derived from both serotypes A and O. Conclusions Sequences of the VP1 coding region from recent serotype O FMDVs from Kenya and Uganda are all representatives of a specific East African lineage (topotype EA-2), a probable indication that hardly any FMD introductions of this serotype have occurred from outside the region in the recent past. Furthermore, evidence for interserotypic recombination, within the non-structural protein coding regions, between FMDVs of serotypes A and O has been obtained. In addition to characterization using the VP1 coding region, analyses involving the non-structural protein coding regions should be performed in order to identify evolutionary processes shaping FMD viral populations.


Background
Foot-and-mouth disease virus (FMDV) is a member of the Picornaviridae family, belonging to the genus Aphthovirus [1] and is the causative agent of foot-andmouth disease (FMD), a highly contagious infection of cloven-hoofed domestic animals and over 70 wildlife species [2]. The viral genome is a positive-sense single stranded RNA (about 8.3 kb, see Figure 1), encoding a polyprotein which is processed to yield structural and non-structural proteins [1]. The RNA genome undergoes a high rate of mutation due to error prone replication by the RNA polymerase resulting in high genetic diversity [3], however, not all coding regions evolve at the same rate [4]. The structural protein coding region, particularly the sequence for VP1, has been shown to vary significantly between strains and serotypes hence it is used extensively for evolutionary relationship inference [5]. Persistent infection, recombination, and quasi-species dynamics have also been suggested as contributing to the genetic variation [6][7][8]. Globally, the virus exists in seven distinct serotypes; the Southern African territories [SAT] types 1-3 and Eurasian types namely O, A, C and Asia 1. Immunity to one serotype does not confer protection against another. In Africa, FMD is endemic in the sub-Saharan region with six of the known serotypes recorded in the Eastern African region [9]. Of the numerous outbreaks reported in this region, most are attributed to serotype O, followed by A, SAT 2 and SAT 1 but some cases of serotype C have been reported in Ethiopia, Kenya and Uganda [9]. SAT 3 has been isolated in 1970 and 1997 from African buffalo in the Queen Elizabeth National Park (Uganda) but has otherwise not been recorded elsewhere in East Africa [9,10]. Previous studies, based on VP1 coding sequences, have shown that four different lineages (EA 1-4) of type O FMDV are present in this region [11]. This complex intra-serotypic variation coupled with the presence of multiple serotypes has complicated disease control, which is achieved mainly by vaccination and restrictions on animal/animal product movement. In East Africa, most molecular studies have sought to identify phylogenetic relationships among viruses relying on the sequence of the VP1 coding region [12]. Although used extensively for phylogenetic studies, the VP1 sequence alone may be inadequate for revealing all processes shaping diversity and evolution of the viruses in the region [13][14][15].
Elsewhere on the African continent, remarkable progress in FMD molecular epidemiology has been made particularly in Southern Africa, a region also characterized by SAT type endemicity. Molecular epidemiological studies have been able to reveal the origin and routes of FMD transmission in this region. In addition, the important epidemiological role of the African buffalo (Syncerus caffer) in maintenance of this virus and transmission to other cloven-hoofed animals has been highlighted [16][17][18].
In this study, we have analyzed the polyprotein coding sequence (6719 nt) of serotype O FMD viruses to get insights into relationships among these viruses to infer processes shaping their diversity in the East African region. Sequences analyzed in this study include isolates from Kenya and Uganda obtained between the years 1992 and 2006 and sequences already in the Genbank database.

Viral isolates
The viruses investigated in this study were collected during outbreaks in the indicated years in Kenya (K) and Uganda (U). RNA extraction, cDNA synthesis, PCR and Cycle sequencing RNA was extracted from virus harvests and directly from orophyrangeal fluid using the QIAamp® Viral RNA kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. The cDNA synthesis was carried out using Ready-To-Go™ You-Prime First -Strand Beads (GE Healthcare Life Sciences, Sweden) with random hexamer primers (pdN 6 ). PCR reactions of overlapping fragments were performed in a final volume of 50 μl using 2-5 ng of cDNA, 0.2 pmol of primers (Table 1) and 2.5 U of AmpliTaq gold DNA polymerase (Applied Biosystems), 200 μM of each dNTP (dATP, dCTP, dGTP, dTTP) and 1.5 mM MgCl 2 . Following the activation of AmpliTaq gold DNA polymerase at 95°C for 5 min, reaction mixtures were denatured at 95°C for 15 s followed by 60°C for 2 min to allow for primer annealing. For each cycle, a chain elongation step at 72°C for 1 min 20 s was allowed. This process was repeated 30 times and final extension continued at 72°C for 5 min. The resultant PCR products were analysed using 2% agarose gel electrophoresis with a molecular weight marker (ΦX174-RF DNA, Amersham Biosciences). Purification of the PCR products to remove oligonucleotides, dNTPs and enzyme was achieved with a QIAquick® PCR purification kit (Qiagen, Hilden, Germany). Sequencing was performed in both directions using a Big dye Terminator V 3.1 kit (Applied Biosystems) and ran on an automated DNA Sequencer (ABI PRISM® 3700) by Macrogen, Korea, using the same primers as employed in the PCRs

Sequence Analysis
Sequencher software 4.8 (Gene Code Corporation) was used to assemble the 12 overlapping fragment sequences generated per sample. Multiple alignments by log-expectation comparison were carried out using MUSCLE [19] incorporated within Geneious 4.7.6 software [20]. Phylogenetic analyses involving the determination of models of evolution were performed using hierarchical likelihood-ratio test of 24 models using PAUP*(v. 4.0 beta 10) [21] and MrModeltest (v 2.2) [22]. The GTR+I+G model was used and Bayesian inference analysis performed using MrBayes (v.3.1.2) [23] with the following settings: maximum likelihood model was six substitution types (nst = 6), with base frequencies set to variable values ("statefreqpr = dirichlet(1,1,1,1"). Rate variation across sites with a proportion of invariable sites was modeled using a gamma distribution (rates = invgamma). The Markov Chain Monte Carlo search was run with 4 chains for 500,000 generations with trees sampled every 100 generations; the first 1250 were discarded as burnin [23]. Recombination between sequences was analysed using SimPlot method 2.5 software [24].

Sequence Characteristics and Phylogenetic Relationships
Almost the entire polyprotein coding sequence (6719 nt) of five FMDV isolates and a single Ugandan FMDV RNA sample were determined. Four of these samples Phylogenetic relationships have been determined using the VP1 coding region for serotype identification. Furthermore, analyses using the coding regions for the Leader (L) protease, the whole capsid precursor (P1-2A) and for the non-structural protein precursors P2 and P3 individually, as well as for almost the entire polyprotein, have been performed. The phylogenetic relationships identified between these East African serotype O strains and representatives of each of the different FMDV serotypes for the VP1 coding regions are shown in Figure 2. The recent strains from Kenya and Uganda analysed here, each clearly belong to serotype O within topotype EA-2 while the older strain K/52/1992 belongs to topotype EA-1. These recent viral isolates each belong to a single evolutionary lineage albeit within different sublineages. Figure 3 shows the inferred phylogenetic relationships between serotype O viruses for the polyprotein coding region. In this phylogeny, with the exception of K/52/ 1992 which belongs to EA-1, the other East African strains are grouped into a single lineage which is distinct from the serotype O viruses isolated from Asia, Europe and South America. Similar phylogenies were observed for the Leader protease and the entire capsid precursor (P1-2A) (data not shown). Figures 4 and 5 show the phylogeny inferred from the coding regions for the P2 (2BC) and P3 (3ABCD) precursors respectively. The P2 coding region follows a similar pattern as for the other coding regions of the East African strains with the exception that the isolate K/52/1992 showed a close relationship in this region to the sequence of a serotype A isolate (Accession no. AY593766, A23/KEN/46/65) obtained in 1964 from Kenya. Furthermore, within the P3 coding region, the 1992 Kenyan isolate is most closely related to a serotype A isolate obtained in 1965 from Azerbaijan in the former USSR (Accession no. X74812) and to the A23/ KEN/1965 strain. It should be noted that in the other regions of the genome these serotype A viruses were most closely related to other serotype A virus strains (see Figures 2 and 3). These observations suggested that recombination may have occurred at some time between serotype O and A viral strains.

Detection of Recombination
As indicated above in Figures 4 and 5, comparison of the genome sequences within the P2 and P3 coding regions showed that the O/K/52/1992 virus sequence in these regions was most closely related to some serotype A viruses although elsewhere within the genome this Kenyan virus was most closely related to other O serotype viruses. In order to establish possible recombination events further analyses were performed using similarity and bootscan plots (see Figure 6  The examination of points at which similarities between the query, O/K/52/1992 and the test sequences increased or decreased, identified approximate break points. Figure 6 shows that the O/K/52/1992 sequence is most closely related to serotype O within the capsid region but has similar levels of percentage identity to both serotypes O and A within P2 and P3 coding regions. The Bootscan analysis (Figure 7) shows this query sequence is closely related to A23/KEN/1965 within the P2 region and has a mosaic pattern of similarity to both O/K/117/1999 and A21/KEN/1964 within the P3 coding regions.

Discussion
Sequencing of these recent viruses from Kenya and Uganda has allowed for the assessment of variation of the East African isolates across almost the entire coding region relative to viruses from Asia, Europe and to a smaller extent South America. These East African viruses belong to serotype O based on the sequence of the VP1 coding region ( Figure 2) and are part of a single evolutionary lineage (topotype EA-2) except for K/52/ 1992 (topotype EA-1). The related virus strains U/312/ 2006 and O/UGA/2006 (accession no. EF611987) [25] have also been identified as serotype O by antigen ELISA (data not shown).
Within the East African lineage, two major divisions are observed corresponding to their geographical locations with the Ugandan isolates comprising one sublineage and the Kenyan strains another. In addition, the    (Figures 4 and 5). This points to the possibility that this strain may have arisen by recombination between viruses of different serotypes, a process seen within many picornaviruses both in nature, e.g. for poliovirus, Human enterovirus B [6,[27][28][29], and in the laboratory including with FMD viruses [28]. The mosaic pattern observed within the P3 coding region may suggest recombination although this could also indicate genetic convergence. Earlier studies have shown that inter serotypic genetic exchange occurs more frequently within the Euro-asiatic viruses in comparison to amongst SATs or even between SAT and non-SAT viruses [8].
It is noteworthy that the most prevalent serotypes in the East Africa region are O followed by A [9] strongly suggesting the possibility of co-infection with both or even more serotypes within some animals.
The findings reveal the complexity of FMDV evolution, consistent with earlier studies that have shown that recombination is mainly restricted to non-structural coding regions.   This study therefore supports recombination as an evolutionary force causing genetic diversity within FMDV and shows the need for full genome analyses to identify such events.