Phylogenetic history demonstrates two different lineages of dengue type 1 virus in Colombia

Background Dengue Fever is one of the most important viral re-emergent diseases affecting about 50 million people around the world especially in tropical and sub-tropical countries. In Colombia, the virus was first detected in the earliest 70's when the disease became a major public health concern. Since then, all four serotypes of the virus have been reported. Although most of the huge outbreaks reported in this country have involved dengue virus serotype 1 (DENV-1), there are not studies about its origin, genetic diversity and distribution. Results We used 224 bp corresponding to the carboxyl terminus of envelope (E) gene from 74 Colombian isolates in order to reconstruct phylogenetic relationships and to estimate time divergences. Analyzed DENV-1 Colombian isolates belonged to the formerly defined genotype V. Only one virus isolate was clasified in the genotype I, likely representing a sole introduction that did not spread. The oldest strains were closely related to those detected for the first time in America in 1977 from the Caribbean and were detected for two years until their disappearance about six years later. Around 1987, a split up generated 2 lineages that have been evolving separately, although not major aminoacid changes in the analyzed region were found. Conclusion DENV-1 has been circulating since 1978 in Colombia. Yet, the phylogenetic relationships between strains isolated along the covered period of time suggests that viral strains detected in some years, although belonging to the same genotype V, have different recent origins corresponding to multiple re-introduction events of viral strains that were circulating in neighbor countries. Viral strains used in the present study did not form a monophyletic group, which is evidence of a polyphyletic origin. We report the rapid spread patterns and high evolution rate of the different DENV-1 lineages.


Background
Dengue virus infection has been an important impact on humans over the last several years, with an estimated 50 million dengue infections and an average of 1 million cases reported annually in more than 100 countries in tropical and subtropical regions [1][2][3][4][5]. This mosquitoborne flavivirus causes a wide spectrum of clinical manifestations in humans, which include an acute self-limited flu-like illness known as dengue fever (DF). DF is characterized by headache, myalgia, arthralgia, retro-orbital pain and sometimes maculopapular rash. Dengue haemorrhagic fever (DHF) is a severe illness documented by haemoconcentration (haematocrit increase by 20%) and evidence of plasma leakage such as pleural effusion and ascites as the major pathophysiological features. In some patients, DHF may progress to hypovolemic shock (Dengue Shock Syndrome, DSS) with circulatory failure [2,[6][7][8].
Four DENV serotypes have been involved in Colombian epidemics, although DENV-1 and DENV-2 have the higher circulation rate since 1971 [5,6,21]. Moreover, since the first case of DHF in Colombia at the end of 1989, these two serotypes have been associated with severe disease. To date, DENV-1 falls into five clades designated as genotype I (Southeast Asia, China and East Africa), genotype II (Thailand), genotype III (Malaysia), genotype IV (South Pacific) and genotype V (America, Africa). Additionally, the existence of lineages with distinctive geographical and temporal relationships had been suggested [12,20,26,28,[35][36][37][38]. Due to the importance of DENV in public health, the particular goals of this research were to reconstruct the phylogenetic history of DENV-1 and to date the phylogenetic tree using isolation time as calibration points to establish date of introduction of virus and rate evolution patterns of virus in Colombia.

Virus recovery and confirmation
Seventy four viruses obtained from symptomatic patients were isolated in mosquito cell culture and subsequently identified as DENV-1 serotype by monoclonal antibodies and confirmed by RT-PCR methods. From the 74 samples, it was not possible to obtain the exact geographic origin of 10 samples. The remaining 64 isolates are listed in Table 1 indicating locality, isolation year, accession number and genotype.

Phylogenetic reconstruction of DENV-1
Sequences from the carboxyl terminus of the envelope (E) gene from the 74 Colombian DENV-1 isolates were aligned in CLUSTAL W [39,40] and compared with 52 previously reported sequences elsewhere, resulting in a trivial alignment as long as there were no indels in the sequences alignment. The Maximum Likelihood analysis comparing 126 sequences is presented in figure 1.
Previously reported genotypes were represented in the tree and placed most of the Colombian isolates nesting in the genotype V clade (America, Africa) and were closely related to Argentina, Brazil and Paraguay virus strains. Nevertheless, the oldest sequences DENV-1/CO/261_Atlantico/1978, DENV-1/CO/ 263_Choco/1979 and DENV-1/CO/150_Choco/1979 were slightly distant from the remaining strains and appeared in close proximity to some Caribbean Island and other American isolates (Trinidad, French Guinea). Interestingly, the isolate DENV-1/CO/267_Valle/1983 appeared in a different clade, as the sister branch of Japan (DENV-1/JP/Mochizuki/1943), China (DENV-1/ CN/GZ-80/1980), Ethiopia (DENV-1/DJ/Ethiopia/1998) and Cambodia (DENV-1/KH/1998) strains, which have been defined as lineages of the genotype I [26]. To allow a better resolution of the tree, we performed a phylogenetic reconstruction using only the Colombian isolates. Again, the oldest isolates were more divergent representing the first entrance of virus in Colombia. Although the genotype V was the only represented in the tree, two different lineages may be defined based on cladal distribution (figure 2).

Molecular clock
We used a Bayesian inference based on MCMC to reconstruct Colombian DENV 1 coalescent history. BEAST allowed the use of isolation year as calibrating point to estimate divergence time and then generated a posterior probability (PP) distribution of trees instead of a bootstrap value [41][42][43][44]. The resulting tree clearly placed the genotypes of DENV-1 already circulating globally before the first appearance of this serotype in the Americas between 1970 and 1980 (figure 3). According to the 95% highest posterior density (HPD) beneath the strict clock model, the estimated root for this phylogeny was 1929 and the substitution rate was 8.58 × 10 -4 substitutions per site, per year. To increase resolution, we use the strict molecular clock model to reconstruct Colombian isolates ( figure 4). Under the assumption of a constant substitution rate, the estimated root indicates 1945 as the date of the more recent common ancestor. In addition, there was a split up around 1987 between DENV-1/CO/188_Guaviare/1987 isolates and the remaining strains (PP = 0,82). As the time goes by, we can see a sustained increase in number of isolates and a rapid spread of viruses, which included few changes among them as seen with the branch lengths. By the year 1992 (approximately), another remarkable partition event occurred to generate 2 well defined clades (PP = 0,77 and 1), evolving independently since the early 90′s until recent time.

Discussion
Emerging and re-emerging diseases have become a public health major concern in developing countries, where dengue is perhaps the most important vector-borne viral disease in terms of morbidity. In Colombia, DF and DHF had been associated to the four DENV serotypes with DENV-2 and DENV-1 predominating since 1971 after the re-appearance and spread of Aedes (Stegomyia) aegipty [6]. DENV-3 circulated for a short time in 1975 and then it was not detected until 2002 when re-introduction occurred probably from Venezuela [27]. DENV-4 has been detected sporadically every year since 1984, when it was involved in several DF cases. The huge genetic diversity of DENV has been vastly documented, starting perhaps with the Rico-Hesse proposal of different "genotypes" comprising serotypes 1 and 2 [10], following by several studies and genotype definition of DENV-3 and DENV-4. In this way, five different genotypes has been previously defined for DENV-1 (genotypes I to V) suggesting a significant genetic variation. In fact, various lineages had been proposed based on time-spatial clustering and clade distribution [26,28,[35][36][37][38]. In the present study, 74 Colombian DENV-1 sequences were analyzed to try to reconstruct the phylogenetic history of the virus in this country. Different genome regions have been used to infer DENV phylogeny including those with short fragments [10,27,28]. Here we employed a sequence from the carboxi terminal of the envelope (E) protein which has demonstrated to provide a useful phylogenetic signal to define genotype clustering [26]. It is important to note that the better resolution of evolutionary patterns should be obtained from complete genomes. However, it was not possible to obtain largest fragments from the oldest isolates, probably because of RNA degradation across the time. As expected, all strains were clustered with those from Brazil, Paraguay, Argentina, and different Caribbean Islands, corresponding to the formerly named genotype V (America/Africa), showing a well supported clade clearly separated from the others genotypes. Colombian strains DENV-1/CO/261_Atlantico/ 1978, DENV-1/CO/263_Choco/1979 and DENV-1/CO/ 150_Choco/1979, were separated from the remaining isolates and appeared closer to those from the Caribbean islands, which represent the entrance of serotype 1 into the Americas. It was reported for the first time in 1977 in Jamaica and rapidly spreading to the Antilles including Cuba, Antigua & Barbuda, Aruba, Bahamas, Barbados, Curaçao, Dominica, Grenada, Guadaloupe, Guyana, Haiti, Martinique, Montserrat, Puerto Rico, St. Kitts, St. Martin, St. Vincent and the Grenadines, Trinidad, Turks and Caicos, and the Virgin Islands [5]. In 1978, DENV-1 was implicated in large mainland outbreaks perhaps occurring at the same time in Colombia, Venezuela, Surinam, French Guyana, and eventually Centro America and Mexico. In Colombia, DENV-1 was isolated between 1977 and 1978, so the strain DENV-1/ CO/261_Atlantico/1978 represents perhaps the first virus entrance to the country. It rapidly spread until the next isolation in Choco (DENV-1/CO/263_Choco/1979 and DENV-1/CO/150_Choco/1979) and then it fades away (or at less it was not reported) probably displaced by DENV-2 (maintaining DENV-1 in a silent low circulation) until 1985 when it established in different localities. It is important to note that even with the mobility between countries and increasing opportunity of viral introduction, only one DENV-1 genotype is circulating in America, different to DENV-2 and DENV-3 of which at least 2 genotypes has been detected (America/Asia genotypes and I/III genotypes respectively) suggesting  Phylogenetic analyses were conducted in PAUP* perhaps, dissimilar patterns of viral spread and transmission between DENV genotypes and even different adaptation capacity. Many researchers have categorized DENV in non official taxonomic levels beneath genotype, based specially in cladal distribution or geographical clustering. Circulation of these "lineages" has been particularly defined for DENV-1 in India, where at least 4 different lineages had been proposed (India-1 close to American strains, India-2 related to Singapore 1993 isolate, India-3 in south India and India-4 from Delhi and Gwalior) [26,28]. In our study, a remarkable cladogenesis event occurs around 1992 according to the molecular clock, generating two well supported clades corresponding to putative Colombian DENV-1 lineages. Despite the eco-epidemiology similarities between Colombia and neighbor countries were dengue is a major concern, lineages have not been previously demonstrated for DENV-1. In fact, according to ML phylogeny, most of the American strains (Argentina and Brazil) correspond to the lineage-1, leaving the lineage 2 restricted to Colombia. Although geographic distribution of these lineages is not clearly delimitated, it is evident that they are evolving independently and most likely in parallel at the same localities.
Despite the emergence and rapid diversification of DENV has been a matter of special concern, precise mechanisms of evolution remain unclear [45][46][47][48][49][50]. It is a fact that human RNA viruses including Influenza, HIV, Coronavirus, etc., have particularly increased mutation and evolution rates mostly because of the lack of proofreading activity of RNA-dependent RNA-polymerase [51,52]. Nevertheless, arthropod-borne viruses (Arboviruses) have demonstrated slower mutation rates comparing with those infecting directly human host, probably because of the trade-off effect occurring when the virus is obligated to adapt alternatively into the  invertebrate vector and vertebrate host [51]. This resulting constrain has been experimentally assessed in vivo to Venezuelan Equine Encephalitis [52] and in vitro to DENV [51] demonstrating that fitness improves when virus specialize in a single cell line but decreases in virus undergoing alternative passages in different cells. In view of that, over all DENV mutation rates have been previously inferred, ranging from 4.55 × 10 -4 (DENV-1) to 9.01 × 10 -4 (DENV-3) [19]. In the present study, we found a mutation rate of 8.58 × 10 -4 substitutions per site, per year, suggesting faster evolution rates for Colombian strains, perhaps because of the high transmition occurrence especially in hyperendemic areas, where virus replicates in several human hosts, reducing the constraining effect occurred in the vector. However, this high mutation rate does not necessarily reflect a fitness advantage or a successful adaptation process. Actually, positive selection for DENV seems to be serotype/genotype dependent and even more, protein specific. In fact, envelope (E) protein apparently exhibits some adaptation evidence in DENV-3, DENV-4 and various DENV-2 genotypes, but not for DENV-1, strongly suggesting a purifying selection pressure, at least over this gene. Nevertheless, further studies have to be done to try to understand the adaptation process in DENV.
On the other hand, although mostly of Colombian strains belong to the genotype V, there is an isolate, DENV-1/CO/267_Valle/1983 placed into genotype I near to Asia, China and East Africa strains. The ML tree show this strain close to DENV-1/JP/Mochizuki/ 1943, a strain considered extinct. Since we do not have this virus as reference in our laboratory, we can discard cross contamination during the assay. Moreover, the presence of this virus could be explained based on the migration process occurred from Asia to America, officially starting to Colombia by 1929, and sustained until the mid XX century [53]. Thus, establishment of Asian colonies increased visitors and perhaps favored the entrance of viral strains. We can speculate that those viruses did not fit to the new environment and the adaptation events were constrained because of the selective pressures including different vectors and human immune response.
According to natural history of DENV, evolution events could bring new genetic variants and eventually increase the severity of disease. Although pathogenic markers remain unclear, hemorrhagic features on some Asian DENV-2 genotypes have been demonstrated and Asian derived DENV-3 genotypes associated to dengue fever and dengue hemorrhagic fever have been reported in Brazil [25]. Moreover, changes in clinical manifestation of disease (atypical dengue) such as viscerotropism or encephalitis may respond to the circulation of new DENV lineages with increased pathogenic potential.
Consequently, epidemiological programs should include not just virological diagnosis but genotype surveillance too.

Conclusion
This study shows in a defined time-scale, not just the first entrance of DENV 1 in Colombia, but also the viral evolution process in a highly endemic area. As a major conclusion, only one genotype of DENV 1 has been circulating since the first epidemic reports in the continental area. Nevertheless, two different lineages have been evolving fast since the earliest 90′s according to molecular clock. As these evolution events may derive in a marked pathogenic potential, surveillance programs should include molecular methodologies. In fact, unusual presentation of disease currently reported by local health care institutions may be correlated to this evolution process. Further analyses by using at least complete E gene should be done to corroborate our results.

Virus strains
DENV-1 strains used in this study were obtained from the virus collection of the National Health Institute (INS, Virology Lab, Bogotá, Colombia), and comprise 74 isolates from outbreaks, epidemics and routine epidemiological surveillance. Clinical samples were collected between 1978 and 2007 from different localities all around the country, so they represent most viruses circulating in Colombia during the last 30 years (Table 1). All viral stocks were inoculated on C6/36 Aedes albopictus cells growing in Eagle's minimal essential medium (E-MEM) supplemented with 2% fetal calf serum (FSC). After 10 days of incubation at 28°C, monolayer was disrupted and supernatant was then recovered by centrifugation and stored at -80°C until use. The remained cells were washed with Phosfate Buffer Saline (PBS) and dripped on slides; after fixed in cold acetone, slides were incubated with monoclonal antibodies (anti-DENV-1 to anti-DENV-4, kindly donated by CDC, Puerto Rico) for one hour, washed with PBS and incubated again with a fluorescent conjugated antibody. Additionally, DENV-1 serotype confirmation was done by reverse transcription polymerase chain reaction (RT-PCR) using specific primers [54].
Amplified products (from RT-PCR or nested PCR) were purified using QIAquick PCR Purification Kit (QIAGEN, Germany) and then used as template for sequencing reactions using the ABI Prism Dye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA). Sequencing was carried out on both strands with 10 pmol of primers used for nested PCR, and the products were analyzed using an ABI model 377 automated sequencer (Applied Biosystems, USA). Overlapping sequences for each sample obtained from sense and antisense primers were combined to obtain a consensus sequence using the Seq-Man module of Lasergene (DNASTAR Inc. Software, Madison, Wis.). A total of 224 bp [corresponding to carboxyl terminus of envelope (E) gene] from 74 new sequences were compared with 52 previously sequenced strains from all over the world, available in GenBank. Consensus sequences were aligned using the program CLUSTAL W included in MEGA package version 4.0 [35,36].

Phylogenetic analyses
Phylogenetic trees were constructed with the Maximum Parsimony and Maximum Likelihood (ML) methods incorporated in the PAUP* 4.0 program [56]. Phylogenetic analyses were performed by using the best model of nucleotide substitution based on Modeltest [57] (analyses are available upon request). Statistical significance of tree topology was assessed with a bootstrap with 1000 replicates. Obtained trees were visualized using the Tree View Program [58].

Substitution rates and molecular clock
In addition, estimated rate of evolutionary change (nucleotide substitutions per site per year) and tree root age was obtained with the program BEAST (Bayesian Evolutionary Analysis by Sampling Trees) [41], which uses Bayesian Markov Chain Montecarlo (MCMC) algorithms combined with the chosen model and prior knowledge of sequence data to infer the posterior probability distribution of phylogenies [41][42][43][44]. We analyze the data using the year of isolation as calibration points to estimate divergence time in years. In order to avoid duplicates, sequences identical to other on the dataset were removed. Rate variation among branches was inferred under the strict molecular clock model, whereas substitution rate among sites was calculated with the General Time-Reversible model (GTR) combined with the gamma parameter and proportion of invariant sites (GTR+Γ+I ) model. MCMC was run for 10,000,000 steps and sampled every 500 steps and the 10,000 first steps of each run were discarded. BEAST format files were obtained in the provided BEAUti graphical interface and the generated trees were visualized with the FigTree 1.2.2. program. Finally, statistical analysis was carried out in the Tracer package [41].