High prevalence and diversity of species D adenoviruses (HAdV-D) in human populations of four Sub-Saharan countries

Background Human adenoviruses of species D (HAdV-D) can be associated with acute respiratory illness, epidemic keratoconjunctivitis, and gastroenteritis, but subclinical HAdV-D infections with prolonged shedding have also been observed, particularly in immunocompromised hosts. To expand knowledge on HAdV-D in Sub-Saharan Africa, we investigated the prevalence, epidemiology and pathogenic potential of HAdV-D in humans from rural areas of 4 Sub-Saharan countries, Côte d’Ivoire (CI), Democratic Republic of the Congo (DRC), Central African Republic (CAR) and Uganda (UG). Methods Stool samples were collected from 287 people living in rural regions in CI, DRC, CAR and UG. HAdV-D prevalence and diversity were determined by PCR and sequencing. A gene block, spanning the genes pV to hexon, was used for analysis of genetic distance. Correlation between adenovirus infection and disease symptoms, prevalence differences, and the effect of age and gender on infection status were analyzed with cross tables and logistic regression models. Results The prevalence of HAdV-D in the investigated sites was estimated to be 66% in CI, 48% in DRC, 28% in CAR (adults only) and 65% in UG (adults only). Younger individuals were more frequently infected than adults; there was no difference in HAdV-D occurrence between genders. No correlation could be found between HAdV-D infection and clinical symptoms. Highly diverse HAdV-D sequences were identified, among which a number are likely to stand for novel types. Conclusions HAdV-D was detected with a high prevalence in study populations of 4 Sub-Saharan countries. The genetic diversity of the virus was high and further investigations are needed to pinpoint pathological potential of each of the viruses. High diversity may also favor the emergence of recombinants with altered tropism and pathogenic properties.


Background
Adenoviruses are non-enveloped icosahedral viruses with a linear, double-stranded DNA genome that can cause a wide spectrum of clinical diseases, including acute respiratory illness, epidemic keratoconjunctivitis, acute hemorrhagic cystitis, hepatitis, myocarditis and gastroenteritis [1,2]. Primary infection with human adenoviruses (HAdV) or reactivation of persistent HAdV can lead to severe systemic diseases in immunocompromised hosts, such as bone marrow and solid organ transplant recipients and patients with AIDS [3][4][5]. Based on serology, wholegenome sequencing, and phylogenomics, eight HAdV species (HAdV-A to -G) are differentiated within the genus Mastadenovirus, comprising a total of 69 recognized HAdV types. Most HAdV types belong to species D and the majority of these HAdV-D types have been detected in HIV-positive patients [6][7][8]. Knowledge about the pathogenicity of HAdV-D types is limited. Types D8, D19, D37 and D54 are known to cause epidemic keratoconjunctivitis [9][10][11][12][13]. The virulent type D53, a recombinant between D8, D22, D37 and at least one unknown HAdV-D type, showed a modified tropism and induced inflammation of the cornea [14]. Another recombinant type, D56, was involved in fatal pneumonia in a neonate and keratoconjunctivitis in three adults [15]. Type D36 has been associated with obesity in animals and humans [16] and types D65 and D67 were detected in the stool of children with gastroenteritis [8,17]. The pathogenicity of the HAdV-D types frequently shed by patients with AIDS remains controversial [3,[18][19][20]. Prolonged fecal and urinary shedding of different HAdV-D types and of recombinants has been observed [18,21]. Hence it has been proposed that recombination might play a greater role in HAdV-D evolution than base substitution [6]. Since HIV/AIDS as well as gastroenteritis are common diseases in Sub-Saharan Africa and PCR-based and sequence-confirmed studies on the prevalence of HAdV-D in Sub-Saharan Africa are scarce [22,23], we analyzed the occurrence of HAdV-D in people from four Sub-Saharan countries; the Côte d'Ivoire (CI), the Democratic Republic of the Congo (DRC), the Central African Republic (CAR), and Uganda (UG). We assessed possible connections between HAdV-D shedding and clinical symptoms or demographic data, and in addition, we partially characterized the molecular isolates by comparison of minimum genetic distances within the pV-hexon gene block.

Results
HAdV-D generic nested PCR and sequencing revealed a HAdV-D prevalence of 66% (95% CI 56-76%) in CI, 48% (95% CI 38-58%) in DRC, 28% (95% CI 13-51%) in CAR (adults only), and 65% (95% CI 53-75%) in UG (adults only). The prevalence in CI was significantly higher than in DRC (P < 0.01). When comparing the adult populations in all four countries, the prevalence in CI and UG was significantly higher than in DRC and CAR (p < 0.05). In CI and DRC 100% and 68% of the younger children and 71% and 50% of the older children and adolescents were HAdV-D positive (Table 1) and the proportion of infected people decreased further to 64% and 39% in the adults in CI and DRC, respectively. Overall there was a significant decrease in the proportion of infected individuals with increasing age group (regression coefficient -0.6, p < 0.05). Gender had no significant effect on infection status (p > 0.100). The prevalence per village ranged from 45-100% (data for individual villages not shown) and there was overall no significant difference between the villages (p > 0.05) (mean n/village = 16.7, range . The logistic regression model explained 11% of the variance in the dataset (pseudo R-square = 0.1064).
To investigate the pathogenicity of HAdV-D, we tested for a correlation between HAdV-D shedding and clinical symptoms. Overall, 57% of the study participants reported at least one clinical symptom. There was no correlation between infection status and individual clinical signs, or between infection status and poor health status, i.e. individuals that reported symptoms (p > 0.05) ( Table 1).
We finally analyzed if the sequences of the current study were derived from novel HAdV-D types. For this purpose, we attempted to amplify a 4.8 kb fragment spanning the genes pV-hexon gene block from 35 randomly chosen, HAdV-D positive samples from CI, DRC, CAR and UG. Fragments were obtained from 14 samples after 2 nd or 3 rd round of Long-Distance PCR and completely sequenced. Thirteen different HAdV-D sequences were obtained. Since in 2 samples from UG close to identical sequences were identified, only one of them (Hu4555_UG) was included in further analysis. BLAST analysis of the hyper-variable Loop 2 region from the 13 sequences revealed a pairwise identity of 86.4%-100% to known HAdV-D types (HAdV-D types 9, 13, 15, 17, 23, 25, 27, 29, 30, 32 47, 48, 56, 65, 67). We then compared the minimum genetic distances (minGD) of the pV-hexon sequences from 43 known HAdV-D types with those of the 13 types identified in this study. First, we determined minGD between the 43 recognized types (see Methods) to estimate the range of intertypic minGD. 50% of all values lay between 0.02 and 0.05 nucleotide substitution per site ( Figure 1). Then, we determined minGD between the 13 sequences generated in this study and any recognized type: here 84.6% of all values lay between 0.02 and 0.05 nucleotide substitutions per site ( Figure 1; Table 2). All 13 unique sequences identified in this study exhibited minGD >0.02; 5 sequences even exhibited a minGD >0.04, which outperformed >50% of intertypic minGD ( Figure 1; Table 2). Comparable results were obtained by using estimated instead of observed minGD (Table 2) and by analyzing only the region containing the hyper-variable hexon loops (Loops 1 and 2). These loops represent the major target for antibodies, are involved in immune escape, are particularly prone to recombination events and are among the target sequences for the characterization of HAdV types. Although more sequence information would be desirable, our data already point at 5 of 13 unique sequences likely representing novel types (under the conservative assumption that intertypic minGD > intratypic minGD in at least 50% cases). Recombination analysis of our data confirmed the tendency of HAdV-D to recombine in the hyper-variable loop region of the hexon gene (data not shown).

Discussion
In this study, we estimated a relatively high prevalence of HAdV-D in several rural study populations in Sub-Saharan countries. The overall prevalence in CI was significantly higher than in DRC, which was also reflected by the fact that the proportion of HAdV-D individuals across all age classes was higher in CI compared to DRC ( Table 2). UG also had a particularly high prevalence when comparing the adult population only. Since we were able to explain only 11% of the variance in the dataset with the model, there are likely to be many factors that influence the difference in HAdV-D occurrence which have not been included in this study. Considering the fecal-oral nature of HAdV transmission, such factors may be sources of local water supply and hygiene measures including toilet facilities. Also, nutrition and the local occurrence of HIV and other infections might play a role in the susceptibility to, and shedding of, HAdV-D. HIV prevalence was not determined for the populations in the present study, but it is possible to speculate that the high prevalence in CI is partly a result of the relatively high HIV-1 prevalence in the Tai area (7.2%) [24]. Various sampling strategies and detection methods limited the feasibility of direct comparison of our crosssectional study to other studies (Additional file 1: Table S1) [22,23,[25][26][27][28][29][30][31][32][33][34][35]. Study participants were within limited age groups, showed specific symptoms and/or different life styles. Contrary to other studies, we applied a generic nested PCR using non-degenerate primers that target the hexon gene of all known HAdV-D types and sequenced all positive samples. Most primers implemented in other studies were degenerate and targeted several HAdV species and if HAdV were confirmed by sequencing, this was only performed for a selection of samples (Additional file 1: Table S1). This might have resulted in a considerable underestimation of the HAdV-D prevalence.
Our study included participants of all age groups ranging from young children, older children/adolescents to adults. We were able to detect that younger individuals shed HADV-D significantly more frequently, which shows on one hand that exposure to this infection likely occurs early in life and on the other hand that adults might develop immunity leading to a reduction in HAdV-D shedding. It is not clear, whether this high prevalence can be explained by a general high sensitivity of children to any infection, or by a more likely ingestion of contaminated material in young children compared to adults. Although mothers should in theory be at higher risk of getting infection through baby care, no difference between men and women regarding HAdV-D shedding was observed. This could indicate that the majority of infections occur via generally available sources in the villages.
It has been shown that some HAdV-D types induce specific symptoms [8,9,14,16,17]. In our study, HAdV-D sequences could not be finally assigned to specific HAdV-D types, which makes it less likely to find a correlation between HAdV-D infection and an individual symptom. In addition, the symptoms induced by HAdV-D types are not pathognomonic and can be associated with different pathogens. To determine the effect of specific HAdV-D types, it would be necessary, to exclude other pathogens causing similar symptoms and to perform type-specific laboratory analyses.
We analyzed genetic distances to further characterize the HAdV-D types involved in this study. Historically, novel AdV types have been determined using serological assays based on recognition of specific epitopes on the viral capsid and on biological properties (oncogenic, haemagglutinating and morphological properties). Nowadays, phylogenetic analyses of complete sequences of the capsid proteins, hexon, fiber and penton base, have been shown to be good predictors for new types and for detection of recombination events [36][37][38]. We analyzed a 4.8 kb long sequence comprising the pV-hexon gene block. Molecular divergence within the hexon protein, the most significant protein for classification and recognition of types, can be used to estimate whether sequences are likely to represent novel AdV types [39,40]. Sequencing of the hyper-variable Loop 2 region of the hexon gene has been proposed to be sufficient for HAdV-typing [39]. For HAdV-D, a genetic divergence ranging from 0.3 to 2,7% has been reported [40]. Although analysis of the pV-hexon sequence alone does not permit definite typing of HAdV-D, since recombination may have occurred in other genes, comparison of minimum genetic distances between recognized types and the study sequences strongly suggested that novel HAdV-D types are involved in our study ( Figure 1; Table 2).

Conclusions
This study shows that HAdV-D is prevalent in rural populations across Sub-Saharan Africa, and that the virus is more frequently shed by younger than older individuals. HAdV-D shedding could not be linked to disease symptoms, but specific types were not investigated. The diversity of the virus was high, which may include pathogenic variants and favor the emergence of recombinants with altered tropism and pathogenic properties.

Subjects, sampling and statistical analysis
Between 2011 and 2012, we collected stool samples from individuals who have lived in or next to the tropical rain forest in Western (Taï National Park in CI) and Central (Salonga National Park in DRC) Africa for several years. The study populations were predominantly hunters, livestock breeders and cultivators with (CI) or without (DRC) regular contact to other populations. Volunteer participants (93 male; 107 female) were recruited from 12 villages (CI, n = 8; DRC, n =4). Participants were between 0 and 77 years old (mean = 41) in CI, and between 0 and 78 years old (mean = 28) in DRC. A basic clinical examination was performed by a trained medical professional and, when necessary, free treatment and medical advice were provided. If treatment was not possible on site, individuals were referred to an appropriate medical facility. The final written medical history included age, sex, location of residence, and information about current medical condition. Participants were assigned to one of three age groups: young children (0-5 years old), older children and adolescents (6-19 years old), and adults (20 year and older). The same sampling procedure was applied in CAR and UG, where stool samples were collected from adult male field assistants of Dzanga-Sangha Protected Areas (CAR, n = 18) and Bwindi National Park (Uganda, n = 69). Clinical data were not recorded for these 2 groups. Before sampling, the aim of the study, as well as the possibility to quit the study at any point, was explained individually in the local language. An individual study number was assigned to every participant in order to protect the privacy of the participant. Written informed consent was obtained from every study participant before sampling and the collection was approved by the responsible ethic commission of every country and was performed according to the declaration of Helsinki. The samples were conserved in liquid nitrogen and later stored at -80°C. The export to Germany occurred on dry ice with the appropriate permissions. DNA extraction was performed using the Qiagen and Roboklon stool kits, according to the manufacturer's instructions.
Descriptive statistics, prevalence estimation and the effects of demographic data were generated and analyzed in Stata v12.0, using cross tables with Fischer exact tests and logistic regression models in Stata v12.0 with HAdV-D status as dependent binomial factor and country, gender, age group and village as independent factors. Results from CAR and UG were used for prevalence estimation only.

Specific nested HAdV-D PCR
For the generic detection of members of the species HAdV-D, non-degenerate primers were designed based on an alignment of published HAdV-D sequences (Table 3)

Specific long-distance-PCR
Long-Distance nested PCR was performed using the TaKaRa-EX PCR system according to the instructions of the manufacturer (Takara Bio Inc., Otsu, Japan) and resulted in a fragment of 4.8 kb. The sense primers bind in the 3′-end of the pVII gene and the antisense primers in the 3′-end of the hexon gene ( Table 3). The target sequence lays between positions 15 091 and 20547 of the genome of HAdV-D36. Depending on the DNA concentration, up to 7 μl of DNA, were added in the first round to 15.5 μl of PCR mix (containing 10× ExTaq buffer with MgCl2, dNTPs at 2.5 mM each, 5.0 units of Ex Taq Polymerase and 10 μM of each first round sense and antisense primer) and PCR-grade H 2 O ad 50 μl. In second and third round amplification, 1 μl of the reaction product generated in the preceding PCR round, was added to 15.5 μl of PCR mix with 10 μM of the respective primers and H 2 O ad 50 μl. Thermocyclers of type TgradientS (Biometra, Germany) were used under the following conditions for every round: activation of the polymerase at 94°C for 5 min and 15 cycles of, followed by 15 cycles of denaturation (98°C, 20 s), annealing (60°C, 30 s), and elongation (68°C, 8 min + 5 s) and a final elongation at 72°C for 30 min. All PCR products of expected size were purified and sequenced as described above.

Sequence analysis
The 4.8 kb sequences determined in this study (n = 13) were added to a data set consisting of sequences of the pV-hexon gene block available in Genbank from completely sequenced HAdV-D genomes (n = 43) (listed in Methods). They were aligned with the ClustalW multiple alignment method (EMBL, Heidelberg, Germany) (Alignment 1). Another alignment was performed comprising only the 43 Genbank sequences (Alignment 2). In addition, the hyper-variable loop regions of the hexon gene were extracted from alignments, resulting in the alignments 1a and 2a. Conserved blocks were selected from these alignments, using Gblocks [41] as implemented in SeaView v4 [42]. This resulted in alignments of 4.556 (Alignment 1 and 2) and 1.065 nucleotides (Alignment 1a and 2a). Observed genetic distance matrices were obtained for every alignment with the program Geneious v6.1.6 [43]. Evolutionary distances were estimated using PAUP* v4.0 [44], based on the best fitting nucleotide substitution model (GTR + I + G), as determined using jModeltest v2.1.3 [45].
In the observed and estimated genetic distance matrices of the alignments 1 and 1a, the minimum genetic distance (minGD) was determined for every sequence of this study (minGD study [1][2][3][4][5][6][7][8][9][10][11][12][13] in relation to the 43 HAdV-D types from Genbank. Correspondingly, in the observed and estimated genetic distance matrix of alignment 2 and 2a the minGD was defined for every HAdV-D type from Genbank (minGD Genbank 1-43) in relation to the other 42 types. Subsequently, we visualized the 13minGD study values and the 43 minGD Genbank values in a strip chart (carried out with R software) ( Figure 1) and assessed for every minGD study value the number and percentage of inferior minGD Genbank values (Table 2).

Recombination analysis
With default settings in the Recombination Detection Program v.4.16 (RDP4), recombination events, likely parental isolates of recombinants and recombination break points were analyzed using the RDP, GENECONV, Chimaera, MaxChi and BOOTSCAN methods implemented in the RDP4 program [46,47].