- Open Access
Prevalence and genetic diversity analysis of human coronaviruses among cross-border children
Virology Journal volume 14, Article number: 230 (2017)
More than a decade after the outbreak of human coronaviruses (HCoVs) SARS in Guangdong province and Hong Kong SAR of China in 2002, there is still no reoccurrence, but the evolution and recombination of the coronaviruses in this region are still unknown. Therefore, surveillance on the prevalence and the virus variation of HCoVs circulation in this region is conducted.
A total of 3298 nasopharyngeal swabs samples were collected from cross-border children (<6 years, crossing border between Southern China and Hong Kong SAR) showing symptoms of respiratory tract infection, such as fever (body temperature > 37.5 °C), from 2014 May to 2015 Dec. Viral nucleic acids were analyzed and sequenced to study the prevalence and genetic diversity of the four human coronaviruses. The statistical significance of the data was evaluated with Fisher chi-square test.
78 (2.37%; 95%CI 1.8-2.8%) out of 3298 nasopharyngeal swabs specimens were found to be positive for OC43 (36;1.09%), HKU1 (34; 1.03%), NL63 (6; 0.18%) and 229E (2;0.01%). None of SARS or MERS was detected. The HCoVs predominant circulating season was in transition of winter to spring, especially January and February and NL63 detected only in summer and fall. Complex population with an abundant genetic diversity of coronaviruses was circulating and they shared homology with the published strains (99-100%). Besides, phylogenetic evolutionary analysis indicated that OC43 coronaviruses were clustered into three clades (B,D,E), HKU1 clustered into two clades(A,B) and NL63 clustered into two clades(A,B). Moreover, several novel mutations including nucleotides substitution and the insertion of spike of the glycoprotein on the viral surface were discovered.
The detection rate and epidemic trend of coronaviruses were stable and no obvious fluctuations were found. The detected coronaviruses shared a conserved gene sequences in S and RdRp. However, mutants of the epidemic strains were detected, suggesting continuous monitoring of the human coronaviruses is in need among cross-border children, who are more likely to get infected and transmit the viruses across the border easily, in addition to the general public.
Human coronaviruses (HCoVs) have been causing worldwide outbreak with cases of hospitalization . Six types of coronaviruses (CoVs) are known to infect human: two α-CoVs, i.e. 229E and NL63, two β-CoVs group A, i.e. HKU1 and OC43, β-CoVs group B, i.e. Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and β-CoVs group C, i.e. Middle East Respiratory Syndrome Coronavirus (MERS-CoV). SARS-CoV and MERS-CoV, which are highly pathogenic to human lives and have caused serious diseases or death, causes about 10 and 36% mortality respectively. OC43, HKU1, NL63 and 229E are the most common four HCoVs in most regions, circulating worldwide with a detection rate ranging from 1.1 - 8.5% and with variations in their predominantly circulating seasons and strains [2,3,4,5]. HCoVs ranks the third in the detection rate of all 17 respiratory viruses in south of China (Guangzhou) and poses a heavy burden to the health care of children as it is associated with acute upper or lower respiratory tract infections, and cases of death have been reported . Moreover, high mutation rates caused by the low fidelity of RNA-dependent RNA polymerase (RdRp) led to high diversity of HCoVs . Several studies about the genetic diversity of human coronaviruses on hospitalized patients had been carried out previously. The new OC43 genotype D based on the recombination of B and C was discovered in 2005 . Two additional recombinants: E (CH) and E (FR) were reported as homologous genome recombination in 2015 [9, 10]. The genetic features of NL63 were reported at least three distinct circulating genotypes (A, B and C) and one recombinant (cluster R) in the United States in 2011 . Meanwhile, HKU1 strains were grouped into three clusters (A, B and C) due to natural recombination . These previous reports focused on hospitalized patients, who have low mobility and seldom cross the border, while this study hereby firstly reports the analysis on cross-border children, mainly including “cross-boundary students”, who are born and attend school in Hong Kong but reside in Mainland China [13, 14]. A border still exists between Shenzhen in Mainland China and Hong Kong (SZ-HK port) due to the colonial history, resulting in different health care and education systems . Children had a high incidence of coronaviruses infection and “cross-boundary students” connecting closely Hong Kong and Mainland China will help us understand the epidemic characteristics of coronaviruses in the Pearl River Delta region. New occurrence of infectious coronaviruses and the known pan-coronavirus variation among this region are of our study interest because the coronaviruses have the potential to threaten global health system and no vaccine is currently available [15, 16]. Therefore, surveillance upon human coronaviruses among this region was carried in this study.
Clinical specimens collection
This was a cross-sectional study in molecular epidemiology for coronaviruses infection, and the minimum sample size of this study was 1683 as determined by Z distribution. A total of 3298(>1683) nasopharyngeal swabs samples were collected from children (<6 years) who passed Shenzhen border, linking Southern China and Hong Kong SAR, from 2014 to 2015 and showed symptoms of respiratory tract infection, such as fever (body temperature > 37.5 °C) and cough. Written informed consent was obtained from the guardians of all participants before the sample and data collection.
Briefly, nasopharyngeal swab was collected and stored in a sterile EP tube with 5 mL viral transport medium in Shenzhen border. All the samples collected were immediately refrigerated at 2-8 °C and transported to the central laboratory of health quarantine of Shenzhen Entry-exit Inspection and Quarantine Bureau (SZCIQ) within the same day and stored at −80 °C until analysis.
Molecular screening of virus and amplification, sequencing of RdRp and S genes
Viral nucleic acids were extracted from 200 μL respiratory samples using MagNA pure 96 DNA with Viral NA small volume kit (Roche) and EZ1 virus Mini kit V2.0 (Qiagen) according to the manufacturer’s instructions. The viral nucleic acids were stored at −80 °C until use. For the coronaviruses screening, a quantitative real-time polymerase chain reaction (qRT-PCR) was performed in triplicate using ABI 7500 qRT-PCR thermocycler. The specimens were firstly screened for influenza viruses according to the procedure previously published . Samples of negative results on influenza were then tested for pan-coronavirus as well as 13 other common respiratory viruses. The qRT-PCR master mixture was performed according to the manufacturer’s instructions of qRT-PCR Kit (Quant), mainly contained 20.0 μL buffer and 5.0 μL RNA. The thermal cycling conditions were set as follows: reverse transcription at 50 °C for 10 min, initial 95 °C for 3 min, 40 cycles of PCR amplification at 95 °C for 15 s, annealing/elongation at 60 °C for 45 s. The partial S (S1 subunit) and RdRp genes were detected in the positive samples after HCoVs screening with the forward (F) and reverse (R) primers listed in Table 1. The PCR mixture (25 μL) contained 5.0 μL of RNA, PCR buffer mixed with Superscript ®III/PT Taq Kit (Invitrogen) containing 12.5 μl of 2× Rxn Mix,1 μL of forward and reverse primer (10 μM), 1.0 μL of MgSO4, 1.0 μL of BSA (0.1%),1.0 μl of Superscript ®III/PT Taq Enzyme, 0.5 μL of RNA Inhibitor, 2.0 μL of nuclease free water. The thermal cycling conditions were set as follows: reverse transcription at 50 °C for 30 min, 35 cycles of PCR amplification at 94 °C for 30 s, annealing at 50–54 °C for 30 s, elongation at 68 °C for 150–180 s, final elongation at 68 °C for 5 min. Sanger sequencing (Sangon Biotech) of the PCR products of concentration ranging from 50 to 300 ng/μL was performed to study the homology and mutations of samples. Genetic sequence data have been submitted to a publicly available repository (Genbank) and the accessible sequence accession numbers (MF996589-MF996664) including features of the samples and sequences.
Statistical and sequence analysis
The statistical significance of the data was evaluated with SPSS 20.0. All the p-value determined by Fisher’s Chi-square test and a p-value <0.05 was considered statistically significant. DNASTAR was used to analyze and illustrate the gene sequences compared with the sequences in NCBI Genbank for homology study. The phylogenetic trees were constructed by MEGA 7.0 with the best bases substitution model consideration, neighbour-joining, maximum likelihood and bootstrap values adjustment.
Three thousand, two hundred and ninety-eight nasopharyngeal swabs samples were screened to study the prevalence and clinical characteristics of HCoVs infection. All the coronaviruses detected in this study could be typed. 78 (2.37%; 95%CI 1.8- 2.8%) out of 3298 nasopharyngeal swabs specimens were found to be positive for OC43 (36; 1.09%; 95% CI 0.74%-1.44%), HKU1 (34; 1.03%; 95%CI 0.69%-1.37%), NL63 (6; 0.18%; 95%CI 0.04%-0.32%) and 229E (2; 0.01%) and none of SARS and MERS were detected. The HCoVs predominant circulating season was in transition of winter to spring, especially January and February and NL63 detected only in summer and fall (Fig. 1a). The results of the clinical symptoms of these samples were shown in Table 2. Males and females shared a common detection rate of all the HCoVs studied and no significant difference was found among the detection rate of the four strains. Also, the p values of Fisher’s chi-square test showed no significant difference in detection rates among different origins. The first three clinical symptoms of HCoVs infection were fever (p = 0.08), throat congestion (p = 0.58) and antiadoncus (p = 0.09). Yet, there was no significant difference between HCoVs infected and non-infected patients. For the age group distribution of four HCoVs infections, the infant age group (<1 year old) with weaker respiratory immunity was showed with the highest infection rate in total types of HCoVs infection (p = 0.049) and OC43 infection (p = 0.068)(Fig. 1b). There was virus co-infection between human coronaviruses with other common respiratory diseases. Adenovirus(Adv) and Rhinovirus(RV) were the most common two viruses that concomitantly detected with HCoVs in children younger than 6 years old.
A total of 40 RdRp genes, including 20 for OC43, 15 for HKU1, 4 for NL63 and 1 for 229E, and 36 S genes, including 16 for OC43, 16 for HKU1 and 4 for NL63, were sequenced to perform phylogenetic analysis. Since there is a high conservative in RdRp gene, phylogenetic tree was not shown here. Multiple alignments results of RdRp genes indicated that OC43 and HKU1 possessed 99–100% nt identities. Largest divergences were observed in HKU1 coronaviruses, which possessed 96 - 100% nt identities, but sequences detected in this study were 99-100% homologous to the published strains (Table 3). For the phylogenetic trees constructed based on 31 S genes with a genomic length over 2 kb of four HCoVs, there was a high level of genetic diversity among those HCoVs (Fig. 2). The OC43 coronaviruses were clustered into clade B (5,41.7%), clade D (6,50%) and clade E(1,8.3%) while none of the strains of genotype A and C was detected (Fig. 2I). Besides, there was one OC43 sequence (SW1502-30/2015/Shenzhen, China) being clustered with a new recombination genotype E (CH) (Genbank accession no: KP198611.1). Similarly, HKU1 strains in this study were clustered into clade A (7,46.7%) and clade B (8,53.3%) and related to the sequences detected in Beijing and Hong Kong SAR respectively, while no clade C was detected (Fig. 2 II). NL63 strains in this study were clustered into clade A (1,25.0%) and clade B (3,75.0%), related to strains isolated from USA and Denmark, while no clade C were detected neither (Fig. 2 III).
Moreover, we found nucleotide mutations in some of the samples (Fig. 3). Three out of 8 OC43 coronaviruses of genotype D had a total of 11 bases substitution in nucleotide position 25,059–25,112 of S genes (Genbank accession number of referenced strain: KF923904.1) (Fig. 3a). Six out of 8 HKU1 coronaviruses of genotype B were found with an extra insertion in nucleotide position 24,465 of genome leading to an additional amino acid “Threonine” insertion in amino acid position 510 of Spike (Genbank accession of referenced strain: DQ415911.1) (Fig. 3b).
The detection rate of total HCoVs was 2.37% (95% CI: 1.8 to 2.8%) in this study was consistent with the previous studies. All the coronaviruses detected have been typed. OC43 was the most common coronaviruses in our study consistent with reports in Guangzhou, Hong Kong, USA and England [4, 18,19,20], but some studies demonstrated that the prevalence of NL63 was similar to or even higher than that of OC43 in Brazil, Kenya and Japan [3, 21,22,23]. 229E was detected in low levels throughout years as previous reports and thus the peak activity of 229E could not be determined. The HCoVs predominant circulating season was in transition of winter to spring, especially January and February. NL63 predominant circulating seasons were summer and fall, which were different from those reports of winter and spring in temperate countries, such as the USA and Netherlands [24, 25]. None of the infection was found in the 1–2 years old group, even though the number of sample of this group was higher than that of the infant age group. In summary, we had analyzed the prevalent and clinical characteristics of HCoVs infection in cross-border children in SZ-HK ports. Compared with previous reports, the detection rate and epidemic trend of coronaviruses were stable, and no obvious fluctuations were found. Yet, none of novel infectious coronaviruses, SARS and MERS were detected in this study.
The coronaviruses detected from SZ-HK ports had a high homology with the published strains indicated a stable gene sequences in S and RdRp. However, there were great genetic diversity among these circulating strains. OC43 detected in this report cluster with genotype B, D and E strains, while none of genotypes A and C were detected, probably because genotype A strains had disappeared and genotype C strains were not included in this study . We observed six OC43 coronaviruses were closely related to the genotype B detected from Beijing based on S genes. It possessed 99% nt identities and showed an incongruent phylogenetic relationship between RdRp and S genes. New Recombination genotypes led by high intra-specific diversity have been reported in studying OC43 coronaviruses circulating in France, where eight different recombinants were discovered and confirmed with in silico analysis of complete genomes available using partial genome sequencing . At present, the base substitution and insertion in OC43 and HKU1 is novel and could not find any matches in either OC43 or HKU1 strains in Genbank library. More importantly, these amino acid sites are located in one of the putative regions of HKU1 receptor binding domain . The protein structure and its related function, especially on the efficiency on human infection, need to be investigated in the future.
The detection rate of coronaviruses were in line with previous reports, no novel infectious coronaviruses was detected, the epidemic trend of coronaviruses were stable and all the infectors showed normal respiratory infection symptoms. Besides there were great genetic diversity of coronaviruses detected from SZ-HK ports and all the strains had a high homology compared with the published strains. However, mutant of the epidemic strains detected during our surveillance are increasing, therefore continuous monitoring of the human coronaviruses is in need among cross-border children, who are more likely to get infected and transmit the viruses across the border easily, in addition to the general public.
- Bat CoV:
- Cox A6:
Quantitative real-time polymerase chain reaction
RNA-dependent RNA polymerase
Respiratory syncytial virus
Reverse transcription polymerase chain reaction
Centers for Disease Control and Prevention. About Coronavirus. https://www.cdc.gov/coronavirus/about/index.html. Accessed 16 Apr 2017.
World Health Organization. Middle East respiratory syndrome coronavirus (MERS-CoV). http://www.who.int/emergencies/mers-cov/en/. Accessed 16 Apr 2017.
Cabeca TK, Granato C, Bellei N. Epidemiological and clinical features of human coronavirus infections among different subsets of patients. Influenza Other Respir Viruses. 2013;7(6):1040–7.
Dijkman R, Jebbink MF, Gaunt E, Rossen JW, Templeton KE, Kuijpers TW, van der Hoek L. The dominance of human coronavirus OC43 and NL63 infections in infants. J Clin Virol. 2012;53(2):135–9.
Amini R, Jahanshiri F, Amini Y, Sekawi Z, Jalilian FA. Detection of human coronavirus strain HKU1 in a 2 years old girl with asthma exacerbation caused by acute pharyngitis. Virol J. 2012;9:142.
Liu WK, Liu Q, Chen DH, Liang HX, Chen XK, Chen MX, Qiu SY, Yang ZY, Zhou R. Epidemiology of acute respiratory infections in children in Guangzhou: a three-year study. PLoS One. 2014;9(5):e96674.
Woo PC, Lau SK, Huang Y, Yuen KY. Coronavirus diversity, phylogeny and interspecies jumping. Exp Biol Med (Maywood). 2009;234(10):1117–27.
Vijgen L, Keyaerts E, Lemey P, et al. Circulation of genetically distinct contemporary human coronavirus OC43 strains. Virol J. 2005;337(1):85–92.
Zhang Y, Li J, Xiao Y, Zhang J, Wang Y, Chen L, Paranhos-Baccala G, Ren L, Wang J. Genotype shift in human coronavirus OC43 and emergence of a novel genotype by natural recombination. J Inf Secur. 2015;70(6):641–50.
Kin N, Miszczak F, Lin W, Gouilh MA, Vabret A, EPICOREM Consortium. Genomic analysis of 15 human Coronaviruses OC43 (HCoV-OC43s) circulating in France from 2001 to 2013 reveals a high intra-specific diversity with new recombinant genotypes. Viruses. 2015;7(5):2358–77.
Dominguez SR, Sims GE, Wentworth DE, Halpin RA, Robinson CC, Town CD, Holmes KV. Genomic analysis of 16 Colorado human NL63 coronaviruses identifies a new genotype, high sequence diversity in the N-terminal domain of the spike gene and evidence of recombination. J Gen Virol. 2012;93(Pt 11):2387–98.
Woo PC, Lau SK, Yip CC, Huang Y, Tsoi HW, Chan KH, Yuen KY. Comparative analysis of 22 coronavirus HKU1 genomes reveals a novel genotype and evidence of natural recombination in coronavirus HKU1. J Virol. 2006;80(14):7136–45.
Cross-Boundary Students - Hong Kong Special Administrative Region Government Press Releases. http://www.info.gov.hk/gia/general/201106/15/P201106150120.htm. Accessed 16 Apr 2017.
Overview of the Health Care System in Hong Kong - Hong Kong Special Administrative Region Government portal: http://www.gov.hk/en/residents/health/hosp/overview.htm. Accessed 16 Apr 2017.
Al-Tawfiq JA, Zumla A, Gautret P, Gray GC, Hui DS, Al-Rabeeah AA, Memish ZA. Surveillance for emerging respiratory viruses. Lancet Infect Dis. 2014;14(10):992–1000.
Geller C, Varbanov M, Duval RE. Human Coronaviruses: insights into environmental resistance and its influence on the development of new antiseptic strategies. Viruses. 2012;4(11):3044–68.
Loo JF, Wang SS, Peng F, He JA, He L, Guo YC, Gu DY, Kwok HC, Wu SY, Ho HP, Xie WD, Shao YH, Kong SK. A non-PCR SPR platform using RNase H to detect MicroRNA 29a-3p from throat swabs of human subjects with influenza a virus H1N1 infection. Analyst. 2015;140(13):4566–75.
Gaunt ER, Hardie A, Claas EC, Simmonds P, Templeton KE. Epidemiology and clinical presentations of the four human coronaviruses 229E, HKU1, NL63, and OC43 detected over 3 years using a novel multiplex real time PCR method. J Clin Microbiol. 2010;48(8):2940–7.
Woo PC, Yuen KY, Lau SK. Epidemiology of coronavirus-associated respiratory tract infections and the role of rapid diagnostic tests: a prospective study. Hong Kong Med J. 2012;18(Suppl 2):22–4.
Prill MM, Iwane MK, Edwards KM. Human coronavirus in young children hospitalized foracute respiratory illness and asymptomatic controls. Pediatr Infect Dis J. 2012;31(3):235–40.
Cabeca TK, Carraro E, Watanabe A, Granato C, Bellei N. Infections with human coronaviruses NL63 and OC43 among hospitalised and outpatient individuals in Sao Paulo, Brazil. J Mem Inst Oswaldo Cruz. 2012;107(5):693–4.
Matoba Y, Abiko C, Ikeda T, Aoki Y, Suzuki Y, Yahagi K, Matsuzaki Y, Itagaki T, Katsushima F, Katasushima Y, Mizuta K. Detection of the human coronavirus 229E, HKU1, NL63, and OC43 between 2010 and 2013 in Yamagata, Japan. Jpn J Infect Dis. 2015;68(2):138–41.
Sipulwa LA, Ongus JR, Coldren RL, Bulimo WD. Molecular characterization of human coronaviruses and their circulation dynamics in Kenya, 2009–2012. Virol J. 2016;13(1):18.
Heald-Sargent T, Gallagher T. Ready, set, fuse! The coronavirus spike protein and acquisition of fusion competence. Viruses. 2012;4(4):557–80.
Pfefferle S, Oppong S, Drexler JF, Gloza-Rausch F, Ipsen A, Seebens A, Muller MA, Anna A, Vallo P, Adu-Sarkodie Y, Kruppa TF, Drosten C. Distant relatives of severe acute respiratory syndrome coronavirus and close relatives of human coronavirus 229E in bats ,Ghana. Emerg Infect Dis. 2009;15(9):1377–84.
Qian Z, Ou X, Góes LGB, Osborne C, Castano A, Holmes KV, Dominguez SR. Identification of the receptor-binding domain of the spike glycoprotein of human Betacoronavirus HKU1. J Virol. 2015;89(17):8816–27.
This research was supported by National Key Research and Development Plan of China (No. 2016YFF0203203, No. 2016YFC1201404), National Natural Science Foundation of China (No. K16026), Guangdong Science and Technology Foundation (No. 2016A020219005, No. 20160223, No. CXZZ20150504163004339, No. 2016A020247001, No. 2016A020223001), Shenzhen Science and Technology Foundation (No. JCYJ20140419151618022, No. JCYJ20150330102720128, No. CXZZ20150504163004339, No. JCYJ20170307104024209, No. JCYJ20160427151920801), Science and Technology Foundation of Shenzhen Enter-exit Inspection and Quarantine Bureau (No. SZ2014101).
Availability of data and materials
Genetic sequence data have been submitted to a publicly available repository (Genbank) with the accessible sequence accession numbers (MF996589-MF996664).
Ethics approval and consent to participate
This study was ethically approved by Shenzhen Entry-exit Inspection and Quarantine Bureau, Shenzhen, China. Written informed consent was obtained from the guardians of all participants before the sample and data collection.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Peilin Liu and Lei Shi contributed to the work equally.
Peilin Liu and Lei Shi are co-first authors.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Liu, P., Shi, L., Zhang, W. et al. Prevalence and genetic diversity analysis of human coronaviruses among cross-border children. Virol J 14, 230 (2017). https://doi.org/10.1186/s12985-017-0896-0
- Human coronaviruses
- Cross-border children
- Molecular epidemiology
- Phylogenetic analysis
- Genetic diversity