Detection of novel viruses in porcine fecal samples from China

Background Pigs are well known source of human infectious disease. To better understand the spectrum of viruses present in pigs, we utilized the 454 Life Sciences GS-FLX high-throughput sequencing platform to sequence stool samples from healthy pigs. Findings Total nucleic acid was extracted from stool samples of healthy piglets and randomly amplified. The amplified materials were pooled and processed using a high-throughput pyrosequencing technique. The raw sequences were deconvoluted on the basis of the barcode and then processed through a standardized bioinformatics pipeline. The unique reads (348, 70 and 13) had limited similarity to known astroviruses, bocaviruses and parechoviruses. Specific primers were synthesized to assess the prevalence of the viruses in healthy piglets. Our results indicate extremely high rates of positivity. Conclusions Several novel astroviruses, bocaviruses and Ljungan-like viruses were identified in stool samples from healthy pigs. The rates of isolation for the new viruses were high. The high detection rate, diverse sequences and categories indicate that pigs are well-established reservoirs for and likely sources of different enteric viruses.


The study
Many emerging viruses are of zoonotic origin and can cause epidemics in humans after overcoming the interspecies barrier through mutation or other genetic events, including recombination. The identification of previously unknown viruses in animal hosts is important for understanding their potential for cross-species transmission and emergence. Pigs are known to harbor a diverse array of viruses, including coronaviruses, astroviruses and kobuviruses [1][2][3][4], among which coronaviruses and astroviruses are well known viruses that can infect human. To better understand the spectrum of viruses present in pigs, we utilized the 454 Life Sciences GS-FLX high-throughput sequencing platform (454 Life Sciences/Roche, Branford, CT, USA), which has emerged as a non-biased, comprehensive and powerful tool for virus discovery in complex environmental samples [5,6], to sequence stool samples from healthy pigs.
With the cooperation of the Lulong County CDC, stool samples were collected from healthy piglets < 3 months of age from several farms and sporadically distributed families that raised pigs in Lulong County during 2006-2009. The samples were transported on dry ice and stored at −80°C at the Chinese CDC.
The samples were screened for rotavirus, calicivirus and astrovirus by ELISA or routine PCR. Nine of the samples negative for these viruses were pooled, diluted (1:5 ratio, wt/vol) and then sequentially filtered through 0.45-and 0.22-μm membranes. The pooled sample included two samples that were previously demonstrated to contain a novel porcine bocavirus (6V/7V CHN) that was partially sequenced in our lab [7]. Total nucleic acid was extracted from the pooled sample and cDNA was generated using random hexamers. Random PCR amplification [8] was performed and the amplified cDNA was used as the template for standard library construction and sequencing using Roche Genome Sequencer FLX Titanium pyrosequencing technology.
The initial pyrosequencing runs produced in excess of 145,000,000 bases of high-quality nucleotide sequence with an average read length of 360 bp. We used a customized informatics pipeline as described previously [9] with minor modifications to computationally identify viral sequences. Briefly, raw sequence reads were filtered to remove low-quality and repetitive sequences; BLASTn and BLASTx were used to identify sequences with similarity to known viruses in GenBank, then sequences identified as viral were further classified into viral families based on the taxonomy of the best hit. Sequence assembly was performed using Newbler (454 Life Sciences) with parameter be set as 90% nucleotide identity over 50 base pairs. Based on this analysis, 348, 70 and 13 unique reads had limited similarity to known astroviruses, bocaviruses and parechoviruses, respectively. Assembly of the raw reads for astrovirus and bocavirus generated 51 and 23 contigs, respectively. Due to the limited number of reads with similarity to parechoviruses, individual reads rather than contigs were analyzed. Most of the sequences detected were highly divergent compared to the most closely related viruses ( Figure 1 and Additional file 1: Table S1, Additional file 2: Table S2 and Additional file 3: Table S3 in the additional file). The reference genomes were selected to represent the top BLAST match for most of the sequences.
Astroviruses have been isolated from a number of host species, including humans, minks, sheep, pigs, chicken, ducks, turkeys [10][11][12] and more recently from marine mammals, dogs, cheetahs and bats [13][14][15][16]. They are generally associated with enteric diseases such as diarrhea and vomiting in a number of mammalian species [17]. The most abundant viral reads in this sequencing library were those from astroviruses. The size of the astrovirus contigs generated ranged from 150 to 5674 bp. These contigs shared 30.1-100% amino acid identity, typically over a small region, with the best hits in the NCBI nr database, which suggests that the sequences are highly diverged from those of known viruses. Interestingly, multiple contigs mapped to the same region of the reference genome (GenBank FJ402983), suggesting the presence of multiple variants of astroviruses in the samples. For example, contigs 1, 6, 13, 21 and 24 mapped to the same subregion of ORF1b. A phylogenetic analysis of these five contigs was performed using both maximum likelihood and maximum parsimony methods in the PHYLIP package with 100 bootstrap replicates. The tree demonstrated that contig 1 was most closely related to avian astroviruses, contig 6 was closely related to porcine astrovirus and contigs 13, 21 and 24 formed a separate branch that was distantly related to all the other astroviruses (Figure 2a). The observed diversity suggests the existence of novel astroviruses in pigs.
Human bocavirus (HBoV) was first identified in respiratory samples from patients with respiratory infections in 2005 [18]. Subsequently, various bocaviruses have been found in respiratory and fecal specimens of human or animal origin, including HBoV 2-4, Gorilla bocavirus (GBoV1) and porcine bocavirus (PBoV1-4, 6V, and 7V) [19,20]. The sizes of the 23 bocavirus contigs in this study ranged from 147 to 1256 bp, and they shared 74.6-99.2% nucleotide sequence identities with the best hit in the NCBI nt database. Contigs 2, 5, 18 and 19 mapped to the same portion of the NS1 region of the reference genome but shared less than 80% identity among themselves at the nucleotide level. A phylogenetic analysis of partial NS1 sequences showed that contig 2 was closely related to porcine bocavirus 1-2 while contig 18 was closely related to PBoV3 [20]. Contigs 5 and 19 shared 74.6 and 80.0% nucleotide identities with PBoV4, respectively (Figure 2b). When we compared the contigs to the sequences from porcine bocavirus 6V and 7V, one and two contigs, respectively, exhibited high nucleotide homologies (> 99%) indicating that we were able to recover these viruses once again with our methods. According to the most recent International Committee on Taxonomy of Viruses species demarcation criteria for the genus Bocavirus (at least 5% nucleotide sequence divergence in their nonstructural gene), our data indicate the existence of multiple new Bocavirus species in pigs.
Parechovirus is a genus in the family Picornaviridae that comprises two species: human parechovirus (HPEV) and Ljungan virus (LV). HPEV causes mild gastrointestinal or respiratory illness, and has been implicated in cases of myocarditis and encephalitis [21]. LV can cause significant morbidity and mortality in wild rodents as well as in laboratory animals, and it is a suspected human pathogen [22]. Bank voles infected with LV in captivity develop several different pathologic signs and symptoms, including myocarditis, diabetes and encephalitis [23]. Thirteen unique viral reads with similarities to both HPEV and LV were identified in this study. They spanned the LV reference genome (GenBank NC_003976.2), to which the amino acid identities ranged from 22.3 to 55.8%. Based on the sequences, we infer the existence of at least two parecho-like viruses. Reads 4, 9 and 11 belong to one virus, and overlap with read 2 over 190-231 bp but with only 81.1-82.7% nucleotide identity. Although each of the read pairs 1 and 12, 3 and 10 and 6 and 13 overlap when aligned with a common reference genome, the two reads in each pair share less than 50% nucleotide identity. Phylogenetic analyses using partial 3C region sequences demonstrated that reads 6 and 13 form a distinct branch in the Parechovirus genus (Figure 2c). It is worthy to mention that these divergent sequences represent some of the first parecho-like viruses reported in pigs till now.
The three types of novel viruses were investigated further. According to the target viruses, five sets of primers were designed and synthesized based on the contigs containing the most reads for each viral type (primers A-E in the Table 1). Positive and negative controls were included in every PCR run and all products were sequenced. An additional five sets of primers (primer 2ndA-E in the Table 1), which targeted internal regions amplified in the 1 st round of PCR, were designed and used in a 2 nd round of PCR to exclude potential contamination, and the products were sequenced (data not shown). All of the sequences are provided in Additional file 4: S4. The results of the 2 nd round of PCR validated all of the samples identified as positive in the 1 st round