Several studies have indicated that many current HPV typing methods are not able to reliably identify all types present in complex multiple infections [19, 21]. In the WHO HPV LabNet Global genotyping proficiency study most labs (90%) were able to identify HPV-16 and 18 as individual types but less than 80% were able to identify HPV 56, 59, and 68. Of more concern was that 28/84 data sets reported false positive results. A notable decrease was observed in the performance of most assays in identifying types when present in multiple infections and only 50 to 73% of the data sets generated by these assays correctly detected the types present . This suggests the need for further assessment of the tests used and regular participation in proficiency testing to ensure the quality of data. It also suggests that epidemiological data may not be completely accurate and result in detection biases.
Recently a study was published using 454 NGS technology and HPV specific primers targeting the conserved L1 gene. A good correlation was reported between INNO-LiPA HPV Genotyping Extra assay (Innogenetics, Gent, Belgium) and NGS data but the NGS had a lower sensitivity . This study differs from our NGS study in that our study does not use specific primers, had a pre-amplification enrichment step using RCA as well as using the Illumina GAII system to generate sequence. Of note was that using this methodology there was a greater sensitivity than LA genotyping. This study demonstrates the use of NGS in genome sequencing and genotyping of the HPV types present in a complex multiple infection in an unbiased manner. The study design was a metagenomic-based approach, extracting total DNA from a cervical specimen (HH015), without prior virus purification. Circular DNA present in the sample was enriched using phage phi29 DNA polymerase in a randomly-primed RCA method . This allowed us to amplify whole HPV genomes in the sample in an unbiased sequence-independent manner, unlike other amplification methods such as PCR. This robust technique has successfully been used for the amplification of a number of circular DNA viruses (reviewed in ), including HPV , and provides ample quantities of sufficiently pure DNA for sequencing. Illumina sequencing was chosen based on the expected high depth of coverage achieved with this technology.
Approximately 20% of the short sequence reads generated by Illumina sequencing of the RCA-enriched DNA from specimen HH015 were identified as HPV sequences (Figure 2). Considering the small size of the HPV genome (8Kb), even in relatively high copy numbers, in relation to the human genome (ca. 3000 Mb), this level of coverage indicates the highly successful amplification or enrichment of the HPV DNA by the RCA technique. Complete or near complete genomes were assembled for five HPV types (30, 39, 40, 16, 56). Both de novo assembly and reference mapping identified these types as being the most abundant in the sample. We were not able to de novo assemble full genomes for the less abundant types in the sample, and instead used reference mapping to identify all the types present.
As the HH015 sample contains a mixture of HPV types, reference mapping may be problematic. Short reads sequenced from one type may map to regions of high identity in the genomes of other types present. This could lead to an under or over-estimation of a types’ abundance, or worse, a false positive. This would be particularly dependent on the presence of other highly related types. This is well illustrated when comparing the read count and coverage obtained when we performed read mappings to HPV genomes individually and simultaneously (Table 2). Visual inspection of the read coverage for different regions within the genomes showed unequal read counts. This has, however, been observed in many genome sequencing projects where read coverage is known to be influenced by a number of factors. To overcome these problems, we performed stringent mappings to a highly variable sub-genomic region, the LCR. A consistent read count was obtained for each type whether the mapping was performed individually with a particular HPV type or simultaneously against all types (Table 3). This then allowed for a greater degree of confidence in identifying HPV types that were less abundant. Further support for this, was our finding that the percentage of the genomes or LCRs sequenced did not differ significantly when reads were mapped individually or simultaneously. The relative coverage obtained for the HPV types should reflect the relative viral loads of each type in the specimen. This is assuming that RCA of different types was equally efficient and did not have any amplification biases. Based on the mapping of sequence to the LCR region the type with the highest copy number was HPV-39 followed by HPV types 16, 40, 56, 74, 30, 71, 70, 35, 45, 59, 90, 55, 86, 81, and then finally HPV type 53 as having the lowest copy number.
Roche LA testing of DNA extracted from HH015 identified 12 HPV types (16, 39, 40, 45, 52, 53, 55, 59, 70, 71, 81 and 84). Illumina sequencing could reliably detect 16 HPV types (39, 16, 40, 56, 74, 30, 71, 70, 35, 45, 59, 90, 55, 86, 81, 53), based on de novo assembly and reference mapping to HPV genomes and LCR sequences. Both Illumina and Roche LA therefore detected HPV types 39, 16, 40, 45, 53, 55, 59, 70, 71, 74, 81. LA detected HPV-84 and −52 which were not detected by Illumina sequencing. Illumina sequencing identified an additional 6 types not detected by LA; HPV types 30, 35, 56, 74, 86 and 90. The HR types 35 and 56 are included in the LA, while HPV types 30, 74, 86 and 90, are not.
Illumina sequencing covered 88.6% of the complete HPV-35 genome and approximately 85% of the HPV-35 LCR with 99.6% identity to the reference sequence. No reads mapping to HPV-52 were identified. In the LA the probe for HPV-52 can cross-react with HPV-35, -33 and −58. A separate probe for HPV-35 is included in the LA, but was negative for HH015. This may be due to a low viral load in the specimen; however it may also be that HPV-35 was mistyped as HPV-52 in the LA result. In the WHO HPV genotyping global proficiency study, LA testing was found to frequently give false-positive results for HPV-52. As no reads mapped to the HPV-84 LCR the presence of this type, detected by LA, could not be confirmed by Illumina sequencing. A type-specific PCR using HPV-84 specific primers was unable to detect HPV-84 in specimen HH015 (results not shown). HPV-84 may have been mistyped in the Roche LA result.
The HR HPV-56 was identified by Illumina sequencing as one of the dominant HPV types in HH015. The complete genome was assembled at a coverage of 93.9 (Table 1) and mapped at a coverage of 97.4. This type was not detected by LA: this is probably because the limit of detection for HPV-56 in the LA is very high. Eklund et al.  report that this type, and HPV-52, are frequently undetected in many HPV genotyping assays, including LA, and their prevalence is probably underestimated in epidemiological studies. This is especially when compared to HPV-16 and −18 prevalence’s, for which most assays have a significantly lower detection limit.
Illumina sequencing identified several HPV types in HH015 not included in the LA (HPV types 30, 74, 86 and 90). HPV-30 has been classified as possibly carcinogenic . While the remaining types are not classified as HR oncogenic types, we wanted to know the frequency of these types in our study population to determine if they were common. HPV-30 (14.6%) and HPV-74 (12.8%) were found to be the third and fourth most common low risk types in our cohort, after HPV-62 (23.9%) and HPV-70 (15.6%) . The prevalence of HPV-86 and 90 was 4.6 and 8.3%, respectively. Although our study population was small (109), the high prevalence of HPV-30 and −74 may warrant their inclusion in future HPV genotyping studies.
Inclusion of HPV types 30, 74, 86 and 90 into our previously reported HPV prevalence data for this cohort , showed that only 9.2% of the women had no HPV (10/109), 18.3% had single HPV infection (20/109) and 72.5% had multiple infection (79/109). The high prevalence of multiple HPV infection in this cohort (72.5%) and small sample size limited our ability to assess the impact of the individual HPV types 30, 74, 86 and 90 on the cervical cytology results.