Patients and clinical history
People included in the study give written consent to collect their data and biological samples. Patient 1 (Pt.1): 52-years-old Caucasian man, HIV infected for 35 years (CDC stage A1). Coinfected with HCV (genotype GT 1b), METAVIR F0, in stable antiretroviral therapy with raltegravir, abacavir and lamivudine, with suppressed plasma viral load and high-level of CD4 (stably > 37%, 650 cells/mL) and naïve to anti-HCV therapy. Patient 2 (Pt.2): 55-years-old Caucasian man, diabetic and HIV infected for 23 years (CDC stage B1). Coinfected with HCV (GT 3a), METAVIR F2, relapse to pegylated interferon plus ribavirin; in stable antiretroviral therapy with darunavir, cobicistat and lamivudine (with suppressed plasma viral load (< 39 cp/mL) and high-level of CD4 (stably > 30%, 600 cells/mL). Patient 3 (Pt.3): 74-years-old Caucasian man, infected with HCV (GT 2c), METAVIR F0, naïve to anti-HCV therapy. Patient 4 (Pt.4): 51-years-old Caucasian man, diabetic and HIV infected for 33 years (CDC stage B2). Coinfected with HCV (GT 4d), METAVIR F4 (cirrhosis CTP A5, MELD 9). Regarding antiretroviral therapy during first-round, the patient was treated with darunavir/ritonavir plus dolutegravir and for drug interactions shift to dolutegravir/rilpivirine plus emtricitabine/tenofovir alafenamide with undetectable HIV RNA (< 39 cp/mL) and high-level of CD4 (stably > 16%, > 400 cells/mL).
HCV-RNA extraction and amplification
HCV-RNA from collected plasma samples was automatically extracted using QIAamp Viral RNA Mini Kit on QIACube system (QIAGEN, Hilden, Germany) following manufacturers’ instructions. Reverse transcription and subsequent amplification were performed using a pan-genotypic commercially available kit, DeepChek® RT-PCR and Sequencing Kit (ABL SA, Luxembourg, Luxembourg), which led to the amplification of all three DAAs-HCV targeted genes (NS3, NS5A and NS5B). In detail, after a one-step reverse-transcription the amplification for each gene followed a specific and singular nested-PCR protocol, which led to the production of one amplicon of 650 bp for NS3 and of 707 bp for NS5A, and two amplicons for NS5B: a fragment of 693 bp at the 3′ portion of the gene and a fragment of 749 bp at the 5′ portion of the gene. PCR products were analyzed by electrophoresis on 1% agarose gel and size-specific expected bands were purified using QIAquick GEL Extraction Kit (QIAGEN, Hilden, Germany).
HCV sanger sequencing
Direct sequencing of NS3, NS5A and NS5B was performed by Sanger method on an automatic sequencer ABI PRISM 3100 genetic analyzer DNA Sequencer (Applied Biosystem, Foster City, CA, USA). The samples were prepared using BigDye Terminator v.3.1 cycle sequencing kit (Applied Biosystem, Foster City, CA, USA) followed by a purification step with BigDye XTerminator™ Purification Kit (Applied Biosystem, Foster City, CA, USA). The generated nucleotide sequences were analyzed with SeqScape® Software (ThermoFisher Scientific, Waltham, MA, USA) set at > 25% height of the second peak for IUB assignment and aligned with reference sequence: M58335 for 1b, D50409 for 2c, D17763 for 3a and DQ418786 for 4d (GenBank accession numbers). RASs identification was based on Geno2Pheno algorithm [16].
HCV next generation sequencing
The same amplicons generated on NS3, NS5A and NS5B, used for Sanger analysis, were sequenced on Illumina MiSeq NGS platform (Illumina, San Diego, CA, USA). Amplicon purification and quantification were performed by Agencourt AMPure XP (Beckman Coulter, Villepinte, France) and Qubit dsDNA Assay Kit (ThermoFisher Scientific, Waltham, MA, USA), respectively. Amplicons from the same patient were pooled and diluted to a final concertation of 0.2 ng/μl and pools were subsequently used for NGS library generation which was performed using Nextera XT DNA Library Prep Kit (Illumina, San Diego, CA, USA). The library generated was then diluted and sequenced with MiSeq Reagent Kit v3 (600-cycles) (Illumina, San Diego, CA, USA) on MiSeq platform. Before starting our sequencing experiment, we used the online ‘Sequencing Coverage Calculator’ (Illumina, San Diego, CA, USA) [17] to estimate the sequencing conditions that allow achieving a depth of coverage (100,000X), where coverage is the number of reads that align to a known reference sequence. This is an essential information because multiple observations per base are needed to obtain a reliable call and, in addition, coverage is also used as a unit for the statistical power of sequencing data [18].
NGS data analysis
RAS detection and frequency analysis were performed in parallel by using three different approaches: a commercially available certified user-friendly software and two homemade in silico pipeline, one based on the online informatic tool provided by Geno2Pheno (geno2pheno(ngs-freq)) and one based on VarScan (Fig. 1).
For the first two approaches, the quality of raw sequences obtained from MiSeq run was first checked using FastQC (v 0.11.5) (Babraham Bioinformatics), and it was increased using FLASH (v 1.2.11) (Johns Hopkins University Center for Computational Biology), which generated high-quality reads by overlapping paired-end reads using a minimum overlap between R1 and R2 of 20 bp with a maximum of 10% difference.
The approach based on the certified DeepChek®-HCV v2.0 analysis software [19] allowed inferring simultaneously four different HCV databases (Geno2Pheno [16], International Antiviral Society [20], Lontok et al. Hepatology 2015 [21] and Sorbo [22]) and permitted the analysis of minority HCV variant by setting arbitrary thresholds. For our analysis, 1, 5 and 10% thresholds were chosen.
In the second approach, the analysis included the use of fastqToFreq [23] or Burrows-Wheeler Aligner (BWA mem) [24] for the generation of aligned sequences converted as frequencies files, comma-separated values (CSV), that underwent Geno2pheno (ngs-freq) analysis (threshold settings: 1, 5 and 10%) [25].
For the third approach, VarScan (University of Washington in St. Louis) was used for variants identification. The reads were first trimmed and cropped using Trimmomatic [26] and the quality of the sequences was checked using FastQC. The resulting reads were further corrected using SPAdes software (Center for Algorithmic Biotechnology St. Petersburg State University), which correct errors based on specific parameters (reads corrected by BayesHammer) and, successively, reads were aligned with the reference sequence (the same as for Sanger analysis) by BWA mem. Mutations and frequencies were identified by VarScan, which generated variant call format (VCF) files, one for each sample, that were used to generate FastA sequences using Genome Analysis Toolkit (GATK) (Broad Institute). These FastA sequences were finally uploaded and analyzed by Geno2Pheno algorithm [16].