Prediction of T-cell epitopes of hepatitis C virus genotype 5a

Background Hepatitis C virus (HCV) is a public health problem with almost 185 million people estimated to be infected worldwide and is one of the leading causes of hepatocellular carcinoma. Currently, there is no vaccine for HCV infection and the current treatment does not clear the infection in all patients. Because of the high diversity of HCV, protective vaccines will have to overcome significant viral antigenic diversities. The objective of this study was to predict T-cell epitopes from HCV genotype 5a sequences. Methods HCV near full-length protein sequences were analyzed to predict T-cell epitopes that bind human leukocyte antigen (HLA) class I and HLA class II in HCV genotype 5a using Propred I and Propred, respectively. The Antigenicity score of all the predicted epitopes were analysed using VaxiJen v2.0. All antigenic predicted epitopes were analysed for conservation using the IEDB database in comparison with 406, 221, 98, 33, 45, 45 randomly selected sequences from each of the HCV genotypes 1a, 1b, 2, 3, 4 and 6 respectively, downloaded from the GenBank. For epitope prediction binding to common HLA alleles found in South Africa, the IEDB epitope analysis tool was used. Results A total of 24 and 77 antigenic epitopes that bind HLA class I and HLA class II respectively were predicted. The highest number of HLA class I binding epitopes were predicted within the NS3 (63%), followed by NS5B (21%). For the HLA class II, the highest number of epitopes were predicted in the NS3 (30%) followed by the NS4B (23%) proteins. For conservation analysis, 8 and 31 predicted epitopes were conserved in different genotypes for HLA class I and HLA class II alleles respectively. Several epitopes bind with high affinity for both HLA class I alleles and HLA class II common in South Africa. Conclusion The predicted conserved T-cell epitopes analysed in this study will contribute towards the future design of HCV vaccine candidates which will avoid variation in genotypes, which in turn will be capable of inducing broad HCV specific immune responses.


Introduction
Hepatitis C virus (HCV) is estimated to infect 185 million people worldwide [1]. Chronic HCV infection leads to progressive liver disease, being one of the major causes of hepatocellular carcinoma and one of the most common indications for liver transplantation [2]. The World Health Organization (WHO) strongly recommends combination therapy with pegylated interferon and ribavirin for chronically infected patients who qualify for treatment [1]. Recently, two NS3 protease inhibitors (boceprevir and telaprevir) have been approved by the US Food and Drug Administration, and the WHO conditionally (until more evidence has accumulated) recommends that these drugs should be given in combination with pegylated interferon and ribavirin for the treatment of chronic HCV genotype 1 infections. Also, the WHO strongly recommends that sofosbuvir be given in combination with ribavirin alone in patients who cannot tolerate interferon and are chronically infected with genotypes 1, 2, 3 and 4 [1]. However, these therapies are still not affordable in most developing countries.
As a result, the development of an effective HCV vaccine is undoubtedly the best solution for the ultimate control of HCV infections, and is a public health priority.
Prophylactic vaccines against viral infections are generally aimed at inducing a humoral (B-cell) immune response, while therapeutic vaccines preferably activate both humoral and cellular (T-cell) immune responses [3]. A successful HCV vaccine will need to stimulate both arms of the adaptive immune response, since while both cellular and humoral .immune responses occur in a naturally infected host, the current consensus is that a strong cellular response is vital for viral clearance and protection [4].
The development of an effective HCV vaccine requires an understanding of the host's adaptive immune response to natural infection. As with other viral infections, viral antigens are presented to CD4+ and CD8+ T-cells via human leukocyte antigen (HLA) class II and class I molecules, respectively [5]. Different HLA class alleles have been found to be associated with HCV infection. For example, HLA-A*11, HLA-Cw*04 and HLA-B*53 have been associated with HCV persistence [6,7], while HLA-B*27, HLA-A*11:01, HLA-B*57, HLA-Cw*01:02 and HLA-A*03 have been associated with spontaneous HCV clearance [8,9]. Also, HLA-DRB1*11 and HLA-DQB1*03:01 have been associated with decreased disease severity of HCV infection globally, suggesting they may present HCV-derived epitopes more efficiently to CD4+ T-cells than others and thus capable of viral clearance [10]. During acute HCV infection, development and persistence of strong specific responses by CD8+ and CD4+ T-cells [11,12] and neutralizing antibodies [13] are associated with viral clearance, with HCV-specific CD8+ and CD4+ T-cells usually being transient or absent in patients who develop persistent infections. The rate of chronic liver disease progression has been shown to be determined by the magnitude of HCV-specific CD4+ T-cell responses, since these cells are essential for both the cellular and humoral responses [14]. CD8+ T-cells are essential for long-term protection against chronic HCV [15], while CD4+ T-cells play a role in viral clearance [16].
HCV infection evades the host's immune system by generating immune escape variants through alteration of the virus HLA-restricted epitopes to avoid being recognized by T-cells and neutralizing antibodies [14]. Thus effective HCV vaccines will need to target protective epitopes that display minimal cross-genotype amino acid variability as they will provide broad potency [17]. Peptides corresponding to protective epitopes are desirable vaccine candidates because they are easy to construct and produce, and they do not contain infectious materials [18]. The first step in the process of epitope-based vaccine design and development is the in-silico prediction of peptide binding affinities to HLA proteins [19]. Genotype 5 accounts for over 50% of HCV infections in South Africa [20], and is becoming more prevalent in Europe and North America [21]. It is the most conserved of HCV genotypes, being classified into only one subtype (5a) [22][23][24]. The growing prevalence of HCV genotype 5a in different parts of the world necessitates its molecular characterization in order to improve the formulations of vaccine candidates that are in development. Thus the aim of this study was to assess immunological determinants by predicting conserved epitopes in near-full length HCV genotype 5a sequences using a suite of online programmes to help in the designing of new vaccine candidates.

1812
LGGWVASQI 16 0.608 LGGWVAAQL (100) LGGWVAAQL (95) LGGWVATHL (88) LGGWVASQI (  -indicates that the epitope has been experimentally proven to be a true positive. Bold-indicates percentage of epitope that is 100% conserved in more than 70% of the sequences analysed in each genotype. Italic-indicates amino acid(s) variation in epitope in comparison to the predicted epitope.

Validation of epitopes
Seven of the predicted epitopes were previously confirmed experimentally by other studies as true positives in comparison with the epitopes analysed in the IEDB resource database. Majority of the epitopes predicted in this study have not been previously tested experimentally. The 'true epitopes' are highlighted by ( # ) in Tables 1, 2 and 4.

Discussion
Several studies that have published HCV epitopes focused mainly on genotype 1 [41,42], but most of these studies do not take into account the diversity in other genotypes that are common in developing countries like most African countries. In the present study, predicted antigenic epitopes of HCV genotype 5a proteins from South Africa were analysed followed by conservation with randomly selected genotypes 1-6 references from GenBank. Several studies have confirmed the importance of using immunoinformatics as good predictors for selecting HLA ligands, Tcell epitopes and immunogenicity [43]. As a result, several immunoinformatics methods have been developed to assist in the identification of HLA binding peptides [44,45].
For this analysis, near full-length sequences covering all HCV proteins with the exclusion of the 3′end of the NS5B were included to maximize number of epitopes predicted. The use of the whole viral genome for developing epitope vaccines has a potential control over the immune response and eliminating the side effects [43], and it also increases the chance of detecting a virus at any developmental stage [46]. It has been shown that multiple epitopes from different parts of the HCV genome are important to produce a vaccine that can elicit strong humoral immune responses and multiple specific cellular Table 3 Distribution of genotype 5a HLA class I and II predicted epitopes in each of the HCV gene   immune responses [47]. A polyepitope-based strategy with multiple components combining core, E1, and E2 proteins; and conserved T-cell epitopes in the NS proteins has been suggested to be a good vaccine candidate for HCV [48]. High number of epitopes was predicted for HLA class II as compared to class I. The findings of this study are consistent with a study by Shehzadi et al. that predicted epitopes in genotype 3 from Pakistan. The study showed that majority of predicted epitopes were found in the NS3 protein for both HLA class I and HLA class II alleles and most of the epitopes were conserved among different genotypes [49]. Although the NS3 region is one of the conserved regions in HCV, variability in the nucleotide and amino acids has been reported by several studies in the same genotype and also in different genotypes [50,51]. A recent study that analysed 1568 NS3-protease sequences from genotypes 1-6 reported that the protease amino acids sequence was moderately conserved and majority of the amino acids clustered in small regions. Of the 181 amino acids analysed 47% showed <1% variability among all HCV genotypes, and 17.1% amino acid positions showing >25.1% variability [51]. The NS3 is considered to be a good cellular target candidate for a therapeutic vaccine [52] since majority of the HCV viral epitopes recognized by CD8+ and CD4+ T-cells are located in the NS3 region [53][54][55][56]. The NS3 specific CD4+ and CD8+ T-cell responses were reported in patient responders to interferon therapy [57] and in spontaneous clearance of HCV [58].
Most of the predicted epitopes in the study sequence were found to be conserved across different HCV genotypes with a higher number of epitopes conserved at the anchor residues. The anchor residues are important for epitope high binding affinity to HLA [59]. Conserved epitopes might influence the immunogenic potential since mutations within the epitopes can increase the chance of immune escape [60]. For a vaccine to be effective globally the selected epitopes must cover HLAs of different populations and it must also be conserved among different genotypes. The high mutation rates of viral epitopes and HLA polymorphisms are some of the challenges that are associated with the development of peptide vaccines [61]. Successful epitope vaccine design requires a broad knowledge of HCV genotype diversity. This will help in the proper selection of conserved HCV-specific T-cell epitopes that will help in avoiding HCV immune evasion [62]. This study attempted to ensure maximal coverage of HLA polymorphism and different genotypes by analyzing conserved epitopes considering different HLA alleles. Majority of the epitopes predicted from HCV proteins isolated from South African genotype 5a were good binders against HLA alleles that are found worldwide. HLA is both polygenic and polymorphic, and the pool of HLA molecules differs for every individual. Different HLA alleles bind peptides with a particular sequence pattern [63]. For an HLA allele to be covered by a set of epitopes, at least one of the epitopes should be capable of inducing an immune response when bound to the corresponding HLA molecule [46]. The epitopes predicted in this study bind to many HLA alleles including the ones common in South Africa and can be used for designing good vaccine candidates that will eventually work in genetically diverse populations. In-vitro and in-silico studies have showed that HLA alleles preferentially bind to conserved regions of viral proteins in human viruses [64].
Very few epitopes were found to be experimentally true positive, however this can be due to the fact that most of the previous studies focused on genotype 1. A limitation of the study was a lack of in-vivo and in-vitro studies to confirm the predicted immunogenic epitopes, which will be the focus of future studies. However in-silico studies still provide the basis for designing good vaccine candidates.
In conclusion, the results of this study demonstrated antigenic T-cell epitopes that are conserved among genotypes and good HLA binders derived from genotype 5a sequences that can be good candidates for vaccine development. Predicted epitopes analysed in this study will contribute to the future design of an efficient vaccine with the use of conserved epitopes to avoid variation in genotypes and as such, it will be able to induce broad HCV specific immune responses. Conserved epitopes among different genotypes will be experimentally tested in the future to determine their involvement in immune response.

Ethical statement
The study was approved by the Medunsa Research and Ethics Committee of the Faculty of Health Sciences at the University of Limpopo as project no MREC/p/142/ 2009:PG. The MREC is registered as an Independent Review Board with a reference no (IRB00005122).

Prediction of T-cell epitopes
Genotype 5a full-length sequences available in the GenBank and 6 of the near full length sequences generated from a previous study conducted by our group [24] were aligned and consensus sequences created using BioEdit [65] for the prediction of T-cell epitopes. For HLA class I, prediction for binding alleles was performed using ProPred I (http://www.imtech.res.in/raghava/propred1/) at a 4% default threshold by keeping the proteosome and immunoproteosome filters on at 5% threshold. ProPred 1 predicts antigenic epitopes for 47 HLA class I alleles [44]. For HLA class II, prediction was performed using ProPred (http://www.imtech.res.in/raghava/propred/) at a 3% default threshold. ProPred predicts antigenic epitopes for 51 HLA class II alleles [45].

Antigenicity of the epitopes
The Antigenicity score of all the predicted epitopes were analysed using VaxiJen v2.0 online antigen prediction (www.ddg-pharmfac.net/vaxijen/). Epitopes having antigenic score >0.5 were selected as antigenic. Vaxijen server performed well with 87% accuracy at a threshold of 0.5 antigenic score for viruses. VaxiJen v2.0 allows antigen classification based on the physicochemical properties of proteins without recourse to sequence alignment.

Epitope conservation analysis
All predicted epitopes were analyzed for conservation using the IEDB database (http://tools.immuneepitope.org/ tools/conservancy/iedb_input) at a threshold of 100% conservation in comparison with 406, 221, 98, 33, 45, 45 randomly selected sequences from each of the HCV genotypes 1a, 1b, 2, 3, 4 and 6 respectively. The epitopes were considered conserved in another genotype if it shows 100% identity across the epitope in at least 70% of sequences in that genotype in the randomly selected sequences used in this study, downloaded from the public database. In addition, epitope variants that were conserved in at least 70% of the sequences were analysed for conservancy for anchor residues at positions 2 and 9 for HLA class I and positions 1, 4, 6 and 9 for HLA class II.

Validation of predicted epitopes
All the predicted epitopes were submitted to IEDB database (http://www.immuneepitope.org/) to confirm if they had been tested previously by other studies. The immuneepitope database contains experimentally confirmed data about antibody, T-cell epitopes, HLA binding, HLA restriction and HLA class.