The most prominent features of occult hepatitis B are absence of detectable HBsAg and low level viremia. Although the underlying mechanisms involved in OBI remain to be clarified, both features might be due to low level HBV replication and expression, and it can be hypothesized that at least in some cases they might be caused by point mutations in regulative elements implicated in the control of viral replication and expression. Actually, data from the present study do not support this hypothesis: no nucleotide variations found to be significantly enriched in the OBI dataset mapped to known viral regulatory regions. Thus, the only possible contribution of such variations to OBI appears to be by amino acid change of HBV proteins.
Mechanisms potentially able to explain low level viremia might involve steps of the replicative cycle such as assembly, budding and entry of viral particles and/or the efficiency of viral particle removal by immune system; indeed, a low viral load might also result from low level replication in the presence of incomplete immune control. Mutation in viral proteins might affect any of the above processes: however, the possible mechanistic biological link with the occult state is not obvious for all identified significant amino acid changes.
The identified G to A change of nucleotide 78 produces a Gly to Glu amino acid change in the Pre-S2 region of the Surface ORF, thus replacing a neutral residue with a negatively charged one. Both the Large and the Middle S proteins include the amino acids encoded by the Pre-S2 region, thus both proteins are affected by the observed change. The Large S, due to alternative folding, exists in the form of two alternative structures, one of them mediating virion assembly and the other involved in virion binding to the cellular receptor. According to the current model of Large S protein structure, it can be speculated that the Gly to Glu change in the Pre-S2 region might affect protein folding and, thus, one or both processes.
The 233, 418 and 2485 nucleotide substitutions produce amino acid change in the polymerase, and they might be directly involved in the replication efficiency of HBV genome by affecting polymerase activity. In addition, the 233 substitution also affects the Small S ORF, shared by the Large S, Middle S and Small S proteins, producing a Thr (hydrophilic) to Ala (hydrophobic) amino acid change. Thus, the 233 substitution impacts simultaneously on 4 different HBV proteins.
Finally, the AC to GT co-variation of nucleotides 2240 and 2241, and the C to A change of nucleotide 2435 resulted in amino acid change in the core ORF, the first one affecting a T-helper epitope and thus, possibly interfering with immune response, the other replacing a polar uncharged with a positively charged amino acid, in a region already rich of positively charged amino acids.
On the whole, results show that the Polymerase, the Large, Middle and Small S and the Core proteins, but not the X, are the targets of the observed significant variations. These findings imply several different viral proteins may contribute to OBI and suggest that heterogeneous mechanisms are involved in the genesis or maintenance of occult clinical status.
Overall, in our dataset only 11 out of 41 (26.8%) OBI sequences showed the identified significant amino acid variants. Likely, this result might be due to the small size of the OBI dataset, which affects statistical significance and might have precluded identification of further relevant positions. Nevertheless, it cannot be excluded that point mutations of HBV genome effectively play a significant role in one third only OBIs, with other factors being involved in the remaining two thirds. Among other possible factors might be: (a) co-infection with other viruses, such as Hepatitis C Virus (HCV), that might interfere with HBV replication: indeed OBI has been frequently identified in HCV infected patients [reviewed in 22]; (b) incomplete immune control by the host; (c) DNA methylation of HBV genome ; (d) large deletions, rather than point mutations, in regulative elements of HBV genome, as reported in genotype C OBI patients .
Due to the small size of the OBI dataset, we could not carry out intra-genotype univariable and multivariable analysis; thus, our data cannot exclude also the existence of genotype specific variations associated with OBI.
Although the frequency of Stop codons in the S ORF was higher in OBI than non-OBI sequences, the difference was not statistically significant. It appears that absence of circulating HBsAg due to mutations introducing a Stop codon in the Small S is not a general mechanism responsible for OBIs, though could be involved in a fraction of them. Further studies with larger datasets are needed to better address this point.
An escape mutation mechanism has been associated with genotype D OBI on the basis of a high substitution rate in the S protein, particularly in the MHR of anti-HBs positive OBIs . Analysis of entropy plots of genotype D sequences from our OBI and non-OBI datasets showed higher variability in OBIs (Figure 3A): the most prominent regions of amino acid variation correspond to amino acids 110-140, a region spanning the "a" determinant of the HBV S protein. The "a" determinant is located within the immunodominant loops and is included in the MHR (residues 100-151). It is exposed on the surface of HBV particle and represents a highly immunogenic region, it is the primary target of neutralizing antibodies and it is common to all HBV genotypes. However, it remains to be investigated whether in genotype D OBIs such an escape mechanism is the main factor or one of several factors responsible for OBI, as well as whether or not this mechanism also plays a role in the other genotypes.
Prediction models are being developed as promising tools to help clinician in diagnosis and patient management. In the present study, we evaluated the feasibility of bioinformatics prediction models to classify HBV infections into OBI and non-OBI by molecular data. The performance of the models was evaluated by accuracy, AUC, TNR and TPR, four parameters measuring the prediction property of the test. The overall prediction performance, although showed high accuracy and AUC, may not be satisfactory in terms of specificity (TNR and TPR), for any in-silico genotype-based phenotype prediction model (Table 3). A more complex input encoding, considering base triplets instead of single base positions, produced similar results. The unsatisfactory performance was likely due to the relatively too small sample size of OBIs and the highly unbalanced (4:1) dataset towards non-OBI. However, results are promising in the perspective of a broader collection of OBI sequences and indicate the feasibility to derive prediction models to classify HBV infections into OBI and non-OBI by molecular data.