Characteristics of HIV-1 molecular transmission networks and drug resistance among men who have sex with men in Tianjin, China (2014–2018)

Background In Tianjin, China, there is a relatively high prevalence of HIV in men who have sex with men (MSM). The number of HIV cases in Tianjin is also increasing. We investigated the HIV molecular transmission network, genetic tropisms, and drug resistance mutations in Tianjin. Methods Blood samples were collected from 510 newly diagnosed antiretroviral therapy (ART)-naïve HIV-1-infected subjects among MSM in Tianjin. Partial pol and env genes were sequenced and used for phylogenetic, genetic tropism, and genotypic drug resistance analyses. Molecular clusters were identified with 1.5% genetic distance and 90% bootstrap support. Results Among the 436 HIV-1 pol sequences obtained from the study participants, various genotypes were identified, including CRF01_AE (56.9%), CRF07_BC (27.8%), B (7.3%), CRF55_01B (4.1%), unique recombinant forms (URFs) (3.7%), and CRF59_01B (0.2%). A higher prevalence of X4 viruses was observed in individuals infected with CRF55_01B (56.3%) and CRF01_AE (46.2%) than with other subtypes. Of all 110 sequences in the 36 clusters, 62 (56.4%) were observed in 23 CRF01_AE clusters and 18 (16.4%) in four CRF07_BC clusters. Eight sequences clustered with at least one other shared the same drug resistance mutation (DRM). In different cluster sizes, the distributions of individuals by age, presence of sexually transmitted disease, and presence of DRMs, were significantly different. Conclusion We revealed the characteristics of HIV molecular transmission, tropism, and DRMs of ART-naïve HIV-infected individuals among the MSM population in Tianjin. Identifying infected persons at risk of transmission is necessary for proposing counseling and treating these patients to reduce the risk of HIV transmission.


Introduction
Since the first patient was diagnosed with acquired immunodeficiency syndrome (AIDS) in Beijing in 1985 [1], HIV-1 has evolved rapidly in China over 30 years, with an increasing number of infected individuals and increased genotype complexity [2,3]. Sexual transmission has contributed greatly to the current HIV-1 epidemic in China. According to relevant reports, in 2017, the proportion of newly discovered HIV/AIDS cases in China through sexual transmission reached more than 90%, 20% of which were attributed to homosexual transmission among MSM [2][3][4][5][6][7]. Tianjin is one of the four direct-controlled municipalities in China, the gateway to Open Access *Correspondence: jghcdc@126.com 2 Tianjin Centers for Disease Control and Prevention, No.6 Huayue Road, Hedong District, Tianjin 300011, China Full list of author information is available at the end of the article Beijing, an important trading port, and neighbors Beijing and Hebei, with a total population of over 15 million people. Sexual transmission is the historically predominant route of HIV infection in Tianjin. In recent years, MSM transmission, as opposed to heterosexual transmission, has resulted in a tenfold increase in the total number of infections, particularly among young men in Tianjin [8]. The Tianjin Municipal Health and Family Planning Commission reported that MSM transmission cases accounted for 74.92% of all new diagnoses from January to October 2018, according to statistics from the AIDS prevention and control information management system. Phylogenetic analysis provides insight into the HIV transmission network structure. Typically, clusters are defined by the genetic distance between analyzed sequences and/ or statistical support within the inferred tree [3]. Characterizing populations and forming transmission networks allow for targeted interventions to individuals at risk [4]. By identifying common clustering among MSM, we examined the HIV molecular transmission network and transmitted drug resistance mutations to determine the characteristics of the HIV epidemic in Tianjin at the provincial level to guide HIV prevention measures.

Study subjects
In total, 510 cases of newly diagnosed ART-naïve HIV infections were selected from the MSM population in Tianjin according to stratified random sampling after obtaining informed consent. Participants' demographic data were obtained through face-to-face interviews prior to blood collection. Plasma (500 μL) was separated from whole blood within 24 h of collection and was used to determine the HIV-1 nucleotide sequences for subsequent analysis.

HIV-1 RNA extraction, amplification, and sequencing
HIV-1 RNA was extracted from plasma using a QIAamp Viral RNA Mini kit (Qiagen, Hilden, Germany). Partial sequences of pol (HXB2: 2167-3440) and env (HXB2: 7022-7647) were amplified from extracted viral RNA [7,9]. The pol and env fragments were amplified in a onestep reverse transcription polymerase chain reaction (RT-PCR) with primers using a Takara one-step RT-PCR kit (Shiga, Japan). Second-round PCR (nested PCR) was performed with primers using 2 × Taq PCR MasterMix (Takara) to increase the sensitivity and specificity of PCR. The primers and reaction cycling conditions are shown in Table 1. PCR products were identified by 1% agarose gel electrophoresis. Finally, positive products were sent to Anpu Biotechnology Company (Beijing, China) for sequencing.
The obtained sequence fragments in the pol and env regions were edited and assembled using Sequencher 5.0 software (Gene Codes, Ann Arbor, MI, USA). The assembled sequences were aligned and checked manually together with the reference sequences retrieved from the Los Alamos HIV database (www.hiv.lanl.gov) in BioEdit  [7,9]. Additionally, to avoid potential errors, the sequences were compared with all known sequences in the HIV Database from Los Alamos National Laboratory, using the online HIV Basic Local Alignment Search Tool HIV Blast (https ://www.hiv.lanl. gov/conte nt/seque nce/BASIC _BLAST /basic _blast .html). The genotype of each patient was determined based on the genotypes of both the pol and env regions. If only the pol region was available, the genotype of the region was determined.
The nucleotide sequences of pol gene, containing the full-length protease gene and the first 299 codons of the reverse transcriptase gene, were submitted to Stanford HIV Drug Resistance Database. HIV-1-transmitted drug resistance mutations (DRMs) were identified according to the WHO surveillance list for nucleoside reverse transcriptase inhibitors (NRTIs), non-nucleoside reverse transcriptase inhibitors (NNRTIs), and protease inhibitors (PIs) using the current Calibrated Population Resistance tool v5.0 (https ://hivdb .stanf ord.edu/cpr/). The env sequences were analyzed for prediction of viral coreceptor usage based on env V3 loop sequences using the online tool Geno2pheno algorithm (https ://corec eptor .geno2 pheno .org) [10], with a false-positive rate cut-off of 10% according to European guidelines [8,11].

Phylogenetic and clustering analysis
We constructed the HIV pol genetic transmission network at a 1.5% distance threshold using Tamura-Nei93 (TN93) nucleotide substitution model in HIV-TRACE (www.hivtr ace.org), and all sequences with pairwise distance ≤ 1.5% were identified. Phylogenetic and bootstrap analyses supporting branching with HIV-1 pol and env reference sequences were determined by the neighbor-joining method using the Tamura-Nei model with MEGA X, based on 1000 resamplings. Robust pol clusters were identified by combining the genetic distance in HIV-TRACE and bootstrap support values with 1000 replicates in MEGA X, which was consistent with the env clusters [12]. The codons associated with major DRMs defined by Lewis [13] were excluded to avoid the potential impact of convergent evolution [14].

Statistics
Statistical comparisons were performed using Fisher's exact and Chi 2 tests for selected variables. Continuous variables were analyzed using the Mann-Whitney U-test for nonparametric statistics. Commercial software SPSS 24.0 (SPSS, Inc., Chicago, IL, USA) was used for statistical calculations. All tests were two-tailed, and values of p < 0.05 were considered statistically significant.

Demographics and HIV genotyping of study subjects
HIV pol sequences were obtained for 436 of the 510 study subjects, with a success rate of 85.5%. HIV 384 env sequences were obtained for 436 subjects with pol sequences successfully amplified. The median age of these subjects was 29 years (range, 16-

Characteristics of transmission clustering
The pol sequences of 213 individuals were clustered in 42 clusters at 1.5% genetic distance threshold using the TN93 model in HIV-Trace (Fig. 1). Of the 213 pol sequences, 110 sequences in 34 clusters at bootstrap support values ≥ 90% in the neighbor-joining trees (Fig. 2a). Clustering performed based on partial pol was mostly sustained by the env sequences (Fig. 2b), except that two pol clusters were identified as 4 clusters because of two env clusters divided into two smaller clusters respectively. Thus, 36 robust clusters including 110 individuals (25.2%, 110/436) were identified according the clustering consistency analyzed and reference to relevant literature [15,16]. Of the 110 clustered individuals in the 36 clusters, 78.2% (86/110) were in 32 small clusters (including 2-5 nodes) and 21.8% (24/110) were in four large clusters (including > 5 nodes) (Figs. 1, 2). The annual distribution of the individuals clustered showed an increasing  trend from 2014 to 2016 followed by a decreasing trend (Table 3).
The proportion of individuals with sexually transmitted diseases (STDs) (8.9% and 24.7%) was higher than the proportion of individuals without STDs (3.8% and 17.2%) in both large and small clusters ( Table 4, χ 2 = 9.166, p = 0.010). The proportion of individuals with DRM (7/30, 23.3%) was higher than that of individuals without DRM (17/406, 4.2%) in the large clusters (Table 4, χ 2 = 20.990, p = 0.003). Of the 11 clustered sequences with at least one DRM, eight clustered with at least one other shared the same DRM. One of the three strains harboring multiple DRMs to both NRTIs and NNRTIs (K70R + M184V + K103N + Y181C) was confirmed to be clustered in the genetic transmission networks.

Discussion
Based on phylogenetic and demographic parameters, we analyzed nucleotide sequences for 436 newly diagnosed patients among MSM to track the characteristics of HIV-1 transmission networks. The results revealed an epidemic characterized by high heterogeneity in the subtypes and high prevalence of recombinant forms of infection (92.7%). From the results of our local MSM cohort, since 2014 CRF55_01B strains were discovered successively and the type and number of recombinant genotypes are increasing, leading the more complicated epidemic trends. Recombination is an important mechanism contributing to the genetic diversity of HIV-1 [16]. Thus, an increasing number of circulating recombinant forms (CRFs) and URFs have been reported on a global scale [17,18]. A total of 102 HIV-1 CRFs are listed in the Los Alamos National Laboratory HIV database (https :// www.hiv.lanl.gov/conte nt/seque nce/HIV/CRFs/CRFs. html). The emergence of novel recombinant forms may easily occur via co-circulation and dual infection of multiple HIV-1 genotypes among MSM in Tianjin, such as subtype B, CRF01_AE, and CRF07_BC. The recombinant forms are increasing the complexity of the HIV-1 epidemic among the MSM cohort in Tianjin. Therefore, effective HIV-1 molecular epidemiologic investigations are needed to identify the transmission of potential HIV-1 recombinant forms in Tianjin, China [9].
HIV transmission clusters are most often identified by phylogenetic analysis based on similarities in viral sequences [19,20]. Cluster inclusion thresholds tend to be ad hoc, and there is no widely accepted definition [20]. Comparison of the results from several previous studies can be confounded by varying populations, sampling fraction, individual risk profiles, and varying methods used for cluster identification [21]. Traditionally, statistical node support for the relationships in a phylogenetic tree is evaluated by bootstrapping [22]. Different studies have used bootstraps ranging from 70 to 99% in combination with genetic distances of 1% ± 4.5%, more than 90% indicating strong support for a group [23]. In a study of local and national HIV surveillance in the USA, a pol genetic distance of two individuals of ≤ 1.5% implies a direct or indirect epidemiological linkage [24]. Previous studies revealed pol mean estimated evolutionary rates for CRF01_AE, CRF07_BC, and B of 2.54-2.97 × 10 -3 , 1.71-2.03 × 10 -3 , and 2.09 × 10 -3 substitutions/site/ year in China [5][6][7]. For this study, we used strict criteria to identify all clusters based on a mean genetic distance of ≤ 1.5% and bootstrap value ≥ 90%, considered the sampling fraction, methodology, and convenience of follow-up. In this analysis, we found that 25.2% of all newly diagnosed cases among MSMs were included in 36 transmission clusters. In the different sized clusters, the CRF55_01B, CRF01_AE, and CRF07_BC (6/18) viruses were mostly in small clusters, whereas B viruses were mostly in large clusters. The clustering characteristics of different subtypes may be related to strain variation and/ or lack of linkages because of sample density [5][6][7]. However, individuals with more linkages may have a higher transmission risk [5][6][7]. Thus, intra-group concentrated transmission of different subtypes requires further analysis. From 2014 to 2018, the annual trends in the individuals clustered may be associated with current treatment strategies in China and/or lack of some lineages caused by sample selection bias. In 2014, the standard of free antiviral treatment for AIDS in China was adjusted from 350 to 500 CD4-cells/µL. In 2016, the standard was further adjusted to "Discovery is Treatment". Our study in Tianjin also revealed a decreased annual prevalence of DRMs in 2018 after an overall increasing from 2014 to 2017 (Table 2). Whether universal free access to medical care and ART for more patients can reduce further transmission should be evaluated in longer surveillance studies.
In China, the regimen composed of TDF, 3TC, and EFV is currently the most commonly used free first-line therapy [25]. Our study highlights that DRMs affecting the efficacy of NNRTIs are the most common, followed by those of NRTIs and PIs, which is consistent with results of domestic and foreign studies [25][26][27]. Related studies in the USA showed that the K103N mutation was the most common, and its generation and transmission were related to the failure of early antiviral therapy caused by patients' long-term and frequent use of the NNRTI drug efavirenz [26]. In our study, K101E was the most frequently observed mutation in response to NNRTIs, followed by K103N. Whether this is related to region, race or the extensive application of NNRTI drugs and early antiviral failure in antiviral therapy in China should be further evaluated. However, our study showed that the transmission of viruses containing DRMs exhibited the significant increase in large clustered infections (Table 4), and one of the three strains harboring mutations responsible for drug resistance to NRTIs and NNRTIs were confirmed to be clustered in the genetic transmission networks. As early as 2007, a study by Art et al. showed that approximately half of transmitted resistance can be attributed to clustered infections [28]. Because of the increase in the prevalence of drug resistance and emergence of multi-resistant mutant strains, DRM surveillance is necessary to prevent the spread of HIV.
Since the CCR5 blocker maraviroc was applied clinically for treating patients extensively harboring R5 viruses in Europe and America, studies have focused on HIV-1 tropism [11]. Nevertheless, current HIV-1 coreceptor usage in China has not been fully characterized. Understanding the co-receptor usage of HIV strains is essential for assessing the candidacy of CCR5 antagonists for treating HIV infection in China [29,30]. Simultaneously binding to CD4 and two main co-receptors, CCR5 or CXCR4, is a necessary condition for HIV to infect target cells. Co-receptor selectivity is determined by genetic sequences within the HIV env "V3" region, which is involved in co-receptor binding [30]. HIV-1 variants are classified as R5-and X4-tropic viruses according to the ability to use CCR5 or CXCR4, respectively. The CCR5 antagonists, which inhibit HIV-1 binding on the CCR5-coreceptor, are only active on R5-tropic viruses, indicating tropism determination before prescription. Therefore, the determination of tropism is useful in clinical practice [31]. Several studies have supported the prevalence of tropism associated with HIV disease progression [32][33][34]. R5-tropic viruses are predominant in the early stages of HIV infection, because of preferentially selective transmission for the viruses as a biological bottleneck inherent to the genital mucosa [34]. We found that the proportion of R5-tropic virus clustering is slightly higher than that of non-clustering without significant differences. In addition, several studies have shown that X4-tropism for CRF01_AE recombinant is associated with accelerated progression to AIDS [32][33][34]. We observed a high prevalence of CRF01_AE and CRF55_01B X4 strains. CRF01_AE is the most prevalent in China and has contributed to 84% of HIV infections in Asia [32][33][34]. Further transmission of X4-tropic CRF01_AE and its second-generation recombinant strains may impede treatment with CCR5 antagonists in the future. Various antagonists are urgently required for effective and targeted treatment in this situation.
By utilizing the information from the inferred molecular HIV transmission network, we combined the demographic, clinical, and molecular data and found that 33.6% of individuals with STDs appeared in the clusters. This is consistent with recent reports on the increasing coincidence of HIV with STDs and high-risk sexual behaviors [35]. Seventeen of the 36 transmission clusters in the network contained 23 individuals with non-Tianjin permanent register. These individuals were primarily from northern Chinese provinces, including Hebei, Jilin, and Heilongjiang. Therefore, measures of the prevention and control of HIV transmission between Tianjin and major provinces are needed.

Conclusion
Our study illustrated the characteristics of HIV molecular transmission, tropism, and drug resistance of ARTnaïve HIV infections among the MSM population in Tianjin. It is necessary to determine which individuals of a population are at an increased risk of infection to intervene before further transmission occurs and to administer appropriate treatments. Moreover, the cooperation between Tianjin and neighboring provinces regarding HIV prevention and control should be strengthened. Based on the prevalence of tropism, we suggest that tropism testing of the HIV-1 V3 gene is pivotal for controlling transmission and treatment of HIV infections in China.