Comprehensive Mapping of the Virus and Host Factors that Guide the Paths of HIV-1 Escape from a Therapeutic

preprint OA: closed CC-BY-NC-ND-4.0
Full text 134,193 characters Β· extracted from oa-pdf Β· 9 sections Β· click to expand

Keywords

Antiviral therapeutics, HIV-1, Virus escape, Envelope glycoproteins, Virus Evolution, 26 Selection Pressures, Virus fitness, Fostemsavir, BMS-626529. 27 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 2

Abstract

28 HIV-1 resistance to therapeutics can emerge through diverse mutational routes, yet the 29 determinants guiding pathway selection in vivo remain unclear. Through comprehensive 30 screening, we identified 18 mutations in the HIV -1 Env protein that enhance resistance to the 31 FDA-approved small-molecule therapeutic temsavir. We then examined their occurrence in HIV-32 infected individuals who developed resistance on therapy. Only a subset of the resistance-33 enhancing mutations emerged in vivo. On-treatment mutation frequencies correlated with their 34 spontaneous emergence rates in temsavir-untreated individuals, and were governed by two 35 parameters: (i) Probability of mutation appearance, determined by number and type of nucleotide 36 changes required, and (ii) Probability of mutation persistence, determined by Env functional and 37 immune fitness. Notably, non-neutralizing antibodies commonly-elicited in HIV -infected 38 individuals restricted emergence of multiple resistant forms, driving convergence to a narrow set 39 of escape routes. These findings establish a quantitative framework for predicting therapeutic 40 resistance and reveal how host-immunity constrains viral evolution during treatment. 41 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 3

Introduction

42 Human immunodeficiency virus type 1 (HIV -1) has a remarkable ability to adapt to the 43 host environment. Due to the low fidelity of the viral replication machinery (1, 2) , nucleotide 44 substitutions are introduced randomly across the genome , allowing the virus to sample new 45 variants of its proteins and regulatory elements (3, 4). Such changes facilitate the broadening of 46 cell tropism (5, 6), evasion of the host immune responses (7, 8), and development of resistance 47 to antiviral therapeutics (9). Clinical resistance can emerge by de novo mutations that occur during 48 treatment or by resurfacing of archived forms from latent cell reservoirs (10). Given that 49 therapeutics dock to their protein targets via interaction with multiple residues, several mutational 50 pathways are available for the virus to evade recognition. However, as most antiretrovirals target 51 sites associated with protein function , resistance-enhancing mutations often compromise virus 52 fitness, limiting the ability of these variants to persist in the host. In such cases, adaptive mutations 53 that increase fitness yet retain the resistance phenotype can facilitate persistence (11-13). 54 While functional fitness and inhibitor resistance are clearly traits essential for virus 55 replication, the principles that govern selection of escape paths from therapeutics remain 56 incompletely understood . First, there is a need for a more quantitative understanding of the 57 contributions of these two traits. For example, is a modestly-resistant but highly-fit variant more 58 likely to persist than a highly-resistant but poorly -fit form, and what thresholds define these 59 distinctions? Second, to what extent do factors other than functional fitness and inhibitor 60 resistance contribute to appearance and persistence of the new variants ? For example, amino 61 acid substitutions that require a single nucleotide change are more likely than those requiring two 62 changes. However, the likelihoods for all nucleotide changes , at least in vitro, are not identical, 63 with transitions exhibiting higher rates than transversions (14-16). Whether such preferences 64 influence escape pathway selection in vivo is unknown. Third, the extent to which we can identify 65 all possible mutational paths that impart resistance to a therapeutic is still unclear. Indeed, HIV-1 66 proteins, and particularly the envelope glycoproteins (Envs) , exhibit considerable within- and 67 between-host variability (17, 18) . Such variability results in strain -specific conformations and 68 allosteric effects. Consequently, some mutations may increase resistance or reduce fitness of 69 some isolates but exert limited effects on others. Context-dependence of mutation effects may 70 limit our ability to estimate the resistance of HIV -1 isolates from sequence data, reducing the 71 potential for personalized antiviral therapies. Context-specific effects may also hinder the 72 generalization of in vitro findings to the diverse viruses that circulate in the population. 73 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 4 To address these questions, we investigated the in vivo escape paths of HIV-1 from the 74 Env-targeting inhibitor temsavir (TMR, BMS-626529), the active metabolite of the FDA-approved 75 prodrug fostemsavir (19, 20). TMR effectively reduces viral loads in HIV-infected individuals (21-76 23). In addition, it has been shown to reduce the proinflammatory effects of soluble gp120 and to 77 prevent killing of uninfected CD4-positive cells via effector-mediated mechanisms (24-26). We 78 first conducted a comprehensive effort to identify the mutations that increase HIV-1 resistance to 79 TMR in vitro. Then, we examined their representation in participants of the BRIGHTE clinical trial 80 who developed resistance to this agent during treatment (21, 22, 27). Interestingly, only a subset 81 of the resistance -enhancing mutations ( REMs) identified in vitro emerged on treatment. REM 82 emergence frequencies correlated strongly with their spontaneous emergence rates in the 83 fostemsavir-untreated population. In vitro analysis of the functional and antigenic properties of all 84 REMs as well as in silico modeling of their substitution likelihoods revealed that th eir rates of 85 emergence in vivo were explained well by these parameters. Remarkably, non-neutralizing 86 antibodies against Env, which are commonly produced in HIV -infected individuals, appeared to 87 restrict the range of resistance mutations that emerged in vivo. Together, our findings establish a 88 quantitative framework for understanding the emergence of resistance to antiviral therapeutics 89 and inform strategies to predict the rate and paths of escape in each individual. 90 91

Results

92 Mutations in the core epitope of TMR account for most, but not all, resistance to TMR 93 The small-molecule inhibitor TMR is generated by hydrolysis of the orally administered 94 prodrug fostemsavir in the gut (Fig 1A) (28-30). TMR targets the CD4-binding pocket of Env and 95 stabilizes the trimer in a CD4-unbound state, preventing activation of the entry cascade by the 96 CD4 receptor (Fig 1B) (31). The main contact sites for TMR are the side chains of Env positions 97 375, 426 and 434, which cradle this molecule in the pocket (27, 32-36). Mutations at position 475 98 have also been shown to impact resistance (36-39). The consensus amino acid motif at these 99 four sites for TMR-sensitive strains is Ser at position 375 and Met at positions 426, 434 and 475, 100 collectively designated herein the SM3 motif. Mutations at these sites reduce virus sensitivity to 101 TMR in vitro and were associated with clinical resistance in treated individuals (22, 27, 32) . 102 Several trials were conducted to test the efficacy of fostemsavir and predecessor compound s. 103 The largest trial, BRIGHTE, enrolled 371 HIV-positive participants, who were treated for up to five 104 years (see trial design in Fig S1A) (21-23, 26, 27, 40). Plasma samples were collected from the 105 participants before and during treatment and analyzed by the Monogram Biosciences 106 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 5 PhenoSense GT test (Fig S1B). The assay is based on amplification of env from plasma virus, 107 and bulk cloning of the amplicons into an expression vector (33, 41). The library of plasmids from 108 each sample is then sequenced and used to generate a library of pseudoviruses that is tested for 109 resistance to TMR in vitro, as measured by the concentration required to achieve a 50% reduction 110 of infection (IC50). The trial thus provided genomic and phenotypic data that we used to establish 111 the escape paths of the virus in vivo (see Data File S1). 112 Sequence and TMR IC50 data were available for 570 plasma samples from 360 BRIGHTE 113 participants. We first examined the distribution of IC50 values based on their HIV-1 subtype 114 associations (see phylogenetic tree in Fig S2A and Data File S2). Of the 360 samples, 83.3% 115 were identified as clade B, 3.5% as clade B recombinants, and 9.5% as clade F1. Considerable 116 variability was observed in IC 50 values within and between the clades (Fig 1C). This variability 117 was significantly reduced in the subgroup of samples that contained the SM 3 motif (Fig 1D). 118 Nevertheless, some SM3-containing samples still exhibited high IC 50 values, suggesting that 119 additional sites impact resistance. We also performed the above analysis for a previously 120 published panel of 208 Envs (designated herein the Single-Env Dataset) that were tested for 121 their resistance to TMR (38). A similar decrease in the intra - and inter-clade variability in IC 50 122 values was observed for the SM3-containing Envs of this panel (Fig S2B and Data File S3). This 123 finding suggested that sensitivity to TMR involves only modest β€œclade context” effects and that 124 positions outside the SM3 motif may contribute to resistance. 125 To establish a threshold that distinguishes between sensitive and resistant samples, we 126 examined the distribution of IC50 values for the BRIGHTE dataset. A right-skewed distribution was 127 observed for the pre-treatment samples and a bimodal one for the post-treatment samples (Fig 128 1E). Based on these data, we selected an IC50 threshold of 50 nM TMR, which represents the 85th 129 percentile for the pre-treatment samples. This threshold was then used to define changes in 130 resistance status during treatment. Among the 360 participants, only 132 had sequence and IC50 131 data for both pre - and post-treatment time points ( Fig 1F). Of these, 24 were resistant to TMR 132 before treatment, 43 remained sensitive to TMR throughout treatment, and 65 gained resistance 133 on treatment (with a median time of 218 days). For this latter group, designated herein the escape 134 group, we sought to identify and characterize the mutational paths used by the virus to gain 135 resistance. 136 137 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 6 A combined approach to identify Env mutations suspected of increasing HIV-1 resistance 138 to TMR 139 To identify the escape paths from TMR in the BRIGHTE participants and compare them 140 with all possible paths available to the virus, we pursued the approach described in Fig 2A. First, 141 we used four strategies to identify Env mutations suspected of increasing resistance to TMR. We 142 then tested them in vitro for their effects on TMR resistance and Env function using a pseudovirus 143 infection assay . Finally, for the resistance-enhancing mutations identified, we examined the ir 144 emergence frequencies after treatment in the escape group , to determine if the observed 145 frequencies can be explained by their effects on fitness and/or TMR resistance. 146 To determine if the genotype-phenotype datasets from the BRIGHTE trial are sufficiently 147 informative to identify mutations that increase resistance to TMR , we tested different machine 148 learning models (see details in the Methods Section). As input for the models, we used the amino 149 acids at all 856 positions of Env according to the HXBc2 numbering system (42). The outcome to 150 be predicted was the presence of resistance (IC 50 value greater than 50 nM). The XGBoost and 151 Gradient Boosting classifier algorithms (43) achieved the highest performance across all key 152 metrics (Fig 2B and Fig S3), and were thus chosen as the foundations for subsequent models. 153 We then evaluated the optimal number of Env positions to include as input, which were selected 154 based on their minimal distance (in Γ…ngstroms, Γ…) from the TMR molecule using coordinates of 155 the TMR-liganded Env (PDB ID 5U7O) (38). Positions within 4Γ…, 5Γ…, 7.5Γ…, 10Γ… or 12.5Γ… from any 156 TMR atom were tested (see positions in Table S1). In addition, we tested the four sites of the SM3 157 motif, and all 856 positions of Env. Fig 2C shows the area under the curve (AUC) metric for these 158 tests, which describes performance of the model to distinguish between resistant and sensitive 159 samples by sequence, whereby a value of 1 indicates perfect discrimination and 0.5 corresponds 160 to random classification. AUC values greater than 0.9 were observed for most sets, peaking at 161 0.96 for the 56 positions located within 7.5Γ… of TMR (see all metrics in Fig S4 and S5). Given that 162 the variability in the model’s performance metrics across the five folds was also low for the 163 positions within 7.5Γ… (see standard deviations at bottom of Fig 2C), we selected this subset for 164 subsequent analyses. Finally, to identify the specific mutations that impact resistance, we used a 165 GB Regressor algorithm with the combined set of 570 BRIGHTE and 208 Single-Env datasets 166 (see normalization of the IC50 values and performance in Fig S 6). The Shapley Additive 167 Explanations (SHAP) value w as used to quantify the contribution of mutation s to the model's 168 predictive capacity (44). As shown in Fig 2D, performance of the GB Regressor model was high. 169 Importantly, it provided a list of mutations, each with an estimated effect on resistance (Fig S7). 170 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 7 The 14 mutations with mean absolute SHAP values greater than 0.01 (Data File S4) were defined 171 as suspected of impacting resistance, and were further tested in vitro as described below. 172 Machine learning algorithms identify mutations that are frequently sampled in the datasets. 173 To identify mutations that are less frequently sampled, we used a probabilistic modeling approach 174 that only considered mutations appearing in less than 5% of samples (see Methods Section). 175 The model applies for each mutation the IC50 values across the different samples that contain it, 176 and calculates the relative likelihood of that mutation to contribute to resistance (see Methods 177 Section and Fig S8A). A total of 17 unique mutations were identified by this approach (see Fig 178 S8B and Data File S5 ). In addition, to account for mutations that are not represented in our 179 datasets, we used the structure of the TMR-bound Env (5U7O) to identify Env positions with side 180 chains that extend into the CD4-binding pocket (Table S2). We then examined the frequency of 181 all residues at these positions in HIV-1 clade B viruses, represented by a panel of 2,535 Envs 182 from fostemsavir-untreated individuals (see alignment in Data File S6). Variants that appeared in 183 at least 0.5% of th is panel were selected. A total of 1 7 such unique mutations were identified. 184 Finally, we identified 11 unique mutations noted in previous studies to increase resistance to TMR 185 or its structural analogs (Table S3) (32, 34, 35, 38, 45, 46) . The four approaches yielded a total 186 of 59 unique mutations at 21 Env positions that were suspected of increasing Env resistance to 187 TMR (Fig 2E). 188 189 Phenotypic analysis of mutations suspected of increasing HIV-1 resistance to TMR 190 To determine the effects of the 59 mutations on Env fitness and TMR resistance, we 191 introduced them individually into the Env protein of HIV -1 strain AD8. This well-characterized 192 clade B strain occupies a closed conformation typical of Tier-2-like primary HIV-1 isolates (47, 193 48). Pseudoviruses that contain the mutant Envs were tested for their infectivity using Cf2Th cells 194 that express CD4 and CCR5, and values were normalized for viral particle count by the reverse 195 transcriptase activity in each sample (49). All 59 mutants were also tested for their resistance to 196 TMR. In addition, for each mutation , we examined the emergence frequency after treatment in 197 the escape group, as well as the proportion of the 65 participants that required a single nucleotide 198 change to acquire the mutation . As shown in Fig 3A, most suspected mutations did not alter 199 resistance to TMR , while several increased it considerably (e.g., L116P). Interestingly, some 200 mutations were significantly enriched in the escape group, most notably 426L and 375N , which 201 appeared in 74% and 38% of these individuals, respectively (Fig 3B). To better understand the 202 basis for the differential emergence frequencies of mutations after treatment, and based on the 203 observed variability in the IC50 values, we focused on the subgroup of 18 mutations that increased 204 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 8 the IC50 by 3.5 -fold or more (see shaded region in Fig 3A ). We designate this subgroup 205 resistance-enhancing mutations (REMs). For some REMs (e.g., 375H, 375M and 375Y), their low 206 frequency in the escape group could be explained by the number of nucleotide changes required 207 (see color of data points in Fig 3A). However, several poorly sampled REMs (e.g., 375I) only 208 required a single nucleotide substitution. REM emergence frequencies were not associated with 209 their effects on resistance to TMR (Fig 3C). By contrast, a threshold effect was observed for the 210 relationship between fitness and emergence rate, whereby low-fitness REMs (e.g., 434K, 255M 211 and 204D) were poorly sampled whereas REMs with favorable fitness profiles and nucleotide 212 substitution requirements showed variable frequencies of emergence. 213 In most individuals, REMs appeared at more than one Env position (Fig 3D). We thus 214 examined the fitness-resistance profile s of the two-site mutation combinations that appeared 215 frequently after treatment, as well as combinations that appeared less frequently or not at all (Fig 216 3E and 3F). The most frequent combination in the escape group, 375N/426L, which requires one 217 nucleotide substitution in each codon, exhibited a favorable fitness -resistance profile relative to 218 other two -substitution combinations . Nevertheless, it did not exhibit a unique synergistic 219 advantage in fitness or TMR resistance that could explain the observed high prevalence of this 220 combination or of the individual changes (Fig S9). 221 In the above tests, we compared the emergence frequencies of REMs with their fitness 222 and TMR resistance levels measured using a single HIV-1 isolate (strain AD8). Nevertheless, it 223 could be argued that some poorly sampled REMs, such as 116P, may only increase Env 224 resistance in the context of a limited number of strains (e.g., AD8). Conversely, the 113E and 434I 225 mutations, which appeared frequently after treatment but did not increase AD8 Env resistance 226 significantly, could exhibit limited effects in AD8 relative to other strains. To address this concern, 227 we introduced the 116P, 113E and 434I mutations into the Envs of two clade B transmitted/founder 228 strains, 700010040.C9.4520 and WEAUd15.410.5017 (48). For the 113E mutation, which 229 emerged frequently in BRIGHTE, we also tested the Envs of strains QH_692 and JRFL. For all 230 three REMs, the effects on fitness and resistance were qualitatively similar to those observed for 231 AD8 (Fig S10). While these tests included only 2 -4 variants, they suggested that the effects of 232 the mutations observed in the Env of strain AD8 can be generalized to other clade B isolates. 233 In summary, we identified 18 mutations that increase Env resistance to TMR. Some REMs 234 emerged after treatment at significantly higher frequencies than others. This preference could not 235 be fully explained by the number of nucleotide changes required or by the level of resistance to 236 TMR they impart, and only partially by their effects on Env fitness. 237 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 9 238 The emergence frequency of REM s in the escape group corresponds with their 239 spontaneous emergence rate in the population 240 The 426L mutation, which appeared in 74% of subjects in the escape group, increased 241 TMR resistance by 48-fold, while four other suspected REMs at this position (Ile , Thr, Lys and 242 Arg) did not increase resistance significantly (Fig 3A). To assess the comprehensiveness of our 243 screening approach, we examined if additional variants at th is position may also increase 244 resistance to TMR. To this end, we performed saturation mutagenesis to test the effects of all 245 possible mutation s at position 426 on Env fitness and resistance to TMR (50). Replication-246 competent libraries of HIV -1AD8 that contain a degenerate codon at this position were used to 247 infect the T cell line A3R5.7 in the absence or presence of TMR (250 nM). The frequency of each 248 form in the infected cells was compared with the frequency in the virus library used for infection 249 (see protocol in Methods Section). While several residues at this position showed fitness levels 250 similar to the wild-type Met, remarkably, only Leu was able to infect the cells in the presence of 251 this concentration of TMR (Fig 4A). 252 We also tested the fitness and TMR resistance profiles of all residues at position 375 (Fig 253 4B). REMs 375H, 37 5I, 375M , 375N and 375Y, which showed considerable increases in 254 resistance using the pseudovirus system (Fig 3A ), also showed high resistance in the se 255 experiments (see correlation in Fig 4C). Interestingly, in addition, the 375W and 375F variants 256 showed high resistance to TMR. These forms were not identified by our four approaches because 257 they did not appear in the sequence-IC50 datasets or in the panel of 2,535 clade B isolates from 258 fostemsavir-untreated individuals. Indeed, the emergence frequency of mutations at position 375 259 in BRIGHTE participants correlated well with their proportion in the untreated clade B population 260 (Fig 4D). To quantify the propensity for spontaneous emergence of the 18 REMs in fostemsavir-261 untreated individuals, we applied the panel of 2,535 clade B Envs. The sequences were used to 262 construct a phylogenetic tree that was partitioned into subgroup s. Each position was tested 263 separately, by assigning the taxa their amino acid occupancy at that position, and subgroups 264 dominated by the non-ancestral form at that position were excluded (see example of position 426 265 in Fig 4E ). The average number of independent emergence events of each REM and the 266 variability across the subgroups was then calculated (see Fig 4 F). A strong correlation was 267 observed between the average rate of spontaneous REM emergence in clade B and their 268 emergence in the escape group (see Fig 4G and Fig S11). These findings suggested that the 269 same selection pressures exerted in fostemsavir-untreated individuals may determine the 270 likelihood of the REMs to emerge on treatment. 271 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 10 272 Immune pressures restrict the escape paths of HIV-1 from TMR 273 Selection pressures on replicative fitness and inhibitor resistance are primary factors that 274 guide the evolution of virus proteins in treated individuals. Interestingly, the relationships shown 275 in Fig 3C only suggested a threshold effect for fitness and none for the level of resistance imparted 276 by the REMs. We thus sought to identify additional factors that may contribute to the observed 277 preference for some REMs to emerge (on treatment and in the clade B population). In most HIV-278 infected individuals, antibodies are elicited against conserved epitopes that overlap the 279 coreceptor-binding site (CoR-BS), CD4-binding site (CD4-BS), and inner domain of gp120 (51, 280 52). These epitopes are not exposed on the native state of Env in most primary isolates, and are 281 thus designated non -neutralizing. The non -neutralizing antibody response was suggested to 282 reduce the risk of infection in vaccine trials (53, 54) and alter the evolutionary path of HIV during 283 early stages of infection (55-57). In both cases, the attributed mechanism was based on Fc -284 mediated effector functions . To examine the potential effect of such antibodies on mutation 285 emergence during treatment, we measured the sensitivity of the REMs to plasma samples from 286 two HIV-positive individuals that do not reduce infectivity of Env AD8 at the lowest dilution tested 287 (1:160). As shown in Fig 5A, several REMs that did not emerge on treatment were neutralized by 288 the plasma (e.g., 116P and 434K) , whereas REMs that were sampled more frequently were 289 resistant. 290 To determine the epitope specificity of the antibodies in the plasma that differentially 291 neutralize these mutants, we performed a preliminary experiment using the frequently-sampled 292 (plasma-resistant) 426L and 375N mutants, and the poorly-sampled (plasma-sensitive) 116P and 293 434K mutants. Plasmids that encode the Envs were used to transfect human osteosarcoma 294 (HOS) cells, which express on their surface fully-cleaved Env trimers in their native closed form 295 (58). Monoclonal antibodies that target different Env epitopes (Fig 5B) were added to the cells, 296 and their binding efficiency was detected by cell-based ELISA (59). As shown in Fig 5C, the 116P 297 and 434K changes enhanced binding of antibodies against otherwise-cryptic epitopes that overlap 298 the CoR-BS, V3 loop and CD4-BS. In addition, these changes reduced binding of antibody 299 PGT145 that targets a quaternary epitope at the apex of the trimer (60). These features are 300 consistent with a CD4-bound-like conformation of Env (61, 62). We extended the analysis to all 301 18 REMs and observed that, in addition to 116P and 434K, mutations 423S, 202E, 204D, which 302 appeared infrequently in the escape group and were sensitive to non -neutralizing plasma ( Fig 303 5A), also increased binding of the CoR -BS antibody 17b (Fig 5D). Given the CD4 -bound-like 304 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 11 conformation of these REMs, we also tested their ability to infect CD4-negative cells (47). Indeed, 305 a strong relationship was observed between this variable and the level of 17b binding (Fig 5E and 306 Fig S12). Interestingly, only the 116P mutant showed high binding of 17b but no CD4-independent 307 infection, suggesting two potential structural-functional outcomes for these changes. 308 We also examined the effects of the 18 REMs on Env stability. The Envs of diverse HIV-1 309 isolates can exhibit different levels of conformational stability (47, 59) and sensitivities to 310 inactivation at physiological temperature (63, 64). Nevertheless, a relationship between stability 311 of Env variants and their likelihood to appear in HIV-infected individuals has not been established. 312 To this end, we incubated viruses containing the 18 REMs at 37Β°C and measured the changes in 313 residual infectivity over time. Consistent with previous results for the related ADA strain (64), the 314 half-life of the wild -type AD8 Env was approximately 7 hours (Fig 5 F). Interestingly, the 315 infrequently sampled REMs that were sensitive to non-neutralizing plasma and enhanced 316 exposure of the CoR-BS also demonstrated low functional stability at 37Β°C (see correlations in 317 Fig S13). 318 Taken together, these results suggest that REMs 116P , 434K, 204 D, 202 E and 423S 319 increase sensitivity to non-neutralizing plasma by inducing an open (CD4-bound-like) form of Env 320 that exposes cryptic epitopes targeted by non-neutralizing antibodies. These mutations also 321 reduce Env stability and functional fitness. Their enhanced resistance to TMR can be explained 322 by a change in the CD4-binding pocket to a CD4 -bound-like conformation that is not conducive 323 with binding of TMR ( note location of the SM 3 sites in Fig S14). However, their resistance is 324 associated with a cost – elimination by the non-neutralizing antibody response. 325 326 Amino acid substitution likelihoods impact the path of resistance in vivo 327 We examined the relationships between the above-described features of the 18 REMs, 328 including their emergence frequencies during treatment and in the clade B population (see values 329 in Fig 6A and P-values for the Spearman correlation tests in Fig 6B). We divide these features 330 into four groups: (i) Functional fitness, measured by the fusion competence of the Env, (ii) 331 Functional stability , measured by resistance to inactivation at 37Β°C, (iii) Immune fitness, 332 captured by exposure of the CoR-BS, sensitivity to the non-neutralizing plasma, and indirectly by 333 the requirement for CD4 to infect cells, and (iv) Therapeutic fitness, measured by the in vitro 334 resistance to TMR. Strong correlations were observed between the variables that capture 335 functional fitness, stability, and immune fitness. However, none of them alone could fully explain 336 the emergence frequency of the 18 REMs in the escape group, other than a modest correlation 337 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 12 with Env stability at 37Β°C. As such, we decided to generate a combined fitness variable that 338 describes the functional fitness, stability and immune resistance of each REM. It was calculated 339 as the product of their relative infectivity, plasma resistance, and stability at 37Β°C (expressed as 340 a fraction of the values measured for the wild-type Env). As shown in Fig 6C and 6D, the combined 341 fitness of the REMs correlated well with their emergence rates in the BRIGHTE subjects and in 342 clade B. Yet some REMs, such as 375Y, showed high combined fitness but did not emerge on 343 treatment. 344 Based on these finding, w e also examined the likelihood of the mutations to appear, as 345 determined by: (i) The pre-treatment nucleotide sequence of the participants, (ii) The number of 346 nucleotide changes needed to acquire each REM , and (iii) The type o f nucleotide changes 347 required, based on the transition and transversion rates determined by Martinez Del Rio et al., 348 (15) (Fig S15A). An algorithm that accounts for these variables was generated, in which mutation 349 appearance was modeled as a series of stochastic events with mutation-specific probabilities (see 350 flowchart in Fig S15B). As shown in Fig 6A (bottom rows), similar REM appearance likelihoods 351 were calculated using the clade B consensus and BRIGHTE pre-treatment sequences as the 352 starting state. We examined if these likelihoods could explain the outliers in Fig 6C and 6D (see 353 colors of datapoints). As expected, several REMs with high fitness but low emergence frequencies 354 (most notably 375Y, 424R and 375I) exhibited low substitution likelihoods. 355 We extended our analyses to account for the variables associated with both appearance 356 and persistence of each REM (see flow chart in Fig S15C). Here, we examined the ability to 357 predict the emergence frequencies of the 18 REMs in the escape group based on their: (i) Pre-358 treatment trinucleotide sequence at these sites, (ii) Likelihood for appearance of the REMs (by 359 number and type of the nucleotide substitutions) , and (iii) The c ombined fitness metric that 360 incorporates the functional fitness, stability, and immune fitness. As shown in Fig 6E, the model 361 that integrated all variables demonstrated good predictive performance (AUC = 0.82) and was 362 robust across multiple cross-validation folds, as indicated by the error bars. The combined fitness 363 variable and the nucleotide substitution likelihood s contributed to the prediction significantly. 364 Consistent with the data in Fig 6A, the use of the participants’ pre-treatment sequences did not 365 improve performance relative to the clade B consensus sequence. 366 Finally, we explored the contribution of the nucleotide substitution likelihoods to the 367 observed profile of mutations in BRIGHTE. To this end, we examined separately the effects of 368 the number of changes required and the probability for each transition or transversion (15). For 369 these tests, we focused on position 375, which contains the largest number of REMs. 370 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 13 Interestingly, both the number of changes and the expected rate of each substitution contributed 371 to predictions of REM emergence ( Fig 6F). For example, position 375 was occupied in most 372 subjects by Ser with the trinucleotide sequence AGT (or less frequently by AGC). Substitutions to 373 Asn (AAT/AAC) or Ile (ATT/ATC) constitute single nucleotide changes (from G to A, or G to T, 374 respectively). However, the G to A transition is more common than the G to T transversion, likely 375 explaining the higher frequency of Asn in the escape group and population relative to Ile despite 376 favorable fitness profiles in both . Indeed, given similar immune and functional fitness levels for 377 the five REMs at position 375 (Fig 6A), addition of the selection pressures did not further improve 378 prediction of the on-treatment mutational outcomes (Fig 6F). We note that these calculations are 379 not intended to capture the quantitative effects of the variables on appearance or persistence of 380 the REMs. Nevertheless, they clearly demonstrate that the mutational path is explained well by 381 the properties of Env measured in vitro and modeled in silico. 382 383

Discussion

384 The forces that guide the path of HIV-1 escape from a therapeutic in vivo 385 Through a comprehensive screen, we identified and then tested 59 mutations suspected 386 of increasing HIV-1 resistance to TMR, of which 18 REMs were found to be significantly impactful. 387 Analysis of the utilization of these paths in individuals treated with fostemsavir revealed that some 388 were highly preferred, whereas others, despite favorable fitness and resistance profiles, were 389 sampled infrequently or not at all. Given that the emergence frequency of the REMs in BRIGHTE 390 corresponded well with their spontaneous appearance rates in fostemsavir-untreated individuals, 391 we hypothesized that additional selection pressures may restrict some paths of escape. Indeed, 392 several REMs poorly represented in vivo exhibited β€œopen” forms of Env ; their mechanism of 393 resistance is based on induction of a CD4 -bound-like conformation that is not conducive with 394 binding of TMR. However, the penalty associated with this path is exposure of otherwise-cryptic 395 epitopes that overlap the CoR -BS and V3 loop . Such forms are readily eliminated by non -396 neutralizing antibodies that are commonly elicited in HIV -infected individuals (51, 52). For other 397 REMs, most notably at position 375, their poor representation in the escape group was explained 398 by the low likelihood of the substitutions to occur, due to the number or type of nucleotide changes 399 required. Thus, we quantitatively capture the in vivo factors that impact the two phases of virus 400 escape from a therapeutic: (i) Appearance of the mutation , as determined by the number and 401 type of nucleotide substitutions, and (ii) Persistence of the new variant, as determined by the 402 effects on functional fitness, immune fitness, and stability of the protein. Interestingly, a very high 403 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 14 level of resistance imparted by some REMs (e.g., 116P) was not able to overcome low levels of 404 functional and immune fitness. The ability to explain the path of resistance by effects of the 405 individual mutations strongly suggests a limited contribution of adaptive changes to escape of the 406 virus. Indeed, REMs with poor functional or immune fitness were rarely observed in vivo. Instead, 407 for up to 5 years of follow -up, the mutations with the most favorable profiles emerge d. These 408 findings are consistent with the catastrophic effect of the most readily available mechanism of 409 resistance – transition to a CD4-bound like conformation. 410 411 Role of the non-neutralizing antibody response in restricting escape paths from a 412 therapeutic 413 An unexpected finding of this study is the effect of non-neutralizing antibodies on the range 414 of mutations that can mediate escape . While the neutralizing response against HIV -1 Env is 415 usually swarm-specific and exhibits limited cross-neutralization with other strains, the specificity 416 of the non -neutralizing response is conserved across different individuals (65, 66). Our results 417 suggest that it indeed constitutes a deterministic pressure that guides the evolutionary path of 418 HIV-1, similar to the requirement for functional fitness . Such pressure is consistent with the 419 inability to isolate CD4-independent primary strains from HIV -infected individuals, despite the 420 clear advantage associated with overriding the need for the CD4 receptor (67). The five REMs 421 that were poorly sampled in the escape group (116P, 434K, 202E, 204D and 423S) induced an 422 open form of Env that exposes the CoR -BS, which likely increase d their sensitivity to non -423 neutralizing antibodies against this epitope in the HIV -positive plasma we used. Interestingly, 424 three of these sites (positions 116, 434 and 204) form a clasp -like structure that appears to 425 stabilize Env in the native untriggered state (Fig S14C). Similar interactions across the gp120 and 426 gp41 subunits have been noted to maintain trimer stability, allowing Env to retain the high potential 427 energy required to drive the fusion process (68, 69). Weakening of these interactions results in 428 transitions to lower-energy metastable CD4-bound-like forms which, in the absence of a 429 coreceptor, undergo spontaneous and irreversible conformational changes in to non-functional 430 states (70). Indeed, here we observe strong relationships between the functional fitness of the 431 variants, their level of β€œopenness”, and their stability at 37Β°C. Given the large number of such Env-432 stabilizing interactions, it is plausible that mutations at other sites, outside the CD4-binding pocket, 433 may also increase resistance to TMR. While these mutations may emerge during in vitro selection 434 experiments, they appear to be restricted during in vivo escape from this therapeutic. 435 436 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 15 Effects of Env structural context on availability of escape paths 437 Env exhibits tremendous diversity in amino acid sequence within and between hosts (17, 438 18). Divergent profiles of antigenicity and networks of allosteric interactions result in distinct 439 profiles of sensitivity to therapeutics. In addition, the fitness profile of Env positions may vary 440 between strains, including those derived from the same HIV-1 clade (50). As such, a REM could 441 exhibit high fitness in the context of one strain but low fitness in another, restricting viability of that 442 escape path. Context specificity can limit the ability to infer, for example, from our tests with the 443 AD8 Env to other HIV-1 strains. Here, to better understand the high frequency of two mutations 444 that do not increase TMR resistance in our assays (113E and 434I), and the absence of REM 445 116P that increases resistance significantly , we tested their effects on additional isolates. 446 Qualitatively similar effects were observed on fitness and TMR resistance of these strains, 447 suggesting that, at least for the CD4 -binding pocket, the effects measured for Env AD8 are 448 generalizable. This notion was also supported by the observation that the frequency of most 449 REMs in the escape group was explained well by their appearance and persistence likelihoods 450 based on the in vitro data measured for AD8 Env. An exception to this rule is REM 475L, which 451 exhibited a relatively high likelihood for appearance and persistence but was poorly represented 452 in the BRIGHTE escape group . Consistent with this finding , 475L only appeared in one of the 453 2,535 clade B isolates from fostemsavir -untreated individuals. Comprehensive characterization 454 of the effects of this mutation on Env functional and immune fitness in diverse strains, as well as 455 the potential for other selection forces such as the cytotoxic T lymphocyte response (71, 72), may 456 provide greater insight into the basis for the low emergence rate of this mutation. 457 458 Personalization of antiviral treatments based on virus genotype 459 Several sites located in the CD4-binding pocket show considerable within-clade variation 460 relative to that expected for the critical function of this domain (Table S2). Most prominent are 461 positions 426 and 375, which also harbor the two most frequent REMs. Both sites exhibit broad 462 fitness profile s, as measured by saturation mutagenesis (Fig 4A and 4B). For example, the 463 ancestral Met at position 426 is often replaced by Arg and Leu (Fig 4F); both are fit, but only Leu 464 is resistant to TMR. Similarly, several variants at position 375 can impart resistance and exhibit 465 high levels of functional and immune fitness (Fig 6A ); however, they are restricted by a low 466 likelihood for appearance, due to the number (His, Met and Tyr) or type (Ile) of nucleotide changes 467 required. The frequencies of 426L and 375N in HIV-1 clade B increased at early stages of the 468 AIDS pandemic and seem to have reached steady states, likely reflecting the lack of an advantage 469 relative to the ancestral forms (Fig S16). Thus, the prevalence of variants that enhance resistance 470 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 16 to TMR, and potentially other inhibitors against the CD4-binding pocket (73, 74), are not low but 471 are also not increasing. Indeed, 24 of the 132 subjects that were sampled before and after 472 treatment showed resistance to TMR prior to initiation of therapy , and 65 developed resistance 473 on treatment (Fig 1F). The high emergence frequencies of 426L and 375N reflect their favorable 474 immune and functional fitness profiles and suggest that future inhibitors targeting the CD4-binding 475 pocket should be designed to maintain efficacy against variation at these positions. 476 A longstanding goal in the field of infectious disease management is development of tools 477 to personalize antiviral therapeutics by rapidly estimating the properties of the infecting virus 478 swarm before treatment (75-78). The non-negligible frequency of TMR -resistant strains in the 479 population and the high rate of resistance that emerges on treatment suggest a need for such 480 tools. Nevertheless, the diversity and complexity of Env can challenge our ability to predict 481 resistance from sequence data (79). As we show here, HIV-1 sensitivity to TMR can be readily 482 inferred from the amino acid sequence of Env, with AUC values exceeding 0.95. Given the limited 483 effects of Env context on these predictions, our algorithms can be applied to strains from diverse 484 HIV-1 clades. An example of the relationship between the predicted and measured IC 50 values 485 using the GB regressor model is shown in Fig 2D. For predicting resistance above 50 nM TMR, 486 the false positive rate was 3% and the false negative rate was 7.1%. Notably, the linearity of the 487 relationship between the probabilities for resistance and the measured IC50 values allow us also 488 to consider the estimated level of resistance during clinical decision making. Such performance 489 is high relative to previous analyses of epitopes targeted by BNAbs (79-81), which may be 490 attributed to the relatively low complexity of the TMR epitope, or the restricted range of mutations 491 that can impart resistance and persist in the host. 492 The sequence data derived from the PhenoSense GT assay and applied in this study does 493 not capture the full diversity of plasma virus variants in each participant. Indeed, g enotypic 494 profiling assays used clinically are generally limited to detection of mutation s with frequencies 495 greater than 20% of all sequences (82, 83). We expect that sequencing of plasma virus at greater 496 depth will provide the accuracy required to predict initial clinical responses as well as short-term 497 emergence of resistant forms from low-frequency circulating variants. Processing of clinical 498 samples by deep sequencing followed by sequence-based estimation of resistance, as described 499 here, can be achieved with limited expense in short time frames relative to phenotypic in vitro 500 assays. The approach has the potential to improve both the time-to-treatment initiation and the 501 efficacy of personalized TMR treatments. Furthermore, if resistant forms are not detected in the 502 pre-treatment sequences, the data can be used to quantitatively describe the likelihood for 503 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 17 emergence of each resistance mutation, based on the nucleotide substitution requirements and 504 the estimated effects on functional and immune fitness of the virus. 505 506 507

Methods

508 Processing and analysis of sequence data 509 All BRIGHTE trial samples used in this study were collected and analyzed by informed 510 consent. Samples were tested for viral loads and CD4 counts. In addition, some samples were 511 analyzed using the Monogram Biosciences PhenoSense GT assay. N ucleotide and amino acid 512 sequences for 580 samples from 371 subjects were available for analysis. TMR resistance values 513 were available for 570 of these samples. Multiple Env positions contained ambiguous nucleotide 514 or amino acid designations, which reflect the presence of more than one sequenced variant in the 515 donor. To align such sequences, we initially demultiplexed the variants by retaining for each 516 position the amino acid or nucleotide found in the clade B ancestral sequence, or, if not present, 517 the first variant listed at that position. We then used a custom Python code to run MAFFT 7.520 518 for alignment (84). Sequences were then used to determine their clade associations using the 519 Recombinant Identification Program (RIP) tool (85). Sequence variability information was 520 subsequently reintegrated into the samples to allow more than one form at each position, and 521 sequences trimmed to the subset of 856 positions according to HXBc2 numbering (42). 522 523 The Single-Env dataset and transformation of TMR resistance values 524 We also used a previously published dataset composed of 208 Envs from diverse clades, 525 each associated with an amino acid sequence and TMR resistance value measured in vitro (38). 526 Accession numbers (listed in Data File S3) were used to download the sequences from the Los 527 Alamos National Lab (LANL) database (86), which were then aligned and processed as above. 528 TMR IC50 values for the Single-Env set were measured using an in -house assay (38), whereas 529 the BRIGHTE trial samples were analyzed using the Monogram Biosciences Entry assay . To 530 allow us to combine the datasets, we converted the Single-Env set values to a distribution similar 531 to that of the pre-treatment samples from the BRIGHTE trial (Fig S6A). For this purpose, we first 532 examined the distribution type of the BRIGHTE and Single-Env datasets, including Beta, 533 Lognormal, Exponential, Gamma, and Pareto. Based on the sum of squared errors (SSE) metric, 534 both datasets conformed best to a Beta Distribution. We then converted the Single-Env set IC50 535 values to the distribution of the pre -treatment BRIGHTE samples using a probability integral 536 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 18 transform approach. In brief, we first mapped the Single-Env s et values to their cumulative 537 probabilities, then applied the inverse cumulative distribution function of the BRIGHTE Beta 538 distribution, and finally rescaled to the target support, ensuring the values follow the BRIGHTE 539 distribution while preserving their rank ordering. 540 541 Classification algorithms to estimate Env resistance to TMR by sequence 542 We evaluated different classification algorithms for their performance to estimate T MR 543 resistance by amino acid sequence. For these evaluations, as well as for hyperparameter tuning, 544 we defined an IC50 of 50 nM TMR as the threshold to distinguish between sensitive and resistant 545 samples. As input for the model, we first used the amino acids at the 856 positions of Env 546 according to the HxBC2 numbering system. To account for sequence ambiguity in the BRIGHTE 547 data, we used one -hot encoding to convert each position into binary features representing the 548 absence (0) or presence (1) of each amino acid in the sample. In addition, given that some 549 subjects had multiple samples analyzed (before and after treatment), we used a group-stratified 550 5-fold cross validation approach, which ensured that all sequences of each subject are assigned 551 to either the training or test folds. 552 Following preliminary testing of different classification algorithms, we pursue d Extreme 553 Gradient Boosting (XGBoost) from the xgboost package in Python (43), for its high performance 554 across several key metrics (Fig S3). Parameter fine-tuning was performed through grid search in 555 Python to enhance the model's predictive capacity. The following hyperparameters were tuned: 556 The number of estimators was set to 100, 150, 200, 250 and 300 trees; learning rate (Ξ·) was set 557 to 0.1, 0.2 and 0.3; maximum depth of trees was set to 3, 5 and 8; Lambda Regularization was 1, 558 2, and 3; and the fraction of features and samples used by the algorithm was 1 . The AUC was 559 used as the objective function for optimization. Classification metrics were calculated using the 560 metrics module from the sklearn library (87). 561 562 Gradient Boosting Regressor to identify Env features that impact resistance to TMR 563 To identify the sequence features that contribute to TMR resistance, we used the Gradient 564 Boosting Regressor (GBR) algorithm. The algorithm was trained on the combined dataset of 570 565 BRIGHTE and 208 Single-Env samples. Amino acid sequences were used as the input feature 566 set, and the log 10-transformed TMR IC50 values as the response variable. To prepare the 567 sequences, we first excluded Env positions with a minimal distance greater than 7.5 Γ… from any 568 TMR atom on the TMR -liganded structure of Env (PDB ID 5U7O), resulting in 56 remaining 569 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 19 positions (Table S1). Next, we applied one-hot encoding to convert the amino acid occupancy at 570 each position into binary features. Of the 243 remaining features, nine exhibited no variation and 571 were excluded. To further reduce dimensionality and mitigate multicollinearity, we dropped the 572 least frequent feature at each position that meets two criteria: (i) The position contains at least 573 two unique AA variants in the dataset, and (ii) The mutation is included in the feature set tested 574 by the probabilistic approach (see below). This reduced the feature set to 129. 575 The GBR model was trained using mean squared error (MSE) as the objective function 576 for optimization with the Friedman improvement score as the criterion for measuring split quality, 577 and a learning rate (πœ‚) value of 0.1. A k -fold nested grouped cross-validation strategy was used 578 with π‘˜ = 10 for the outer folds and π‘˜ = 5 for the inner folds. This nested structure separates the 579 parameter optimization process from the model assessment with the inner cross-validation loop 580 for hyperparameter tuning, and the outer loop for model evaluation. The grouping strategy 581 ensured that sequences from the same subject appeared in either the training or test fold, but not 582 both. Hyperparameters of the GBR model were optimized via grid search, including the number 583 of estimators (20, 50, or 100) and the maximum depth of individual regression es timators (3, 5, 584 10, or full tree). Predicted IC50 values that exceeded 5 Β΅M were capped at the maximal allowable 585 value of 5 Β΅M TMR. To estimate the contribution of features to the model’s predictions, we used 586 the Shapley Additive Explanations (SHAP) value (44). SHAP values quantitatively describe the 587 impact of each feature on the model’s performance to predict the outcome , including their 588 magnitude, direction and distribution. 589 590 Probabilistic approach to identify low-prevalence mutations that impact TMR resistance 591 The occurrence of mutations at a low frequency in the dataset limits the ability of 592 algorithms to model their effect. To estimate their impact, we implemented a separate procedure. 593 Mutations included in this set were defined by three criteria: (i) They must appear in at least two 594 unique sequences, (ii) They must not appear in more than 5% of all sequences, and (iii) The side 595 chain of the position must be located within 7.5 Γ… of the T MR molecule. This filtering process 596 resulted in 107 mutations. If a mutation appeared in more than one sample of a BRIGHTE 597 participant, the average value of the log10(IC50) was used. 598 A prioritization algorithm was used to rank these mutations by their potential impact. For 599 each of the 10 7 mutations, described as amino acid π‘Ž and position 𝑝, we modeled the log 10-600 converted IC50 values of samples containing that mutation using a normal distribution π‘‹π‘Ž,𝑝, with 601 average πœ‡ and variance 𝜎2. We assume that π‘‹π‘Ž,𝑝 corresponds with the distribution of their effects 602 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 20 on the greater population of HIV -1 strains (π‘Œπ‘Ž,𝑝). To compare between the 107 mutations, we 603 randomly sampled with replacement the distribution π‘‹π‘Ž,𝑝 of each mutation. This value was 604 compared with the randomly sampled value from all other 106 mutations and ranked (highest 605 value received the lowest rank). This process was repeated 10,000 times, and the average ranks 606 for all 107 mutations across all iterations were calculated. Fig S8B summarizes the top features 607 identified by this approach. 608 609 Production and testing of pseudoviruses containing the Env mutations 610 All 59 mutations suspected of increasing Env resistance to TMR were introduced by site-611 directed mutagenesis into a pSVIII vector that expresses the Env of strain AD8 under control of 612 the LTR promoter (88). Pseudoviruses that contain the variants were generated by transfection 613 of HEK 293T cells. Briefly, 9.5 × 105 cells were seeded in each 6-well plate well, and transfected 614 the next day with 0.4 μg of the HIV-1 packaging construct pCMVΞ”P1Ξ”envpA, 1.2 μg of the firefly 615 luciferase-expressing construct pHIvec2.luc, 0.4 μg of pSVIII expressing HIV-1 Env , 0.2 μg of 616 pRev expressing HIV -1 Rev , and 4.2 μL of JetPrime reagent (PolyPlus). The medium was 617 replaced the next day, and virus-containing supernatant was collected 24 h later. Samples were 618 cleared of cell debris by centrifugation at 800 × g and filtered through 0.45 ΞΌm pore sized 619 membranes. 620 As a measure of virus particle content, we quantified the reverse transcriptase activity in 621 the samples using a modified version of a previously published protocol (49), where TaqMan 622 chemistry was used in place of SYBR Green. In brief, RT -qPCR reactions (25 ΞΌl total volume) 623 were prepared with TaqMan Gene Expression Master Mix (ThermoFisher) and contained 2.5 mU 624 of MS2 Bacteriophage RNA (Roche), 1 ΞΌM of MS2 Forward Primer (5’ - 625 TCCTGCTCAACTTCCTGTCGAG -3’), 1 ΞΌM of MS2 Reverse Primer (5’ - 626 CACAGGTCAAACCTCCTAGGAATG -3’), 200 nM Probe (6[FAM] 627 CGAGACGCTACCATGGCTATCGCTGTAG [TAMsp]), 0.5 U RNase Inhibitor (Fermentas, 628 EO0381), and 2 ΞΌl of pseudovirus sample. Pseudovirus samples were lysed in 0.125% Triton X-629 100, 50 mM KCl, 100 mM Tris HCl (pH7.4), 0.4 U/ΞΌl RNase Inhibitor, and 20% glycerol. A standard 630 curve was generated using a dilution series of recombinant HIV reverse transcriptase (NIH AIDS 631 Reagent Program, Cat. No. 12583) ranging from 10⁴ to 10ΒΉΒ² pU/ΞΌl prepared in the same lysis 632 buffer. Reactions were run in 0.1 ml MicroAmp plates (Applied Biosystems, 4306737) on a Quant 633 Studio 3 Real-Time PCR System under the following cycling conditions: 42 Β°C for 20 min, 50 Β°C 634 for 2 min, 95 Β°C for 10 min, followed by 50 cycles of 95 Β°C for 15 s and 60 Β°C for 1 min. Virus 635 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 21 infectivity was expressed as the mean luciferase activity measured for each virus stock (in relative 636 light units, see below) divided by the reverse transcriptase activity in that sample. 637 To measure sensitivity of the variants to TMR ( BMS-626529, MedChemExpress) or 638 plasma from HIV -infected individuals, Cf2Th-CD4+CCR5+ cells were seeded in 96 -well opaque 639 white plates at 2 × 10 4 cells per well and infected the next day. For neutralization assays, 640 pseudovirus samples were pre-incubated with the TMR or plasma for one hour at 37Β°C. Samples 641 were then added to the target cells and incubated for 3 days in a 37Β°C 5% CO 2 incubator. To 642 measure infection, the medium was removed, 35 μL passive lysis buffer (Promega) were added, 643 and samples subjected to three freeze -thaw cycles. To measure luciferase activity, 100 μL of 644 luciferin buffer (15 mM MgSO 4, 15 mM KPO4 [pH 7.6], 1 mM ATP, and 1 mM dithiothreitol) and 645 50 μL of 1 mM d -luciferin potassium salt (Syd Labs, MA) were added to each sample. 646 Luminescence was recorded using a Synergy H1 microplate reader (BioTek). 647 To measure the decay rate of virus infectivity at 37Β°C, pseudovirus stocks were divided 648 into aliquots (one sample for each time point), and all samples were snap -frozen on dry ice 649 immersed in ethanol and stored at βˆ’80Β°C. At different time points, samples were thawed in a 37Β°C 650 water bath for 2 min and then further incubated at 37Β°C for 2 -48 hours. All samples were 651 subsequently added to the Cf2Th-CD4+CCR5+ cells and infectivity measured 3 days later by 652 luciferase activity. All tests of infectivity and neutralization were performed in at least three 653 independent experiments that contain three replicates each. Standard errors of the mean were 654 used to quantify the variability in the data. 655 656 Statistical analysis of mutation enrichment in the escape group after treatment 657 We compared the frequency of all 59 suspected mutations in the samples collected before 658 and during treatment in the escape group. The escape group was defined as participants for 659 whom all pre-treatment samples had IC50 values lower than 50 nM TMR, and at least one on -660 treatment sample with an IC 50 greater than 50 nM. To determine enrichment of mutations after 661 treatment in the se 65 participants, we tested the null hypothesis that the frequency of the 662 mutations in the pre-treatment samples was equal to or greater than their frequency in the post -663 treatment samples. Briefly, for each subject we merged all amino acids at each position into a 664 single pre-treatment sample or post -treatment sample. We then calculated for each of the 59 665 suspected mutations the ratio between the mutation frequency in the post -treatment samples 666 relative to the pre-treatment samples. If a participant contained a mutation in both pre- and post-667 treatment samples, the individual was excluded from the analysis of that mutation. Timepoint 668 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 22 identifiers were then permuted 10,000 times and for each iteration the ratio was calculated. To 669 avoid dividing by zero, a small constant value was added to the denominator. The fraction of 670 iterations in which the ratio for the permuted data was the same or larger than the non-permuted 671 data was applied as the P-value for this one-sided test. 672 673 Rate of independent substitution events in HIV-1 clade B 674 We calculated the emergence frequency of the 18 REMs in HIV-1 clade B. To this end, 675 we downloaded Env amino acid sequences from the LANL database, and aligned them to the 676 HXBc2 strain. All sequences that contain any ambiguity or stop codons were removed, as well as 677 sequences within 0.05 amino acid substitutions per site from any other sequence. The resulting 678 dataset consisted of 2,535 Env sequences . We note that while no information was associated 679 with these sequences to indicate lack of fostemsavir treatment of the hosts, the sample collection 680 dates (99.2% collected before FDA approval of fostemsavir) and the sources of the remaining 681 samples render the likelihood of such a coincidence negligible. Phylogenetic relationships 682 between sequences were calculated using FastTree (89) on the Galaxy platform (90), and the 683 tree was divided into sublineages comprising 80 to 150 sequences each using the Depth -First 684 Search algorithm (91). All sublineages were processed using HyPhy SLAC (92) on Galaxy to infer 685 the most likely ancestral sequence. A custom python code was then used to calculate the rate of 686 independent mutation events at each Env position using the results of SLAC as an input. In brief, 687 each subgroup tree was recursively searched to determine all substitution events relative to the 688 prior inferred node. A subgroup was ignored if the majority variant differed from the clade 689 consensus sequence. The rate of independent emergence events of for each amino acid was 690 calculated as the ratio between the number of emergence events and the total number of nodes. 691 Daughter nodes of inferred substitution events were excluded from this count unless a new 692 substitution event occurred. The code is available through our GitHub repository at 693 https://github.com/haimlab/Independent_mutation_Tree. 694 695 Cell-based ELISA to measure binding of antibodies to cell-surface Env variants 696 To measure effects of the mutations on Env antigenicity, we expressed the different 697 variants on the surface of human osteosarcoma (HOS) cells and measured binding of monoclonal 698 antibodies by cell-based ELISA, as we described (7, 59, 93). Briefly, HOS cells were seeded in 699 96-well plates ( 1.4Γ—104 cells per well) and transfected the next day with plasmids that express 700 Env, Tat, and Rev, using 60, 11, and 6 ng of each plasmid per well, respectively, and 0.12 μL per 701 well of JetPrime reagent. Background antibody binding was quantified using cells transfected with 702 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 23 a pSVIII construct containing a premature stop codon at Env position 46 . Three days later, cells 703 were washed in blocking buffer (BB) composed of a tris-saline (TS) buffer (140 mM NaCl, 1.8 mM 704 CaCl2, 1 mM MgCl2, and 25 mM Tris [pH 7.5]) supplemented with 3% bovine serum albumin and 705 1.1% skim milk. Cells were then incubated for 45 min at room temperature in BB that contains the 706 antibodies at the following concentrations: 17b and F105 at 5 μg/mL; 39F and 10E8 at 2  μg/mL; 707 CD4-Ig, N6, 447-52D, 35O22, PGT145 and 2G12 at 1 ΞΌg/mL. Cells were then washed 6 times 708 with BB and incubated with a horseradish peroxidase-conjugated goat anti-human IgG for 1 h at 709 room temperature. Cells were subsequently washed six times with BB and six times with TS 710 buffer. Antibody binding was measured by chemiluminescence using 35 μL per well of a 1:1 mix 711 of SuperSignal West Pico chemiluminescent peroxide and luminol enhancer solutions (Thermo 712 Scientific) supplemented with 150 mM NaCl on a Synergy H1 microplate reader. To correct 713 antibody binding values to the expression level of each Env , we also measured in each 714 experiment their recognition by antibody 2G12 that targets an exposed epitope on the high-715 mannose patch of gp120 (94). The antibody-to-2G12 binding ratio was calculated for each Env 716 variant, and the value expressed as a fraction of this ratio measured for the wild -type AD8 Env. 717 Binding assays were conducted at least three times in independent experiments, each with three 718 replicate samples. Standard error values were used to describe the variability in the data. 719 720 Saturation mutagenesis to determine effects on Env fitness and resistance to TMR 721 To introduce a degenerate codon that encodes for all 20 amino acids ( and one stop 722 codon), we used primers that contain the trinucleotide sequence NNK at position 375 or 426 (see 723 schematic of approach and all primers in Fig S17). N represents any nucleotide, and K represents 724 G or T. The NNK-containing primers were used to amplify an Env segment using as template the 725 proviral vector pNL4 -3 that encodes for the Env of strain AD8 with PrimeSTAR Max DNA 726 Polymerase (Takara). This fragment was then ligated to a second fragment by overlapping PCR 727 to generate a combined fragment that spans the entire env gene. This product was then cloned 728 into the env-deleted pNLAD8 vector (GenBank ID PV345784) using In-Fusion assembly (Takara), 729 and the product transformed into Zymo Mix&Go DH5Ξ± chemical ly competent cells. At least 350 730 colonies for each library were collected and pooled, and plasmids were purified using ZR Plasmid 731 Miniprep-Classic. Two micrograms of the provirus library were then used to transfect 1x106 HEK 732 293T cells cultured in 6-well-plate-wells using JetPrime reagent. The medium was changed after 733 4 hours, and 48-hours after transfection the virus was harvested, passed through 0.45 ΞΌm filters 734 and treated with 100 U/ml DNase -1 (Roche) for 30 min at 37Β°C. Samples were the divide d into 735 aliquots, snap-frozen using dry ice immersed in ethanol, and stored at -80Β°C until use. Virus titers 736 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 24 were determined by plaque assay after a 96-hour infection of TZMbl-GFP cells (BEI Resources, 737 HRP-20041, contributed by David G. Russell and David W. Gludish) in DMEM/FCS supplemented 738 with 20 ΞΌg/ml DEAE dextran. 739 To eliminate mixed-allele virions from the samples, the above virus libraries were used to 740 infect 2x106 A3R5.7 acute lymphoblastic leukemia T cells in 2 mL at an MOI of 0.003 in 741 RPMI/FCS supplemented with 20 ΞΌg /ml DEAE dextran. Cells were resuspended in 4 mL fresh 742 culture medium 24 hours after infection, and 2 days later the virus was harvested and filtered as 743 above. Titers were determined using TZM-bl-GFP cells and the samples used to infect a culture 744 of 1x106 A3R5.7 cells in RPMI/FCS supplemented with 40 ΞΌg/ml DEAE dextran at an MOI of 0.05. 745 These infections were performed in the absence or presence of 250 nM TMR. At 16 hours post-746 infection, cells were pelleted and non -integrated viral DNA was purified using QIAprep Spin 747 Miniprep kit (Qiagen). A 900 -bp region encompassing position 375 and 426 was then amplified 748 using PrimeSTAR Max DNA Polymerase. For each biological replicate, a n initial 30 -cycle 749 amplification was performed in triplicate, the samples pooled, and gel-purified using Zymoclean 750 Gel DNA Recovery Kit (Zymo). If necessary, an additional 8-12 rounds of PCR were performed 751 to obtain sufficient sample for sequencing, which was purified using the QIAquick PCR Purification 752 Kit (Qiagen). To sequence the input virus used for the second round of A3R5.7 cell infection, viral 753 RNA was purified using Quick RNA Viral Kit (Zymo), reverse transcribed using SuperScript IV 754 (Invitrogen), and PCR amplified using the same primers used with the purified product from the 755 infected cells. Finally, all samples were sequenced by Oxford Nanopore Technology 756 (PlasmidSaurus), which provided an average count of 5,000 reads per sample. 757 Sequence data ( in fastq format ) were used to calculate amino acid preferences at 758 positions 375 and 426 in the absence and presence of Temsavir based on the method described 759 by Haddox and colleagues (50). This approach calculates the relative frequency of each amino 760 acid in viral DNA isolated from cells infected by the virus library relative to their frequencies in the 761 virus library used for infection. To determine if calculations should be corrected for the error rate 762 in the sequencing reactions, we first examined the incorrect variant calls at the two positions by 763 sequencing three samples infected by the wild-type virus. The average frequency of minority 764 variants across the six samples was 0.13% (standard deviation 0.12%). As such, we performed 765 subsequent calculations in an error-agnostic manner. 766 To calculate the enrichment ratio (πœ™) for each amino acid π‘Ž at position 𝑝, we used: 767 πœ™π‘,π‘Ž = ƒ𝑝,π‘Ž 𝑐𝑒𝑙𝑙 ƒ𝑝,𝑀𝑑(𝑝) 𝑐𝑒𝑙𝑙⁄ ƒ𝑝,π‘Ž π‘£π‘–π‘Ÿπ‘’π‘  ƒ𝑝,𝑀𝑑(𝑝) π‘£π‘–π‘Ÿπ‘’π‘ β„ 768 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 25 where ƒ𝑝,π‘Ž 𝑐𝑒𝑙𝑙 and ƒ𝑝,𝑀𝑑(𝑝) 𝑐𝑒𝑙𝑙 are the frequencies in the cell lysate of amino acid π‘Ž or the wild-type (𝑀𝑑) 769 amino acid for position 𝑝, and ƒ𝑝,π‘Ž π‘£π‘–π‘Ÿπ‘’π‘  and ƒ𝑝,𝑀𝑑(𝑝) π‘£π‘–π‘Ÿπ‘’π‘  are the frequencies of these forms in the input 770 virus sample used for infection. Finally, we calculated the amino acid preference ( πœ‹) for each 771 amino acid as: 772 πœ‹π‘,π‘Ž = πœ™π‘,π‘Ž βˆ‘ πœ™π‘,π‘Žβ€²π‘Žβ€² 773 where the βˆ‘ πœ™π‘,π‘Žβ€²π‘Žβ€² is the sum of enrichments for all amino acids π‘Žβ€² at position 𝑝. 774 A complete software package to calculate amino acid preferences is available through our Github 775 repository at https://github.com/haimlab/EZDMS. 776 777 Algorithms to estimate appearance and persistence of REMs 778 To evaluate the contribution of the different variables to the observed mutational profiles 779 in treated individuals, we designed two algorithms (see flowcharts in Fig S15B and S15C). In 780 both, mutation appearance and persistence were modeled as stochastic events with mutation -781 specific probabilities. The first algorithm calculate d for each of the 18 REMs individually the 782 probability of appearance based on the number and type of nucleotide changes required from the 783 initial trinucleotide sequence. To test the probability for REM appearance in clade B, the algorithm 784 was initiated with the trinucleotide sequence at the Env position in the clade B consensus 785 sequence. To test the probability for REM appearance in the 65 participants of the escape group, 786 the algorithm was initiated with the trinucleotide sequence in the pre-treatment samples of the 65 787 escape group participants. One of the three nucleotide sites was selected randomly, and a 788 nucleotide substitution was randomly introduced at that site based on the transition and 789 transversion likelihoods determined by Martinez Del Rio et al., (15) (see values in Fig S15A). Up 790 to t wo consecutive mutations were allowed. If the desired REM was acquired within the two 791 attempts, a success event was recorded. For each REM, the algorithm was repeated 1,000 times 792 for the clade B consensus sequence, and 1,000 times for each escape group participant. The 793 fraction of success events of the 1,000 iterations in each participant was defined as the ir 794 probability for mutation appearance. 795 The second algorithm ( Fig S15C ) introduce d an additional module to the mutation 796 appearance step, which aimed to capture the ability of each REM to persist in the host . It was 797 based on the in vitro measured effects of the REMs on: (i) Infectivity, (ii) Functional stability at 798 37Β°C, and (iii) Resistance to non-neutralizing plasma. The three values measured for each REM 799 were expressed as a fraction of the wild-type AD8 Env values, and their product was defined as 800 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 26 the Combined Fitness value of each REM. For each success event from the appearance module, 801 persistence was determined by a random process , in which the Combined Fitness value was 802 used as the probability of the REM to persist . If the mutation persisted, a success event was 803 recorded. The fraction of success events of the 1,000 iterations was defined as the probability for 804 REM emergence. Performance was evaluated by compiling the probability values for the 18 805 REMs in the 65 participants of the escape group, which were compared with their outcomes: the 806 absence (0) or presence (1) of REM emergence in the participant after treatment. Performance 807 was measured using the AUC metric. The code for the above algorithm can be found on our 808 Github repository at https://github.com/haimlab/REM-Emergence-Modeling. 809 810 811

Acknowledgements

812 This work was supported by National Institutes of Health (NIH) grant R01 AI170205 to HH, and 813 by ViiV Healthcare Investigator Sponsored Study (ID 4104) to HH. The funders played no role in 814 study design, data analysis, in vitro data acquisition or analysis, or the decision to publish this 815 work. The authors are grateful to Anthony Roberth Rojas ChΓ‘vez for assistance with processing 816 of the sequence data. 817 818 819 SUPPLEMENTAL DATA FILES 820 Data File S1. Amino acid sequence and TMR resistance values for the 570 BRIGHTE trial 821 samples. Sequences were obtained by the Phenosense GT assay. For each sample, the 822 complete amino acid sequence is provided, as well as the sequence at the 856 positions of Env 823 according to the HXBc2 reference strain. TMR IC 50 values for the samples, as measured using 824 the Phenosense GT assay, are indicated. Clade associations were inferred using the RIP tool. 825 826 Data File S2. Phylogenetic tree (in Newick format) of 360 pre-treatment samples from BRIGHTE 827 subjects. The tree is rooted to the Env of the HXBc2 reference strain. These data were used to 828 generate the tree in Fig S2A. 829 830 Data File S3. Amino acid sequence and TMR resistance values for the 208 Envs of the S ingle-831 Env dataset. Accession numbers reported in Pancera et al., (Nat Chem Biol, 2017) were used to 832 download the amino acid sequences of the 208 Envs included in our study. Aligned amino acid 833 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 27 sequences are provided for all Env positions and for the 856 positions of Env according to the 834 HXBc2 reference strain. TMR IC 50 values for the Envs were measured using a TZM -bl 835 neutralization assay. 836 837 Data File S4. GB regressor model to identify mutations that affect HIV-1 resistance to TMR. Env 838 positions within 7.5Γ… of the TMR molecule were used as input for the model. SHAP values for all 839 residues at these positions were averaged across all 778 samples used for the prediction. Mean 840 SHAP values as well as the values for all 778 samples are provided. 841 842 Data File S5. Probabilistic model to detect mutations that impact Env resistance to TMR. 843 Mutations at Env positions within 7.5Γ… of the TMR molecule were used as input for the model 844 using the 778 samples of the BRIGHTE trial and Single-Env datasets. Only mutations that appear 845 in less than 5% of all samples were analyzed. The average IC50 value for all samples that contain 846 each mutation and the variability (measured by the standard deviation) are shown, and were used 847 to rank the mutations according to their likelihood for impacting Env resistance to TMR. 848 849 Data File S6. Amino acid alignment (in fasta format) of 2,535 Env sequences from HIV -1 clade 850 B. 851 852 Data File S7. Emergence frequency of amino acid variants at all Env positions in HIV-1 clade B. 853 Values were calculated using the phylogenetic tree constructed from 2,535 amino acid sequences 854 of clade B Envs. They describe the rate of substitution to each amino acid (from the clade 855 ancestral form) at the 856 positions of Env according to HXBc2 numbering. The number of 856 sequences in each subgroup is indicated above each column. 857 858 Data File S8. Phylogenetic tree (in Newick format) of 2,535 Env sequences from HIV-1 clade B. 859 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 28

References

860 1. Mansky LM, Temin HM. 1995. Lower in-Vivo Mutation-Rate of Human-Immunodeficiency-861 Virus Type-1 Than That Predicted from the Fidelity of Purified Reverse-Transcriptase. 862 Journal of Virology 69:5087-5094. 863 2. Sanjuan R, Nebot MR, Chirico N, Mansky LM, Belshaw R. 2010. Viral mutation rates. J 864 Virol 84:9733-48. 865 3. Ji JP, Loeb LA. 1992. Fidelity of HIV -1 reverse transcriptase copying RNA in vitro. 866 Biochemistry 31:954-8. 867 4. Yeo JY, Goh GR, Su CT, Gan SK. 2020. The Determination of HIV-1 RT Mutation Rate, 868 Its Possible Allosteric Effects, and Its Implications on Drug Resistance. Viruses 12. 869 5. Duncan CJ, Sattentau QJ. 2011. Viral determinants of HIV-1 macrophage tropism. Viruses 870 3:2255-79. 871 6. Delobel P, Sandres -Saune K, Cazabat M, Pasquier C, Marchou B, Massip P, Izopet J. 872 2005. R5 to X4 switch of the predominant HIV -1 population in cellular reservoirs during 873 effective highly active antiretroviral therapy. J Acquir Immune Defic Syndr 38:382-92. 874 7. Rojas Chavez RA, Boyt D, Schwery N, Han C, Wu L, Haim H. 2022. Commonly Elicited 875 Antibodies against the Base of the HIV-1 Env Trimer Guide the Population-Level Evolution 876 of a Structure-Regulating Region in gp41. J Virol 96:e0040622. 877 8. Beitari S, Wang Y, Liu SL, Liang C. 2019. HIV-1 Envelope Glycoprotein at the Interface of 878 Host Restriction and Virus Evasion. Viruses 11. 879 9. Menendez-Arias L. 2013. Molecular basis of human immunodeficiency virus type 1 drug 880 resistance: overview and recent developments. Antiviral Res 98:93-120. 881 10. Noe A, Plum J, Verhofstede C. 2005. The latent HIV -1 reservoir in patients undergoing 882 HAART: an archive of pre-HAART drug resistance. J Antimicrob Chemother 55:410-2. 883 11. Gonzalez-Ortega E, Ballana E, Badia R, Clotet B, Este JA. 2011. Compensatory mutations 884 rescue the virus replicative capacity of VIRIP-resistant HIV-1. Antiviral Res 92:479-83. 885 12. Lynch RM, Boritz E, Coates EE, DeZure A, Madden P, Costner P, Enama ME, Plummer 886 S, Holman L, Hendel CS, Gordon I, Casazza J, Conan -Cibotti M, Migueles SA, Tressler 887 R, Bailer RT, McDermott A, Narpala S, O'Dell S, Wolf G, Lifson JD, Freemire BA, Gorelick 888 RJ, Pandey JP, Mohan S, Chomont N, Fromentin R, Chun TW, Fauci AS, Schwartz RM, 889 Koup RA, Douek DC, Hu Z, Capparelli E, Graham BS, Mascola JR, Ledgerwood JE, Team 890 VRCS. 2015. Virologic effects of broadly neutralizing antibody VRC01 administration 891 during chronic HIV-1 infection. Sci Transl Med 7:319ra206. 892 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 29 13. Van Duyne R, Kuo LS, Pham P, Fujii K, Freed EO. 2019. Mutations in the HIV-1 envelope 893 glycoprotein can broadly rescue blocks at multiple steps in the virus replication cycle. Proc 894 Natl Acad Sci U S A 116:9040-9049. 895 14. Abram ME, Ferris AL, Das K, Quinones O, Shao W, Tuske S, Alvord WG, Arnold E, 896 Hughes SH. 2014. Mutations in HIV -1 reverse transcriptase affect the errors made in a 897 single cycle of viral replication. J Virol 88:7589-601. 898 15. Martinez Del Rio J, Frutos-Beltran E, Sebastian-Martin A, Lasala F, Yasukawa K, Delgado 899 R, Menendez-Arias L. 2024. HIV-1 Reverse Transcriptase Error Rates and Transcriptional 900 Thresholds Based on Single-strand Consensus Sequencing of Target RNA Derived From 901 In Vitro-transcription and HIV-infected Cells. J Mol Biol 436:168815. 902 16. Rawson JM, Landman SR, Reilly CS, Mansky LM. 2015. HIV-1 and HIV-2 exhibit similar 903 mutation frequencies and spectra in the absence of G -to-A hypermutation. Retrovirology 904 12:60. 905 17. DeLeon O, Hodis H, O'Malley Y, Johnson J, Salimi H, Zhai Y, Winter E, Remec C, 906 Eichelberger N, Van Cleave B, Puliadi R, Harrington RD, Stapleton JT, Haim H. 2017. 907 Accurate predictions of population-level changes in sequence and structural properties of 908 HIV-1 Env using a volatility-controlled diffusion model. PLoS Biol 15:e2001549. 909 18. Han C, Johnson J, Dong R, Kandula R, Kort A, Wong M, Yang T, Breheny PJ, Brown GD, 910 Haim H. 2020. Key Positions of HIV -1 Env and Signatures of Vaccine Efficacy Show 911 Gradual Reduction of Population Founder Effects at the Clade and Regional Levels. mBio 912 11. 913 19. Markham A. 2020. Fostemsavir: First Approval. Drugs 80:1485-1490. 914 20. Wang T, Ueda Y, Zhang Z, Yin Z, Matiskella J, Pearce BC, Yang Z, Zheng M, Parker DD, 915 Yamanaka GA, Gong YF, Ho HT, Colonno RJ, Langley DR, Lin PF, Meanwell NA, Kadow 916 JF. 2018. Discovery of the Human Immunodeficiency Virus Type 1 (HIV -1) Attachment 917 Inhibitor Temsavir and Its Phosphonooxymethyl Prodrug Fostemsavir. J Med Chem 918 61:6308-6327. 919 21. Aberg JA, Shepherd B, Wang M, Madruga JV, Mendo Urbina F, Katlama C, Schrader S, 920 Eron JJ, Kumar PN, Sprinz E, Gartland M, Chabria S, Clark A, Pierce A, Lataillade M, 921 Tenorio AR. 2023. Week 240 Efficacy and Safety of Fostemsavir Plus Optimized 922

Background

Therapy in Heavily Treatment-Experienced Adults with HIV-1. Infect Dis Ther 923 12:2321-2335. 924 22. Kozal M, Aberg J, Pialoux G, Cahn P, Thompson M, Molina JM, Grinsztejn B, Diaz R, 925 Castagna A, Kumar P, Latiff G, DeJesus E, Gummel M, Gartland M, Pierce A, Ackerman 926 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 30 P, Llamoso C, Lataillade M, Team BT. 2020. Fostemsavir in Adults with Multidrug -927 Resistant HIV-1 Infection. N Engl J Med 382:1232-1243. 928 23. Lataillade M, Lalezari JP, Kozal M, Aberg JA, Pialoux G, Cahn P, Thompson M, Molina 929 JM, Moreno S, Grinsztejn B, Diaz RS, Castagna A, Kumar PN, Latiff GH, De Jesus E, 930 Wang M, Chabria S, Gartland M, Pierce A, Ackerman P, Llamoso C. 2020. Safety and 931 efficacy of the HIV -1 attachment inhibitor prodrug fostemsavir in heavily treatment -932 experienced individuals: week 96 results of the phase 3 BRIGHTE study. Lancet HIV 933 7:e740-e751. 934 24. Benlarbi M, Richard J, Bourassa C, Tolbert WD, Chartrand-Lefebvre C, Gendron-Lepage 935 G, Sylla M, El -Far M, Messier -Peet M, Guertin C, Turcotte I, Fromentin R, Verly MM, 936 Prevost J, Clark A, Mothes W, Kaufmann DE, Maldarelli F, Chomont N, Begin P, Tremblay 937 C, Baril JG, Trottier B, Trottier S, Duerr R, Pazgier M, Durand M, Finzi A. 2024. Plasma 938 Human Immunodeficiency Virus 1 Soluble Glycoprotein 120 Association With Correlates 939 of Immune Dysfunction and Inflammation in Antiretroviral Therapy -Treated Individuals 940 With Undetectable Viremia. J Infect Dis 229:763-774. 941 25. Richard J, Prevost J, Bourassa C, Brassard N, Boutin M, Benlarbi M, Goyette G, Medjahed 942 H, Gendron-Lepage G, Gaudette F, Chen HC, Tolbert WD, Smith AB, 3rd, Pazgier M, 943 Dube M, Clark A, Mothes W, Kaufmann DE, Finzi A. 2023. Temsavir blocks the 944 immunomodulatory activities of HIV-1 soluble gp120. Cell Chem Biol 30:540-552 e6. 945 26. Clark A, Prakash M, Chabria S, Pierce A, Castillo -Mancilla JR, Wang M, Du F, Tenorio 946 AR. 2024. Inflammatory Biomarker Reduction With Fostemsavir Over 96 Weeks in Heavily 947 Treatment-Experienced Adults With Multidrug -Resistant HIV-1 in the BRIGHTE Study. 948 Open Forum Infect Dis 11:ofae469. 949 27. Gartland M, Cahn P, DeJesus E, Diaz RS, Grossberg R, Kozal M, Kumar P, Molina JM, 950 Mendo Urbina F, Wang M, Du F, Chabria S, Clark A, Garside L, Krystal M, Mannino F, 951 Pierce A, Ackerman P, Lataillade M. 2022. Week 96 Genotypic and Phenotypic Results 952 of the Fostemsavir Phase 3 BRIGHTE Study in Heavily Treatment -Experienced Adults 953 Living with Multidrug -Resistant HIV -1. Antimicrob Agents Chemother 954 doi:10.1128/aac.01751-21:e0175121. 955 28. Brown J, Chien C, Timmins P, Dennis A, Doll W, Sandefer E, Page R, Nettles RE, Zhu L, 956 Grasela D. 2013. Compartmental absorption modeling and site of absorption studies to 957 determine feasibility of an extended-release formulation of an HIV-1 attachment inhibitor 958 phosphate ester prodrug. J Pharm Sci 102:1742-1751. 959 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 31 29. Heidary M, Shariati S, Nourigheimasi S, Khorami M, Moradi M, Motahar M, Bahrami P, 960 Akrami S, Kaviar VH. 2024. Mechanism of action, resistance, interaction, 961 pharmacokinetics, pharmacodynamics, and safety of fostemsavir. BMC Infect Dis 24:250. 962 30. Lai YT. 2021. Small Molecule HIV-1 Attachment Inhibitors: Discovery, Mode of Action and 963 Structural Basis of Inhibition. Viruses 13. 964 31. Langley DR, Kimura SR, Sivaprakasam P, Zhou N, Dicker I, McAuliffe B, Wang T, Kadow 965 JF, Meanwell NA, Krystal M. 2015. Homology models of the HIV -1 attachment inhibitor 966 BMS-626529 bound to gp120 suggest a unique mechanism of action. Proteins 83:331-50. 967 32. Lataillade M, Zhou N, Joshi SR, Lee S, Stock DA, Hanna GJ, Krystal M, team AIs. 2018. 968 Viral Drug Resistance Through 48 Weeks, in a Phase 2b, Randomized, Controlled Trial 969 of the HIV -1 Attachment Inhibitor Prodrug, Fostemsavir. J Acquir Immune Defic Syndr 970 77:299-307. 971 33. Gartland M, Zhou N, Stewart E, Pierce A, Clark A, Ackerman P, Llamoso C, Lataillade M, 972 Krystal M. 2021. Susceptibility of global HIV -1 clinical isolates to fostemsavir using the 973 PhenoSense(R) Entry assay. J Antimicrob Chemother 76:648-652. 974 34. Ray N, Hwang C, Healy MD, Whitcomb J, Lataillade M, Wind-Rotolo M, Krystal M, Hanna 975 GJ. 2013. Prediction of virological response and assessment of resistance emergence to 976 the HIV-1 attachment inhibitor BMS -626529 during 8 -day monotherapy with its prodrug 977 BMS-663068. J Acquir Immune Defic Syndr 64:7-15. 978 35. Zhou N, Nowicka-Sans B, McAuliffe B, Ray N, Eggers B, Fang H, Fan L, Healy M, Langley 979 DR, Hwang C, Lataillade M, Hanna GJ, Krystal M. 2014. Genotypic correlates of 980 susceptibility to HIV-1 attachment inhibitor BMS-626529, the active agent of the prodrug 981 BMS-663068. J Antimicrob Chemother 69:573-81. 982 36. Prevost J, Chen Y, Zhou F, Tolbert WD, Gasser R, Medjahed H, Nayrac M, Nguyen DN, 983 Gottumukkala S, Hessell AJ, Rao VB, Pozharski E, Huang RK, Matthies D, Finzi A, 984 Pazgier M. 2023. Structure-function analyses reveal key molecular determinants of HIV-1 985 CRF01_AE resistance to the entry inhibitor temsavir. Nat Commun 14:6710. 986 37. Gartland M, Arnoult E, Foley BT, Lataillade M, Ackerman P, Llamoso C, Krystal M. 2021. 987 Prevalence of gp160 polymorphisms known to be related to decreased susceptibility to 988 temsavir in different subtypes of HIV -1 in the Los Alamos National Laboratory HIV 989 Sequence Database. J Antimicrob Chemother 76:2958-2964. 990 38. Pancera M, Lai YT, Bylund T, Druz A, Narpala S, O'Dell S, Schon A, Bailer RT, Chuang 991 GY, Geng H, Louder MK, Rawi R, Soumana DI, Finzi A, Herschhorn A, Madani N, 992 Sodroski J, Freire E, Langley DR, Mascola JR, McDermott AB, Kwong PD. 2017. Crystal 993 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 32 structures of trimeric HIV envelope with entry inhibitors BMS -378806 and BMS-626529. 994 Nat Chem Biol 13:1115-1122. 995 39. Zuze BJL, Radibe BT, Choga WT, Bareng OT, Moraka NO, Maruapula D, Seru K, 996 Mokgethi P, Mokaleng B, Ndlovu N, Kelentse N, Pretorius-Holme M, Shapiro R, Lockman 997 S, Makhema J, Novitsky V, Seatla KK, Moyo S, Gaseitsiwe S. 2023. Fostemsavir 998 resistance-associated polymorphisms in HIV-1 subtype C in a large cohort of treatment -999 naive and treatment-experienced individuals in Botswana. Microbiol Spectr 11:e0125123. 1000 40. Hanna GJ, Lalezari J, Hellinger JA, Wohl DA, Nettles R, Persson A, Krystal M, Lin P, 1001 Colonno R, Grasela DM. 2011. Antiviral activity, pharmacokinetics, and safety of BMS -1002 488043, a novel oral small -molecule HIV -1 attachment inhibitor, in HIV -1-infected 1003 subjects. Antimicrob Agents Chemother 55:722-8. 1004 41. Nettles RE, Schurmann D, Zhu L, Stonier M, Huang SP, Chang I, Chien C, Krystal M, 1005 Wind-Rotolo M, Ray N, Hanna GJ, Bertz R, Grasela D. 2012. Pharmacodynamics, safety, 1006 and pharmacokinetics of BMS -663068, an oral HIV -1 attachment inhibitor in HIV -1-1007 infected subjects. J Infect Dis 206:1002-11. 1008 42. Korber B, Foley B, Kuiken C, Pillai S, Sodroski J. 1998. Numbering positions in HIV relative 1009 to HXBc2. Los Alamos: Los Alamos Natl Lab:iii-102-iii-103. 1010 43. Chen TQ, Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System. Kdd'16: 1011 Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery 1012 and Data Mining doi:10.1145/2939672.2939785:785-794. 1013 44. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, 1014 Bansal N, Lee SI. 2020. From Local Explanations to Global Understanding with 1015 Explainable AI for Trees. Nat Mach Intell 2:56-67. 1016 45. Gartland M, Stewart E, Zhou N, Li Z, Rose R, Beloor J, Clark A, Tenorio AR, Krystal M. 1017 2024. Characterization of clinical envelopes with lack of sensitivity to the HIV-1 inhibitors 1018 temsavir and ibalizumab. Antiviral Res 228:105957. 1019 46. Zhou N, Fan L, Ho HT, Nowicka-Sans B, Sun Y, Zhu Y, Hu Y, McAuliffe B, Rose B, Fang 1020 H, Wang T, Kadow J, Krystal M, Alexander L, Colonno R, Lin PF. 2010. Increased 1021 sensitivity of HIV variants selected by attachment inhibitors to broadly neutralizing 1022 antibodies. Virology 402:256-61. 1023 47. Haim H, Strack B, Kassa A, Madani N, Wang L, Courter JR, Princiotto A, McGee K, 1024 Pacheco B, Seaman MS, Smith AB, 3rd, Sodroski J. 2011. Contribution of intrinsic 1025 reactivity of the HIV -1 envelope glycoproteins to CD4 -independent infection and global 1026 inhibitor sensitivity. PLoS Pathog 7:e1002101. 1027 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 33 48. Seaman MS, Janes H, Hawkins N, Grandpre LE, Devoy C, Giri A, Coffey RT, Harris L, 1028 Wood B, Daniels MG, Bhattacharya T, Lapedes A, Polonis VR, McCutchan FE, Gilbert 1029 PB, Self SG, Korber BT, Montefiori DC, Mascola JR. 2010. Tiered categorization of a 1030 diverse panel of HIV -1 Env pseudoviruses for assessment of neutralizing antibodies. J 1031 Virol 84:1439-52. 1032 49. Vermeire J, Naessens E, Vanderstraeten H, Landi A, Iannucci V, Van Nuffel A, Taghon T, 1033 Pizzato M, Verhasselt B. 2012. Quantification of reverse transcriptase activity by real-time 1034 PCR as a fast and accurate method for titration of HIV, lenti- and retroviral vectors. PLoS 1035 One 7:e50859. 1036 50. Haddox HK, Dingens AS, Hilton SK, Overbaugh J, Bloom JD. 2018. Mapping mutational 1037 effects along the evolutionary landscape of HIV envelope. Elife 7. 1038 51. Guan Y, Pazgier M, Sajadi MM, Kamin-Lewis R, Al-Darmarki S, Flinko R, Lovo E, Wu X, 1039 Robinson JE, Seaman MS, Fouts TR, Gallo RC, DeVico AL, Lewis GK. 2013. Diverse 1040 specificity and effector function among human antibodies to HIV-1 envelope glycoprotein 1041 epitopes exposed by CD4 binding. Proc Natl Acad Sci U S A 110:E69-78. 1042 52. Richard J, Nguyen DN, Tolbert WD, Gasser R, Ding S, Vezina D, Yu Gong S, Prevost J, 1043 Gendron-Lepage G, Medjahed H, Gottumukkala S, Finzi A, Pazgier M. 2021. Across 1044 Functional Boundaries: Making Nonneutralizing Antibodies To Neutralize HIV -1 and 1045 Mediate Fc-Mediated Effector Killing of Infected Cells. mBio 12:e0140521. 1046 53. Forthal D, Hope TJ, Alter G. 2013. New paradigms for functional HIV -specific 1047 nonneutralizing antibodies. Curr Opin HIV AIDS 8:393-401. 1048 54. Mayr LM, Su B, Moog C. 2017. Non-Neutralizing Antibodies Directed against HIV and 1049 Their Functions. Front Immunol 8:1590. 1050 55. Horwitz JA, Bar-On Y, Lu CL, Fera D, Lockhart AAK, Lorenzi JCC, Nogueira L, Golijanin 1051 J, Scheid JF, Seaman MS, Gazumyan A, Zolla -Pazner S, Nussenzweig MC. 2017. Non-1052 neutralizing Antibodies Alter the Course of HIV-1 Infection In Vivo. Cell 170:637-648 e10. 1053 56. Mielke D, Bandawe G, Zheng J, Jones J, Abrahams MR, Bekker V, Ochsenbauer C, 1054 Garrett N, Abdool Karim S, Moore PL, Morris L, Montefiori D, Anthony C, Ferrari G, 1055 Williamson C. 2021. ADCC -mediating non -neutralizing antibodies can exert immune 1056 pressure in early HIV-1 infection. PLoS Pathog 17:e1010046. 1057 57. Santra S, Tomaras GD, Warrier R, Nicely NI, Liao HX, Pollara J, Liu P, Alam SM, Zhang 1058 R, Cocklin SL, Shen X, Duffy R, Xia SM, Schutte RJ, Pemble Iv CW, Dennison SM, Li H, 1059 Chao A, Vidnovic K, Evans A, Klein K, Kumar A, Robinson J, Landucci G, Forthal DN, 1060 Montefiori DC, Kaewkungwal J, Nitayaphan S, Pitisuttithum P, Rerks-Ngarm S, Robb ML, 1061 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 34 Michael NL, Kim JH, Soderberg KA, Giorgi EE, Blair L, Korber BT, Moog C, Shattock RJ, 1062 Letvin NL, Schmitz JE, Moody MA, Gao F, Ferrari G, Shaw GM, Haynes BF. 2015. Human 1063 Non-neutralizing HIV-1 Envelope Monoclonal Antibodies Limit the Number of Founder 1064 Viruses during SHIV Mucosal Infection in Rhesus Macaques. PLoS Pathog 11:e1005042. 1065 58. Haim H, Salas I, Sodroski J. 2013. Proteolytic processing of the human immunodeficiency 1066 virus envelope glycoprotein precursor decreases conformational flexibility. J Virol 1067 87:1884-9. 1068 59. Johnson J, Zhai Y, Salimi H, Espy N, Eichelberger N, DeLeon O, O'Malley Y, Courter J, 1069 Smith AB, 3rd, Madani N, Sodroski J, Haim H. 2017. Induction of a Tier-1-Like Phenotype 1070 in Diverse Tier -2 Isolates by Agents That Guide HIV -1 Env to Perturbation -Sensitive, 1071 Nonnative States. J Virol 91. 1072 60. Lee JH, Andrabi R, Su CY, Yasmeen A, Julien JP, Kong L, Wu NC, McBride R, Sok D, 1073 Pauthner M, Cottrell CA, Nieusma T, Blattner C, Paulson JC, Klasse PJ, Wilson IA, Burton 1074 DR, Ward AB. 2017. A Broadly Neutralizing Antibody Targets the Dynamic HIV Envelope 1075 Trimer Apex via a Long, Rigidified, and Anionic beta-Hairpin Structure. Immunity 46:690-1076 702. 1077 61. Cale EM, Driscoll JI, Lee M, Gorman J, Zhou T, Lu M, Geng H, Lai YT, Chuang GY, Doria-1078 Rose NA, Mothes W, Kwong PD, Mascola JR. 2022. Antigenic analysis of the HIV -1 1079 envelope trimer implies small differences between structural states 1 and 2. J Biol Chem 1080 298:101819. 1081 62. Munro JB, Gorman J, Ma X, Zhou Z, Arthos J, Burton DR, Koff WC, Courter JR, Smith 1082 AB, 3rd, Kwong PD, Blanchard SC, Mothes W. 2014. Conformational dynamics of single 1083 HIV-1 envelope trimers on the surface of native virions. Science 346:759-63. 1084 63. Agrawal N, Leaman DP, Rowcliffe E, Kinkead H, Nohria R, Akagi J, Bauer K, Du SX, 1085 Whalen RG, Burton DR, Zwick MB. 2011. Functional stability of unliganded envelope 1086 glycoprotein spikes among isolates of human immunodeficiency virus type 1 (HIV -1). 1087 PLoS One 6:e21339. 1088 64. Gift SK, Leaman DP, Zhang L, Kim AS, Zwick MB. 2017. Functional Stability of HIV -1 1089 Envelope Trimer Affects Accessibility to Broadly Neutralizing Antibodies at Its Apex. J Virol 1090 91. 1091 65. Veillette M, Coutu M, Richard J, Batraville LA, Dagher O, Bernard N, Tremblay C, 1092 Kaufmann DE, Roger M, Finzi A. 2015. The HIV -1 gp120 CD4 -bound conformation is 1093 preferentially targeted by antibody-dependent cellular cytotoxicity-mediating antibodies in 1094 sera from HIV-1-infected individuals. J Virol 89:545-51. 1095 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 35 66. Williams KL, Cortez V, Dingens AS, Gach JS, Rainwater S, Weis JF, Chen X, Spearman 1096 P, Forthal DN, Overbaugh J. 2015. HIV-specific CD4-induced Antibodies Mediate Broad 1097 and Potent Antibody-dependent Cellular Cytotoxicity Activity and Are Commonly Detected 1098 in Plasma From HIV-infected humans. EBioMedicine 2:1464-77. 1099 67. Kolchinsky P, Kiprilov E, Sodroski J. 2001. Increased neutralization sensitivity of CD4 -1100 independent human immunodeficiency virus variants. J Virol 75:2041-50. 1101 68. Salimi H, Johnson J, Flores MG, Zhang MS, O'Malley Y, Houtman JC, Schlievert PM, 1102 Haim H. 2020. The lipid membrane of HIV -1 stabilizes the viral envelope glycoproteins 1103 and modulates their sensitivity to antibody neutralization. J Biol Chem 295:348-362. 1104 69. Zhang Z, Wang Q, Nguyen HT, Chen HC, Chiu TJ, Smith Iii AB, Sodroski JG. 2023. 1105 Alterations in gp120 glycans or the gp41 fusion peptide -proximal region modulate the 1106 stability of the human immunodeficiency virus (HIV-1) envelope glycoprotein pretriggered 1107 conformation. J Virol 97:e0059223. 1108 70. Haim H, Si Z, Madani N, Wang L, Courter JR, Princiotto A, Kassa A, DeGrace M, McGee-1109 Estrada K, Mefford M, Gabuzda D, Smith AB, 3rd, Sodroski J. 2009. Soluble CD4 and 1110 CD4-mimetic compounds inhibit HIV -1 infection by induction of a short -lived activated 1111 state. PLoS Pathog 5:e1000360. 1112 71. Currenti J, Chopra A, John M, Leary S, McKinnon E, Alves E, Pilkinton M, Smith R, Barnett 1113 L, McDonnell WJ, Lucas M, Noel F, Mallal S, Conrad JA, Kalams SA, Gaudieri S. 2019. 1114 Deep sequence analysis of HIV adaptation following vertical transmission reveals the 1115 impact of immune pressure on the evolution of HIV. PLoS Pathog 15:e1008177. 1116 72. McBrien JB, Kumar NA, Silvestri G. 2018. Mechanisms of CD8(+) T cell -mediated 1117 suppression of HIV/SIV replication. Eur J Immunol 48:898-914. 1118 73. Fritschi CJ, Anang S, Gong Z, Mohammadi M, Richard J, Bourassa C, Severino KT, 1119 Richter H, Yang D, Chen HC, Chiu TJ, Seaman MS, Madani N, Abrams C, Finzi A, 1120 Hendrickson WA, Sodroski JG, Smith AB, 3rd. 2023. Indoline CD4-mimetic compounds 1121 mediate potent and broad HIV-1 inhibition and sensitization to antibody-dependent cellular 1122 cytotoxicity. Proc Natl Acad Sci U S A 120:e2222073120. 1123 74. Matsumoto K, Kuwata T, Tolbert WD, Richard J, Ding S, Prevost J, Takahama S, Judicate 1124 GP, Ueno T, Nakata H, Kobayakawa T, Tsuji K, Tamamura H, Smith AB, 3rd, Pazgier M, 1125 Finzi A, Matsushita S. 2023. Characterization of a Novel CD4 Mimetic Compound YIR -1126 821 against HIV-1 Clinical Isolates. J Virol 97:e0163822. 1127 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 36 75. Beerenwinkel N, Daumer M, Oette M, Korn K, Hoffmann D, Kaiser R, Lengauer T, Selbig 1128 J, Walter H. 2003. Geno2pheno: Estimating phenotypic drug resistance from HIV -1 1129 genotypes. Nucleic Acids Res 31:3850-5. 1130 76. Eshleman SH, Hackett J, Jr., Swanson P, Cunningham SP, Drews B, Brennan C, Devare 1131 SG, Zekeng L, Kaptue L, Marlowe N. 2004. Performance of the Celera Diagnostics 1132 ViroSeq HIV -1 Genotyping System for sequence -based analysis of diverse human 1133 immunodeficiency virus type 1 strains. J Clin Microbiol 42:2711-7. 1134 77. Liu TF, Shafer RW. 2006. Web resources for HIV type 1 genotypic -resistance test 1135 interpretation. Clin Infect Dis 42:1608-18. 1136 78. Sanchez V, Masia M, Robledano C, Padilla S, Ramos JM, Gutierrez F. 2010. Performance 1137 of genotypic algorithms for predicting HIV -1 tropism measured against the enhanced -1138 sensitivity Trofile coreceptor tropism assay. J Clin Microbiol 48:4135-9. 1139 79. Rawi R, Mall R, Shen CH, Farney SK, Shiakolas A, Zhou J, Bensmail H, Chun TW, Doria-1140 Rose NA, Lynch RM, Mascola JR, Kwong PD, Chuang GY. 2019. Accurate Prediction for 1141 Antibody Resistance of Clinical HIV-1 Isolates. Sci Rep 9:14696. 1142 80. Magaret CA, Benkeser DC, Williamson BD, Borate BR, Carpp LN, Georgiev IS, Setliff I, 1143 Dingens AS, Simon N, Carone M, Simpkins C, Montefiori D, Alter G, Yu WH, Juraska M, 1144 Edlefsen PT, Karuna S, Mgodi NM, Edugupanti S, Gilbert PB. 2019. Prediction of VRC01 1145 neutralization sensitivity by HIV -1 gp160 sequence features. PLoS Comput Biol 1146 15:e1006952. 1147 81. Williamson BD, Magaret CA, Karuna S, Carpp LN, Gelderblom HC, Huang Y, Benkeser 1148 D, Gilbert PB. 2023. Application of the SLAPNAP statistical learning tool to broadly 1149 neutralizing antibody HIV prevention research. iScience 26:107595. 1150 82. Kantor R. 2024. Overview of HIV -1 drug resistance testing assays, on UpToDate. 1151 https://www.uptodate.com/contents/overview-of-hiv-1-drug-resistance-testing-1152 assays#H1277302675. Accessed Oct 18, 2025. 1153 83. Simen BB, Simons JF, Hullsiek KH, Novak RM, Macarthur RD, Baxter JD, Huang C, 1154 Lubeski C, Turenchalk GS, Braverman MS, Desany B, Rothberg JM, Egholm M, Kozal 1155 MJ, Terry Beirn Community Programs for Clinical Research on A. 2009. Low-abundance 1156 drug-resistant viral variants in chronically HIV -infected, antiretroviral treatment -naive 1157 patients significantly impact treatment outcomes. J Infect Dis 199:693-701. 1158 84. Katoh K, Standley DM. 2013. MAFFT multiple sequence alignment software version 7: 1159 improvements in performance and usability. Mol Biol Evol 30:772-80. 1160 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 37 85. Siepel AC, Halpern AL, Macken C, Korber BT. 1995. A computer program designed to 1161 screen rapidly for HIV type 1 intersubtype recombinant sequences. AIDS Res Hum 1162 Retroviruses 11:1413-6. 1163 86. Kuiken C, Korber B, Shafer RW. 2003. HIV sequence databases. AIDS Rev 5:52-61. 1164 87. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, 1165 Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, 1166 Perrot M, Duchesnay E. 2011. Scikit -learn: Machine Learning in Python. Journal of 1167 Machine Learning Research 12:2825-2830. 1168 88. Helseth E, Kowalski M, Gabuzda D, Olshevsky U, Haseltine W, Sodroski J. 1990. Rapid 1169 complementation assays measuring replicative potential of human immunodeficiency 1170 virus type 1 envelope glycoprotein mutants. J Virol 64:2416-20. 1171 89. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2 --approximately maximum -likelihood 1172 trees for large alignments. PLoS One 5:e9490. 1173 90. Galaxy C. 2024. The Galaxy platform for accessible, reproducible, and collaborative data 1174 analyses: 2024 update. Nucleic Acids Res 52:W83-W94. 1175 91. Prosperi MC, Ciccozzi M, Fanti I, Saladini F, Pecorari M, Borghi V, Di Giambenedetto S, 1176 Bruzzone B, Capetti A, Vivarelli A, Rusconi S, Re MC, Gismondo MR, Sighinolfi L, Gray 1177 RR, Salemi M, Zazzi M, De Luca A, group Ac. 2011. A novel methodology for large-scale 1178 phylogeny partition. Nat Commun 2:321. 1179 92. Kosakovsky Pond SL, Poon AFY, Velazquez R, Weaver S, Hepler NL, Murrell B, Shank 1180 SD, Magalis BR, Bouvier D, Nekrutenko A, Wisotsky S, Spielman SJ, Frost SDW, Muse 1181 SV. 2020. HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using 1182 Phylogenies. Mol Biol Evol 37:295-299. 1183 93. Haim H, Salas I, McGee K, Eichelberger N, Winter E, Pacheco B, Sodroski J. 2013. 1184 Modeling virus- and antibody-specific factors to predict human immunodeficiency virus 1185 neutralization efficiency. Cell Host Microbe 14:547-58. 1186 94. Trkola A, Purtscher M, Muster T, Ballaun C, Buchacher A, Sullivan N, Srinivasan K, 1187 Sodroski J, Moore JP, Katinger H. 1996. Human monoclonal antibody 2G12 defines a 1188 distinctive neutralization epitope on the gp120 glycoprotein of human immunodeficiency 1189 virus type 1. J Virol 70:1100-8. 1190 1191 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint P a n c e r a d a t a β†’ n o t a l l c a n b e e x p l a i n e d b y S M M M I n d e e d , s e v e r a l a d d i t i o n a l s i t e s h a v e b e e n d e s c r i b e d ; h o w e v e r , t h e r e i s n o c o m p r e h e n s i v e u n d e r s t a n d i n g t e s t h a t m e d i a t e r e s i s t a n c e A 375S 434M 475M 426M Temsavir (BMS-626529) Fostemsavir (BMS-663068) B A1A1, G AE B B, DB, F1B, G C D F1 G -5 -4 -3 -2 -1 0 1 All samples Resistance, log10(IC50) F C A1A1, G AE BB, DB, F1B, G C D F1 G -5 -4 -3 -2 -1 0 1 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 Pre-treatment (n=385) Post-treatment (n=185) Frequency (%) SM3-containing samples Log10(IC50), Β΅M E D 0 5 10 15 20 25 30 35 40 45 24 65 Frequency (subjects) On-treatment emergence (days after initiation) 43 o f t h e s i Figure 1. TMR resistance and escape in the BRIGHTE clinical trial. (A) Structure of fostemsavir and the active metabolite TMR. (B) Cryo-electron microscopy model of the Env trimer bound to TMR (in magenta, PDB ID 8TTW). The inset shows the CD4-binding pocket with the SM3 sites labeled. (C,D) Resistance values of all 570 samples from BRIGHTE trial participants and of the subset of samples that contain the TMR-sensitive SM3 motif. Samples are grouped by their inferred clade associations. (E) Distribution of IC50 values for samples collected before and after fostemsavir treatment. The th reshold used to define resistance ( 50 nM) is shown by a dotted line. (F) TMR resistance outcomes in the 132 BRIGHTE trial participants for whom genotype and IC50 data were available for samples collected both before and after treatment. 0 10 20 30 40 .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 0.6 0.7 0.8 0.9 1 AUC Accuracy F1 Score Precision Senstivity Specifictiy XGBoost Gradient Boosting Logistic Regression AdaBoost Random Forest SVM Metric B A E Predicted log 10(IC50) D Measured log 10(IC50) 0 0.01 0.02 0.03 0.04 0.88 0.91 0.94 0.97 1 0.05 0.1 0.5 1 Average AUCVariability in AUC C Resistance Threshold (Β΅M): -5 -4 -3 -2 -1 0 1 -4 -3 -2 -1 0 1 R2: 0.75 MSE: 0.63 BRIGHTE (570 π‘†π‘Žπ‘šπ‘π‘™π‘’π‘ ) Single-Env Set (208 Envs) Sequence-IC50 Data GB Regressor model for high- frequency mutations (n = 14) Probabilistic model for low- frequency mutations (n = 17) TMR-bound structure Structure-guided estimate (n = 17) Identify suspected mutations Prior Studies (n = 11)Mutation frequency in HIV-1 clade B Introduce mutations in HIV-1AD8 Env Measure in vitro effects on fitness & TMR resistance Evaluate emergence frequency in BRIGHTE subjects that developed resistance 59 mutations Figure 2. Identification of mutations suspected of increasing HIV-1 resistance to TMR. (A) Our approach to identify the mutations. The number of mutations identified by each approach is shown. (B) Performance of different algorithms to predict TMR resistance by sequence. Amino acid sequence at the 856 positions of Env in the 570 BRIGHTE trial samples was used as input. Average metrics for five-fold cross- validation are shown. Error bars, standard deviation (SD). (C) Performance of the XGBoost algorithm to predict TMR resistance in the BRIGHTE samples by Env sequence. As input, we used amino acids at the four SM3 positions, all 856 positions, or positions within the indicated distances from TMR on the TMR-bound structure of Env. Performance was tested using the indicated IC50 thresholds to define resistance. Average AUC values and their variability (SD) across the five folds is shown. (D) Performance of a Gradient Boosting Regressor model to predict resistance of 778 samples from the BRIGHTE and Single-Env datasets by sequence. MSE, mean squared error. (E) The 59 Env mu tations suspected of increasing resistance to TMR identified by the four approaches shown in panel A. .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 0.1 1 10 100 1,000 10,000 0 1 2 3 375N,475I 375N,429R 426L,423F 116Q,434K 375N,423F 375N,113G 426L,434K 426L,424R 375M,475I 426L,429R 375N,434T 113E,375N 113E,434I WT 375N,434I 375H,426L 434T,475I 113G,434K 113G,424R 426L,375I 426L,202K 426L,113E 426L,434I 426L,202A 426L,116P 375N,116P 375M,426L375N,424R375N,434K 426L,475I 375N,426L 434T,375I 426L,113G 375I,426I >50,000 Frequency in escape group: οƒŽ Not observed NT changes from Anc_B: β–  2 β–  3 Emergence frequency in escape group (%) 0.01 0.1 1 113H 377V 432Q 478I 384L 384C 432R 426R 373T 434L 113E WT 255M 434K 204D 475L 375I 434T 434I 423S 429R 116Q 424R 375Y 202E 375M 375H 373Q 595F 376I 384F 210W 429H 506M 595V 202K 113G 202A 116P 423F 475I 424V 375N 426L 655E 655A 202S 426T 0.1 1 10 100 1,000 10,000 100,000 0 1 2 3 113A 200S 202R 375T 426K 109V 432L 255I 423Y 426I 376L 255A AFitness (fold wild-type) Temsavir resistance, IC50 (fold wild-type) FFitness (fold wild-type) Temsavir resistance, IC50 (fold wild-type) Fitness (fold-WT) TMR Resistance (fold-WT) E 0 1 2 3 4 Number of REMs Number of subjects D C Emergence Frequency in escape group: οƒŽ Not observed % participants requiring 1 -NT change: β–  >90% β–  50-90% β–  10-50% β–  <10% 1 10 100 1,000 10,000 # of REMs: 0 1 2 Emergence frequency (%) 375N 426L 426L 434I 424V 426L 113E 375N 113E 426L 375N 475I 426L 475I 375N 424V 375N 434I 113E 424V 375M 426L 113E 434I 426L 429R 113E 375M 113E 475I 255I 426L 423F 426L 424R 426L 424V 434I Figure 3. Mutations that increase resistance to TMR are not represented equally in indi viduals that develop resistance. (A) The 59 mutations suspected of increa sing resistance to TMR were introduced in the AD8 Env and tested in a pseudovirus system for their fitness (infectivity normalized for virus particle count) and resistance to TMR. Datapoint size corresponds with emergence frequency in the escape group. Color corresponds with the percent of BRIGHTE subjects that required one nucleotide (NT) change to acquire the mutation. (B) Emergence frequency of the mutations in the escape group. Significance of mutation enrichment in the post-treatment samples, as determined in a permutation test, is indicated. (C) Relationship between frequency of the 18 mutations that increase resistance by more than 3.5-fold (designated REMs) and their effects on TMR resistance or relativ e fitness. (D) Number of REMs that emerged in each of the 65 escape group subjects. (E) Emergence frequency of mutation combinations in the escape group. (F) Fitness versus TMR resistance of two-mutation combinations. Datapoint size describes their frequency in the escape group, and color describes the number of NT changes required to acquire the mutation from the clade B ancestral form. 0 20 40 60 80 0 20 40 0 20 40 60 80 0 10 20 30 0 10 20 30 40 50 60 70 80 426L 375N 113E 475I 434I 424V 375M 429R 424R 423F 434T 255I 375H 373T 202K 113A 255A 426I 202E 375I 423S 434K 595F 3.5 Fold-increase in IC 50:*** *** *** *** * Emergence frequency in escape group (%) Enrichment in post- treatment samples , P-value: *, <0.05; ***, <0.0005 B .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint F H I M N S T W Y 0 1 10 100 1,000 10,000 0.01 0.1 1 10 0 1 10 100 0 1 10 0.1 1 10 100 0.01 0.1 1 10 0 2 4 6 8 10 12 A C D E F G H I K L M N P Q R S T V W Y 0.2 426 (M) Emergence frequency in BRIGHTE escape group (%) E F G Avg. emergence rate in clade B (%) Avg. emergence rate in clade B (%) rS = 0.82 P = 0.00004 426L 375N 475I 375I 429R 0 0.5 1 1.5 2 2.5 . A C D E F G H I K L M N P Q R S T V W Y 0 5 10 15 . A C D E F G H I K L M N P Q R S T V W Y 0 1 2 3 4 5 . A C D E F G H I K L M N P Q R S T V W Y 0 0.5 1 1.5 2 . A C D E F G H I K L M N P Q R S T V W Y Position 375, No InhibitorPosition 426, No Inhibitor Position 426, TMR 250 nM Position 375, TMR 250 nM AA Preference A B AA Preference with TMR 250 nM (replicative HIV-1) C D Frequency in clade B (%) N M H I W Y F 375M 375H434T 434K 424R 423S 202E 475L255M 116P116Q204D375Y >25,000 TMR IC50 (nM), pseudovirus system AA Preference Emergence frequency in BRIGHTE escape group (%) Figure 4. The frequency of REM emergence in the escape group corresponds with their spontaneous emergence frequency in the population. (A,B) Saturation mutagenesis to determine the effects of all amino acid changes at positions 426 and 375 on HIV-1 fitness (β€œNo inhibitor”) and resistance to TMR ( 250 nM). The wild-type form is shown in maroon. (C) Relationship between the preference for amino acids at position 375 in the presence of TMR and their IC50 values measured using the pseudovirus system. (D) Relationship between the emergence frequency of amino acids at position 375 in the escape group and their frequency in a panel of 2,535 clade B Envs from Fostemsavir-untreated individuals. (E,F) Example of the emergence rate of mutations at position 426 in HIV-1 clade B. The tree was constructed using amino acid sequences of the 2,535 Envs. Branches are colored by the amino acid in each taxon at position 426. The tree was partitioned into subgroups, which were excluded if the dominant form at that position differed from the clade B consensus. For all remaining subgroups, the number of new substitution events at position 426 to each amino acid was calculated, and these values were averaged. Error bars, standard errors of the mean. (G) Relationship between emergence frequency of the 18 REMs in clade B and their emergence frequency in the BRIGHTE escape group. .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint 0 2 4 6 8 10 12 375N 426L 434K 116P 0 5 10 15 WT 426L 375N 475I 375M 429R 424R 434T 375H 202E 375I 423S 434K 116P 116Q 204D 255M 375Y 475L High Low WT 426L 375N 475I 375M 429R 424R 434T 375H 202E 375I 423S 434K 116P 116Q 204D 255M 375Y 475L 0.00001 0.0001 0.001 0.01 0 20 40 60 80 0.1 Plasma 1295 Plasma 1641 A Emergence frequency in BRIGHTE escape group 0 2 4 6 8 WT 426L 375N 475I 375M 429R 424R 434T 375H 202E 375I 423S 434K 116P 116Q 204D 255M 375Y 475L 0 5 10 15 20 0 2 4 6 8 Frequency in escape group: Normalized 17b binding (fold WT) C D Resistance to 37Β°C (IT50, hours) F B Relative CD4 independence (%) E Normalized antibody binding (fold wild-type) CoR-BS (cryptic) CD4-BS (cryptic) CD4-BS (exposed) V3 loop (cryptic) Apex Interface MPER 17b Binding (fold-WT) 116P Interface CD4-BS (exposed) Top view Side view >0.063. Resistance to HIV+ Plasma (1/dilution) CoR-BS Apex V3 loop CD4-BS (cryptic) MPER 202E 423S 434K 204D Figure 5. REMs that are poorly sampled in the BRI GHTE escape group exhibit an open conformation of Env that is sensitive to non-neutralizing antibodies. (A) Sensitivity of the 18 REMs to plasma from two HIV-infected individuals. Their emergence rates in the escape group are shown as magenta bars. (B) Cryo-EM structure of the Env trimer ectodomain in the CD4-unliganded state (PDB IDs 6U59 and 6UJV). Residues associated with binding of the antibodies we tested are colored. (C) Envs containing the indicated REMs were expressed on HOS cells, and binding of monoclonal antibodies was measured by cell-based ELISA. Values were normalized for cell surface expression of the Envs using antibody 2G12, and are expressed as a fraction of their binding to the wild- type AD8 Env. (D) Binding efficiency of the 18 REMs to antibody 17b that targets a cryptic epitope overlapping the CoR- BS. (E) Relationship between binding of the REMs to 17b and their infection of CD4-negative cells, expressed as a percent of their measured infection of CD4-positive cells. (F) Resistance of the variants to incubation at 37Β°C. Values indicate the time until a 50% decrease in infectivity is detected. .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint * 426L 375N 475I 375M 429R 375H 375I 375Y475L 0.1 1 10 100 0 0.5 1 426L 375N 475I 375M 429R 424R 434T 375H 434K 475L 0.01 0.1 1 10 0 0.5 1 204D 423S 116P 202E Frequency in escape group (%) Emergence rate in Clade B Plasma resistance CoR-BS Exposure CD4 independence Fitness (infectivity) Stability at 37Β° C Increase in TMR IC50 (fold WT) ns ns ns ns ns ns ns Stability at 37Β° C -1.8 -2.0 -3.5 -2.1 < -4 -2.3 Fitness (infectivity) ns ns -1.8 -1.7 -4 CD4 independence ns -1.8 -3.5 -2.3 CoR-BS Exposure ns ns < -4 Plasma resistance ns -2.3 Emergence rate in Clade B < -4 A B 426L 375N 475I 375M 429R 424R 434T 375H 202E 375I 423S 434K 116P 116Q 204D 255M 375Y 475L Frequency in escape group (%) 74.2 42.6 20 9.5 6.3 6.2 4.8 4.6 1.5 1.5 1.5 1.5 0 0 0 0 0 0 Emergence rate in Clade B 10.3 3.3 1.1 0.65 4.9 0.05 0.6 0.8 0 1.8 0 0.09 0 0 0 0.04 0 0.05 Plasma resistance > 6.3 > 6.3 > 6.3 > 6.3 > 6.3 > 6.3 4.9 > 6.3 0.09 > 6.3 0.08 0.07 0.25 1.3 0.06 > 6.3 > 6.3 > 6.3 CoR-BS exposure 0.95 0.48 0.89 0.41 1.6 0.48 0.91 0.65 4.6 0.49 3.3 5.1 4.7 0.85 6.36 0.77 0.57 0.58 CD4 independence 0.03 0.01 0.01 0.01 0.003 0.08 0.02 0.03 4.3 0.01 7.4 18.7 0.09 0.04 15.3 0.24 0.00 0.03 Fitness (infectivity) 0.68 2.1 0.63 0.80 1.4 0.68 1.1 1.8 0.33 0.83 0.07 0.09 0.72 0.83 0.01 0.17 1.8 0.49 Stability at 37Β° C 7.7 12.3 11.4 7.2 9.9 8.3 5.6 5.7 0.85 5.5 1.05 0.97 2.5 3.8 3.3 2.4 10.3 6.4 Increase in TMR IC50 (fold WT) 48.2 3.7 11.8 82.2 4.9 227.9 11.8 129.3 3.5 30 10 53.2 11535 9.5 54.8 48.3 134.8 14.9 Substitution likelihood from con_B 0.24 0.32 0.35 0.007 0.06 0.04 0.39 0.02 0.05 0.13 0.13 0.07 0.35 0.07 0.17 0.07 0.05 0.26 Avg. substitution likelihood in BRIGHTE 0.22 0.27 0.33 0.008 0.12 0.04 0.37 0.03 0.05 0.16 0.15 0.08 0.34 0.06 0.17 0.13 0.05 0.25 C Combined Fitness Emergence in clade B (%) Emergence in BRIGHTE (%) Log10(P-value) < -4 ns < -3 < -2 < -1.3 116Q 204D 116P 116Q255M 424R 434T 375Y 255M 375I Average AUC Pre-treatment sequence - + - - + - + + Subst. likelihood - - + - + + - + Combined Fitness - - - + - + + + 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 D 423S 202E 434K 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 E F Average AUC # of nucl. subst. - + + + Ti&Tv likelihoods - - + + Combined Fitness - - - + **** **** **** **** **** **** **** **** *** *** rS = 0.62 P = 0.006 rS = 0.54 P = 0.02 Combined Fitness Subst. likelihood Figure 6. Virus and host factors that guide emergence of the REMs. (A) Summary of all features measured for the 18 REMs. Substitution likelihoods from the clade B consensus (con_B) trinucleotide sequence describes the number of nucleotide changes required and the transition/transversion rate for each change. This likelihood was also calculated using the pre-treatment nucleotide sequences of the 65 escape group subjects. (B) P-values for a Spearman correlation test that compares the values shown in panel A. ns, not significant. (C,D) The combined fitness metric for each REM was calculated as the product of their effects on fitness, resistance to plasma, and stability at 37Β°C. This value was compared with REM emergence frequencies in clade B or in the escape group. Data points are colored by the substitutions likelihood based on the con_B sequence or the subjects' pre- treatment sequences, respectively. (E) Simulations of REM emergence were performed with different combinations of the indicated variables. The fraction of success events (mutation acquisition) in the 1000 iterations was compared with REM frequency in the 65 subjects (see algorithm in Fig S1 5C). Data for all 18 REMs were compiled to calculate the AUC. Error bars, SDs for 10 simulations. (F) Contribution of number and type of nucleotide substitutions to predict emergence of the five REMs at position 375 (H, I, M, N and Y). .CC-BY-NC-ND 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 27, 2025. ; https://doi.org/10.64898/2025.12.27.696684doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source β€” PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

βš™ Ask this paper AI returns verbatim quotes from the full text Β· source: oa-pdf β“˜

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) β€” citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-ND-4.0