A spontaneous proton transfer is key for enzymatic C-glycosylation and restricts the scope of natural C-glycosides

preprint OA: closed
Full text JSON View at publisher
AI-generated deep summary by claude@2026-06, 2026-06-24 · read from full text

This preprint investigates the mechanism and substrate scope of the maize C-glycosyltransferase Zm UGT708A6 by combining assays screening 125 glycosyl acceptors, pH-dependent kinetics, catalytic dyad mutagenesis, X-ray crystallography (UDP-bound structures), and extensive atomistic molecular dynamics with Markov state modeling. The authors find a conserved His-Asp catalytic machinery and show that C-glycosylation is strongly pH-dependent, with mutations largely abolishing C-activity at neutral pH and only partially restoring it at higher pH, consistent with critical proton-transfer steps involving a spontaneous water-mediated proton transfer stabilizing a key σ-complex in a stepwise SEAr-like process. They identify two flexible “gates” that regulate donor and acceptor access/reactivity and argue that substrate intrinsic ability to stabilize the σ-complex restricts which compounds can be C-glycosylated. A major caveat is that the work is a preprint and, while it includes simulations and structural snapshots, the mechanistic model is inferential rather than directly observed. The paper does not explicitly discuss endometriosis or adenomyosis; it was included in the corpus via a keyword match in the upstream search index.

Read from the paper's body, not the abstract. Not a substitute for reading the paper. No clinical advice. How this works

Full text 122,525 characters · extracted from preprint-html · click to expand
A spontaneous proton transfer is key for enzymatic C-glycosylation and restricts the scope of natural C-glycosides | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A spontaneous proton transfer is key for enzymatic C-glycosylation and restricts the scope of natural C-glycosides Ditte Welner, Lluís Raich, David Teze, Gonzalo Bidart, Folmer Fredslund, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5591657/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract C -glycosides are valuable compounds containing hydrolytically stable C-C bonds. However, their scarcity in nature and their complex synthesis limit their availability. Enzymes represent an environmentally mild paradigm for the synthesis of C -glycosides, but only few enzymes with C -glycosylation activity are known and their catalytic mechanism remains unclear. In this work, we study the intricacies of a C -glycosyltransferase using X-ray crystallography, biochemical assays, and atomistic simulations. We identify two dynamic gates that control substrate access and reactivity, and investigate the molecular mechanism of C -glycosylation, identifying an S E Ar stepwise process along a critical intermediate that stabilizes through a spontaneous water-mediated proton transfer. This stabilization is related to the chemical properties of the substrate, which dictate whether a compound can be C -glycosylated. Our results provide detailed knowledge and enhance our understanding of this class of enzymes, paving the way for their widespread utilization and engineering. Biological sciences/Computational biology and bioinformatics Biological sciences/Biochemistry Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 1. Introduction C- glycosides are high value products in several industries, including food, cosmetics, and pharmaceuticals. For example, carminic acid derived from the insect Dactylopius coccus is one of the oldest natural red colorants in food 1 , aloesin extracted from aloe vera is a depigmenting agent for the skin 2 , and dapagliflozin inspired from phlorizin is a potent and selective SGLT2 inhibitor to treat type II diabetes 3 . Hence, efforts are devoted to establishing synthetic pathways that could give easy access to these molecules 4 . One of the main synthetic challenges is to overcome the number of undesired regio- and stereoisomers that can be generated when forming a glycoside. This renders most chemical pathways impractical, as they require several steps of protection, activation, and deprotection of reactive centers, making the synthetic process complex and diminishing product yields. In nature, glycosides are synthesized by family 1 glycosyl transferases (GT1s). These are inverting enzymes that generally use UDP-activated a-sugars ( e.g. UDP-glucose) to generate single b-products. They are thus also known as UDP-dependent glycosyl transferases, or UGTs, and can transfer glycosyl moieties to C sp2 , N sp3 , O sp3 , and S sp3 atoms of the acceptor molecule. Interestingly, the four types of glycosylation occur with the same catalytic machinery, most commonly a His-Asp dyad, and a single enzyme can catalyze all types of glycosidic bonds 5,6 . While the mechanisms of N -, O -, and S -glycosylation have been investigated in-depth 7 , the C -glycosylation mechanism is still unclear. It is anticipated to differ significantly from the others due to the involvement of a sp2-hybridized carbon atom, suggesting a mechanism of aromatic electrophilic substitution (S E Ar) 8,9 . This mechanism typically involves the formation of a s-complex and an elimination step to restore aromatization (Fig. 1). However, given that the catalytic histidine would become protonated during the formation of the s-complex, it is not clear what would serve as a base in the elimination step. Previous studies on C -GT mutants of this residue have demonstrated that C -glycosylation products can still form, albeit at considerably slower rates 10 . This observation suggests that alternative residues or solvent molecules may play a role in facilitating proton transfers for both chemical steps, although currently there is no evidence to strengthen this hypothesis. To deepen our understanding of C -GTs, we here focused on an enzyme from maize 11 , Zm UGT708A6, the crystal structure of which is available (PDB 6LF6) 10,11 . We undertook an interdisciplinary approach combining experiments and atomistic simulations. We screened 125 glycosyl acceptors, and observed O -, S - and C -glycosylation activity on different substrates. The pH profiles for O - and C -glycosylation are significantly different, with C -glycosylation being much more impacted by pH using either the WT enzyme or a range of catalytic mutants, suggesting that proton transfer steps are critical during C -glycosylation. We also solved the UDP-bound crystal structure of this enzyme, revealing a tightly packed, apparently rigid, and hydrophobic acceptor binding site. This aspect contrasts with the available structure and the promiscuous activity of the enzyme, suggesting that structural plasticity and perhaps dynamics may be relevant for the access and accommodation of different substrates. We then performed extensive molecular dynamics simulations to explore the conformational landscape of the enzyme, characterizing two flexible gates whose opening motions modulate substrate reactivity. Moreover, these gates seem to be conserved across GTs, and we show the crystal structure of a distantly related GT1, Nt UGT72B82, which presents structures with both states of the donor gate in the same asymmetric unit. We studied the mechanism of C -glycosylation, providing evidence for a classic two-step S E Ar reaction (C-C bond formation and C-H proton abstraction) passing through a key s-complex that requires the assistance of water molecules to stabilize. Finally, we showed that the intrinsic ability of a chemical to stabilize the s-complex is determining whether a given substrate can be C -glycosylated or not, in accordance with the relative scarcity of C -glycosides in nature. 2. Results and discussion Promiscuous glycosylation activity with a conserved catalytic machinery. We screened 125 glycoside acceptors and found Zm UGT708A6 to have a clear preference for chalcones, displaying high C -glycosylation activity against phloretin, phloracetophenone, and 3-hydroxyphloretin. It also presents O -glycosylation activity on naringenin, the lignans magnolol and honokiol, S -glycosylation activity on 3,4-dichlorothiophenol, but no N -glycosylation activity on the corresponding 3,4-dichloroaniline (Supplementary Table 1). In total, significant activity was found on 24 acceptors. Interestingly, Zm UGT708A6 has a pH optimum of 8 for O -glycosylation on naringenin, and a bimodal pH-activity profile for C -glycosylation on phloretin, with a minimum at pH 8 flanked by higher activity zones between pH 6-7 and pH 10 (Fig. 2a). A similar decrease in conversion rates was observed at pH 7.75 to 8.75 (Supplementary Fig. 1). These observations are in accordance with newly discovered C -GTs showing high activity in unbuffered diluted sodium hydroxide 12 , and hint at a mechanistically distinct pathway at high pH. Indeed, a high activity at pH 6-7 is consistent with an activity against a fully protonated phloretin (p K a 7.4 13 ), while at pH10 few of di-anionic keto/enol forms are expected to be in equilibrium, according to studies on phloroglucinol 14 The kinetic parameters showed a moderately higher activity at pH 10 compared to 7 (Supplementary Fig. 2). To shed further light on the different pH effects on O - and C -glycosylation, we investigated six conservative mutations of the catalytic dyad. We found that all mutants except H25A displayed decreased O -activity on naringenin at neutral pH (Fig. 2b and Supplementary Fig. 2). No mutant showed detectable C -activity on phloretin at neutral pH, but they all recovered activity at higher pH. This indicates that the catalytic histidine is critical for the C -glycosylation of phloretin, either for glycosylation and/or to deprotonate the s-complex and recover aromaticity. Note that no variant displayed a specificity change, i.e. no C -glycosylation activity was observed on naringenin, nor O -glycosylation activity on phloretin. Moreover, at high pH, mutants introducing negative charges exhibited reduced activity compared to mutants eliminating them ( e.g. H25D and H25E are the least efficient mutants, while D122N is the most active), hinting at a repulsive interaction with either the substrate or reaction intermediates. Two flexible gates control substrate accessibility and reactivity. We solved the structure of Zm UGT708A6 in complex with UDP at a resolution of 2.04 Å (PDB 8CGQ, Fig. 3a and Supplementary Table 2). The overall fold and the positioning of individual residues closely resembled the previously reported structure of Zm UGT708A6 (PDB 6LF6 10 ). Nonetheless, we observed a significant difference adjacent to the acceptor pocket, where our structure displayed a smaller, more compact cavity compared to the larger solvent-exposed cavity present in 6LF6 (volumes of 1056 and 1992 Å 3 , respectively). This difference arises from the displacement of the a 3 -helix from the binding site, while F94 shifts its orientation away from F201. Additionally, at the donor site, we noted a slight rearrangement of a loop that covers the UDP binding pocket, indicating a plasticity that could be relevant to explain the access of UDP-Glc into the buried binding site. This flexibility can also explain the substrate promiscuity, as also evidenced by other studies. 6 We further characterized the dynamical nature of Zm UGT708A6 by computational methods. Starting from our crystal structure (8CGQ), we modelled a complex between UDP-Glucose and phloretin at physiological conditions. We launched 128 parallel MD simulations of 100 ns each, collecting a total of 12.8 ms of cumulative data. Next, we analyzed all trajectories using Markov state models 15–17 . We used a time-lagged independent component analysis (TICA) and unveiled an intricate conformational landscape displaying several minima (Supplementary Fig. 3‒7). This helped us to identify four biologically relevant states from the slowest TICs, corresponding to the open/closed forms of the two flexible gates that cover the donor and acceptor substrates (Fig. 3b). The opening of the donor gate involves the detachment of a flexible loop (residues S315-T323) that sits on top of the a 2 -helix, locked by the interactions of D319 with the backbone amides that are capping the helix ( e.g . S55; Fig. 3b right). This motion leaves S286 exposed to the solvent, a residue located in an internal loop that covers UDP and whose role is related to substrate recognition and stabilization (Supplementary Fig. 8 for a further shift of S286 that leaves UDP more exposed to the solvent). Therefore, the opening/closing of this gate is not only a necessary step for the access of the donor substrate into the catalytic site, but also for the establishment of key interactions that favor the reaction. The opening of the acceptor gate allowed us to establish a connection between our crystal structure and 6LF6 (free energy landscape in Fig. 3b). We observed the same displacement of the a 3 -helix from the binding pocket, including the shift of F94, which leaves the substrate exposed to the solvent (Figure 3b left). Crucially, this motion affects the number of reactive states for C - and O -glycosylation, both being less likely to occur in the open state (Fig. 3b and Supplementary Fig. 9-10 for reaction criteria and structural renders). The catalytic form of the enzyme is thus likely the closed state, as it is the one in which the substrate is spatially restricted and can fulfill the reactive criteria more frequently. We found two additional C -GT structures in the PDB that further confirm the existence of the open/closed states (PDBs 6L5R and 6L5P; Fig. 3c) 5 . Interestingly, in 6L5P the donor gate is not resolved, indicative of an open state with high mobility. Similarly, in the acceptor gate the side chain of F92 (F94 in Zm UGT708A6) is unresolved and the Ca-Cb bond is pointing towards the solvent, in line with what we find in our simulations (Supplementary Fig. 11). These similarities are remarkable considering that the C -GT structures have only 45% of sequence identity with Zm UGT708A6. We also solved the structure of Nt UGT72B82 (PDB 8CHD), an O -GT from Nicotiana tabacum 18 with only 29% sequence identity to Zm UGT708A6, and observed a large difference in the conformation of the donor gate in the two enzyme molecules of the asymmetric unit (Fig. 3d and Supplementary Table 2 for data collection and refinement statistics). In one molecule, the donor gate is very open, exposing UDP to the solvent and leaving it ready to be replaced by a new UDP-Glc donor substrate. Overall, these observations support that the opening/closing motions unveiled by MD simulations are likely a common feature in GT1s, even for evolutionary distant members. C -glycosylation develops through a stepwise mechanism mediated by water stabilization. To complete our understanding of Zm UGT708A6, we studied the reaction mechanism of C -glycosylation. The mechanistic details are currently unclear, and only a few pathways have been proposed based on chemical intuition 19 . We used QM/MM simulations to address this challenge, treating a small active region at DFT level and the rest with the MM force field (Fig. 4a and details in the Methods section). We selected a reactive state from the most populated region, having both gates closed as in our crystal structure. During the initial equilibration of the complex, we observed spontaneous back-and-forth proton transitions from the ortho -hydroxyl group of phloretin to the catalytic histidine (o-OH and H25 in Fig. 4). This agrees with the expected acidity of phloretin hydroxyl groups (p K a 7.4 13 ) and the H25-D122 dyad (p K a 7.5 for chymotrypsin 20 ). From a mechanistic point of view, the deprotonation of o -OH increases the nucleophilicity of the carbons at ortho - and para - positions, as evidenced by their charges (Supplementary Fig. 12). This is a key aspect to favor the formation of the C-C bond with the anomeric carbon of the glycan, which is expected to be positively charged in the transition state region. Next, we used metadynamics to enhance the sampling of high-energy configurations and explore reactive pathways (see Methods and Supplementary Fig. 13). We first characterized the C-C bond formation, which proceeds with the dissociation of the C1-O P bond between Glc and UDP, followed by the approach of Glc to phloretin and the formation of the C1-Cr bond (Fig. 4c and states R, TS1, and I1 in Fig. 4d). As expected, this approach makes the o -OH proton of phloretin more acidic, and it ends up being fully transferred to H25 before the transition state, releasing a pair of electrons that can delocalize through the phenolic ring and help to form the C-C bond. Formally, this first step is an electrophilic migration of Glc from UDP to phloretin, passing through a flat energy region that corresponds to a short-lived oxocarbenium ion. This is similar to previous results obtained for retaining O -GTs of different families 21,22 , even though the system we study is a C -GT of inverting type. The distance that Glc must travel before reaching the acceptor is likely the reason behind the appearance of this species. The reaction barrier for this step is ~20 kcal·mol ‒1 , in close agreement with the value derived from the catalytic constant (5 s ‒1 ; ~17 kcal·mol ‒1 ). Moreover, this is the rate-limiting step along the entire free energy profile, which is in line with the fact that, experimentally, both C - and O -glycosylation have relatively similar rates, suggesting that the breaking of the UDP-Glc bond may be determining. Notably, the formed intermediate (I1) is ~16 kcal·mol ‒1 high in energy, mainly because the phenolic ring loses aromaticity after the electrophilic addition. More importantly, in this state H25 is not able to act as a base for the subsequent C-H proton abstraction (Cr-Hr bond in Fig. 4d), given that it has already received a proton from the o -OH. Ideally, H25 should transfer back this proton to the phosphate group, neutralizing the charge developed during the reaction and restoring the basicity of the catalytic histidine. However, this does not seem geometrically feasible. At this point, we reasoned that water molecules could be responsible for mediating the C-H abstraction. Therefore, we included a subset of them in the QM region, enabling their direct participation in chemical reorganizations. We observed a spontaneous and irreversible proton transfer from the p -OH of phloretin to the highly charged phosphate, mediated by a water molecule and the 2-OH of Glc (see states I1 and I2 in Fig. 4d). This rapid transfer highlights that the OH groups of phloretin become very acidic in the adduct state, and their deprotonation allows the recovery of partial aromaticity thorough the phenolic ring, rendering a much stabler intermediate. Subsequently, H25 performs a stepwise acid/base catalysis to abstract the C-H proton (see states I3, TS2, and P in Fig. 4d). First, it protonates the o -OH of phloretin, rendering a similar state as I1 in terms of energy, and then acts as a base, abstracting the C-H proton with a very low activation energy (~2 kcal·mol ‒1 ). This process leads to the final product, a stable glycoconjugate linked by a C - C bond. Overall, our simulations indicate that C -GTs operate through an electrophilic aromatic substitution that involves a classical addition-elimination pathway (C-C bond formation and C-H proton abstraction), together with a critical reorganization of the intermediate in order to stabilize it and allow the recovery of H25 basicity. The C -glycosylation mechanism restricts the acceptor substrate scope. One of the most critical aspects of the mechanism is the stabilization of the s-complex through the spontaneous deprotonation of aromatic hydroxyls. This mechanism of stabilization resembles the keto-enol tautomerization of phloroglucinol in solution, where a seminal study showed that deprotonated forms of s-complexes are prevalent 14 . To give further insights into this stabilization mechanism, as well as to the structural factors of the acceptor that may favor or disfavor C -glycosylation, we compared the energetics between O -, s-, and C -glycosides of different acceptors without the enzymatic scaffold, using full DFT calculations in implicit water solvent (Fig. 5, Supplementary Fig. 14‒25 for all the evaluated isomers, and Methods for more details). Our results show three key aspects to highlight: (1) C -glycosylation is energetically favored for all acceptors, suggesting that O - and C -glycosylation are involved in a kinetic / thermodynamic competition; (2) the stability of s-complexes correlate with the number of OHs in the reactive ring; and (3) acceptors with 2 or 3 OHs can stabilize their s-complexes through deprotonation, while acceptors bearing a single OH cannot. Interestingly, in the hydroxyflavone series, apigenin (acceptor 7) and 7,4’ dihydroxyflavone (acceptor 8) can also stabilize through the deprotonation of the distal phenolic group, while 7-hydroxyflavone (acceptor 9) cannot. This explains the prevalence of C -glucosides for former molecules, while a C -glucoside has never been reported for the latter, despite that the only difference is a hydroxyl far away from the glycosylation site. Notably, daidzein (acceptor 11) is also unable to significantly stabilize the s-complex through the deprotonation of its distal hydroxyl, even though the only difference with respect to 7,4’ dihydroxyflavone is the connectivity of the phenolic group (isoflavone vs flavone). This is because isoflavones cannot delocalize electrons from the phenolic group to the reactive scaffold, as evidenced by resonance structures. This subtle detail is in accordance with the relative scarcity of isoflavone C -glycosides compared to their flavone counterparts. Indeed, only a single enzyme, Pl UGT43 23 , has been reported to present C -activity on the isoflavone daidzein, albeit very low. Conversely, there is no clear difference between flavone and isoflavone energetics for O -glycosides. Our calculations are also useful to determine the most stable isomers of a given acceptor, allowing predictions of regioselectivities. Indeed, we observed large energy differences between s-complexes of C -glycosylation sites and non-reactive sites, even when those surround the same reactive hydroxyl. For instance, there is about 10 kcal·mol ‒1 difference between the s-complexes of aloesin (acceptor 12) at position 8 and its analogues at position 6, both surrounding the phenolic hydroxyl at position 7 (Supplementary Fig. 25). As far as we are aware, a 6- C -glycoside of aloesone has never been reported, and this can be explained purely by energetic terms. It is also worth noting that all O - and C -glycosides have similar energies between acceptors, and s-complexes are the only states that show significant differences. Therefore, the relative stability of s-complexes is critical to discern between acceptors that will likely render O - or C -products. Furthermore, compounds featuring a single aromatic hydroxyl lack the capacity for spontaneous deprotonation to stabilize the s-complex and are also inherently less reactive due to the low nucleophilicity of the ring. Indeed, Zm UGT708A6 catalyzes the C -glycosylation of 2,4,6 trihydroxyacetophenone (acceptor 1) about 20-fold faster compared to 2,4 dihydroxyacetophenone (acceptor 2) or 2,4,5 trihydroxyacetophenone, and appears inactive with 4 hydroxyacetophenone (acceptor 3) (Supplementary Fig. 2). Similarly, in a recent and compelling study, Ab CGT was tested against a wide array of substrates, showing that acceptors with a single hydroxyl were only O -glycosylated, acceptors with two hydroxyls were O - and C -glycosylated, and acceptors with three hydroxyls were only C -glycosylated. 24 Hence, these structural details of acceptor substrates emerge as the critical determinant in governing their potential for C -glycosylation. 3. Conclusions In this work, we have characterized Zm UGT708A6 and its C -glycosylation mechanism. We revealed an intricate conformational landscape characterized by the dynamic opening and closing of two gates, crucial for controlling the accessibility of donor and acceptor substrates. Our findings show that the open state of the acceptor gate exposes the substrate to the solvent, facilitating substrate binding and product release, while the closed state increases the number of reactive poses, limiting the mobility and orientation of the substrate in the binding cavity. Thus, we propose that the catalytic cycle commences with the apo state and both gates open. Subsequently, both donor and acceptor substrates bind, the gates close to enhance reactivity, the reaction step occurs, and finally the gates open again to release the products, restoring the apo state and completing the cycle. These results offer insights into the remarkable flexibility of C -GTs, and of GT1s in general, providing a rational foundation for understanding how the two deeply buried substrates can access the active site, and how the acceptor cavity can bind a diverse array of molecules and modulate their reaction outcomes. Our mechanistic data open several questions about the mechanism at different pH. At physiological conditions, where polyphenols are predominantly protonated, C -glycosylation likely proceeds through the mechanism that we uncovered, involving a water-mediated stabilization of the s-complex. At more basic conditions, substrates could either enter deprotonated into the active site, moving directly from the reactant state to the stable s-complex, or be assisted by solvent bases to proceed with the reaction. This hypothesis is supported by the observed variations in pH-activity profile between O - and C -glycosylation, with the latter restored at elevated pH values, pointing to the critical importance of proton transfer steps in the mechanism. Critically, our results also suggest that substrates with a single hydroxyl may require enzymes with alternative mechanisms to make C -glycosides kinetically accessible, as the s-complexes of these substrates are unable to stabilize through deprotonation. This may be the case of Pl UGT43, which has an asparagine instead of the catalytic histidine, and yet it can C -glycosylate the isoflavone daidzein. Finally, we emphasize that the potential to C -glycosylate an acceptor relates directly to the intrinsic stabilization of its s-intermediates, whose energies can be evaluated by simple computational methods. Hence, we propose that, for molecules presenting a high energy s-intermediate, substrate engineering to C -glycosylate a precursor of the desired compound may be preferable than extensive enzyme mining or engineering. It is possible that this strategy represents the natural pathway to complex C -glycosides. 4. Methods Kinetic assays and X-ray crystallography Materials Buffers, chemicals and reagents were purchased from Sigma Aldrich and used without further purification. Enzyme expression and purification. The full-length histidine-tagged DNA sequence was cloned into a pET28a(+) expression vector by GenScript (USA), without codon optimization. The plasmids was transformed into E. coli BL21 Star(DE3) (Fisher Scientific). Overexpression was induced by the addition of 250 µM IPTG to the cultures that had reached OD 600 = 0.8–1.0 in 2xYT medium at 37°C (200 rpm). Thereafter, the cultures were incubated for 20 hours at 20°C (200 rpm). The culture were centrifuged, the supernatant discarded, and the pellet was resuspension in 50 mM sodium phosphate buffer (pH 7.4). Lysis was carried out by 2 rounds of high-pressure homogenization at 10,000 psi (Avestin Emulsiflex C5). Cell debris were removed by centrifugation (15.000 x g , 30 min, 4 °C), the lysate was filtered and purified using immobilized metal affinity chromatography on an ÄKTA Pure with a Histrap FF column (Cytiva). The proteins were stored in 25 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) buffer, 50 mM NaCl, pH 7. HPLC analysis. Samples were analyzed by RP-HPLC on an Ultimate 3000 series apparatus (Dionex) with a Kinetix 2.6 µm C18 100 Å 100x4.6 mm analytical column (Phenomenex) maintained at 40°C. MilliQ water containing 0.1% formic acid and acetonitrile were used as mobile phases A and B, respectively, with the following method in percentages of mobile phase B at 1 mL/min: 0–0.5 min 2%, 0.5–1.5 min 35%, 1.5–3 min 35–80% (gradient), 3‒4.2 min 98%, 4.2‒5 min 2%. Chromatograms recorded at 300 nm and were processed via Chromeleon 7.2.7 (Dionex). Initial rates kinetics. Acceptor (phloretin or 3-hydroxyphloretin) concentrations from 0 to 250 µM were used in 25 mM Tris buffer at pH 7 and 10 in presence of 500 µM UDP-Glc. The reactions were carried out at 293 K for 30 s in presence of 48 ng/mL Zm UGT708A6 and quenched using 1% acetic acid. The calculated K m and k cat values and Michaelis-Menten plots were generated and analyzed in R using the drc package. pH characterization. The reactions were carried out at 298 K in 70 mM Tris-Bis-Tris (TBT) buffer in a pH range from 5 to 10, in presence of 500 µM sugar donor (UDP-Glc), 100 µM phloretin, and 10 µg/mL Zm UGT708A6 enzyme (100 µg/mL enzyme for the mutant enzymes). The reaction was quenched by 25x dilution in 0.1% acetic acid at time points 0, 5, and 10 minutes. Time course reactions. The reactions were carried out at 298 K in 25 mM Tris buffer pH 7, in presence of 500 µM sugar donor (UDP-Glc), 50 µM acceptor, and 15 µg/mL (2,4,6 trihydroxy acetophenone) or 150 µg/mL (2,4,5 trihydroxy acetophenone, 2,4 dihydroxy acetophenone, 4 hydroxy acetophenone) Zm UGT708A6 enzyme. The reaction was quenched by 25x dilution in 0.1% acetic acid at time points 0, 1, 2, 3, 4, 5, and 6 hours. Zm UGT708A6 Crystallographic structure determination. Zm UGT708A6 was co-crystallized with UDP-Glucose in sitting drops consisting of 0.2 μL protein solution (6.9 mg/mL in 25 mM HEPES pH 7.0, 50 mM NaCl, 1 mM DTT, 5 mM UDP-Glucose) and 0.2 μL crystallization buffer (PACT++ screen (Jena Biosci-ence) solution G10: Polyethylene glycol 3,350 20% w/v, BIS-TRIS propane 100 mM, pH 7.5). The drops were set up with a Phoenix crystallization robot in 3-drop In-telli-Plate 96-well crystallization plates (Art Robbins Instruments). Crystals appeared after 1 day, were cryoprotected in 10% glycerol, and mounted on a nylon cryoloop. 270° of data were collected at 100 K, 1.0000 nm, at the BioMax beamline of MAX-IV, Lund, Sweden, with a 1° oscillation and 5 s exposure time and a ADSC Q315r CCD detector. The data were processed with Xia2 25 and XDS 26 . The structure was solved by molecular replacement using PDB ID 2VCE 27 as a search model and Phaser 28 from the Phenix software package. The structure was refined using phenix.refine 29 and Coot 30 . The final structure was validated with MolProbity 31 and deposited in the Protein Data Bank with PDB ID 8CGQ. Computational modelling. The structure of Zm UGT708A6 was taken from the crystal resolved in this work. Protonation states of titratable residues were assigned at pH 7 with Propka 32 (v 3.1) and unresolved loops were added using Modeller 33 (v 9.25). The coordinates of UDP-Glc and phloretin were taken by superimposition from PDB 6L5P and PDB 6L5R, respectively. The system was solvated with TIP3P 34 water molecules and 0.15 M of NaCl salt concentration using the tleap module of Ambertools 35 (v 17.0). The enzyme was parameterized with the FF14SB 36 force field, the glucose unit with GLYCAM 37 , and UDP and phloretin molecules with Gaff2 38 . Molecular dynamics (MD) simulations were carried out with the OpenMM 39 engine (v 7.4.0). A Langevin integrator with a 1 ps friction coefficient and a Monte Carlo barostat were used to maintain the temperature and pressure of the system at 310 K and 1 bar. Long-range electrostatics were computed using the Particle Mesh Ewald scheme, and van der Waals interactions were truncated with a 1 nm cutoff. A hydrogen mass repartition with a factor of 3 was used together with an integration time step of 4 fs, constraining the lengths of all bonds. A total of 128 parallel trajectories of 100 ns each were collected, starting from different seeds that were obtained from rounds of simulation runs and analyses. The generated data was analyzed with PyEMMA 40 (v 2.5.9). Crystallographic contacts were used as features to describe the system, including 2.216 pairs of residues in total. The dimensionality of this space was reduced by transforming the data with a time-lagged independent component analysis (TICA 16,17 ), using a lag time of 10 ns and selecting up to 9 components. Projection of the data onto TIC0-TIC2 showed 4 clear and stable basins that correspond to open and closed states of the acceptor and donor gates (Supplementary Fig. 3). Single features highly correlated with these TICs were taken as simplified variables to allow clear and simple analyses of the systems. These variables are F94-F201 c a -c a distance for the acceptor gate (0.79 Pearson correlation coefficient with TIC0), and the minimum distance between residues 53-55 c a and 318-320 c a for the donor gate (0.66 Pearson correlation coefficient with TIC2), allowing to separate clustered TIC densities cleanly (Suppelmentary Fig. 4-5). The TIC space was clustered with the k-means algorithm using one cluster center per basin, and analysis of the discretized trajectories revealed high metastability with very few transitions (Supplementary Fig. 6-7). States fulfilling the conditions shown in Supplementary Fig. 9 were considered as C - and O -reactive. Specifically, we considered three criterions for C -glycosylation and two for O -glycosylation: (1) a “nucleophilic attack” criterion, involving a reactive distance between UDP and the acceptor center (either C - or O -), and a proper angle of attack, (2) a “proton transfer” criterion, involving the establishment of a hydrogen bond between the activating hydroxyl and the catalytic histidine, and (3) a “ring orientation” criterion, which ensures the proper orientation of the reactive ring with respect to the activated glucose. A state fulfilling both 1 and 2 is considered as O -reactive, while a state fulfilling all three conditions is considered as C -reactive. A snapshot with both gates closed and filling the conditions of a C -reactive state was selected to explore the reaction mechanism. Quantum mechanics / Molecular mechanics (QM/MM) simulations were carried out with CP2K 41 (v 8.2) coupled with Plumed 42 (v 2.7.3). The QM region included 116 atoms for the first reaction step: 3 water molecules surrounding UDP, the side chains of His25, Asp122, Ser286, His366, Asn370, and Ser371 residues, phloretin and UDP-glucose substrates capped through C-C bonds, and 8 capping hydrogens to saturate bonds in the intersection between the QM and the MM region. For the second and the third reaction steps 3 additional water molecules were included, leading to 125 QM atoms in total (Fig. 4a). Molecular orbitals were expanded using a combination of atom-centered Gaussian functions (triple zeta valence polarized basis set with Goedecker-Teter-Hutter pseudopotentials 43 ) and auxiliary plane-waves with a cutoff of 350 Ry distributed in a 20.8 x 27.9 x 22.5 cubic angstroms cell. The PBE 44 density functional and DFT-D3 45 van der Waals corrections were used to describe the Hamiltonian of the system. This selection of DFT functional and basis sets represents a good compromise between accuracy and computational cost. All simulations were run with a timestep of 0.5 fs, within the NVT ensemble, at a temperature of 310 K controlled by a CSVR 46 thermostat (Canonical Sampling through Velocity Rescaling). The system was equilibrated during 3 ps, and subsequently metadynamics 47 was used to unveil the reaction mechanism of C -glycosylation. To this end, three independent and consecutive simulations were carried out, one for each of the following steps: (1) C-C bond formation, (2) s-complex stabilization, and (3) C-H deprotonation. The collective variables (CVs) used in each simulation are shown in Supplementary Fig. 13, together with the integrated free energy landscapes. Gaussian height, width, and pace were set to 1 kcal·mol -1 , 0.4 Angstroms, and 50 fs for the first step, 0.5 kcal·mol -1 , 0.2 Angstroms, and 50 fs for the second step, and 1 kcal·mol -1 , 0.2 Angstroms (for CV1 and CV2), and 50 fs for the third step. Additionally, for the first step the parameters were reduced to 0.5 kcal·mol -1 , 0.4 Angstroms, and 100 fs after 8 ps to obtain a more accurate barrier. Simulations were stopped following a first crossing criterion 48 , with a total simulation time of 12 ps for the first step, 11 ps for the second, and 32 ps for the third. Small molecule QM calculations were done with Orca 49 (v 4.2.1). The PBE 44 functional was considered for the Hamiltonian, using a def2-TZVP basis set. The SMD 50 implicit solvation model was used considering water as solvent. The visual inspection of structures and molecular renders were done with Pymol (v 2.5.0). MDtraj 51 (v 1.9.7) was used for the computation of different structural parameters, including the solvent accessible surface area of phloretin. Fpocket 52 (v 4.0) was used to compute pocket volumes. All plots were done with Matplotlib (v 3.5.1). The fittings shown in Fig. 2a and 2b were done with Scipy (v 1.7.3), using Gaussian and logistic functions, respectively, and the histogram fitting shown in Fig. 3b with a 1d cubic interpolation. Declarations 6. Data availability Crystallographic data for the structures reported in this article have been deposited at the Protein Data Bank, under PDB IDs 8CGQ and 8CHD. Copies of the data can be obtained free of charge via https://www.rcsb.org. All molecular structures, trajectories, and notebooks to reproduce the results of this paper are available in Zenodo at https://doi.org/X/X. All other relevant data generated and analysed during this study are included in this article and its supplementary information. 7. Acknowledgements L. R. acknowledges funding by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no. 897414, and Ministerio de Ciencia e Innovación / Agencia Estatal de Investigación for the RYC2021-032530-I grant. This work was funded by the Novo Nordisk foundation through grants NNF20CC0035580 and NNF18OC0034744. This work was performed on the Horeka supercomputer, funded by the Ministry of Science, Research and the Arts Baden-Württemberg and by the Federal Ministry of Education and Research (Germany). 8. Author contributions L. R., D. T., F. N. and D. W. conceptualized the study; D. T., G. B., M. H., and N. P. performed the screening, mutational and pH analyses; D. T., G. B., F. F., and D. W. performed crystallographic experiments; S.K. performed preliminary docking calculations; L. R. performed MD, QM/MM, and QM calculations; L. R. and D. T. wrote the manuscript with contributions from all authors; L. R., F.N. and D. W. provided the funding. References Yang, D., Jang, W. D. & Lee, S. Y. Production of Carminic Acid by Metabolically Engineered Escherichia coli. J Am Chem Soc 143 , 5364–5377 (2021). Wang, Z. et al. Effects of aloesin on melanogenesis in pigmented skin equivalents. Int J Cosmet Sci 30 , 121–130 (2008). Li, Y. et al. Discovery of a Potent, Selective Renal Sodium-Dependent Glucose Cotransporter 2 (SGLT2) Inhibitor (HSK0935) for the Treatment of Type 2 Diabetes. J Med Chem 60 , 4173–4184 (2017). Yang, Y. & Yu, B. Recent Advances in the Chemical Synthesis of C ‑ Glycosides. Chem Rev 117 , 12281–12356 (2017). Zhang, M. et al. Functional Characterization and Structural Basis of an Efficient Di-C-glycosyltransferase from Glycyrrhiza glabra. J Am Chem Soc 142 , 3506–3512 (2020). He, J. Bin et al. Molecular and Structural Characterization of a Promiscuous C-Glycosyltransferase from Trollius chinensis. Angewandte Chemie - International Edition 58 , 11513–11520 (2019). Teze, D. et al. O-/ N-/ S-Specificity in Glycosyltransferase Catalysis: From Mechanistic Understanding to Engineering. ACS Catal 11 , 1810–1815 (2021). Galabov, B., Nalbantova, D., Schleyer, P. V. R. & Schaefer, H. F. Electrophilic Aromatic Substitution: New Insights into an Old Class of Reactions. Acc Chem Res 49 , 1191–1199 (2016). Stamenković, N., Ulrih, N. P. & Cerkovnik, J. An analysis of electrophilic aromatic substitution: a “complex approach”. Physical Chemistry Chemical Physics 23 , 5051–5068 (2021). Wang, Z. L. et al. Dissection of the general two-step di-C-glycosylation pathway for the biosynthesis of (iso)schaftosides in higher plants. Proc Natl Acad Sci U S A 117 , 30816–30823 (2020). Ferreyra, M. L. F. et al. Identification of a bifunctional Maize C- and O-glucosyltransferase. Journal of Biological Chemistry 288 , 31678–31688 (2013). Putkaradze, N., Gala, V. Della, Vaitkus, D., Teze, D. & Welner, D. H. Sequence mining yields 18 phloretin C-glycosyltransferases from plants for the efficient biocatalytic synthesis of nothofagin and phloretin-di-C-glycoside. Biotechnol J 18 , 1–10 (2023). Strichartz, G. R., Oxford, G. S. & Ramon, F. Effects of the dipolar form of phloretin on potassium conductance in squid giant axons. Biophys J 31 , 229–246 (1980). Lohrie, M. & Knoche, W. Dissociation and Keto-Enol Tautomerism of Phloroglucinol and Its Anions in Aqueous Solution. J Am Chem Soc 115 , 919–924 (1993). Prinz, J. H. et al. Markov models of molecular kinetics: Generation and validation. Journal of Chemical Physics 134 , 174105 (2011). Pérez-Hernández, G., Paul, F., Giorgino, T., De Fabritiis, G. & Noé, F. Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139 , 015102 (2013). Schwantes, C. R. & Pande, V. S. Improvements in Markov State Model construction reveal many non-native interactions in the folding of NTL9. J Chem Theory Comput 9 , 2000–2009 (2013). de Boer, R. M. et al. Regioselective glycosylation of polyphenols by family 1 glycosyltransferases: experiments and simulations. ChemRiv (2023) doi:10.26434/chemrxiv-2023-35mmv. Putkaradze, N., Teze, D., Fredslund, F. & Welner, D. H. Natural product: C-glycosyltransferases-a scarcely characterised enzymatic activity with biotechnological potential. Nat Prod Rep 38 , 432–443 (2021). Hofer, F., Kraml, J., Kahler, U., Kamenik, A. S. & Liedl, K. R. Catalytic Site pKa Values of Aspartic, Cysteine, and Serine Proteases: Constant pH MD Simulations. J Chem Inf Model 60 , 3030–3042 (2020). Ardèvol, A. & Rovira, C. The molecular mechanism of enzymatic glycosyl transfer with retention of configuration: Evidence for a short-lived oxocarbenium-like species. Angewandte Chemie - International Edition 50 , 10897–10901 (2011). Ardèvol, A. & Rovira, C. Reaction mechanisms in carbohydrate-active enzymes: glycoside hydrolases and glycosyltransferases. Insights from ab initio quantum mechanics/molecular mechanics dynamic simulations. J. Am. Chem. Soc. 137 , 7528–7547 (2015). Wang, X., Li, C., Zhou, C., Li, J. & Zhang, Y. Molecular characterization of the C-glucosylation for puerarin biosynthesis in Pueraria lobata. Plant Journal 90 , 535–546 (2017). Xie, K., Zhang, X., Sui, S., Ye, F. & Dai, J. Exploring and applying the substrate promiscuity of a C-glycosyltransferase in the chemo-enzymatic synthesis of bioactive C-glycosides. Nat Commun 11 , 1–12 (2020). Winter, G. Xia2: An expert system for macromolecular crystallography data reduction. J Appl Crystallogr 43 , 186–190 (2010). Kabsch, W. et al. XDS . Acta Crystallogr D Biol Crystallogr 66 , 125–132 (2010). Brazier-Hicks, M. et al. Characterization and engineering of the bifunctional N- and O-glucosyltransferase involved in xenobiotic metabolism in plants. Proc Natl Acad Sci U S A 104 , 20238–20243 (2007). McCoy, A. J. et al. Phaser crystallographic software. J Appl Crystallogr 40 , 658–674 (2007). Afonine, P. V. et al. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr D Biol Crystallogr 68 , 352–367 (2012). Emsley, P. & Cowtan, K. Coot: Model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 60 , 2126–2132 (2004). Chen, V. B. et al. MolProbity: All-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr 66 , 12–21 (2010). Olsson, M. H. M., Søndergaard, C. R., Rostkowski, M. & Jensen, J. H. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J. Chem. Theor. Comput. 7 , 525–537 (2011). Šali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. Journal of Molecular Biology vol. 234 779–815 Preprint at https://doi.org/10.1006/jmbi.1993.1626 (1993). Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79 , 926–935 (1983). Salomon-Ferrer, R., Case, D. A. & Walker, R. C. An overview of the Amber biomolecular simulation package. Wiley Interdiscip Rev Comput Mol Sci 3 , 198–210 (2013). Maier, J. A. et al. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J Chem Theory Comput 11 , 3696–3713 (2015). Kirschner, K. N. et al. GLYCAM06: a generalizable biomolecular force field. Carbohydrates. J. Comput. Chem. 29 , 622–655 (2008). Wang, J. M., Wolf, R. M., Caldwell, J. W., Kollman, P. A. & Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 25 , 1157–1174 (2004). Eastman, P. et al. OpenMM 4: A reusable, extensible, hardware independent library for high performance molecular simulation. J Chem Theory Comput 9 , 461–469 (2013). Scherer, M. K. et al. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J Chem Theory Comput 11 , 5525–5542 (2015). Phys, J. C. et al. CP2K : An electronic structure and molecular dynamics software package - Quickstep : Efficient and accurate electronic structure calculations CP2K : An electronic structure and molecular dynamics software package - Quickstep : Efficient and accurate elect. 194103 , (2020). Tribello, G. A., Bonomi, M., Branduardi, D., Camilloni, C. & Bussi, G. PLUMED 2: New feathers for an old bird. Comp. Phys. Commun. 185 , 604–613 (2014). Goedecker, S., Teter, M. & Hutter, J. Separable dual-space Gaussian pseudopotentials. Phys. Rev. B 54 , 1703–1710 (1996). Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77 , 3865–3868 (1996). Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. Journal of Chemical Physics 132 , (2010). Bussi, G., Donadio, D. & Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. 126 , 14101 (2007). Laio, A. & Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. USA 99 , 12562–12566 (2002). Ensing, B., Laio, A., Parrinello, M. & Klein, M. L. A recipe for the computation of the free energy barrier and the lowest free energy path of concerted reactions. J Phys Chem B 109 , 6676–6687 (2005). Neese, F., Wennmohs, F., Becker, U. & Riplinger, C. The ORCA quantum chemistry program package. Journal of Chemical Physics 152 , (2020). Marenich, A. V., Cramer, C. J. & Truhlar, D. G. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. Journal of Physical Chemistry B 113 , 6378–6396 (2009). McGibbon, R. T. et al. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys J 109 , 1528–1532 (2015). Vincent Le Guilloux, P. S. and P. Tuffery. Fpocket: An open source platform for ligand pocket detection. BMC Bioinformatics 10 , (2009). Additional Declarations There is NO Competing Interest. Supplementary Files 20240827CGTSI.docx Supplementary information Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5591657","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":389874212,"identity":"deafb9c8-a643-4383-8ca8-4ae187eefeae","order_by":0,"name":"Ditte Welner","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA9ElEQVRIie3PvYoCMRDA8djEJpp2FnyIPRbEzlfJsGC1aHGwbCFitaXXWogvIaxtjsDarNpaWHgIVwcEsRKjyF0h8aOzyB8mMMUPMoS4XO+YMrNJJLuuYEaYoeI+EcWVyAsJHxBpBlP5vzwk1Rn72OB4XeO8CHa622iTcicHEresxFMs8DH7Zd4wqoPM4ZOwLQWyiKzEV5UMMFPMX7E6kRSwDyGFUprcI9MDjgxZFoGWx+dIRrBviIx8+E7/iP1jnuI7EPn5llYM8wFgyrZBQyzs51eXOWrdXTc5VxOd7Hv4VcaflY5DK7mNnh/xAnC5XC7XbSdiaVZtUH2o7gAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0001-9297-4133","institution":"Technical University of Denmark","correspondingAuthor":true,"prefix":"","firstName":"Ditte","middleName":"","lastName":"Welner","suffix":""},{"id":389874213,"identity":"96e16204-5325-4eef-930e-c831b07a6222","order_by":1,"name":"Lluís Raich","email":"","orcid":"","institution":"Freie Universität Berlin","correspondingAuthor":false,"prefix":"","firstName":"Lluís","middleName":"","lastName":"Raich","suffix":""},{"id":389874214,"identity":"f64aed06-e5a2-4ab6-9a42-4d8b925ba796","order_by":2,"name":"David Teze","email":"","orcid":"https://orcid.org/0000-0002-6865-6108","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Teze","suffix":""},{"id":389874215,"identity":"52ee2817-1191-4687-8d74-7ef5a211db2c","order_by":3,"name":"Gonzalo Bidart","email":"","orcid":"","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"Gonzalo","middleName":"","lastName":"Bidart","suffix":""},{"id":389874216,"identity":"24334b4b-6949-4721-8691-7ae426efceb9","order_by":4,"name":"Folmer Fredslund","email":"","orcid":"","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"Folmer","middleName":"","lastName":"Fredslund","suffix":""},{"id":389874217,"identity":"28bb9a71-3ef3-4bbc-8db6-7e17959a26e9","order_by":5,"name":"Mandy Hobusch","email":"","orcid":"","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"Mandy","middleName":"","lastName":"Hobusch","suffix":""},{"id":389874218,"identity":"8b105c07-6394-4938-bbe8-c0373e4372eb","order_by":6,"name":"Sonja Kunstmann","email":"","orcid":"","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"Sonja","middleName":"","lastName":"Kunstmann","suffix":""},{"id":389874219,"identity":"bf210214-a27a-47ce-b7e1-dd9c29fbaa24","order_by":7,"name":"Natalia Putkaradze","email":"","orcid":"https://orcid.org/0000-0001-5401-8378","institution":"Technical University of Denmark","correspondingAuthor":false,"prefix":"","firstName":"Natalia","middleName":"","lastName":"Putkaradze","suffix":""},{"id":389874220,"identity":"ab9b4438-92fc-4b94-b448-2ccdac75a65a","order_by":8,"name":"Frank Noé","email":"","orcid":"","institution":"Microsoft Research AI4Science","correspondingAuthor":false,"prefix":"","firstName":"Frank","middleName":"","lastName":"Noé","suffix":""}],"badges":[],"createdAt":"2024-12-06 07:30:28","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5591657/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5591657/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":71509752,"identity":"429f993a-ac2b-4b14-a5b9-5b6d1cc54c51","added_by":"auto","created_at":"2024-12-16 10:17:25","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":26099,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSimplified mechanisms of O- and C-glycosylation in GTs, exemplified on trihydroxyacetophenone.\u003c/strong\u003e (Top) Single step O-glycosylation, (Bottom) Two step C-glycosylation through a s-complex.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/62bb468992968b15101ac3df.png"},{"id":71509093,"identity":"9ff0056c-c0ae-4687-b21b-e192c2885213","added_by":"auto","created_at":"2024-12-16 10:09:29","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":24117,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDistinctive pH-activity profiles.\u003c/strong\u003e (A) Wildtype initial rates on phloretin (\u003cem\u003eC\u003c/em\u003e-Glc, blue) and naringenin (\u003cem\u003eO\u003c/em\u003e-Glc, red); (B) \u003cem\u003eC\u003c/em\u003e-Glc conversion rates on phloretin for mutants of the catalytic dyad.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/3468348fe8078ba50e663b71.png"},{"id":71509094,"identity":"4a7f9245-22b4-4fdb-96e6-bba264e9c275","added_by":"auto","created_at":"2024-12-16 10:09:30","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":694246,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTwo flexible gates control substrate accessibility and reactivity. (\u003c/strong\u003eA) Comparison of crystal structures 8CGQ (cyan, closed acceptor) and 6LF6 (blue, open acceptor) of \u003cem\u003eZm\u003c/em\u003eUGT708A6. Note that the donor gate shows a slight conformational rearrangement but it is closed in both crystals. (B) Structural ensembles of closed and open states (left acceptor, right donor) obtained by MD simulations from 8CGQ. The acceptor and donor molecules are shown in yellow and black, respectively. Plots of relevant metrics along the acceptor and donor gate openings, and raw free energy landscape with the structures 8CGQ and 6LF6 projected as star symbols. (C) Opening motion of the acceptor gate in \u003cem\u003eGg\u003c/em\u003eUGT 6L5R (cyan, closed) and 6L5P (blue, open). In 6L5R the donor gate is closed and in 6L5P unresolved, indicative of a high mobility. (D) Opening motion of the donor gate in \u003cem\u003eNt\u003c/em\u003eUGT72B82 (8CHD), the closed state is present in chain B (cyan) and the open state in chain A (blue).\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/5c5ad4a283a1795526383479.png"},{"id":71509097,"identity":"5465a3d4-8884-470a-88fb-1cd9bfca1299","added_by":"auto","created_at":"2024-12-16 10:09:31","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":395947,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComplete enzymatic mechanism of \u003c/strong\u003e\u003cem\u003e\u003cstrong\u003eC\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e-glycosylation.\u003c/strong\u003e (A) Closeup of the QM region. Enzymatic residues are shown in cyan, and the donor and acceptor substrates are shown in black and yellow, respectively. The dashed red lines indicate QM-MM boundaries capped with hydrogen atoms. (B) Proton transfer exchange between phloretin o-OH and H25 during the equilibration period. (C) Evolution of key distances along the \u003cem\u003eC\u003c/em\u003e-glycosylation mechanism. Vertical gray lines separate the three mechanistic steps. (D) Representative snapshots and free energy profile along the catalytic pathway.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/f23e749e58745125f731787a.png"},{"id":71509091,"identity":"08a38339-b64c-45fb-869e-580f0e4ce0d7","added_by":"auto","created_at":"2024-12-16 10:09:27","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":32789,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRing nucleophilicity and stabilization by deprotonation correlate with substrate reactivity.\u003c/strong\u003e (a) Chemical structure of the evaluated acceptors. (b) Energy differences between the stabler \u003cem\u003eO\u003c/em\u003e-, s-, and \u003cem\u003eC\u003c/em\u003e-glycosides of each acceptor and a reference s-complex of 2,4,6-hydroxyacetophenone (acceptor 4; see chemical reaction examples below). Deprotonation energies of s-complexes are computed using HPO\u003csub\u003e4\u003c/sub\u003e\u003csup\u003e2-\u003c/sup\u003e as base. All energies are computed at the PBE level and implicit water solvent.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/c0c2f63a611e71954873dfc0.png"},{"id":71509760,"identity":"c99eab7e-6fb3-4713-9d2f-2c17ff8cc122","added_by":"auto","created_at":"2024-12-16 10:17:33","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1692078,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/6fde5b6a-0e46-42be-9d12-872d099fb0af.pdf"},{"id":71509749,"identity":"7473e428-286a-48b2-8c55-1dce2ebd270a","added_by":"auto","created_at":"2024-12-16 10:17:22","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":9668507,"visible":true,"origin":"","legend":"Supplementary information","description":"","filename":"20240827CGTSI.docx","url":"https://assets-eu.researchsquare.com/files/rs-5591657/v1/26cf6a7fb2a91bd82085b2e0.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"A spontaneous proton transfer is key for enzymatic C-glycosylation and restricts the scope of natural C-glycosides","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003e\u003cem\u003eC-\u003c/em\u003eglycosides are high value products in several industries, including food, cosmetics, and pharmaceuticals. For example, carminic acid derived from the insect \u003cem\u003eDactylopius coccus\u003c/em\u003e is one of the oldest natural red colorants in food\u003csup\u003e1\u003c/sup\u003e, aloesin extracted from \u003cem\u003ealoe vera\u003c/em\u003e is a depigmenting agent for the skin\u003csup\u003e2\u003c/sup\u003e, and dapagliflozin inspired from phlorizin is a potent and selective SGLT2 inhibitor to treat type II diabetes\u003csup\u003e3\u003c/sup\u003e. Hence, efforts are devoted to establishing synthetic pathways that could give easy access to these molecules\u003csup\u003e4\u003c/sup\u003e. One of the main synthetic challenges is to overcome the number of undesired regio- and stereoisomers that can be generated when forming a glycoside. This renders most chemical pathways impractical, as they require several steps of protection, activation, and deprotection of reactive centers, making the synthetic process complex and diminishing product yields.\u003c/p\u003e\n\u003cp\u003eIn nature, glycosides are synthesized by family 1 glycosyl transferases (GT1s). These are inverting enzymes that generally use UDP-activated\u0026nbsp;a-sugars (\u003cem\u003ee.g.\u003c/em\u003e UDP-glucose) to generate single\u0026nbsp;b-products. They are thus also known as UDP-dependent glycosyl transferases, or UGTs, and can transfer glycosyl moieties to C\u003csub\u003esp2\u003c/sub\u003e, N\u003csub\u003esp3\u003c/sub\u003e, O\u003csub\u003esp3\u003c/sub\u003e, and S\u003csub\u003esp3\u003c/sub\u003e atoms of the acceptor molecule. Interestingly, the four types of glycosylation occur with the same catalytic machinery, most commonly a His-Asp dyad, and a single enzyme can catalyze all types of glycosidic bonds\u003csup\u003e5,6\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eWhile the mechanisms of \u003cem\u003eN\u003c/em\u003e-, \u003cem\u003eO\u003c/em\u003e-, and \u003cem\u003eS\u003c/em\u003e-glycosylation have been investigated in-depth\u003csup\u003e7\u003c/sup\u003e, the \u003cem\u003eC\u003c/em\u003e-glycosylation mechanism is still unclear. It is anticipated to differ significantly from the others due to the involvement of a sp2-hybridized carbon atom, suggesting a mechanism of aromatic electrophilic substitution (S\u003csub\u003eE\u003c/sub\u003eAr)\u003csup\u003e8,9\u003c/sup\u003e. This mechanism typically involves the formation of a\u0026nbsp;s-complex and an elimination step to restore aromatization (Fig. 1). However, given that the catalytic histidine would become protonated during the formation of the\u0026nbsp;s-complex, it is not clear what would serve as a base in the elimination step. Previous studies on \u003cem\u003eC\u003c/em\u003e-GT mutants of this residue have demonstrated that \u003cem\u003eC\u003c/em\u003e-glycosylation products can still form, albeit at considerably slower rates\u003csup\u003e10\u003c/sup\u003e. This observation suggests that alternative residues or solvent molecules may play a role in facilitating proton transfers for both chemical steps, although currently there is no evidence to strengthen this hypothesis.\u003c/p\u003e\n\u003cp\u003eTo deepen our understanding of \u003cem\u003eC\u003c/em\u003e-GTs, we here focused on an enzyme from maize\u003csup\u003e11\u003c/sup\u003e, \u003cem\u003eZm\u003c/em\u003eUGT708A6, the crystal structure of which is available (PDB 6LF6)\u003csup\u003e10,11\u003c/sup\u003e. We undertook an interdisciplinary approach combining experiments and atomistic simulations. We screened 125 glycosyl acceptors, and observed \u003cem\u003eO\u003c/em\u003e-, \u003cem\u003eS\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylation activity on different substrates. The pH profiles for \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylation are significantly different, with \u003cem\u003eC\u003c/em\u003e-glycosylation being much more impacted by pH using either the WT enzyme or a range of catalytic mutants, suggesting that proton transfer steps are critical during \u003cem\u003eC\u003c/em\u003e-glycosylation. We also solved the UDP-bound crystal structure of this enzyme, revealing a tightly packed, apparently rigid, and hydrophobic acceptor binding site. This aspect contrasts with the available structure and the promiscuous activity of the enzyme, suggesting that structural plasticity and perhaps dynamics may be relevant for the access and accommodation of different substrates. We then performed extensive molecular dynamics simulations to explore the conformational landscape of the enzyme, characterizing two flexible gates whose opening motions modulate substrate reactivity. Moreover, these gates seem to be conserved across GTs, and we show the crystal structure of a distantly related GT1, \u003cem\u003eNt\u003c/em\u003eUGT72B82, which presents structures with both states of the donor gate in the same asymmetric unit. We studied the mechanism of \u003cem\u003eC\u003c/em\u003e-glycosylation, providing evidence for a classic two-step S\u003csub\u003eE\u003c/sub\u003eAr reaction (C-C bond formation and C-H proton abstraction) passing through a key\u0026nbsp;s-complex\u0026nbsp;that requires the assistance of water molecules to stabilize. Finally, we showed that the intrinsic ability of a chemical to stabilize the\u0026nbsp;s-complex is determining whether a given substrate can be \u003cem\u003eC\u003c/em\u003e-glycosylated or not, in accordance with the relative scarcity of \u003cem\u003eC\u003c/em\u003e-glycosides in nature.\u003c/p\u003e"},{"header":"2. Results and discussion","content":"\u003cp\u003e\u003cstrong\u003ePromiscuous glycosylation activity with a conserved catalytic machinery.\u0026nbsp;\u003c/strong\u003eWe screened 125 glycoside acceptors and found \u003cem\u003eZm\u003c/em\u003eUGT708A6 to have a clear preference for chalcones, displaying high \u003cem\u003eC\u003c/em\u003e-glycosylation activity against phloretin, phloracetophenone, and 3-hydroxyphloretin. It also presents \u003cem\u003eO\u003c/em\u003e-glycosylation activity on naringenin, the lignans magnolol and honokiol, \u003cem\u003eS\u003c/em\u003e-glycosylation activity on 3,4-dichlorothiophenol, but no \u003cem\u003eN\u003c/em\u003e-glycosylation activity on the corresponding 3,4-dichloroaniline (Supplementary Table 1). In total, significant activity was found on 24 acceptors. Interestingly, \u003cem\u003eZm\u003c/em\u003eUGT708A6 has a pH optimum of 8 for \u003cem\u003eO\u003c/em\u003e-glycosylation on naringenin, and a bimodal pH-activity profile for \u003cem\u003eC\u003c/em\u003e-glycosylation on phloretin, with a minimum at pH 8 flanked by higher activity zones between pH 6-7 and pH 10 (Fig. 2a). A similar decrease in conversion rates was observed\u0026nbsp;at pH 7.75 to 8.75 (Supplementary Fig. 1). These observations are in accordance with newly discovered \u003cem\u003eC\u003c/em\u003e-GTs showing high activity in unbuffered diluted sodium hydroxide\u003csup\u003e12\u003c/sup\u003e, and hint at a mechanistically distinct pathway at high pH. Indeed, a high activity at pH 6-7 is consistent with an activity against a fully protonated phloretin (p\u003cem\u003eK\u003c/em\u003e\u003csub\u003ea\u003c/sub\u003e 7.4\u003csup\u003e13\u003c/sup\u003e), while at pH10 few of di-anionic keto/enol forms are expected to be in equilibrium, according to studies on phloroglucinol\u003csup\u003e14\u0026nbsp;\u003c/sup\u003eThe kinetic parameters showed a moderately higher activity at pH 10 compared to 7 (Supplementary Fig. 2).\u003c/p\u003e\n\u003cp\u003eTo shed further light on the different pH effects on \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylation, we investigated six conservative mutations of the catalytic dyad. We found that all mutants except H25A displayed decreased \u003cem\u003eO\u003c/em\u003e-activity on naringenin at neutral pH (Fig. 2b and Supplementary Fig. 2). No mutant showed detectable \u003cem\u003eC\u003c/em\u003e-activity on phloretin at neutral pH, but they all recovered activity at higher pH. This\u0026nbsp;indicates that the catalytic histidine is critical for the \u003cem\u003eC\u003c/em\u003e-glycosylation of phloretin, either for glycosylation and/or to deprotonate the\u0026nbsp;s-complex and recover aromaticity. Note that no variant displayed a specificity change, \u003cem\u003ei.e.\u003c/em\u003e no \u003cem\u003eC\u003c/em\u003e-glycosylation activity was observed on naringenin, nor \u003cem\u003eO\u003c/em\u003e-glycosylation activity on phloretin. Moreover, at high pH, mutants introducing negative charges exhibited reduced activity compared to mutants eliminating them (\u003cem\u003ee.g.\u003c/em\u003e H25D and H25E are the least efficient mutants, while D122N is the most active), hinting at a repulsive interaction with either the substrate or reaction intermediates.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTwo flexible gates control substrate accessibility and reactivity.\u0026nbsp;\u003c/strong\u003eWe solved the structure of \u003cem\u003eZm\u003c/em\u003eUGT708A6 in complex with UDP at a resolution of 2.04 Å (PDB 8CGQ, Fig. 3a and Supplementary Table 2). The overall fold and the positioning of individual residues closely resembled the previously reported structure of \u003cem\u003eZm\u003c/em\u003eUGT708A6 (PDB 6LF6\u003csup\u003e10\u003c/sup\u003e). Nonetheless, we observed a significant difference adjacent to the acceptor pocket, where our structure displayed a smaller, more compact cavity compared to the larger solvent-exposed cavity present in 6LF6 (volumes of 1056 and 1992 Å\u003csup\u003e3\u003c/sup\u003e, respectively). This difference arises from the displacement of the\u0026nbsp;a\u003csub\u003e3\u003c/sub\u003e-helix from the binding site, while F94 shifts its orientation away from F201. Additionally, at the donor site, we noted a slight rearrangement of a loop that covers the UDP binding pocket, indicating a plasticity that could be relevant to explain the access of UDP-Glc into the buried binding site. This flexibility can also explain the substrate promiscuity, as also evidenced by other studies.\u003csup\u003e6\u003c/sup\u003e\u003c/p\u003e\n\u003cp\u003eWe further characterized the dynamical nature of \u003cem\u003eZm\u003c/em\u003eUGT708A6 by computational methods. Starting from our crystal structure (8CGQ), we modelled a complex between UDP-Glucose and phloretin at physiological conditions. We launched 128 parallel MD simulations of 100 ns each, collecting a total of 12.8\u0026nbsp;ms of cumulative data. Next, we analyzed all trajectories using Markov state models\u003csup\u003e15–17\u003c/sup\u003e. We used a time-lagged independent component analysis (TICA) and unveiled an intricate conformational landscape displaying several minima (Supplementary Fig. 3‒7). This helped us to identify four biologically relevant states from the slowest TICs, corresponding to the open/closed forms of the two flexible gates that cover the donor and acceptor substrates (Fig. 3b).\u003c/p\u003e\n\u003cp\u003eThe opening of the donor gate involves the detachment of a flexible loop (residues S315-T323) that sits on top of the\u0026nbsp;a\u003csub\u003e2\u003c/sub\u003e-helix, locked by the interactions of D319 with the backbone amides that are capping the helix (\u003cem\u003ee.g\u003c/em\u003e. S55; Fig. 3b right). This motion leaves S286 exposed to the solvent, a residue\u0026nbsp;located in an internal loop that covers UDP and whose role is related to substrate recognition and stabilization (Supplementary Fig. 8 for a further shift of S286 that leaves UDP more exposed to the solvent). Therefore, the opening/closing of this gate is not only a necessary step for the access of the donor substrate into the catalytic site, but also for the establishment of key interactions that favor the reaction.\u003c/p\u003e\n\u003cp\u003eThe opening of the acceptor gate allowed us to establish a connection between our crystal structure and 6LF6 (free energy landscape in Fig. 3b). We observed the same\u0026nbsp;displacement of the\u0026nbsp;a\u003csub\u003e3\u003c/sub\u003e-helix from the binding pocket, including the shift of F94, which leaves the substrate exposed to the solvent (Figure 3b left). Crucially, this motion affects the number of reactive states for \u003cem\u003eC\u003c/em\u003e- and \u003cem\u003eO\u003c/em\u003e-glycosylation, both being less likely to occur in the open state (Fig. 3b and Supplementary Fig. 9-10 for reaction criteria and structural renders). \u0026nbsp;The catalytic form of the enzyme is thus likely the closed state, as it is the one in which the substrate is spatially restricted and can fulfill the reactive criteria more frequently.\u003c/p\u003e\n\u003cp\u003eWe found two additional \u003cem\u003eC\u003c/em\u003e-GT structures in the PDB that further confirm the existence of the open/closed states (PDBs 6L5R and 6L5P; Fig. 3c)\u003csup\u003e5\u003c/sup\u003e. Interestingly, in 6L5P the donor gate is not resolved, indicative of an open state with high mobility. Similarly, in the acceptor gate the side chain of F92 (F94 in \u003cem\u003eZm\u003c/em\u003eUGT708A6) is unresolved and the Ca-Cb\u0026nbsp;bond is pointing towards the solvent, in line with what we find in our simulations (Supplementary Fig. 11). These similarities are remarkable considering that the \u003cem\u003eC\u003c/em\u003e-GT structures have only 45% of sequence identity with \u003cem\u003eZm\u003c/em\u003eUGT708A6. We also solved the structure of \u003cem\u003eNt\u003c/em\u003eUGT72B82 (PDB 8CHD), an \u003cem\u003eO\u003c/em\u003e-GT from \u003cem\u003eNicotiana tabacum\u003c/em\u003e\u003csup\u003e18\u003c/sup\u003e with only 29% sequence identity to \u003cem\u003eZm\u003c/em\u003eUGT708A6, and observed a large difference in the conformation of the donor gate in the two enzyme molecules of the asymmetric unit (Fig. 3d and Supplementary Table 2 for data collection and refinement statistics). In one molecule, the donor gate is very open, exposing UDP to the solvent and leaving it ready to be replaced by a new UDP-Glc donor substrate. Overall, these observations support that the opening/closing motions unveiled by MD simulations are likely a common feature in GT1s, even for evolutionary distant members.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eC\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e-glycosylation develops through a stepwise mechanism mediated by water stabilization.\u003c/strong\u003e To complete our understanding of \u003cem\u003eZm\u003c/em\u003eUGT708A6, we studied the reaction mechanism of \u003cem\u003eC\u003c/em\u003e-glycosylation. The mechanistic details are currently unclear, and only a few pathways have been proposed based on chemical intuition\u003csup\u003e19\u003c/sup\u003e. We used QM/MM simulations to address this challenge, treating a small active region at DFT level and the rest with the MM force field (Fig. 4a and details in the Methods section). We selected a reactive state from the most populated region, having both gates closed as in our crystal structure.\u003c/p\u003e\n\u003cp\u003eDuring the initial equilibration of the complex, we observed spontaneous back-and-forth proton transitions from the \u003cem\u003eortho\u003c/em\u003e-hydroxyl group of phloretin to the catalytic histidine (o-OH and H25 in Fig. 4). This agrees with the expected acidity of phloretin hydroxyl groups (p\u003cem\u003eK\u003c/em\u003e\u003csub\u003ea\u003c/sub\u003e 7.4\u003csup\u003e13\u003c/sup\u003e) and the H25-D122 dyad (p\u003cem\u003eK\u003c/em\u003e\u003csub\u003ea\u003c/sub\u003e 7.5 for chymotrypsin\u003csup\u003e20\u003c/sup\u003e). From a mechanistic point of view, the deprotonation of \u003cem\u003eo\u003c/em\u003e-OH increases the nucleophilicity of the carbons at \u003cem\u003eortho\u003c/em\u003e- and \u003cem\u003epara\u003c/em\u003e- positions, as evidenced by their charges (Supplementary Fig. 12). This is a key aspect to favor the formation of the C-C bond with the anomeric carbon of the glycan, which is expected to be positively charged in the transition state region.\u003c/p\u003e\n\u003cp\u003eNext, we used metadynamics to enhance the sampling of high-energy configurations and explore reactive pathways (see Methods and Supplementary Fig. 13). We first characterized the C-C bond formation, which proceeds with the dissociation of the C1-O\u003csub\u003eP\u003c/sub\u003e bond between Glc and UDP, followed by the approach of Glc to phloretin and the formation of the C1-Cr bond (Fig. 4c and states R, TS1, and I1 in Fig. 4d). As expected, this approach makes the \u003cem\u003eo\u003c/em\u003e-OH proton of phloretin more acidic, and it ends up being fully transferred to H25 before the transition state, releasing a pair of electrons that can delocalize through the phenolic ring and help to form the C-C bond.\u003c/p\u003e\n\u003cp\u003eFormally, this first step is an electrophilic migration of Glc from UDP to phloretin, passing through a flat energy region that corresponds to a short-lived oxocarbenium ion. This is similar to previous results obtained for retaining \u003cem\u003eO\u003c/em\u003e-GTs of different families\u003csup\u003e21,22\u003c/sup\u003e, even though the system we study is a \u003cem\u003eC\u003c/em\u003e-GT of inverting type. The distance that Glc must travel before reaching the acceptor is likely the reason behind the appearance of this species.\u003c/p\u003e\n\u003cp\u003eThe reaction barrier for this step is ~20 kcal·mol\u003csup\u003e‒1\u003c/sup\u003e, in close agreement with the value derived from the catalytic constant (5 s\u003csup\u003e‒1\u003c/sup\u003e; ~17 kcal·mol\u003csup\u003e‒1\u003c/sup\u003e). Moreover, this is the rate-limiting step along the entire free energy profile, which is in line with the fact that, experimentally, both \u003cem\u003eC\u003c/em\u003e- and \u003cem\u003eO\u003c/em\u003e-glycosylation have relatively similar rates, suggesting that the breaking of the UDP-Glc bond may be determining.\u0026nbsp;Notably, the formed intermediate (I1) is ~16 kcal·mol\u003csup\u003e‒1\u003c/sup\u003e high in energy, mainly because the phenolic ring loses aromaticity after the electrophilic addition. More importantly, in this state H25 is not able to act as a base for the subsequent C-H proton abstraction (Cr-Hr bond in Fig. 4d), given that it has already received a proton from the \u003cem\u003eo\u003c/em\u003e-OH. Ideally, H25 should transfer back this proton to the phosphate group, neutralizing the charge developed during the reaction and restoring the basicity of the catalytic histidine. However, this does not seem geometrically feasible.\u003c/p\u003e\n\u003cp\u003eAt this point, we reasoned that water molecules could be responsible for mediating the C-H abstraction. Therefore, we included a subset of them in the QM region, enabling their direct participation in chemical reorganizations. We observed a spontaneous and irreversible proton transfer from the\u003cem\u003e\u0026nbsp;p\u003c/em\u003e-OH of\u0026nbsp;phloretin to the highly charged phosphate, mediated by a water molecule and the 2-OH of Glc (see states I1 and I2 in Fig. 4d). This rapid transfer highlights that the OH groups of phloretin become very acidic in the adduct state, and their deprotonation allows the recovery of partial aromaticity thorough the phenolic ring, rendering a much stabler intermediate. Subsequently, H25 performs a stepwise acid/base catalysis to abstract the C-H proton (see states I3, TS2, and P in Fig. 4d). First, it protonates the \u003cem\u003eo\u003c/em\u003e-OH of phloretin, rendering a similar state as I1 in terms of energy, and then acts as a base, abstracting the C-H proton with a very low activation energy (~2 kcal·mol\u003csup\u003e‒1\u003c/sup\u003e). This process leads to the final product, a stable glycoconjugate linked by a \u003cem\u003eC\u003c/em\u003e-\u003cem\u003eC\u003c/em\u003e bond.\u003c/p\u003e\n\u003cp\u003eOverall, our simulations indicate that \u003cem\u003eC\u003c/em\u003e-GTs operate through an electrophilic aromatic substitution that involves a classical addition-elimination pathway (C-C bond formation and C-H proton abstraction), together with a critical reorganization of the intermediate in order to stabilize it and allow the recovery of H25 basicity.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eThe\u003cem\u003e\u0026nbsp;C\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003e-glycosylation mechanism restricts the acceptor substrate scope.\u003c/strong\u003e One of the most critical aspects of the mechanism is the stabilization of the\u0026nbsp;s-complex through the spontaneous deprotonation of aromatic hydroxyls. This mechanism of stabilization resembles the keto-enol tautomerization of phloroglucinol in solution, where a seminal study showed that deprotonated forms of\u0026nbsp;s-complexes are prevalent\u003csup\u003e14\u003c/sup\u003e. To give further insights into this stabilization mechanism, as well as to the structural factors of the acceptor that may favor or disfavor \u003cem\u003eC\u003c/em\u003e-glycosylation, we compared the energetics between \u003cem\u003eO\u003c/em\u003e-,\u0026nbsp;s-, and \u003cem\u003eC\u003c/em\u003e-glycosides of different acceptors without the enzymatic scaffold, using full DFT calculations in implicit water solvent (Fig. 5, Supplementary Fig. 14‒25 for all the evaluated isomers, and Methods for more details). Our results show three key aspects to highlight: (1) \u003cem\u003eC\u003c/em\u003e-glycosylation is energetically favored for all acceptors, suggesting that \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylation are involved in a kinetic / thermodynamic competition; (2) the stability of\u0026nbsp;s-complexes correlate with the number of OHs in the reactive ring; and (3) acceptors with 2 or 3 OHs can stabilize their\u0026nbsp;s-complexes through deprotonation, while acceptors bearing a single OH cannot.\u003c/p\u003e\n\u003cp\u003eInterestingly, in the hydroxyflavone series, apigenin (acceptor 7) and 7,4’ dihydroxyflavone (acceptor 8) can also stabilize through the deprotonation of the distal phenolic group, while 7-hydroxyflavone (acceptor 9) cannot. This explains the prevalence of \u003cem\u003eC\u003c/em\u003e-glucosides for former molecules, while a \u003cem\u003eC\u003c/em\u003e-glucoside has never been reported for the latter, despite that the only difference is a hydroxyl far away from the glycosylation site. Notably, daidzein (acceptor 11) is also unable to significantly stabilize the\u0026nbsp;s-complex through the deprotonation of its distal hydroxyl, even though the only difference with respect to 7,4’ dihydroxyflavone is the connectivity of the phenolic group (isoflavone vs flavone). This is because isoflavones cannot delocalize electrons from the phenolic group to the reactive scaffold, as evidenced by resonance structures. This subtle detail is in accordance with the relative scarcity of isoflavone \u003cem\u003eC\u003c/em\u003e-glycosides compared to their flavone counterparts. Indeed, only a single enzyme, \u003cem\u003ePl\u003c/em\u003eUGT43\u003csup\u003e23\u003c/sup\u003e, has been reported to present \u003cem\u003eC\u003c/em\u003e-activity on the isoflavone daidzein, albeit very low. Conversely, there is no clear difference between flavone and isoflavone energetics for \u003cem\u003eO\u003c/em\u003e-glycosides.\u003c/p\u003e\n\u003cp\u003eOur calculations are also useful to determine the most stable isomers of a given acceptor, allowing predictions of regioselectivities. Indeed, we observed large energy differences between\u0026nbsp;s-complexes of \u003cem\u003eC\u003c/em\u003e-glycosylation sites and non-reactive sites, even when those surround the same reactive hydroxyl. For instance, there is about 10 kcal·mol\u003csup\u003e‒1\u003c/sup\u003e difference between the\u0026nbsp;s-complexes of aloesin (acceptor 12) at position 8 and its analogues at position 6, both surrounding the phenolic hydroxyl at position 7 (Supplementary Fig. 25). As far as we are aware, a 6-\u003cem\u003eC\u003c/em\u003e-glycoside of aloesone has never been reported, and this can be explained purely by energetic terms.\u003c/p\u003e\n\u003cp\u003eIt is also worth noting that all \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosides have similar energies between acceptors, and\u0026nbsp;s-complexes are the only states that show significant differences. Therefore, the relative stability of\u0026nbsp;s-complexes is critical to discern between acceptors that will likely render \u003cem\u003eO\u003c/em\u003e- or \u003cem\u003eC\u003c/em\u003e-products. Furthermore, compounds featuring a single aromatic hydroxyl lack the capacity for spontaneous deprotonation to stabilize the\u0026nbsp;s-complex and are also inherently less reactive due to the low nucleophilicity of the ring. Indeed, \u003cem\u003eZm\u003c/em\u003eUGT708A6 catalyzes the \u003cem\u003eC\u003c/em\u003e-glycosylation of 2,4,6 trihydroxyacetophenone (acceptor 1) about 20-fold faster compared to 2,4 dihydroxyacetophenone (acceptor 2) or 2,4,5 trihydroxyacetophenone, and appears inactive with 4 hydroxyacetophenone (acceptor 3) (Supplementary Fig. 2). Similarly, in a recent and compelling study, \u003cem\u003eAb\u003c/em\u003eCGT was tested against a wide array of substrates, showing that acceptors with a single hydroxyl were only \u003cem\u003eO\u003c/em\u003e-glycosylated, acceptors with two hydroxyls were \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylated, and acceptors with three hydroxyls were only \u003cem\u003eC\u003c/em\u003e-glycosylated.\u003csup\u003e24\u003c/sup\u003e Hence, these structural details of acceptor substrates emerge as the critical determinant in governing their potential for \u003cem\u003eC\u003c/em\u003e-glycosylation.\u003c/p\u003e"},{"header":"3. Conclusions","content":"\u003cp\u003eIn this work, we have characterized \u003cem\u003eZm\u003c/em\u003eUGT708A6 and its \u003cem\u003eC\u003c/em\u003e-glycosylation mechanism.\u003c/p\u003e\n\u003cp\u003eWe revealed an intricate conformational landscape characterized by the dynamic opening and closing of two gates, crucial for controlling the accessibility of donor and acceptor substrates. Our findings show that the open state of the acceptor gate exposes the substrate to the solvent, facilitating substrate binding and product release, while the closed state increases the number of reactive poses, limiting the mobility and orientation of the substrate in the binding cavity. Thus, we propose that the catalytic cycle commences with the apo state and both gates open. Subsequently, both donor and acceptor substrates bind, the gates close to enhance reactivity, the reaction step occurs, and finally the gates open again to release the products, restoring the apo state and completing the cycle. These results offer insights into the remarkable flexibility of \u003cem\u003eC\u003c/em\u003e-GTs, and of GT1s in general, providing a rational foundation for understanding how the two deeply buried substrates can access the active site, and how the acceptor cavity can bind a diverse array of molecules and modulate their reaction outcomes.\u003c/p\u003e\n\u003cp\u003eOur mechanistic data open several questions about the mechanism at different pH. At physiological conditions, where polyphenols are predominantly protonated, \u003cem\u003eC\u003c/em\u003e-glycosylation likely proceeds through the mechanism that we uncovered, involving a water-mediated stabilization of the\u0026nbsp;s-complex. At more basic conditions, substrates could either enter deprotonated into the active site, moving directly from the reactant state to the stable\u0026nbsp;s-complex, or be assisted by solvent bases to proceed with the reaction. This hypothesis is supported by the observed variations in pH-activity profile between \u003cem\u003eO\u003c/em\u003e- and \u003cem\u003eC\u003c/em\u003e-glycosylation, with the latter restored at elevated pH values, pointing to the critical importance of proton transfer steps in the mechanism. Critically, our results also suggest that substrates with a single hydroxyl may require enzymes with alternative mechanisms to make \u003cem\u003eC\u003c/em\u003e-glycosides kinetically accessible, as the\u0026nbsp;s-complexes of these substrates are unable to stabilize through deprotonation. This may be the case of \u003cem\u003ePl\u003c/em\u003eUGT43, which has an asparagine instead of the catalytic histidine, and yet it can \u003cem\u003eC\u003c/em\u003e-glycosylate the isoflavone daidzein.\u003c/p\u003e\n\u003cp\u003eFinally, we emphasize that the potential to \u003cem\u003eC\u003c/em\u003e-glycosylate an acceptor relates directly to the intrinsic stabilization of its\u0026nbsp;s-intermediates, whose energies can be evaluated by simple computational methods. Hence, we propose that, for molecules presenting a high energy\u0026nbsp;s-intermediate, substrate engineering to \u003cem\u003eC\u003c/em\u003e-glycosylate a precursor of the desired compound may be preferable than extensive enzyme mining or engineering. It is possible that this strategy represents the natural pathway to complex \u003cem\u003eC\u003c/em\u003e-glycosides.\u003c/p\u003e"},{"header":"4. Methods","content":"\u003cp\u003e\u003cstrong\u003eKinetic assays and X-ray crystallography\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMaterials\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBuffers, chemicals and reagents were purchased from Sigma Aldrich and used without further purification.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEnzyme expression and purification.\u0026nbsp;\u003c/strong\u003eThe full-length histidine-tagged DNA sequence was cloned into a pET28a(+) expression vector by GenScript (USA), without codon optimization. The plasmids was transformed into \u003cem\u003eE. coli\u003c/em\u003e BL21 Star(DE3) (Fisher Scientific). Overexpression was induced by the addition of 250 µM IPTG to the cultures that had reached OD\u003csub\u003e600\u003c/sub\u003e = 0.8–1.0 in 2xYT medium at 37°C (200 rpm). Thereafter, the cultures were incubated for 20 hours at 20°C (200 rpm). The culture were centrifuged, the supernatant discarded, and the pellet was resuspension in 50 mM sodium phosphate buffer (pH 7.4). Lysis was carried out by 2 rounds of high-pressure homogenization at 10,000 psi (Avestin Emulsiflex C5). Cell debris were removed by centrifugation (15.000 x \u003cem\u003eg\u003c/em\u003e, 30 min, 4 °C), the lysate was filtered and purified using immobilized metal affinity chromatography on an ÄKTA Pure with a Histrap FF column (Cytiva). The proteins were stored in 25 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) buffer, 50 mM NaCl, pH 7.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eHPLC analysis.\u003c/strong\u003e Samples were analyzed by RP-HPLC on an Ultimate 3000 series apparatus (Dionex) with a Kinetix 2.6 µm C18 100 Å 100x4.6 mm analytical column (Phenomenex) maintained at 40°C. MilliQ water containing 0.1% formic acid and acetonitrile were used as mobile phases A and B, respectively, with the following method in percentages of mobile phase B at 1 mL/min: 0–0.5 min 2%, 0.5–1.5 min 35%, 1.5–3 min 35–80% (gradient), 3‒4.2 min 98%, 4.2‒5 min 2%. Chromatograms recorded at 300 nm and were processed via Chromeleon 7.2.7 (Dionex).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInitial rates kinetics.\u003c/strong\u003e Acceptor (phloretin or 3-hydroxyphloretin) \u0026nbsp;concentrations from 0 to 250 µM were used in 25 mM Tris buffer at pH 7 and 10 in presence of 500 µM UDP-Glc. The reactions were carried out at 293 K for 30 s in presence of 48 ng/mL \u003cem\u003eZm\u003c/em\u003eUGT708A6 and quenched using 1% acetic acid. The calculated \u003cem\u003eK\u003c/em\u003e\u003csub\u003em\u003c/sub\u003e and \u003cem\u003ek\u003c/em\u003e\u003csub\u003ecat\u003c/sub\u003e values and Michaelis-Menten plots were generated and analyzed in R using the drc package.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003epH characterization.\u0026nbsp;\u003c/strong\u003eThe reactions were carried out at 298 K in 70 mM Tris-Bis-Tris (TBT) buffer in a pH range from 5 to 10, in presence of 500 µM sugar donor (UDP-Glc), 100 µM phloretin, and 10 µg/mL \u003cem\u003eZm\u003c/em\u003eUGT708A6 enzyme (100 µg/mL enzyme for the mutant enzymes). The reaction was quenched by 25x dilution in 0.1% acetic acid at time points 0, 5, and 10 minutes.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTime course reactions.\u0026nbsp;\u003c/strong\u003eThe reactions were carried out at 298 K in 25 mM Tris buffer pH 7, in presence of 500 µM sugar donor (UDP-Glc), 50 µM acceptor, and 15 µg/mL (2,4,6 trihydroxy acetophenone) or 150 µg/mL (2,4,5 trihydroxy acetophenone, 2,4 dihydroxy acetophenone, 4 hydroxy acetophenone) \u003cem\u003eZm\u003c/em\u003eUGT708A6 enzyme. The reaction was quenched by 25x dilution in 0.1% acetic acid at time points 0, 1, 2, 3, 4, 5, and 6 hours.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eZm\u003c/em\u003e\u003c/strong\u003e\u003cstrong\u003eUGT708A6 Crystallographic structure determination.\u003c/strong\u003e \u003cem\u003eZm\u003c/em\u003eUGT708A6 was co-crystallized with UDP-Glucose in sitting drops consisting of 0.2 μL protein solution (6.9 mg/mL in 25 mM HEPES pH 7.0, 50 mM NaCl, 1 mM DTT, 5 mM UDP-Glucose) and 0.2 μL crystallization buffer (PACT++ screen (Jena Biosci-ence) solution G10: Polyethylene glycol 3,350 20% w/v, BIS-TRIS propane 100 mM, pH 7.5). The drops were set up with a Phoenix crystallization robot in 3-drop In-telli-Plate 96-well crystallization plates (Art Robbins Instruments). Crystals appeared after 1 day, were cryoprotected in 10% glycerol, and mounted on a nylon cryoloop. 270° of data were collected at 100 K, 1.0000 nm, at the BioMax beamline of MAX-IV, Lund, Sweden, with a 1° oscillation and 5 s exposure time and a ADSC Q315r CCD detector. The data were processed with Xia2\u003csup\u003e25\u003c/sup\u003e and XDS\u003csup\u003e26\u003c/sup\u003e. The structure was solved by molecular replacement using PDB ID 2VCE\u003csup\u003e27\u003c/sup\u003e as a search model and Phaser\u003csup\u003e28\u003c/sup\u003e from the Phenix software package. The structure was refined using phenix.refine\u003csup\u003e29\u003c/sup\u003e and Coot\u003csup\u003e30\u003c/sup\u003e. The final structure was validated with MolProbity\u003csup\u003e31\u003c/sup\u003e and deposited in the Protein Data Bank with PDB ID 8CGQ.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eComputational modelling.\u003c/strong\u003e The structure of \u003cem\u003eZm\u003c/em\u003eUGT708A6 was taken from the crystal resolved in this work. Protonation states of titratable residues were assigned at pH 7 with Propka\u003csup\u003e32\u003c/sup\u003e (v 3.1) and unresolved loops were added using Modeller\u003csup\u003e33\u003c/sup\u003e (v 9.25). The coordinates of UDP-Glc and phloretin were taken by superimposition from PDB 6L5P and PDB 6L5R, respectively. The system was solvated with TIP3P\u003csup\u003e34\u003c/sup\u003e water molecules and 0.15 M of NaCl salt concentration using the tleap module of Ambertools\u003csup\u003e35\u003c/sup\u003e (v 17.0). The enzyme was parameterized with the FF14SB\u003csup\u003e36\u003c/sup\u003e force field, the glucose unit with GLYCAM\u003csup\u003e37\u003c/sup\u003e, and UDP and phloretin molecules with Gaff2\u003csup\u003e38\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003eMolecular dynamics (MD) simulations were carried out with the OpenMM\u003csup\u003e39\u003c/sup\u003e engine (v 7.4.0). A Langevin integrator with a 1 ps friction coefficient and a Monte Carlo barostat were used to maintain the temperature and pressure of the system at 310 K and 1 bar. Long-range electrostatics were computed using the Particle Mesh Ewald scheme, and van der Waals interactions were truncated with a 1 nm cutoff. A hydrogen mass repartition with a factor of 3 was used together with an integration time step of 4 fs, constraining the lengths of all bonds. A total of 128 parallel trajectories of 100 ns each were collected, starting from different seeds that were obtained from rounds of simulation runs and analyses. The generated data was analyzed with PyEMMA\u003csup\u003e40\u003c/sup\u003e (v 2.5.9). Crystallographic contacts were used as features to describe the system, including 2.216 pairs of residues in total. The dimensionality of this space was reduced by transforming the data with a time-lagged independent component analysis (TICA\u003csup\u003e16,17\u003c/sup\u003e), using a lag time of 10 ns and selecting up to 9 components. Projection of the data onto TIC0-TIC2 showed 4 clear and stable basins that correspond to open and closed states of the acceptor and donor gates (Supplementary Fig. 3). Single features highly correlated with these TICs were taken as simplified variables to allow clear and simple analyses of the systems. These variables are F94-F201 c\u003csub\u003ea\u003c/sub\u003e-c\u003csub\u003ea\u003c/sub\u003e distance for the acceptor gate (0.79 Pearson correlation coefficient with TIC0), and the minimum distance between residues 53-55 c\u003csub\u003ea\u003c/sub\u003e and 318-320 c\u003csub\u003ea\u003c/sub\u003e for the donor gate (0.66 Pearson correlation coefficient with TIC2), allowing to separate clustered TIC densities cleanly (Suppelmentary Fig. 4-5). The TIC space was clustered with the k-means algorithm using one cluster center per basin, and analysis of the discretized trajectories revealed high metastability with very few transitions (Supplementary Fig. 6-7).\u003c/p\u003e\n\u003cp\u003eStates fulfilling the conditions shown in Supplementary Fig. 9 were considered as \u003cem\u003eC\u003c/em\u003e- and \u003cem\u003eO\u003c/em\u003e-reactive. Specifically, we considered three criterions for \u003cem\u003eC\u003c/em\u003e-glycosylation and two for \u003cem\u003eO\u003c/em\u003e-glycosylation: (1) a “nucleophilic attack” criterion, involving a reactive distance between UDP and the acceptor center (either \u003cem\u003eC\u003c/em\u003e- or \u003cem\u003eO\u003c/em\u003e-), and a proper angle of attack, (2) a “proton transfer” criterion, involving the establishment of a hydrogen bond between the activating hydroxyl and the catalytic histidine, and (3) a “ring orientation” criterion, which ensures the proper orientation of the reactive ring with respect to the activated glucose. A state fulfilling both 1 and 2 is considered as \u003cem\u003eO\u003c/em\u003e-reactive, while a state fulfilling all three conditions is considered as \u003cem\u003eC\u003c/em\u003e-reactive.\u003c/p\u003e\n\u003cp\u003eA snapshot with both gates closed and filling the conditions of a \u003cem\u003eC\u003c/em\u003e-reactive state was selected to explore the reaction mechanism. Quantum mechanics / Molecular mechanics (QM/MM) simulations were carried out with CP2K\u003csup\u003e41\u003c/sup\u003e (v 8.2) coupled with Plumed\u003csup\u003e42\u003c/sup\u003e (v 2.7.3). The QM region included 116 atoms for the first reaction step: 3 water molecules surrounding UDP, the side chains of His25, Asp122, Ser286, His366, Asn370, and Ser371 residues, phloretin and UDP-glucose substrates capped through C-C bonds, and 8 capping hydrogens to saturate bonds in the intersection between the QM and the MM region. For the second and the third reaction steps 3 additional water molecules were included, leading to 125 QM atoms in total (Fig. 4a). Molecular orbitals were expanded using a combination of atom-centered Gaussian functions (triple zeta valence polarized basis set with Goedecker-Teter-Hutter pseudopotentials\u003csup\u003e43\u003c/sup\u003e) and auxiliary plane-waves with a cutoff of 350 Ry distributed in a 20.8 x 27.9 x 22.5 cubic angstroms\u0026nbsp;cell. The PBE\u003csup\u003e44\u003c/sup\u003e density functional and DFT-D3\u003csup\u003e45\u003c/sup\u003e van der Waals corrections were used to describe the Hamiltonian of the system. This selection of DFT functional and basis sets represents a good compromise between accuracy and computational cost. All simulations were run with a timestep of 0.5 fs, within the NVT ensemble, at a temperature of 310 K controlled by a CSVR\u003csup\u003e46\u003c/sup\u003e thermostat (Canonical Sampling through Velocity Rescaling). The system was equilibrated during 3 ps, and subsequently metadynamics\u003csup\u003e47\u003c/sup\u003e was used to unveil the reaction mechanism of \u003cem\u003eC\u003c/em\u003e-glycosylation. To this end, three independent and consecutive simulations were carried out, one for each of the following steps: (1) C-C bond formation, (2)\u0026nbsp;s-complex stabilization, and (3) C-H deprotonation. The collective variables (CVs) used in each simulation are shown in Supplementary Fig. 13, together with the integrated free energy landscapes. Gaussian height, width, and pace were set to 1 kcal·mol\u003csup\u003e-1\u003c/sup\u003e, 0.4 Angstroms, and 50 fs for the first step, 0.5 kcal·mol\u003csup\u003e-1\u003c/sup\u003e, 0.2 Angstroms, and 50 fs for the second step, and 1 kcal·mol\u003csup\u003e-1\u003c/sup\u003e, 0.2 Angstroms (for CV1 and CV2), and 50 fs for the third step. Additionally, for the first step the parameters were reduced to 0.5 kcal·mol\u003csup\u003e-1\u003c/sup\u003e, 0.4 Angstroms, and 100 fs after 8 ps to obtain a more accurate barrier. Simulations were stopped following a first crossing criterion\u003csup\u003e48\u003c/sup\u003e, with a total simulation time of 12 ps for the first step, 11 ps for the second, and 32 ps for the third.\u003c/p\u003e\n\u003cp\u003eSmall molecule QM calculations were done with Orca\u003csup\u003e49\u003c/sup\u003e (v 4.2.1). The PBE\u003csup\u003e44\u003c/sup\u003e functional was considered for the Hamiltonian, using a def2-TZVP basis set. The SMD\u003csup\u003e50\u003c/sup\u003e implicit solvation model was used considering water as solvent.\u003c/p\u003e\n\u003cp\u003eThe visual inspection of structures and molecular renders were done with Pymol (v 2.5.0). MDtraj\u003csup\u003e51\u003c/sup\u003e(v 1.9.7) was used for the computation of different structural parameters, including the solvent accessible surface area of phloretin. Fpocket\u003csup\u003e52\u003c/sup\u003e(v 4.0) was used to compute pocket volumes. All plots were done with Matplotlib (v 3.5.1). The fittings shown in Fig. 2a and 2b were done with Scipy (v 1.7.3), using Gaussian and logistic functions, respectively, and the histogram fitting shown in Fig. 3b with a 1d cubic interpolation.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003e6.\u003c/strong\u003e \u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCrystallographic data for the structures reported in this article have been deposited at the Protein Data Bank, under PDB IDs 8CGQ and 8CHD. Copies of the data can be obtained free of charge via https://www.rcsb.org. All molecular structures, trajectories, and notebooks to reproduce the results of this paper are available in Zenodo at https://doi.org/X/X. \u0026nbsp;All other relevant data generated and analysed during this study are included in this article and its supplementary information.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e7. Acknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eL. R. acknowledges funding by the European Union\u0026rsquo;s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no. 897414, and Ministerio de Ciencia e Innovaci\u0026oacute;n / Agencia Estatal de Investigaci\u0026oacute;n for the RYC2021-032530-I grant. This work was funded by the Novo Nordisk foundation through grants NNF20CC0035580 and NNF18OC0034744. This work was performed on the Horeka supercomputer, funded by the Ministry of Science, Research and the Arts Baden-W\u0026uuml;rttemberg and by the Federal Ministry of Education and Research (Germany).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e8. Author contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eL. R., D. T., F. N. and D. W. conceptualized the study; D. T., G. B., M. H., and N. P. performed the screening, mutational and pH analyses; D. T., G. B., F. F., and D. W. performed crystallographic experiments; S.K. performed preliminary docking calculations; L. R. performed MD, QM/MM, and QM calculations; \u0026nbsp;L. R. and D. T. wrote the manuscript with contributions from all authors; L. R., F.N. and D. W. provided the funding.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eYang, D., Jang, W. D. \u0026amp; Lee, S. Y. Production of Carminic Acid by Metabolically Engineered Escherichia coli. \u003cem\u003eJ Am Chem Soc\u003c/em\u003e \u003cstrong\u003e143\u003c/strong\u003e, 5364\u0026ndash;5377 (2021).\u003c/li\u003e\n\u003cli\u003eWang, Z. \u003cem\u003eet al.\u003c/em\u003e Effects of aloesin on melanogenesis in pigmented skin equivalents. \u003cem\u003eInt J Cosmet Sci\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 121\u0026ndash;130 (2008).\u003c/li\u003e\n\u003cli\u003eLi, Y. \u003cem\u003eet al.\u003c/em\u003e Discovery of a Potent, Selective Renal Sodium-Dependent Glucose Cotransporter 2 (SGLT2) Inhibitor (HSK0935) for the Treatment of Type 2 Diabetes. \u003cem\u003eJ Med Chem\u003c/em\u003e \u003cstrong\u003e60\u003c/strong\u003e, 4173\u0026ndash;4184 (2017).\u003c/li\u003e\n\u003cli\u003eYang, Y. \u0026amp; Yu, B. Recent Advances in the Chemical Synthesis of C ‑ Glycosides. \u003cem\u003eChem Rev\u003c/em\u003e \u003cstrong\u003e117\u003c/strong\u003e, 12281\u0026ndash;12356 (2017).\u003c/li\u003e\n\u003cli\u003eZhang, M. \u003cem\u003eet al.\u003c/em\u003e Functional Characterization and Structural Basis of an Efficient Di-C-glycosyltransferase from Glycyrrhiza glabra. \u003cem\u003eJ Am Chem Soc\u003c/em\u003e \u003cstrong\u003e142\u003c/strong\u003e, 3506\u0026ndash;3512 (2020).\u003c/li\u003e\n\u003cli\u003eHe, J. Bin \u003cem\u003eet al.\u003c/em\u003e Molecular and Structural Characterization of a Promiscuous C-Glycosyltransferase from Trollius chinensis. \u003cem\u003eAngewandte Chemie - International Edition\u003c/em\u003e \u003cstrong\u003e58\u003c/strong\u003e, 11513\u0026ndash;11520 (2019).\u003c/li\u003e\n\u003cli\u003eTeze, D. \u003cem\u003eet al.\u003c/em\u003e O-/ N-/ S-Specificity in Glycosyltransferase Catalysis: From Mechanistic Understanding to Engineering. \u003cem\u003eACS Catal\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 1810\u0026ndash;1815 (2021).\u003c/li\u003e\n\u003cli\u003eGalabov, B., Nalbantova, D., Schleyer, P. V. R. \u0026amp; Schaefer, H. F. Electrophilic Aromatic Substitution: New Insights into an Old Class of Reactions. \u003cem\u003eAcc Chem Res\u003c/em\u003e \u003cstrong\u003e49\u003c/strong\u003e, 1191\u0026ndash;1199 (2016).\u003c/li\u003e\n\u003cli\u003eStamenković, N., Ulrih, N. P. \u0026amp; Cerkovnik, J. An analysis of electrophilic aromatic substitution: a \u0026ldquo;complex approach\u0026rdquo;. \u003cem\u003ePhysical Chemistry Chemical Physics\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 5051\u0026ndash;5068 (2021).\u003c/li\u003e\n\u003cli\u003eWang, Z. L. \u003cem\u003eet al.\u003c/em\u003e Dissection of the general two-step di-C-glycosylation pathway for the biosynthesis of (iso)schaftosides in higher plants. \u003cem\u003eProc Natl Acad Sci U S A\u003c/em\u003e \u003cstrong\u003e117\u003c/strong\u003e, 30816\u0026ndash;30823 (2020).\u003c/li\u003e\n\u003cli\u003eFerreyra, M. L. F. \u003cem\u003eet al.\u003c/em\u003e Identification of a bifunctional Maize C- and O-glucosyltransferase. \u003cem\u003eJournal of Biological Chemistry\u003c/em\u003e \u003cstrong\u003e288\u003c/strong\u003e, 31678\u0026ndash;31688 (2013).\u003c/li\u003e\n\u003cli\u003ePutkaradze, N., Gala, V. Della, Vaitkus, D., Teze, D. \u0026amp; Welner, D. H. Sequence mining yields 18 phloretin C-glycosyltransferases from plants for the efficient biocatalytic synthesis of nothofagin and phloretin-di-C-glycoside. \u003cem\u003eBiotechnol J\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 1\u0026ndash;10 (2023).\u003c/li\u003e\n\u003cli\u003eStrichartz, G. R., Oxford, G. S. \u0026amp; Ramon, F. Effects of the dipolar form of phloretin on potassium conductance in squid giant axons. \u003cem\u003eBiophys J\u003c/em\u003e \u003cstrong\u003e31\u003c/strong\u003e, 229\u0026ndash;246 (1980).\u003c/li\u003e\n\u003cli\u003eLohrie, M. \u0026amp; Knoche, W. Dissociation and Keto-Enol Tautomerism of Phloroglucinol and Its Anions in Aqueous Solution. \u003cem\u003eJ Am Chem Soc\u003c/em\u003e \u003cstrong\u003e115\u003c/strong\u003e, 919\u0026ndash;924 (1993).\u003c/li\u003e\n\u003cli\u003ePrinz, J. H. \u003cem\u003eet al.\u003c/em\u003e Markov models of molecular kinetics: Generation and validation. \u003cem\u003eJournal of Chemical Physics\u003c/em\u003e \u003cstrong\u003e134\u003c/strong\u003e, 174105 (2011).\u003c/li\u003e\n\u003cli\u003eP\u0026eacute;rez-Hern\u0026aacute;ndez, G., Paul, F., Giorgino, T., De Fabritiis, G. \u0026amp; No\u0026eacute;, F. Identification of slow molecular order parameters for Markov model construction. \u003cem\u003eJ. Chem. Phys.\u003c/em\u003e \u003cstrong\u003e139\u003c/strong\u003e, 015102 (2013).\u003c/li\u003e\n\u003cli\u003eSchwantes, C. R. \u0026amp; Pande, V. S. Improvements in Markov State Model construction reveal many non-native interactions in the folding of NTL9. \u003cem\u003eJ Chem Theory Comput\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, 2000\u0026ndash;2009 (2013).\u003c/li\u003e\n\u003cli\u003ede Boer, R. M. \u003cem\u003eet al.\u003c/em\u003e Regioselective glycosylation of polyphenols by family 1 glycosyltransferases: experiments and simulations. \u003cem\u003eChemRiv\u003c/em\u003e (2023) doi:10.26434/chemrxiv-2023-35mmv.\u003c/li\u003e\n\u003cli\u003ePutkaradze, N., Teze, D., Fredslund, F. \u0026amp; Welner, D. H. Natural product: C-glycosyltransferases-a scarcely characterised enzymatic activity with biotechnological potential. \u003cem\u003eNat Prod Rep\u003c/em\u003e \u003cstrong\u003e38\u003c/strong\u003e, 432\u0026ndash;443 (2021).\u003c/li\u003e\n\u003cli\u003eHofer, F., Kraml, J., Kahler, U., Kamenik, A. S. \u0026amp; Liedl, K. R. Catalytic Site pKa Values of Aspartic, Cysteine, and Serine Proteases: Constant pH MD Simulations. \u003cem\u003eJ Chem Inf Model\u003c/em\u003e \u003cstrong\u003e60\u003c/strong\u003e, 3030\u0026ndash;3042 (2020).\u003c/li\u003e\n\u003cli\u003eArd\u0026egrave;vol, A. \u0026amp; Rovira, C. The molecular mechanism of enzymatic glycosyl transfer with retention of configuration: Evidence for a short-lived oxocarbenium-like species. \u003cem\u003eAngewandte Chemie - International Edition\u003c/em\u003e \u003cstrong\u003e50\u003c/strong\u003e, 10897\u0026ndash;10901 (2011).\u003c/li\u003e\n\u003cli\u003eArd\u0026egrave;vol, A. \u0026amp; Rovira, C. Reaction mechanisms in carbohydrate-active enzymes: glycoside hydrolases and glycosyltransferases. Insights from ab initio quantum mechanics/molecular mechanics dynamic simulations. \u003cem\u003eJ. Am. Chem. Soc.\u003c/em\u003e \u003cstrong\u003e137\u003c/strong\u003e, 7528\u0026ndash;7547 (2015).\u003c/li\u003e\n\u003cli\u003eWang, X., Li, C., Zhou, C., Li, J. \u0026amp; Zhang, Y. Molecular characterization of the C-glucosylation for puerarin biosynthesis in Pueraria lobata. \u003cem\u003ePlant Journal\u003c/em\u003e \u003cstrong\u003e90\u003c/strong\u003e, 535\u0026ndash;546 (2017).\u003c/li\u003e\n\u003cli\u003eXie, K., Zhang, X., Sui, S., Ye, F. \u0026amp; Dai, J. Exploring and applying the substrate promiscuity of a C-glycosyltransferase in the chemo-enzymatic synthesis of bioactive C-glycosides. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 1\u0026ndash;12 (2020).\u003c/li\u003e\n\u003cli\u003eWinter, G. Xia2: An expert system for macromolecular crystallography data reduction. \u003cem\u003eJ Appl Crystallogr\u003c/em\u003e \u003cstrong\u003e43\u003c/strong\u003e, 186\u0026ndash;190 (2010).\u003c/li\u003e\n\u003cli\u003eKabsch, W. \u003cem\u003eet al.\u003c/em\u003e \u003cem\u003eXDS\u003c/em\u003e. \u003cem\u003eActa Crystallogr D Biol Crystallogr\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 125\u0026ndash;132 (2010).\u003c/li\u003e\n\u003cli\u003eBrazier-Hicks, M. \u003cem\u003eet al.\u003c/em\u003e Characterization and engineering of the bifunctional N- and O-glucosyltransferase involved in xenobiotic metabolism in plants. \u003cem\u003eProc Natl Acad Sci U S A\u003c/em\u003e \u003cstrong\u003e104\u003c/strong\u003e, 20238\u0026ndash;20243 (2007).\u003c/li\u003e\n\u003cli\u003eMcCoy, A. J. \u003cem\u003eet al.\u003c/em\u003e Phaser crystallographic software. \u003cem\u003eJ Appl Crystallogr\u003c/em\u003e \u003cstrong\u003e40\u003c/strong\u003e, 658\u0026ndash;674 (2007).\u003c/li\u003e\n\u003cli\u003eAfonine, P. V. \u003cem\u003eet al.\u003c/em\u003e Towards automated crystallographic structure refinement with phenix.refine. \u003cem\u003eActa Crystallogr D Biol Crystallogr\u003c/em\u003e \u003cstrong\u003e68\u003c/strong\u003e, 352\u0026ndash;367 (2012).\u003c/li\u003e\n\u003cli\u003eEmsley, P. \u0026amp; Cowtan, K. Coot: Model-building tools for molecular graphics. \u003cem\u003eActa Crystallogr D Biol Crystallogr\u003c/em\u003e \u003cstrong\u003e60\u003c/strong\u003e, 2126\u0026ndash;2132 (2004).\u003c/li\u003e\n\u003cli\u003eChen, V. B. \u003cem\u003eet al.\u003c/em\u003e MolProbity: All-atom structure validation for macromolecular crystallography. \u003cem\u003eActa Crystallogr D Biol Crystallogr\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 12\u0026ndash;21 (2010).\u003c/li\u003e\n\u003cli\u003eOlsson, M. H. M., S\u0026oslash;ndergaard, C. R., Rostkowski, M. \u0026amp; Jensen, J. H. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. \u003cem\u003eJ. Chem. Theor. Comput.\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 525\u0026ndash;537 (2011).\u003c/li\u003e\n\u003cli\u003e\u0026Scaron;ali, A. \u0026amp; Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. \u003cem\u003eJournal of Molecular Biology\u003c/em\u003e vol. 234 779\u0026ndash;815 Preprint at https://doi.org/10.1006/jmbi.1993.1626 (1993).\u003c/li\u003e\n\u003cli\u003eJorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. \u0026amp; Klein, M. L. Comparison of simple potential functions for simulating liquid water. \u003cem\u003eJ. Chem. Phys.\u003c/em\u003e \u003cstrong\u003e79\u003c/strong\u003e, 926\u0026ndash;935 (1983).\u003c/li\u003e\n\u003cli\u003eSalomon-Ferrer, R., Case, D. A. \u0026amp; Walker, R. C. An overview of the Amber biomolecular simulation package. \u003cem\u003eWiley Interdiscip Rev Comput Mol Sci\u003c/em\u003e \u003cstrong\u003e3\u003c/strong\u003e, 198\u0026ndash;210 (2013).\u003c/li\u003e\n\u003cli\u003eMaier, J. A. \u003cem\u003eet al.\u003c/em\u003e ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. \u003cem\u003eJ Chem Theory Comput\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 3696\u0026ndash;3713 (2015).\u003c/li\u003e\n\u003cli\u003eKirschner, K. N. \u003cem\u003eet al.\u003c/em\u003e GLYCAM06: a generalizable biomolecular force field. Carbohydrates. \u003cem\u003eJ. Comput. Chem.\u003c/em\u003e \u003cstrong\u003e29\u003c/strong\u003e, 622\u0026ndash;655 (2008).\u003c/li\u003e\n\u003cli\u003eWang, J. M., Wolf, R. M., Caldwell, J. W., Kollman, P. A. \u0026amp; Case, D. A. Development and testing of a general amber force field. \u003cem\u003eJ. Comput. Chem.\u003c/em\u003e \u003cstrong\u003e25\u003c/strong\u003e, 1157\u0026ndash;1174 (2004).\u003c/li\u003e\n\u003cli\u003eEastman, P. \u003cem\u003eet al.\u003c/em\u003e OpenMM 4: A reusable, extensible, hardware independent library for high performance molecular simulation. \u003cem\u003eJ Chem Theory Comput\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, 461\u0026ndash;469 (2013).\u003c/li\u003e\n\u003cli\u003eScherer, M. K. \u003cem\u003eet al.\u003c/em\u003e PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. \u003cem\u003eJ Chem Theory Comput\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 5525\u0026ndash;5542 (2015).\u003c/li\u003e\n\u003cli\u003ePhys, J. C. \u003cem\u003eet al.\u003c/em\u003e CP2K : An electronic structure and molecular dynamics software package - Quickstep : Efficient and accurate electronic structure calculations CP2K : An electronic structure and molecular dynamics software package - Quickstep : Efficient and accurate elect. \u003cstrong\u003e194103\u003c/strong\u003e, (2020).\u003c/li\u003e\n\u003cli\u003eTribello, G. A., Bonomi, M., Branduardi, D., Camilloni, C. \u0026amp; Bussi, G. PLUMED 2: New feathers for an old bird. \u003cem\u003eComp. Phys. Commun.\u003c/em\u003e \u003cstrong\u003e185\u003c/strong\u003e, 604\u0026ndash;613 (2014).\u003c/li\u003e\n\u003cli\u003eGoedecker, S., Teter, M. \u0026amp; Hutter, J. Separable dual-space Gaussian pseudopotentials. \u003cem\u003ePhys. Rev. B\u003c/em\u003e \u003cstrong\u003e54\u003c/strong\u003e, 1703\u0026ndash;1710 (1996).\u003c/li\u003e\n\u003cli\u003ePerdew, J. P., Burke, K. \u0026amp; Ernzerhof, M. Generalized gradient approximation made simple. \u003cem\u003ePhys. Rev. Lett.\u003c/em\u003e \u003cstrong\u003e77\u003c/strong\u003e, 3865\u0026ndash;3868 (1996).\u003c/li\u003e\n\u003cli\u003eGrimme, S., Antony, J., Ehrlich, S. \u0026amp; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. \u003cem\u003eJournal of Chemical Physics\u003c/em\u003e \u003cstrong\u003e132\u003c/strong\u003e, (2010).\u003c/li\u003e\n\u003cli\u003eBussi, G., Donadio, D. \u0026amp; Parrinello, M. Canonical sampling through velocity rescaling. \u003cem\u003eJ. Chem. Phys.\u003c/em\u003e \u003cstrong\u003e126\u003c/strong\u003e, 14101 (2007).\u003c/li\u003e\n\u003cli\u003eLaio, A. \u0026amp; Parrinello, M. Escaping free-energy minima. \u003cem\u003eProc. Natl. Acad. Sci. USA\u003c/em\u003e \u003cstrong\u003e99\u003c/strong\u003e, 12562\u0026ndash;12566 (2002).\u003c/li\u003e\n\u003cli\u003eEnsing, B., Laio, A., Parrinello, M. \u0026amp; Klein, M. L. A recipe for the computation of the free energy barrier and the lowest free energy path of concerted reactions. \u003cem\u003eJ Phys Chem B\u003c/em\u003e \u003cstrong\u003e109\u003c/strong\u003e, 6676\u0026ndash;6687 (2005).\u003c/li\u003e\n\u003cli\u003eNeese, F., Wennmohs, F., Becker, U. \u0026amp; Riplinger, C. The ORCA quantum chemistry program package. \u003cem\u003eJournal of Chemical Physics\u003c/em\u003e \u003cstrong\u003e152\u003c/strong\u003e, (2020).\u003c/li\u003e\n\u003cli\u003eMarenich, A. V., Cramer, C. J. \u0026amp; Truhlar, D. G. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. \u003cem\u003eJournal of Physical Chemistry B\u003c/em\u003e \u003cstrong\u003e113\u003c/strong\u003e, 6378\u0026ndash;6396 (2009).\u003c/li\u003e\n\u003cli\u003eMcGibbon, R. T. \u003cem\u003eet al.\u003c/em\u003e MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. \u003cem\u003eBiophys J\u003c/em\u003e \u003cstrong\u003e109\u003c/strong\u003e, 1528\u0026ndash;1532 (2015).\u003c/li\u003e\n\u003cli\u003eVincent Le Guilloux, P. S. and P. Tuffery. Fpocket: An open source platform for ligand pocket detection. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, (2009).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-5591657/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5591657/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cem\u003eC\u003c/em\u003e-glycosides are valuable compounds containing hydrolytically stable C-C bonds. However, their scarcity in nature and their complex synthesis limit their availability. Enzymes represent an environmentally mild paradigm for the synthesis of \u003cem\u003eC\u003c/em\u003e-glycosides, but only few enzymes with \u003cem\u003eC\u003c/em\u003e-glycosylation activity are known and their catalytic mechanism remains unclear. In this work, we study the intricacies of a \u003cem\u003eC\u003c/em\u003e-glycosyltransferase using X-ray crystallography, biochemical assays, and atomistic simulations. We identify two dynamic gates that control substrate access and reactivity, and investigate the molecular mechanism of \u003cem\u003eC\u003c/em\u003e-glycosylation, identifying an S\u003csub\u003eE\u003c/sub\u003eAr stepwise process along a critical intermediate that stabilizes through a spontaneous water-mediated proton transfer. This stabilization is related to the chemical properties of the substrate, which dictate whether a compound can be \u003cem\u003eC\u003c/em\u003e-glycosylated. \u0026nbsp;Our results provide detailed knowledge and enhance our understanding of this class of enzymes, paving the way for their widespread utilization and engineering.\u003c/p\u003e","manuscriptTitle":"A spontaneous proton transfer is key for enzymatic C-glycosylation and restricts the scope of natural C-glycosides","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-12-16 10:08:37","doi":"10.21203/rs.3.rs-5591657/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"nature-communications","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"NCOMMS","sideBox":"Learn more about [Nature Communications](http://www.nature.com/ncomms/)","snPcode":"","submissionUrl":"https://mts-ncomms.nature.com/","title":"Nature Communications","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Communications","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"9b690bd4-8fe3-4072-ae38-6cd220dba6df","owner":[],"postedDate":"December 16th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":41555430,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":41555431,"name":"Biological sciences/Biochemistry"}],"tags":[],"updatedAt":"2026-02-23T17:05:14+00:00","versionOfRecord":[],"versionCreatedAt":"2024-12-16 10:08:37","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5591657","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5591657","identity":"rs-5591657","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00