Discovery of Electron Hole-hopping Redox Mutations in Myoglobin by Deep Mutational Learning

preprint OA: gold CC-BY-NC-ND-4.0
📄 Open PDF Full text JSON View at publisher
Full text 61,468 characters · extracted from oa-pdf · 8 sections · click to expand

Abstract

In addition to storing molecular oxygen, myoglobin catalyzes peroxidase-like reactions involving high valency iron(IV)-oxo species that support one-electron oxidations on a range of substrates at an open active site. In select metalloenzymes, long -range electron transfer can be mediated by hole -hopping pathways composed of aromatic residues that act as relay stations for oxidative equivalents. However, it remains unclear how sequence variations could introduce or alter such catalytic mechanisms in myoglobin. Here we used enzyme proximity sequencing (EP -Seq) to measure the peroxidase activity levels of >6,000 human myoglobin variants. The resulting fitness landscape reveals how aromatic substitutions, in particular surface -exposed tryptophans, can enhance per oxidase activity. Using protein language models in tandem with feedforward neural networks, we trained an accurate fitness predictor on the experimental dataset, and applied it to evaluate >4M double mutant variants. The predictions suggested a beneficial role for hole-hopping mutations in improving peroxidase activity. We experimentally tested 20 high scoring variants in a yeast display assay, all of which outperformed wild type myoglobin. Three selected variants were also tested in soluble format and similarly showed improved performance. A focused combinatorial library yielded a top double tryptophan variant (Q92W/F107W) with 4.9-fold higher catalytic efficiency than wild type. These results show that hole- hopping pathways can be identified and engineered through deep mutational learning, with broad implications for biocatalyst and redox enzyme design. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 2

Introduction

Myoglobin is a small, globular heme protein found predominantly in the cardiac and skeletal muscle of vertebrates, comprising a single polypeptide chain folded into eight α-helices1. Best known for its role in oxygen storage and transport, myoglobin buffers oxygen levels during high metabolic demand.2 In addition to this classical function, myoglobin has more recently attracted interest for its catalytic activity3,4. This arises from its iron -containing protoporphyrin IX cofactor, which can reversibly cycle oxidation states to catalyze redox reactions. In the presence of hydrogen peroxide (H 2O2), the iron center can be oxidized to form high -valent iron(IV)-oxo species referred to as Compound I/II. These reactive intermediates can catalyze one -electron oxidations on a broad range of organic substrates facilitated by an open and accessible active site geometry. The peroxidase activity of myoglobin is normally suppressed in myocytes due to the reducing environment of the cell. However, there is evidence that this activity can occur in vivo under pathological conditions5–7. For example, in rhabdomyolysis or upon reperfusion of ischemic tissue, increased levels of reactive oxidative species (ROS) create conditions that support heme -mediated oxidation.8–10 Myoglobin peroxidase activity may contribute to ROS detoxification in these settings, but it can also lead to oxidative damage 9,11. When endogenous antioxidants like ascorbate or glutathione are depleted, myoglobin can oxidize lipids and damage proteins and DNA. Recognizing this expanded catalytic potential, researchers have sought to engineer myoglobin for a range of applications, including dye decolorization, and antibiotic degradation.12–16 Multiple studies point to tyrosine and tryptophan substitutions as important for engineering electron transfer pathways in peroxidases. These aromatic residues either directly enhance peroxidase activity, or increase cofactor reduction rates by electron do nors. Their evolutionary appearance has been linked to the oxygenation of Earth’s atmosphere, suggesting an adaptive response to oxidative stress17–20. Prior work by Gray and Winkler has emphasized both the catalytic and protective roles of such residues in natural enzymes, showing that they can extend redox activity beyond the active site and help shuttle oxidative equivalents through protein scaffolds .21–23 Introducing these residues into a stable single-domain protein like myoglobin offers a powerful platform for dissecting protein-based radical chemistry and guiding rational enzyme design. To systematically understand how mutations could influence the peroxidase-like activity of myoglobin, we sought out a high-throughput method that could provide suitable data for variant discovery with machine learning (ML). Deep mutational scanning (DMS) is one such powerful approach that enables massively parallel analysis of protein sequence -function pairings. 24–26 However, building high - throughput platforms to assay enzymatic activity remains challenging.27 Linking genotype to enzymatic phenotype requires the ability to compartmentalize enzymatic reactions, as well as distinguish between signal generation due to improved catalytic properties from that attributable to higher enzyme abundance (i.e. expression level). Although droplet -based systems with colorimetric readouts28–30 and survival -based selections 31–33 can be used to enrich functional variants, these approaches suffer from infrastructure and biochemical limitations, and frequently confound improved enzyme activity with increased protein expression levels. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 3 To address this challenge, we recently developed enzyme proximity sequencing (EP -Seq), a high - throughput method based on yeast display that enables parallel assessment of both protein expression levels and enzymatic activity ( Figure 1)26,34,35. This platform uses phenoxy radical -based labelling chemistry to couple enzymatic activity to yeast cell fluorescence. Highly active variants can then be separated from inactive variants by fluorescence-activated cell sorting (FACS). In prior work, this strategy relied on exogenous horseradish peroxidase for radical labeling, however in the present study, we used the intrinsic peroxidase activity of surface-displayed myoglobin34. Leveraging our high- throughput dataset, we then applied machine learning to model the mutational fitness landscape of human myoglobin and predict peroxidase activity of all double mutants. Recent advances have established ML as a powerful tool for protein variant prediction 37. The fusion of deep mutational scanning with machine learning, termed deep mutational learning (DML), offers an efficient method to gain deeper understanding in sequence-function relationships38,39. Unlike many ML approaches that yield opaque predictions, our DML framework facilitated interpretable hypothesis generation by recommending substitutions with residues that enable hole hopping. This connection, learned from the training data, illustrates how integrating DMS with ML can yield mechanistic insights rather than serving solely as an uninterpretable prediction tool. This analysis led to the identification of highly active variants which were validated experimentally. Importantly, we further demonstrated that activity trends observed in the yeast-display format were recapitulated in the corresponding soluble enzyme ver sions, underscoring the generalizability of our approach. Figure 1) Experimental scheme. A barcoded library is transformed in EBY100 (S. Cerevisiae) strain and encoded protein variants are displayed on the cell surface. After immunostaining and subsequent reaction with tyramide Alexa Fluor 594 substrate and the associated labelling of the cells, the library is sorted in four bins by fluorescent activated cell sorting (FACS) based on the expression normalized activity level. Variant distribution amongst bins is evaluated by next generation sequencing and transformed into fitness scores and compared to the score of WT. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 4

Results

System setup, library construction and barcoding We first optimized the functional display of WT hMb fused to the C-terminus of the Aga2 yeast anchor protein. We successfully detected protein display by staining the His 6-tag at the C -terminus with a primary and secondary Alexa Fluor 488-conjugated fluorescent antibody by flow cytometry25. We next validated selective tyramide -based cell labeling for expressed, catalytically active variants, using negative controls either omitting H2O2 or using cells transformed with empty cassette plasmids (Figure S1). To generate the site saturated library of the coding region of hMb, we employed primers with nested NNK codons and barcoded each variant with a unique 15 nucleotide barcode (see Methods)40. After sequencing the sorted bins, we mapped the reads via a look up table to their corresponding variants, and converted the distribution of the variants among the bins to activity fitness scores. Applying confidence filters to remove variants with low sorting or sequencing coverage, we ended up with a dataset consisting of expression-normalized activity scores for 6,115 variants bearing single and multiple mutations. Deep mutational scanning elucidates stability-activity trade offs The results of the deep mutational scanning (DMS) experiment to quantify peroxidase activity were processed by computational filtering to consider only the single -site mutants ( Figure 2). To assess reproducibility, we calculated activity scores for two biological replicates of the library, and observed a Pearson’s r of 0.85 (n = 2,661; p < 0.0001), indicating strong agreement between replicates (Figure 2A). Data points with higher cell coverage are shown in darker shades, and reflect the improved correlation for variants observed more frequently in the dataset. We categorized the variants based on mutation type, and visualized the fitness scores in a histogram (Figure 2B). By definition, the WT sequence is assigned a score of 0. Synonymous mutations encoding the same amino acid sequence cluster at 0.00 ± 0.06 (n = 102, 3.83% of the single mutants). Nonsense mutations encoding stop codons (n = 123, 4.62%) exhibit the lowes t activity scores of -0.43 ± 0.03, consistent with truncation of the protein chain. The largest and most informative group consists of the missense mutations (n =2,436; 91.54%), describing all single mutant variants with activity scores ranging from -0.47 to 0.3. As observed in other mutational scans, most amino acid substitutions are deleterious26,41,42. In our dataset, ~88% of missense mutations reduce peroxidase activity. This fraction is higher than the ~67% of missense mutations that decrease myoglobin expression levels 25. This higher sensitivity to mutation likely indicates that enzymatic activity imposes stricter constraints than simple folding, making the peroxidase phenotype more vulnerable to mutational disruption than folding stability alone. To validate the results of the DMS experiment, we conducted control assays for individual variants. We selected random sequences as well as variants with high activity scores in order to cover the full range of activity. Each variant was expressed individually on the yeast surface, and peroxidase activity was measured using the same tyramide proximity labeling protocol that was applied in the pooled .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 5 screen. As in the en masse experiment, we calculated activity scores by quantifying expression - normalized fluorescence shifts (see Methods). The resulting values showed strong agreement with the DMS dataset (r = 0.96, n = 20, p < 0.0001; Figure 2C). Figure 2D presents a heatmap of activity scores for all single -point mutants, where activity is normalized by expression level. Variants with reduced folding stability compared to wild type are shown with a black border. We estimated this stability threshold based on the distribution of synonymous mutants and defined destabilizing variants as those with expression scores < -0.06 (see supplementary info and ref 25). Variants without a border are either stably expressed or were not present in the expression dataset. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 6 Figure 2) Deep mutational scanning of myoglobin peroxidase activity. A) Correlation of activity scores between two biological replicates. Color shading indicates total cell count per variant as indicated in the color bar. B) Distribution of activity scores grouped by mutation type. Missense mutations are shown in yellow, synonymous mutations in teal, and nonsense mutations in blue. Dashed lines indicate ±1 standard deviation from the mean of the synonymous codon scores. C) Validation of the DMS fitness scores using individually expressed monogenic variants. D) Heatmap of expression-normalized peroxidase activity scores for all single mutants. Amino acids are grouped by chemical class. Black borders around the squares indicate variants previously shown to have reduced stability based on a separate expression DMS screen.25 .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 7 In addition to the destabilizing mutations discussed earlier ( Figure 2D, shown in red with black borders), we also identified variants that are stably expressed on the yeast cell surface but lose their catalytic activity. Examples include mutations at critical positions such as the proximal and distal heme-coordinating histidines (His94 and His65, respectively). Other functionally sensitive sites include residues within helix F, a region known to support heme binding but is not required for correct folding of the apoprotein. Helix F is structurally flexible and disrupted in apo -myoglobin43, which makes it more tolerant to mutation compared to other helices. However, our data show that mutations in this region can severely impair peroxidase activity. This points to an activity -stability tradeoff, and underscores the importance of heme positi oning and axial coordination in maintaining catalytic function, even in mutants that fold and express efficiently. Identification of novel and highly active variants with machine learning We next sought to discover myoglobin variants with improved peroxidase activity by combining deep mutational data with machine learning. Using the activity DMS dataset, we trained a supervised regression model to predict peroxidase activity from protein se quences encoded using pre -trained protein language models (Figure 3A). We evaluated both ESM and ProtTrans embeddings given their strong performance across a variety of protein prediction tasks 44,45. The supervised regression layer was based on a deep feedforward neural network trained on the DMS activity scores using the protein embeddings as input features. To increase the confidence of model predictions, the training data excluded variants with large variation in DMS scores across replicates ( Figure 2A and Methods). We also excluded higher-order mutants (4x, 5x) that were poorly covered in the original DMS screen. This filtering step led to a training set with N=4,769 myoglobin variants (Figure 3B). After hyperparameter optimization, both ESM and ProtTrans models performed well on held -out test data, and both outperformed models based on one-hot amino acid encoding (Figure 3C, left). We next queried both models with a set of N=20 variants for which we had independently measured monogenic activity scores ( Table S6 ; Methods). Comparison of predicted and measured values indicated that ESM embeddings provided better out-of-distribution accuracy. Based on this result, we selected the ESM embeddings for the final computational screen (Figure 3C, right). Given that the training data was highly enriched in double mutants (46.8%) and included only a small fraction of triple mutants (8.7%), we restricted the prediction screen to double mutants that were not present in the training set. This increased reliabil ity of model predictions. Moreover, a two - dimensional projection of both the training and query sequences showed good overlap, which further supports the robustness of the predictions despite the relatively limited coverage of the training data (Figure 3D ). We embedded a total of N=4,250,505 double mutants using ESM and queried an ensemble of 20 regressors, each trained using five -fold cross-validation and four random seeds for weight initialization. The predicted activity scores for these unseen double mut ants followed a distribution similar to that of the training data ( Figure 3E). For experimental validation, we selected 20 from a set of 65 consensus hits that scored in the top 0.2% (N=10,000 variants) across the model ensemble (Table S7). All 20 of these tested variants showed improved peroxidase activity over the wild type. This perfect success rate demonstrates the effectiveness of our machine learning -guided approach for identifying highly active myoglobin variants. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 8 Figure 3) Machine learning workflow, model performance, and experimental validation of predicted high-activity double mutants. (A) Overview of our sequence-to-function prediction pipeline. Phase 1: Activity DMS data were used to train a feedforward neural network (multilayer perceptron) regressor using ESM-3 embeddings as input features. Phase 2: An in silico library of ~4.25 million double mutants was embedded using ESM -3 and scored using an ensemble of 20 models. (B) Characteristics of the training data (N = 4,769 unique variants) after filtering out low confidence variants and higher order mutants. The histogram shows the distribution of experimental DMS scores. The pie chart indicates the proportion of single, double, and triple mutants in the training data. (C) Evaluation of model .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 9 performance. Left: predictive performance on a held-out test set of two MLP models trained on ESM- 3 and ProtTrans embeddings, and a 1D convolutional neural network model trained on amino acid one-hot encodings. Bars are mean R² scores between predicted and ground truth fitness scores, computed across predictions from models trained in 5 -fold cross-validation; error bars denote one standard deviation across folds. Right: Comparison of model predictions and experimentally measured monogenic activity scores for 20 variants not used in training (Table S6). (D) Two-dimensional UMAP projection of ESM-3 embeddings of training myoglobin variants and the double mutants screened with the pretrained model. A random sample of 50,000 variants from the full screening library is shown in orange, highlighting extensive ove rlap in coverage between both libraries. (E) Distribution of consensus predicted fitness for the ~4.25 million double mutants. The histogram shows the mean predicted fitness scores across all 20 trained ESM-3 MLP model instances. (F) Experimental validation of 20 high-confidence candidate variants selected from the top 0.2% of predicted variants. All tested variants showed activity above wild type. Analysis of best performing ML predicted sequences as soluble enzymes After showing that the top machine learning-predicted variants all exhibited higher peroxidase activity than WT myoglobin when displayed on the yeast surface, we further characterized the top three candidates as purified soluble enzymes. These variants, re ferred to as Var4, Var9, and Var14, were selected based on their top ranked monogenic activity scores and were expressed in E. coli along with WT myoglobin. The mutational compositions of these variants are shown in Figure 4A , and their positions are mapped onto the 3D structure in Figure 4D. Mutations were modeled using AlphaFold and visualized in PyMol as sticks. We obtained the purified protein and could verify by means of reducing and non-reducing SDS page that the R32C mutation, known from our prior study 25, forms a disulfide bond in the ML-predicted double mutant (Figure 4B). The non-reduced form of the protein migrates faster due to the intramolecular disulfide with C111, while addition of β-mercaptoethanol eliminates this change in electrophoretic mobility. To test peroxidase activity for the soluble variants, we adapted the tyramide labeling assay to a soluble format. Employing the same reaction mixture as with yeast displaying proteins, we stained uninduced yeast cells carrying an empty cassette plasmid by administering soluble hMb variant enzymes. The endpoint fluorescence data ( Figure 4C ) confirmed that all three double mutants produced higher signal than wild type, reproducing the high activity ranking observed in the yeast -displayed format. Negative controls lacking any enzyme showed minimal background fluorescence. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 10 Figure 4) Soluble validation of ML predicted variants. A) Selected variants and featured mutations. B) 15% SDS -PAGE gel of purified soluble variants. Samples were boiled under non reducing ( -) and reducing (+) conditions. C) Endpoint MFI of uninduced cells stained by purified double mutants with tyramide AF594. Error bars correspond to STD of triplicates and negative control contains no myoglobin in reaction mix. D) Structure showing positions of double mutants, according to color code introduced before. Position Q92 which is mutated in multiple variants is shown here only as Trp and in blue. E) Michaelis Menten analysis of soluble enzymes with tyramide. Uninduced yeast cells carrying an empty cassette were stained at different substrate concentrations and endpoint fluorescence assayed at different time points in order to get reaction velocities for variants along with WT. To gain a deeper understanding of the kinetics of these improved variants, we performed a Michaelis- Menten analysis using the same endpoint -based tyramide labeling assay. Reaction velocities were measured at varying substrate concentrations by stopping the reaction at different timepoints and fitting linear regressions to the endpoint mean fluorescence intensities (MFIs) (see methods). Figure 4E shows the resulting velocity curves for WT and the three selected double mutants. Due to the high cost of tyramide, we were unable to reach substrate saturation. Therefore, a linear approximation of the Michaelis-Menten model was used in the low-substrate regime to estimate catalytic efficiency. All three double mutants showed slopes more than three times steeper than WT, indicating high catalytic rates at the concentrations tested, including under the standard library screening concentration which corresponds to 1.2 uM. In all three of these improved variants (Var4, Var9, and Var14), residue Q92 is mutated to either tyrosine or tryptophan, suggesting that the introduction of a redox -active residue at this surface site contributes to enhanced activity. The R32C mutation found in Var4 likely serves a stabilizing role that supports acquisition of secondary activity -enhancing mutations. This stabilizing effect of R32C was presumably the reason it was frequently found among top ML -ranked double variants. To further .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 11 explore the catalytic behavior of the selected variants, we expanded our kinetic analysis to include alternative peroxidase substrates, specifically reactive blue 19 and guaiacol. The results are presented in Figure S2 and discussed in the Supplementary Note SN1. Strategically placed Trp wires boost activity towards bulky substrates Among the machine learning-predicted variants discussed above, we observed a clear enrichment of mutations that introduce tyrosine or tryptophan. Beyond the frequently occurring R32C disulfide variant25 found in many of the top double mutants (Table S7), four of the five most active candidates contain either a tyrosine at position 71 or a tyrosine or tryptophan at position 92. This trend is also observable across the whole DMS dataset, where single-point variants containing these substitutions often show higher peroxidase activity. This can be seen in the heatmap in the two rightmost columns in Figure 2D . Conversely, mutations that remove native tyrosine residues tend to show very deleterious effects on activity (Figure S3). For the two native tryptophan residues we cannot estimate their role in activity since both are elementary for stability and do not tolerate substitution. When comparing the effects of mutations on protein stability and catalytic activity, we find that aromatic amino acids are particularly important for maintaining function. In many cases, mutations that are tolerated in the expression screen exhibit reduced activity when an aromatic side chain is removed. Wild type tyrosine, histidine and phenylalanine residues tolerate some substitutions, but these almost always come at the expense of catalytic efficiency. These findings support a broader role for redox -active aromatic residues in modulating myoglobin peroxidase activity. It has been reported that many oxidoreductases possess clusters or chains of tyrosine and/or tryptophan residues that serve as hole hoppin g relay stations, providing alternative electron transfer pathways to mitigate oxidative stress and preserve function. 21,46 Similarly, dye - decoloring peroxidases use surface exposed aromatic residues as stepping stones for oxidation of bulky substrates that cannot pass through the heme access tunnel.47,48 Such work has inspired others to use surface tryptophans and tyrosines in myoglobins to enhance dye decoloring peroxidative activity in myoglobins.12,13 Consistent with this rationale, our data show increased peroxidase activity for variants introducing tyrosine or tryptophan at surface -accessible positions. In many cases, these substitutions are interchangeable, supporting the idea that their redox potent ial rather than specific side -chain orientations are beneficial to catalysis (e.g., positions 71, 92, 137, 146 or 152). We visualized this trend in Figure 5A by mapping all single tryptophan substitution scores onto the protein surface. Variants that improve activity are shown in blue, and deleterious substitutions are shown in red. Positions where tryptophan residues increase the peroxidase activity are placed around the heme active site on the surface of the protein. This spatial arrangement suggests that these engineered residues facilitate electron transfer for substrates like tyramide, which are too bulky to access the heme directly. We note that the native tyrosine and tryptophan residues are not surface exposed (SASA W8: 4, W15: 2, Y103: 18, Y146:0, w here 0 is buried and 100 equals fully exposed). These observations support the idea that enhanced labeling efficiency arises from introduction of redox relays close to the substrate interface. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 12 To examine whether the activity -enhancing effects of individual tryptophan mutations could be combined synergistically, we designed a small combinatorial library targeting the four positions with the highest fitness scores in the single -mutant DMS dataset, which were A72W, Q92W, K97W and F107W. Notably, substitution A72W, which yields one of the most active variants in the library, was the only substitution at that position 72 that retained or improved peroxidase activity, while all others were neutral or d eleterious. Furthermore, since most mutations at A72 were neither strongly destabilizing nor beneficial in terms of expression stability 25, this position seems to be functionally specific to peroxidase activity. The selected variants were modeled using AlphaFold and visualized in PyMol 49 (Figure 5B). Residues were color coded according to the legend in Figure 5C, where native tyrosine and tryptophan residues are shown in cyan. We synthesized these combinatorial variants using site-directed mutagenesis and tested their peroxidase activity in the yeast surface display system ( Figure 5C) using the same WT - normalized scoring method as described previously (see Figure S4). Consistent with the prior DMS data and ML predictions, all single TRP variants outperformed WT myoglobin ( Figure 5C ). The combinatorial mutants also exhibited improved activity relative to WT, although the effects were more variable. While most combinations did not show improvement upon introduction of additional TRP residues, the impact of F107W was dependent on which TRP mut ation it was paired with. In combination with A72W, activity declined, whereas when combined with double mutant Q92W it was amongst the very best variants found in this study. In addition to tyrosine residues, tryptophans possess electron-rich aromatic side chains that can under certain conditions undergo oxidative modification by peroxidase-generated radicals, especially under high peroxide or high substrate conditions50. To ensure that the improved labeling signal observed in our TRP-substituted mutants was not due to the introduction of these additional labeling sites, we performed the decoupled labeling assay with soluble myoglobin variants as described above. We purified WT, Q92W, F107W and Q92W/F107W variants as soluble proteins (Figure 5D) and used them to stain uninduced yeast cells containing only an empty cassette plasmid under otherwise identical reaction conditions. This decoupled labelling assay eliminated any artefacts due to the display format, and allowed direct attribution of labeling signal to peroxidase activity. The same performance trend (Figure 5E) observed in the display assay was found in this decoupled format, confirming that the enhanced signal was the result of increased enzymatic rate rather than TRP -modification on myoglobin itself. As presented in Figure for the ML predicted variants, we quantified reaction velocities at different tyramide concentrations as well for the tryptophan mutants. As shown in Figure 5F, the double mutant Q92W/F107W exhibits a 4.9-fold increase in reaction velocity relative to the wild type at the substrate concentration used in the DMS assay. Interestingly, although the single mutants Q92W and F107W each individually enhance reaction ve locity by factors of 3.9 and 2.4, respectively, their combined effect in the double mutant was not fully additive, indicating some negative epistatic effects. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 13 Figure 5) Tryptophan-mediated enhancement of peroxidase activity. A) Myoglobin 3D structure with residues colored by single mutant fitness scores for TRP substitutions. Positions without measurements are shown in white as cartoon loops. The heme cofactor is shown in white. B) Residues selected for combinatorial Trp-library are shown as sticks and colored according to the legend in panel C. C) Barplot of monogenic activity scores for individual TRP mutations and their combinations. D) SDS- PAGE gel analysis of purified protein variants. E) Decoupled cell labelling assay. Endpoint MFI values are shown for uninduced yeast cells containing an empty cassette plasmid, stained with soluble enzymes and the tyramide fluorophore. F) Reaction velocities for soluble enzyme variants measured using a time -dependent tyramide assay. Data represent the linear region used to approximate catalytic efficiency. We further attempted to explain observed trends for mutants by analyzing the potential electron transfer pathways employing the published tools eMap and EHPath. eMap is a python based web application that predicts possible electron or hole transfer channel s from pdb files based on graph theory. Shortest path algorithms are used to estimate shortest pathways from user -specified hole donor (here heme) to the surface of the protein, assigning scores to all pathways51. EHPath is another python module estimating and ranking mean residence times of a transferring charge along such hopping pathways 52. The settings used are described in the method section and the results are presented in Figure S5 and discussed in supplementary notes SN2. We were interested in studying the additivity of the double mutant Q92W/F107W in more detail and hence used purified variants to test the activity in Michaelis -Menten analysis. Equally to the machine learning predicted variants above, we switched to reac tive blue 19 as substrate for this experiment. The results including the fits and extracted kinetic parameters are shown in Figure S7. We see that the trend observed in the monogenic tyramide scores as well as the decoupled cell labelling assay holds true also for Rb19 dye, with both .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 14 single mutants being more active than WT and the double mutant benefiting from additivity of composing mutants. Next, we assayed how the mutations would act on peroxidase activity towards smaller substrates that might fit in the active site crevice. We performed a Michaelis -Menten assay using guaiacol for the selected variants along with WT myoglobin and saw that in fact the activity towards this smaller substrate is similar for all variants tested ( Figure 4E). We used molecular docking to further validate our hypothesis, differentiating between bulkier substrates that benefit from a surface radical and smaller ones that do not. We find that for WT, Q92W and F107W the binding site for guaiacol is identical and directly adjacent to the heme cofactor (Figure S7 F). Contrary to the machine learning variants, here we study single mutations, allowing for direct allocation of observed effects. Considering that we found improvement towards guaiacol for the machine learning predicted variant 9 which contains the R140I a nd Q92Y mutations, we suspect that especially the R140I mutation is leading to enhanced guaiacol reactivity, while the tyrosine mutation, similar to the tryptophan substitutions are beneficial for bulky tyramide oxidation. Lastly, we cross-referenced our deep mutational scanning data with variants reported in gnomAD to identify mutations in myoglobin observed in human clinical populations. We annotated these clinically observed variants with their stability and activity scores derived from our screening assays (Table S8). Although our peroxidase screening assay used a non -physiological substrate, these annotations may still provide useful insights into naturally occurring variants. Notably, several clinically observed variants show substitution of wild -type residue with tyrosine or tryptophan, suggesting potential impacts on enhancing oxidation of bulky substrates.

Discussion

In this study we provide a comprehensive map of the peroxidase activity fitness landscape of human myoglobin, utilizing the high -throughput EP-Seq platform. We highlight regions of activity -stability trade-offs, and show global trends of amino acid groups as well as single mutations with enhanced activity. By leveraging the extensive labeled mutant library, we integrate high -throughput DMS with ML to successfully train a predictive model and identify novel double mutants with elevated peroxidase activity. Notably, all 20 ML-predicted sequences were substantially more active than wild type (WT). The most promising sequences were expressed as soluble proteins to assess whether their improved activities observed on the yeast cell surface translated to the soluble format. When tested at the same substrate concentration used in the initial library scr eening, these variants exhibited over threefold higher catalytic efficiency compared to WT. Analysis of the machine learning dataset pointed toward the introduction of oxidizable amino acids such as tyrosine and tryptophan as a key driver of enhanced activity with the bulky tyramide substrate. We validated this hypothesis by constructing a small focused combinatorial library based on top-performing tryptophan mutations. Among these, surface- exposed residues such as Q92 and A72 substantially boosted activity, supporting the hypothesis that these residues serve as redox -active relay stations and facilitate long -range electron transfer. In particular, the combination of mutations F107W and Q92W increased tyramide oxidation by nearly fivefold in assays with soluble proteins. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 15 These findings demonstrate that activity fitness scores derived from the EP -Seq deep mutation scanning platform reliably predict activity trends in soluble enzymes, providing a key validation of this platform. Beyond methodological relevance, the insights reported will find broad applications ranging from engineering bulky dye decoloring peroxidases for industrial use to providing knowledge about mutations that alter peroxidase activity of globins, including hemoglobin based oxygen carriers (HBOCs)53. Surface-exposed tyrosine residues have already been shown to accelerate reduction by physiological reductants such as ascorbate in hemoglobin54. The strategy developed here could guide future engineering efforts to identify mutations that tune redox activity for safe HBOCs. Finally, we establish a combined approach of high -throughput screening and machine learning to expedite enzyme engineering u sing yeast surface -displayed libraries, with results directly transferable to soluble enzymes. Finally, this work underscores the utility of combining high -throughput experimental fitness landscapes with pretrained protein language models to drive hypothesis generation, prioritize variants, and ultimately expand the functional repertoire of enzymes. The ability to use ML predictions to successfully guide mutational searches across vast sequence space represents a generalizable framework for accelerating biocatalyst development. Author contributions C.K. and M.A.N. conceived the study and drafted the manuscript. C.K. carried out the practical work and computational analyses. A.D. carried out the machine learning and variant prediction. R.V. contributed to the conceptualization and optimization of the EP-Seq experimental and computational workflow. D.A.O. designed the machine learning work. M.A.N. secured funding and administered the project.

Acknowledgements

This work was supported by the University of Basel, ETH Zurich, and the Swiss National Science Foundation (200021_191962). AD and DAO were supported by a UKRI Engineering Biology Mission Award CYBER under BBSRC grant BB/Y007638/1. Competing interests The authors have no conflicts of interest to disclose. .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 16

References

1. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC (1958) A Three-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis. Nature [Internet] 181:662–666. Available from: https://www.nature.com/articles/181662a0 2. Wittenberg JB (1970) Myoglobin-facilitated oxygen diffusion: role of myoglobin in oxygen entry into muscle. Physiol Rev 50:559–636. 3. Wan L, Twitchett MB, Eltis LD, Mauk AG, Smith M (1998) In vitro Evolution of Horse Heart Myoglobin to Increase Peroxidase Activity. Proceedings of the National Academy of Sciences of the United States of America [Internet] 95:12825–12831. Available from: https://www.jstor.org/stable/46153 4. Guo C, Chadwick RJ, Foulis A, Bedendi G, Lubskyy A, Rodriguez KJ, Pellizzoni MM, Milton RD, Beveridge R, Bruns N (2022) Peroxidase Activity of Myoglobin Variants Reconstituted with Artificial Cofactors. ChemBioChem [Internet] 23:e202200197. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/cbic.202200197 5. Boutaud O, Roberts LJ (2011) Mechanism-Based Therapeutic Approaches to Rhabdomyolysis- Induced Renal Failure. Free Radic Biol Med [Internet] 51:1062–1067. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3116013/ 6. Holt S, Moore K (2000) Pathogenesis of Renal Failure in Rhabdomyolysis: The Role of Myoglobin. Experimental Nephrology [Internet] 8:72–76. Available from: https://doi.org/10.1159/000020651 7. Moore KP, Holt SG, Patel RP, Svistunenko DA, Zackert W, Goodier D, Reeder BJ, Clozel M, Anand R, Cooper CE, et al. (1998) A Causative Role for Redox Cycling of Myoglobin and Its Inhibition by Alkalinization in the Pathogenesis and Treatment of Rhabdomyolysis-induced Renal Failure *. Journal of Biological Chemistry [Internet] 273:31731–31737. Available from: https://www.jbc.org/article/S0021-9258(19)59006-8/abstract 8. Alayash AI, Patel RP, Cashon RE (2001) Redox reactions of hemoglobin and myoglobin: biological and toxicological implications. Antioxid Redox Signal 3:313–327. 9. Wilson MT, Reeder BJ (2021) The peroxidatic activities of Myoglobin and Hemoglobin, their pathological consequences and possible medical interventions. Mol Aspects Med:101045. 10. Vlasova I (2018) Peroxidase Activity of Human Hemoproteins: Keeping the Fire under Control. Molecules [Internet] 23:2561. Available from: http://www.mdpi.com/1420-3049/23/10/2561 11. Reeder BJ, Sharpe MA, Kay AD, Kerr M, Moore K, Wilson MT (2002) Toxicity of myoglobin and haemoglobin: oxidative stress in patients with rhabdomyolysis and subarachnoid haemorrhage. Biochemical Society Transactions [Internet] 30:745–748. Available from: https://doi.org/10.1042/bst0300745 12. Li L-L, Yuan H, Liao F, He B, Gao S-Q, Wen G-B, Tan X, Lin Y-W (2017) Rational design of artificial dye-decolorizing peroxidases using myoglobin by engineering Tyr/Trp in the heme center. Dalton Trans. [Internet] 46:11230–11238. Available from: https://pubs.rsc.org/en/content/articlelanding/2017/dt/c7dt02302b 13. Guo W-J, Xu J-K, Wu S-T, Gao S-Q, Wen G-B, Tan X, Lin Y-W (2022) Design and Engineering of an Efficient Peroxidase Using Myoglobin for Dye Decolorization and Lignin Bioconversion. International Journal of Molecular Sciences [Internet] 23:413. Available from: https://www.mdpi.com/1422- 0067/23/1/413 14. Wu G-R, Sun L-J, Xu J-K, Gao S-Q, Tan X-S, Lin Y-W (2022) Efficient Degradation of Tetracycline Antibiotics by Engineered Myoglobin with High Peroxidase Activity. Molecules 27:8660. 15. Reeder BJ, Svistunenko DA, Cooper CE, Wilson MT (2012) Engineering Tyrosine-Based Electron Flow Pathways in Proteins: The Case of Aplysia Myoglobin. J. Am. Chem. Soc. [Internet] 134:7741– 7749. Available from: https://doi.org/10.1021/ja211745g 16. Pott M, Hayashi T, Mori T, Mittl PRE, Green AP, Hilvert D (2018) A Noncanonical Proximal Heme Ligand Affords an Efficient Peroxidase in a Globin Fold. J. Am. Chem. Soc. [Internet] 140:1535–1543. Available from: https://doi.org/10.1021/jacs.7b12621 .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 17 17. Ayuso-Fernández I, Emrich-Mills TZ, Haak J, Golten O, Hall KR, Schwaiger L, Moe TS, Stepnov AA, Ludwig R, Cutsail III GE, et al. (2024) Mutational dissection of a hole hopping route in a lytic polysaccharide monooxygenase (LPMO). Nat Commun [Internet] 15:3975. Available from: https://www.nature.com/articles/s41467-024-48245-w 18. Moosmann B (2021) Redox Biochemistry of the Genetic Code. Trends in Biochemical Sciences [Internet] 46:83–86. Available from: https://www.sciencedirect.com/science/article/pii/S0968000420302711 19. Granold M, Hajieva P, Toşa MI, Irimie F-D, Moosmann B (2018) Modern diversification of the amino acid repertoire driven by oxygen. Proceedings of the National Academy of Sciences [Internet] 115:41–46. Available from: https://www.pnas.org/doi/full/10.1073/pnas.1717100115 20. Ravanfar R, Sheng Y, Gray HB, Winkler JR (2023) Tryptophan extends the life of cytochrome P450. Proceedings of the National Academy of Sciences [Internet] 120:e2317372120. Available from: https://www.pnas.org/doi/10.1073/pnas.2317372120 21. Gray HB, Winkler JR (2015) Hole hopping through tyrosine/tryptophan chains protects proteins from oxidative damage. Proceedings of the National Academy of Sciences [Internet] 112:10920– 10925. Available from: https://www.pnas.org/doi/10.1073/pnas.1512704112 22. Winkler JR, Gray HB (2015) Could tyrosine and tryptophan serve multiple roles in biological redox processes? Philos Trans A Math Phys Eng Sci [Internet] 373:20140178. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4342971/ 23. B. Gray H, R. Winkler J (2021) Functional and protective hole hopping in metalloenzymes. Chemical Science [Internet] 12:13988–14003. Available from: https://pubs.rsc.org/en/content/articlelanding/2021/sc/d1sc04286f 24. Fowler DM, Fields S (2014) Deep mutational scanning: a new style of protein science. Nat

Methods

[Internet] 11:801–807. Available from: https://www.nature.com/articles/nmeth.3027 25. Küng C, Protsenko O, Vanella R, Nash MA (2025) Deep mutational scanning reveals a de novo disulfide bond and combinatorial mutations for engineering thermostable myoglobin. Protein Science [Internet] 34:e70112. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/pro.70112 26. Vanella R, Küng C, Schoepfer AA, Doffini V, Ren J, Nash MA (2024) Understanding activity- stability tradeoffs in biocatalysts by enzyme proximity sequencing. Nat Commun [Internet] 15:1807. Available from: https://www.nature.com/articles/s41467-024-45630-3 27. Höllerer S, Desczyk C, Muro RF, Jeschek M (2024) From sequence to function and back – High- throughput sequence-function mapping in synthetic biology. Current Opinion in Systems Biology [Internet] 37:100499. Available from: https://www.sciencedirect.com/science/article/pii/S2452310023000562 28. Agresti JJ, Antipov E, Abate AR, Ahn K, Rowat AC, Baret J-C, Marquez M, Klibanov AM, Griffiths AD, Weitz DA (2010) Ultrahigh-throughput screening in drop-based microfluidics for directed evolution. Proceedings of the National Academy of Sciences [Internet] 107:4004–4009. Available from: https://www.pnas.org/doi/full/10.1073/pnas.0910781107 29. Romero PA, Tran TM, Abate AR (2015) Dissecting enzyme function with microfluidic-based deep mutational scanning. Proceedings of the National Academy of Sciences [Internet] 112:7159–7164. Available from: https://www.pnas.org/doi/full/10.1073/pnas.1422285112 30. Thomas N, Belanger D, Xu C, Lee H, Hirano K, Iwai K, Polic V, Nyberg KD, Hoff KG, Frenz L, et al. (2025) Engineering highly active nuclease enzymes with machine learning and high-throughput screening. Cell Systems [Internet] 16:101236. Available from: https://www.sciencedirect.com/science/article/pii/S2405471225000699 31. Stiffler MA, Hekstra DR, Ranganathan R (2015) Evolvability as a Function of Purifying Selection in TEM-1 β-Lactamase. Cell [Internet] 160:882–892. Available from: https://www.sciencedirect.com/science/article/pii/S0092867415000781 32. Trinidad DD, Macdonald CB, Rosenberg OS, Fraser JS, Coyote-Maestas W (2024) Deep mutational scanning of EccD3 reveals the molecular basis of its essentiality in the mycobacterium ESX secretion .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 18 system. bioRxiv [Internet]:2024.08.23.609456. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370616/ 33. Jansen SC, Mayer C (2024) A Robust Growth-Based Selection Platform to Evolve an Enzyme via Dependency on Noncanonical Tyrosine Analogues. JACS Au [Internet] 4:1583–1590. Available from: https://pubs.acs.org/doi/10.1021/jacsau.4c00070 34. Küng C, Vanella R, Nash MA (2023) Directed evolution of Rhodotorula gracilisD-amino acid oxidase using single-cell hydrogel encapsulation and ultrahigh-throughput screening. React. Chem. Eng. [Internet] 8:1960–1968. Available from: https://pubs.rsc.org/en/content/articlelanding/2023/re/d3re00002h 35. Vanella R, Boult S, Kueng C, Nash M (2025) Decoding Substrate Specificity in a Promiscuous Biocatalyst by Enzyme Proximity Sequencing. :2025.07.10.664162. Available from: https://www.biorxiv.org/content/10.1101/2025.07.10.664162v1 36. Hsu C, Nisonoff H, Fannjiang C, Listgarten J (2022) Learning protein fitness models from evolutionary and assay-labeled data. Nat Biotechnol [Internet] 40:1114–1122. Available from: https://www.nature.com/articles/s41587-021-01146-5 37. Yang KK, Wu Z, Arnold FH (2019) Machine-learning-guided directed evolution for protein engineering. Nat Methods [Internet] 16:687–694. Available from: https://www.nature.com/articles/s41592-019-0496-6 38. Frei L, Gao B, Han J, Taft JM, Irvine EB, Weber CR, Kumar RK, Eisinger BN, Ignatov A, Yang Z, et al. (2025) Deep mutational learning for the selection of therapeutic antibodies resistant to the evolution of Omicron variants of SARS-CoV-2. Nat. Biomed. Eng [Internet] 9:552–565. Available from: https://www.nature.com/articles/s41551-025-01353-4 39. Taft JM, Weber CR, Gao B, Ehling RA, Han J, Frei L, Metcalfe SW, Overath MD, Yermanos A, Kelton W, et al. (2022) Deep mutational learning predicts ACE2 binding and antibody escape to combinatorial mutations in the SARS-CoV-2 receptor-binding domain. Cell [Internet] 185:4008- 4022.e14. Available from: https://www.sciencedirect.com/science/article/pii/S0092867422011199 40. Wrenbeck EE, Klesmith JR, Stapleton JA, Adeniran A, Tyo KEJ, Whitehead TA (2016) Plasmid- based one-pot saturation mutagenesis. Nat Methods [Internet] 13:928–930. Available from: http://www.nature.com/articles/nmeth.4029 41. Starr TN, Greaney AJ, Hilton SK, Ellis D, Crawford KHD, Dingens AS, Navarro MJ, Bowen JE, Tortorici MA, Walls AC, et al. (2020) Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell [Internet] 182:1295-1310.e20. Available from: https://www.sciencedirect.com/science/article/pii/S0092867420310035 42. Li Y, Arcos S, Sabsay KR, te Velthuis AJW, Lauring AS (2023) Deep mutational scanning reveals the functional constraints and evolutionary potential of the influenza A virus PB1 protein. Journal of Virology [Internet] 97:e01329-23. Available from: https://journals.asm.org/doi/full/10.1128/jvi.01329-23 43. Picotti P, Marabotti A, Negro A, Musi V, Spolaore B, Zambonin M, Fontana A (2004) Modulation of the structural integrity of helix F in apomyoglobin by single amino acid replacements. Protein Science [Internet] 13:1572–1585. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1110/ps.04635304 44. Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, Gibbs T, Feher T, Angerer C, Steinegger M, et al. (2022) ProtTrans: Toward Understanding the Language of Life Through Self- Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence [Internet] 44:7112–7127. Available from: https://ieeexplore.ieee.org/document/9477085 45. Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, Verkuil R, Tran VQ, Deaton J, Wiggert M, et al. (2025) Simulating 500 million years of evolution with a language model. Science [Internet] 387:850–858. Available from: https://www.science.org/doi/10.1126/science.ads0018 46. Meng S, Li Z, Ji Y, Ruff AJ, Liu L, Davari MD, Schwaneberg U (2023) Introduction of aromatic amino acids in electron transfer pathways yielded improved catalytic performance of cytochrome P450s. Chinese Journal of Catalysis [Internet] 49:81–90. Available from: .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint 19 https://www.sciencedirect.com/science/article/pii/S1872206723644456 47. Sáez-Jiménez V, Rencoret J, Rodríguez-Carvajal MA, Gutiérrez A, Ruiz-Dueñas FJ, Martínez AT (2016) Role of surface tryptophan for peroxidase oxidation of nonphenolic lignin. Biotechnology for Biofuels [Internet] 9:198. Available from: https://doi.org/10.1186/s13068-016-0615-x 48. Li L, Wang T, Chen T, Huang W, Zhang Y, Jia R, He C (2021) Revealing two important tryptophan residues with completely different roles in a dye-decolorizing peroxidase from Irpex lacteus F17. Biotechnol Biofuels [Internet] 14:128. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8165797/ 49. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, et al. (2021) Highly accurate protein structure prediction with AlphaFold. Nature [Internet] 596:583–589. Available from: https://www.nature.com/articles/s41586-021- 03819-2 50. Bobrow MN, Harris TD, Shaughnessy KJ, Litt GJ (1989) Catalyzed reporter deposition, a novel

Method

of signal amplification application to immunoassays. Journal of Immunological Methods [Internet] 125:279–285. Available from: https://www.sciencedirect.com/science/article/pii/002217598990104X 51. Anon eMap: A Web Application for Identifying and Visualizing Electron or Hole Hopping Pathways in Proteins | The Journal of Physical Chemistry B. Available from: https://pubs.acs.org/doi/10.1021/acs.jpcb.9b04816 52. Teo RD, Wang R, Smithwick ER, Migliore A, Therien MJ, Beratan DN (2019) Mapping hole hopping escape routes in proteins. Proceedings of the National Academy of Sciences [Internet] 116:15811– 15816. Available from: https://www.pnas.org/doi/abs/10.1073/pnas.1906394116 53. Silkstone GGA, Silkstone RS, Wilson MT, Simons M, Bülow L, Kallberg K, Ratanasopa K, Ronda L, Mozzarelli A, Reeder BJ, et al. (2016) Engineering tyrosine electron transfer pathways decreases oxidative toxicity in hemoglobin: implications for blood substitute design. Biochem J [Internet] 473:3371–3383. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5095908/ 54. Cooper CE, Simons M, Dyson A, Leiva Eriksson N, Silkstone GGA, Syrett N, Allen-Baume V, Bülow L, Ronda L, Mozzarelli A, et al. (2024) Taming hemoglobin chemistry—a new hemoglobin-based oxygen carrier engineered with both decreased rates of nitric oxide scavenging and lipid oxidation. Exp Mol Med [Internet] 56:2260–2270. Available from: https://www.nature.com/articles/s12276- 024-01323-x 55. Gietz RD, Woods RA (2002) Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods Enzymol 350:87–96. 56. Li H (2018) Minimap2: pairwise alignment for nucleotide sequences Birol I, editor. Bioinformatics [Internet] 34:3094–3100. Available from: https://academic.oup.com/bioinformatics/article/34/18/3094/4994778 57. Bushnell B BBMap: A Fast, Accurate, Splice-Aware Aligner. In: ; 2014. Available from: https://www.semanticscholar.org/paper/BBMap%3A-A-Fast%2C-Accurate%2C-Splice-Aware-Aligner- Bushnell/f64dd54444a724574deb7710888091350eebb2b9 58. Bugnon M, Röhrig UF, Goullieux M, Perez MAS, Daina A, Michielin O, Zoete V (2024) SwissDock 2024: major enhancements for small-molecule docking with Attracting Cavities and AutoDock Vina. Nucleic Acids Research [Internet] 52:W324–W332. Available from: https://doi.org/10.1093/nar/gkae300 59. Eberhardt J, Santos-Martins D, Tillack AF, Forli S (2021) AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. J. Chem. Inf. Model. [Internet] 61:3891–3898. Available from: https://doi.org/10.1021/acs.jcim.1c00203 .CC-BY-NC-ND 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is The copyright holder for this preprintthis version posted August 31, 2025. ; https://doi.org/10.1101/2025.08.27.672588doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-NC-ND-4.0