CycleDesigner: Leveraging RFdiffusion and HighFold to Design Cyclic Peptide Binders for Specific Targets

preprint OA: closed CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 34,821 characters · extracted from oa-pdf · 8 sections · click to expand

Abstract

Cyclic peptides are potentially therapeutic in clinical applications, due to their great stability and activity. Yet, designing and identifying potential cyclic peptide binders targeting specific targets remains a formidable challenge, entailing significant time and resources. In this study, we modified the powerful RFdiffusion model to allow the cyclic peptide structure identification and integrated it with ProteinMPNN and HighFold to design binders for specific targets. This innovative approach, termed cycledesigner, was followed by a series of scoring functions that efficiently screen. With the combination of effective cyclic peptide design and screening, our study aims to further broaden the scope of cyclic peptide binder design.

Keywords

deep learning, cyclic peptides, peptide design Issue Section: Problem solving protocol

Introduction

.CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint Cyclic peptides represent a unique class of biomolecules characterized by their cy- clic backbone structure. Compared to linear peptides, cyclic peptides exhibit notable advantages, such as enhanced stability against enzymatic degradation, stronger bind- ing affinity, and higher target specificity[1-3]. These features make cyclic peptides promising candidates for therapeutic development, especially in targeting pro- tein-protein interactions and other challenging molecular interfaces. However, the de- sign of cyclic peptides is inherently complex due to their restricted conformational space and the intricate relationship between sequence and structure[4]. Traditional cyclic peptide design approaches often rely on extensive experimental filtering or heuristic computational methods, which can be both time-intensive and resource-demanding[5-7]. Recent advancements in computational modeling, particu- larly deep learning, have paved the way for novel strategies in de novo peptide design. RFdiffusion[8], a generative model based on probabilistic diffusion, has demonstrated remarkable success in designing proteins and protein scaffolds by exploring se- quence-structure relationships in high-dimensional space. However, applying this ap- proach to cyclic peptides remains a significant challenge. Limited availability of cy- clic peptide data constrains large-scale training[9], and existing models often adapt pre-trained frameworks by modifying input features to accommodate specific cyclic peptide tasks[10]. In this study, we extend RFdiffusion to address the unique challenges of cyclic peptide design. By modifying the model‘s input features to reflect the distinct struc- tural characteristics of cyclic peptides, our approach enables the generation of novel .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint cyclic Figure 1. The illustration of modified RFdiffusion. The model requires the target protein structure file and designated hotspot regions as inputs. Additionally, the desired length of the cyclic peptide sequence must be specified. peptide backbones and sequences for specific targets, without requiring extensive re- training on cyclic peptide datasets. We called it CycleDesigner. Through a series of computational experiments, we evaluate the performance of the CycleDesigner model and demonstrate its ability to generate structurally stable cyclic peptides. This work underscores the potential of advanced diffusion models and other deep learning tech- niques to revolutionize cyclic peptide design, bridging the gap between computational predictions and experimental validation.

Methods

AND MATERIALS Cyclic peptide complex dataset The data used in this study were sourced from the Protein Data Bank (PDB)[11], a publicly accessible repository of three-dimensional structural data for biomolecules. Specifically, we utilized the dataset of HighFold[12], focusing exclusively on sin- gle-chain protein structures. This restriction aligns with the requirements of .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint RFdiffusion, which supports only single-chain protein inputs. As a result, multichain assemblies, including dimers, trimers and higher-order oligomers, were excluded from the dataset. For target structures with issues such as missing amino acids, we em- ployed PDBfixer[13] to repair the proteins. This ensures the integrity of the sin- gle-chain protein structures and their compatibility with the input constraints of RFdiffusion. This preprocessing step was critical for maintaining the continuity and accuracy of the input protein chains, thereby enabling effective cyclic peptide binder design. Model Computational Environment and Framework We use Docker containers to establish an isolated and compatible computational environment[14]. Docker enabled seamless execution of RFdiffusion by encapsulating all dependencies within a stand- ardized container, ensuring consistency across different computational setups. Docker also makes full use of GPU cuda acceleration and high-performance computing to achieve efficient modeling of cyclic peptides. This approach provided a robust and flexible framework for executing RFdiffusion without modifying the local server con- figuration, streamlining the computational process for cyclic peptide design. Input Data Preprocessing Before experiments, a series of data preprocessing steps were applied to ensure compatibility of the PDB files with the modified workflow. Firstly, ligands and other non-protein entities were removed from the PDB files, leaving only the target protein structure. This step simplified the input data, focusing solely on the protein chain required for RFdiffusion. Then RFdiffusion gets infor- .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint mation about chain length, sequence identifiers, and residue indices derived from the raw PDB files. This information served as the basis for downstream modeling and sequence-structure Figure 2. The complete workflow for designing cyclic peptides using RFdiffusion. By replacing the default relative position coding, RFdiffusion was able to generate a cyclic peptide backbone for specific targets. The backbone file was input into ProteinMPNN to obtain a new sequence. Finally, HighFold was used to predict the designed cyclic peptide structure. alignment. We also retained relevant metadata of cyclic peptide complexes, such as ligand lengths and computed distances between ligands and the target protein binding site. This information was incorporated into the model as input features for construct- ing the positional encoding matrix, which is critical for guiding the cyclic peptide generation process. Modification of Positional Encoding for Cyclic Peptides The pre-trained RFdiffusion model incorporates positional encoding in its input features to represent spatial relationships among amino acid residues in linear protein chains. However, this linear encoding is unsuitable for cyclic peptides due to their circular topology. To .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint address this limitation, we implemented the following modifications. We constructed a matrix of Cyclic relative position, using the input residue index (idx) from the linear representation and reflecting the circular topology of cyclic peptides. This matrix en- codes the relative distances between each pair of amino acids, with the values repre- senting the number of intervening residues. Additionally, the signs indicate the rela- tive direction within the circular structure (clockwise or counterclockwise). The re- sulting matrix has a size of N×N, where N is the number of residues in the cyclic pep- tide. We then replaced the constructed cyclic relative position matrix replaced the de- fault linear positional encoding in the input feature set of RFdiffusion. This modifica- tion allowed the model to effectively interpret and manipulate the cyclic topology without requiring additional retraining of the pre-trained framework. Structure Output and Downstream Processing The output of RFdiffusion is a cyclic peptide backbone in PDB format, accurately representing the cyclic topology and maintaining realistic spatial constraints. The downstream processing integrates the following steps: sequence generation with ProteinMPNN[15] and structure re- finement with HighFold[12]. We employed the structure-based graph neural network model ProteinMPNN for sequence generation. ProteinMPNN specializes in generating amino acid sequences based on the 3D structural features of proteins, including N, C α , C, O, and virtual C β atoms. The encoder comprises multiple layers with hidden dimensions of 128. For each residue iii, its node features are updated by integrating features from neighboring residues and edge connections through a message-passing mechanism. This iterative .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint process ensures that structural and relational information is effectively captured. The decoder translates the encoded structural information into residue probabilities for each position. For tasks involving fixed-target ligand generation, ProteinMPNN in- corporates known sequence segments as background constraints. These are combined with the structural features of the unknown regions to infer the sequence. The model is trained by minimizing the cross-entropy loss for each residue, quantifying the devi- ation between predicted and original sequences. The newly generated cyclic peptide sequence is fed into HighFold, a deep learn- ing model based on AlphaFold[16, 17] for predicting the 3D structures of cyclic pep- tides and their complexes to further explore the plausibility of designed backbones and sequences. HighFold introduces the Cyclic Position Offset Encoding Matrix (CycPOEM) to adapt AlphaFold's positional encoding for cyclic peptide topology. This enhancement ensures accurate prediction of the cyclic peptide structure and its interaction with target proteins. By combining RFdiffusion, ProteinMPNN, and HighFold, our workflow efficiently generates novel cyclic peptide binders for specific targets and predicts their sequences and structures, offering a powerful pipeline for cyclic peptide binder design and analysis. In this study, we use Rosetta's energy analyzer[18] to evaluate the stability and binding affinity of the generated cyclic peptide-protein complexes. To filter and select high-confidence designs, we apply the energy metric dGseparated/dSASA×100. This value serves as an indicator of the peptide’s binding efficiency, helping us identify structures with favorable energy characteristics for further analysis and experimental .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint validation. Hotspot Identification In the context of ligand-target interactions, hotspots are key residues at the binding site that play a critical role in molecular recognition and binding affinity[18-20].In this study, we defined hotspots as target residues within a 5 Å threshold of the ligand. Two strategies were considered for computing the shortest distance between ligand atoms and target residues: one-to-one and one-to-many (Adopted) Strategy. For the approach of one-to-one, for each ligand residue, all dis- tances to target residues are computed, and the residue with the shortest distance is selected as a hotspot. This approach often results in a limited or empty hotspot set, particularly for ligands with sparse or irregular interaction patterns. Insufficient hotspot guidance can lead to off-target backbone generation during modeling, where the backbone deviates from the intended binding site due to the lack of spatial con- straints. For the one-to-many approach, distances from each ligand residue to all tar- get residues are computed. Any target residue within the 5 Å threshold is considered a hotspot. Reducing the risk of insufficient hotspot identification by ensuring a more comprehensive selection of residues within the interaction interface. Capturing a broader representation of the binding interface offers the model richer spatial con- straints. Aligns generated backbones more closely with the binding site, increasing the reliability and applicability of the model outputs for downstream tasks. Experiment settings All the experiments run on an Ubuntu (version 20.04.2) workstation with hardware configurations as follows: Intel i9 CPU (16 cores), RTX 3090 GPU*2 (24GB*2 Vid- .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint eo. RAM in total), 64GB Memory and 6TB Hard Disk. RFdiffusion is developed and im- plemented based on docker containers. Figure 3. The results of different diffusion steps on cyclic peptide design. a. Comparison of im- portant indicators of complexes with different diffusion steps. b. pLDDT distribution of different diffusion steps. c. ipTM distribution of different diffusion steps. d. RMSD to backbone distribu- tion of different diffusion steps.

Results

AND DISCUSSION Effect of Diffusion Steps (T) on Cyclic Peptide Design The parameter diffuser.T, representing the number of diffusion steps, significantly influences the accuracy of the CycleDesigner outputs. While increasing diffusion steps generally enhances output accuracy, our experiments demonstrated a distinct .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint trend for cyclic peptide backbone generation. We evaluated different diffusion steps by redesigning all targets in the dataset, selecting the top-1 backbone generated by CycleDesigner, generating the top-1 sequence with ProteinMPNN, and predicting the 3D structure with HighFold. Key metrics included pLDDT for local structural confi- dence, RMSD for alignment accuracy, and ipTM for assessing the quality of predicted complexes. The results showed that the pLDDT metric remained relatively consistent across diffusion steps, with T=25 and T=30 yielding better performance. Specifically, pLDDT values at T=25 were more enriched at higher levels. RMSD values showed an average of 2.151 Å at the default T=20 and improved with increasing steps, but per- formance degraded at T ≥ 30 with an average RMSD of 2.212 Å. The scatterplot anal- ysis revealed that RMSD values at T=25 were more concentrated and enriched at lower levels. Similarly, ipTM values peaked at T=25 with an average of 0.541, com- pared to 0.527 at T=20, and decreased to 0.511 at T=35. Notably, CycleDesigner with T=25 also produced backbones with the highest number of ipTM results above 0.8, indicating a higher likelihood of generating structures with binding activity. These

Results

highlight a unique behavior for cyclic peptides, where increasing diffusion steps does not always improve accuracy. This is likely due to the model’s bias toward generating α -helical secondary structures, which are less prominent in cyclic peptides, where loop structures dominate. Higher diffusion steps exacerbate this bias, reducing structural accuracy for cyclic peptides. Therefore, diffuser.T=25 was selected as the optimal diffusion step for cyclic peptide design in our workflow, striking a balance between model performance and structural .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint relevance. Cyclic peptide binders designed using Rfdiffusion for specific targets We performed the design of cyclic peptide ligands for 23 targets. Using CycleDesigner, we generated five cyclic peptide backbones for each target, resulting in a total of 115 backbones. These backbones were then processed through ProteinMPNN, with each backbone producing five novel sequences, yielding 575 new cyclic peptide sequences. The structural prediction of all 575 sequence designs was conducted using HighFold, generating 2,875 unique 3D structures of cyclic pep- tide-target complexes. Among these complexes, the average pLDDT reached 91.263, while pTM, ipTM, and iPAE scores were 0.912, 0.533, and 0.465, respectively. Cyclic peptides with lengths of 7 and 14 residues were most prevalent, accounting for 625 and 1, 500 structures, respectively. These results highlight the capability of the workflow to gen- erate a large number of high-confidence cyclic peptide designs tailored for specific targets. The screening for potential cyclic peptide binders .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint Figure 4. Preliminary filtering results of alphafold related indicators. a. Comparison of results before and after filtering. b. pLDDT distribution after filtering. c. ipTM distribution after filtering. d. iPAE distribution after filtering. E. RMSD to backbone distribution after filtering. We filtered the 2,875 cyclic peptide-target complex structures to identify candi- dates for subsequent energy function calculations. The filtering criteria included iPAE0.5, pLDDT>80, and RMSD to input backbone <3.5. This process yielded 305 high-quality designs. After filtering, the average pLDDT of the selected structures significantly improved, exceeding 95, with most results surpassing 90. The ipTM, a critical indicator of com- plex quality, showed an even greater enhancement, with an average increase of 20%, reaching nearly 0.8. However, ipTM values were distributed between 0.7 and 0.8 be- fore .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint Figure 5. The filtering results using Rosetta energy analysis. a. Top 1 results of all targets that meet the filtering criteria. b. RMSD to backbone distribution of results after screening. c. dGseparated/dSASA×100 distribution of results after screening. d. ipTM distribution of results after screening. e. The correlation between dGseparated/dSASA×100 and ipTM. screening. The iPAE also improved markedly, with an average dropping below 0.3. The RMSD to the backbone, which already performed well, showed slight improve- ment. Its distribution exhibited two dense regions, one below 1 Å and another be- tween 2 and 3 Å. This bimodal distribution may result from the variability in cyclic peptide lengths, as longer sequences tend to present greater challenges in achieving low RMSD after sequence design. These findings underscore the effectiveness of the filtering process in finding high-quality cyclic peptide designs for further analysis.We performed energy function .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint calculations on the filtered structures using Rosetta's energy_analyze tool. Based on the criterion dGseparated/dSASA×100, 245 structures were selected. Figure 5A illus- trates the best-performing designs for each target meeting this criterion. The RMSD distribution of all selected structures shows two prominent clusters around 1 Å and 2.5 Å. Among these, 74 structures exhibited dGseparated/dSASA×100<−1.5. Inter- estingly, there was no linear correlation between ipTM and dGseparated/dSASA×100. However, as ipTM increased, the enrichment region of dGseparated/dSASA×100 shifted slightly downward, suggesting a modest degree of correlation (Figure 5E). These results highlight the correlation between binding affinity metrics and structural quality indicators, offering valuable insights for further refinement and evaluation of cyclic peptide designs. Among the top 1 structures for all selected targets, we observed that the designed cyclic peptide ligands exhibited spatial positions similar to their natural counterparts. However, their 3D structures and sequences were significantly different. In cases such Figure 6. The comparisons of top 1 designed structures and native structures. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint Figure 7. The comparisons of top 1 designed structures and native structures. as 4gux and 6mrq, which involve longer cyclic peptide sequences, the structural simil arity was particularly pronounced in the contact regions. Conversely, other regions displayed notable differences, likely due to the generative tendencies of the model (Figures 6 and 7). These findings highlight the model's ability to retain spatial alignment with native ligands while introducing diversity in sequence and non-contacting structural ele- ments, potentially broadening the functional scope of the designed cyclic peptides. Parameter adjustment for diffrenret targets in cyclic peptide de- sign .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint Figure 7. Comparison of various indicators of target 3zgc under different parameters, and com- parison between the designed structures and natural structures under different parameters. In exploring the impact of diffusion steps, we found that different targets may have distinct optimal parameters. For example, with 3ZGC, the best results were achieved at 30 diffusion steps. At this setting, three identical amino acid sequences were generated, showing high spatial overlap. Other metrics also performed excep- tionally, with pLDDT reaching 98.2, ipTM 0.941, RMSD to backbone 1.204, and dGseparated/dSASA×100 at -1.145. In contrast, at 25 diffusion steps-the parameter used in our batch processing—the best filtered results showed slightly lower metrics, including a pLDDT of 97.6, ipTM 0.825, iPAE 0.218, RMSD to backbone 0.573, and a Rosetta energy score of -0.468. Structurally, the design at 30 diffusion steps was also more similar to the native structure (Figure 8). These findings indicate that cyclic peptide design results exhibit a degree of randomness. Selecting universally applicable parameters, such as 25 diffusion steps, increases the likelihood of obtaining good designs but may not always yield the opti- .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint mal structures for specific targets. For certain targets, adjusting parameters can lead to designs with higher potential, underscoring the importance of flexibility in parameter selection for cyclic peptide design.

Conclusion

This study demonstrates the potential of CycleDesigner for cyclic peptide design. By introducing cyclic positional encoding, we effectively overcome the limitations of traditional generative models based on linear encoding. This approach allows us to design high-accuracy cyclic peptide topologies without the need for retraining or fi- ne-tuning. Through our comprehensive workflow, we generated over 2,800 novel cy- clic peptide-protein complexes. Of these, 305 candiated were retained after passing rigorous structural and energy-based screening criteria, and finally 245 high-confidence cyclic peptide complexes met multiple standards for further explora- tion. In addtion, while using universal parameters can streamline the process, the sto- chastic nature of cyclic peptide design suggests that adjusting parameters for specific targets can significantly improve outcomes. This balance between generality and specificity represents the next key challenge in cyclic peptide computational design. Our work advances the computational toolkit for cyclic peptide design, enabling the efficient generation of novel cyclic peptide binders with favorable structural and energy properties, even in the absence of detailed ligand information. Future research .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint should focus on experimental validation of the best candidates and further refinement of the models to enhance their applicability in a broad range of peptide-protein inter- action environments. CODE AVAILABILITY The code and scripts for running the model will be released at GitHub after publication. Author Biographies Chenhao Zhang is a postgraduate student in the Artificial Intelligence Assisted Drug Discovery Institute and the Department of Pharmacy, Zhejiang University of Technology. His research interests are in bioinformatics, peptides discovery, machine learning and deep learning. Zhenyu Xu is a researcher in the Shenzhen Highslab Therapeutics. Inc. His research in- terests are in bioinformatics, computational biology, machine learning and deep learning. Kang Lin is a postgraduate student in the Artificial Intelligence Assisted Drug Discovery Institute and the Department of Pharmacy, Zhejiang University of Technology. His re- search interests are in bioinformatics, peptides discovery, machine learning and deep learning. Xu Wen is a postgraduate student in the Artificial Intelligence Assisted Drug Discovery Institute and the Department of Pharmacy, Zhejiang University of Technology. Her re- search interests are in pharmacological data analysis, computational biology, machine learning and deep learning. Chengyun Zhang is a researcher in the Artificial Intelligence Assisted Drug Discovery Institute and the Department of Pharmacy, Zhejiang University of Technology. Her re- search interests are in bioinformatics, computational biology, machine learning and deep learning. Hongliang Duan is a professor in the Artificial Intelligence Assisted Drug Discovery In- stitute and the Department of Pharmacy, Zhejiang University of Technology. His research .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint interests are in AI-driven drug discovery, machine learning and deep learning. COMPETING INTERESTS The authors declare no competing interests.

Reference

1. Buckton LK, Rahimi MN, McAlpine SR. Cyclic Peptides as Drugs for Intracel- lular Targets: The Next Frontier in Peptide Therapeutic Development, Chemistry 2021;27:1487-1513. 2. Haberman V A, Fleming SR, Leisner TM et al. Discovery and Development of Cyclic Peptide Inhibitors of CIB1, ACS Med Chem Lett 2021;12:1832-1839. 3. Zhang H, Chen S. Cyclic peptide drugs approved in the last two decades (2001-2021), RSC Chem Biol 2022;3:18-31. 4. Zorzi A, Deyle K, Heinis C. Cyclic peptide therapeutics: past, present and future, Current opinion in chemical biology 2017;38:24-29. 5. Rhodes DI, Peat TS, V andegraaff N et al. Crystal structures of novel allosteric peptide inhibitors of HIV integrase identify new interactions at the LEDGF binding site, Chembiochem 2011;12:2311-2315. 6. Riley BT, Ilyichova O, Costa MG et al. Direct and indirect mechanisms of KLK4 inhibition revealed by structure and dynamics, Sci Rep 2016;6:35385. 7. Mourao CBF, Brand GD, Fernandes JPC et al. Head-to-Tail Cyclization after In- teraction with Trypsin: A Scorpion V enom Peptide that Resembles Plant Cyclotides, J Med Chem 2020;63:9500-9511. 8. Watson JL, Juergens D, Bennett NR et al. De novo design of protein structure and function with RFdiffusion, Nature 2023;620:1089-1100. 9. Rettie SA, Campbell KV , Bera AK et al. Cyclic peptide structure prediction and design using AlphaFold, bioRxiv 2023. 10. Kosugi T, Ohue M. Solubility-Aware Protein Binding Peptide Design Using AlphaFold, Biomedicines 2022;10:1-15. 11. Berman HM, Westbrook J, Feng Z et al. The Protein Data Bank, Nucleic Acids Research 2000;28:235-242. 12. Zhang C, Zhang C, Shang T et al. HighFold: accurately predicting structures of cyclic peptides and complexes with head-to-tail and disulfide bridge constraints, Brief Bioinform 2024;25. 13. Eastman P , Swails J, Chodera JD et al. OpenMM 7: Rapid development of high performance algorithms for molecular dynamics, PLoS computational biology 2017;13:e1005659. 14. Merkel D. Docker: lightweight linux containers for consistent development and .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint deployment, Linux j 2014;239:2. 15. Dauparas J, Anishchenko I, Bennett N et al. Robust deep learning–based protein sequence design using ProteinMPNN, Science 2022;378:49-56. 16. Jumper J, Evans R, Pritzel A et al. Highly accurate protein structure prediction with AlphaFold, Nature 2021;596:583-589. 17. Evans R, O’Neill M, Pritzel A et al. Protein complex prediction with AlphaFold-Multimer, bioRxiv 2021:2021.2010. 2004.463034. 18. Stranges PB, Kuhlman B. A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds, Protein Sci 2013;22:74-82. 19. Bajusz D, Wade WS, Satala G et al. Exploring protein hotspots by optimized fragment pharmacophores, Nat Commun 2021;12:3201. 20. Kozakov D, Hall DR, Jehle S et al. Ligand deconstruction: Why some fragment binding positions are conserved and others are not, Proceedings of the National Academy of Sciences 2015;112:E2585-E2594. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted December 2, 2024. ; https://doi.org/10.1101/2024.11.27.625581doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-4.0