A federated graph learning method to multi-party collaboration for molecular discovery | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A federated graph learning method to multi-party collaboration for molecular discovery Yuen Wu, Liang Zhang, Kong Chen, Jun Jiang, Yanyong Zhang This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5546931/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 10 Feb, 2026 Read the published version in Nature Machine Intelligence → Version 1 posted You are reading this latest preprint version Abstract Optimizing molecular resources utilization for molecular discovery requires collaborative efforts across research institutions to accelerate progress. However, given the high research value of both successful and unsuccessful molecules conducted by each institution (or laboratory), these findings are typically kept private and confidential until formal publication, with failed ones rarely disclosed. This confidentiality requirement presents a great challenge for most existing methods when handing molecular data with heterogeneous distributions under stringent privacy constraints. Here, we propose FedLG, a federated graph learning method that leverages the Lanczos algorithm to facilitate collaborative model training across multiple parties, achieving reliable prediction performance under strict privacy protection conditions. Compared with various traditional federate learning methods, FedLG method exhibits excellent model performance on all benchmark datasets. With different privacy-preserving mechanism settings, FedLG method demonstrates potential application with high robustness and noise resistance. Comparison tests on datasets from each simulated research institution also show that FedLG method effectively achieves superior data aggregation capabilities and more promising outcomes than localized model training. In addition, we incorporate the Bayesian optimization algorithm into FedLG method to demonstrate its scalability and further enhance model performance. Overall, the proposed method FedLG can be deemed a highly effective method to realize multi-party collaboration while ensuring sensitive molecular information is protected from potential leakage. Biological sciences/Drug discovery/Drug safety Physical sciences/Mathematics and computing/Information technology Full Text Additional Declarations There is NO Competing Interest. Supplementary Files SupplementaryInformationAfederatedgraphlearningmethodtomultipartycollaborationformoleculardiscovery.docx Supplementary Information Cite Share Download PDF Status: Published Journal Publication published 10 Feb, 2026 Read the published version in Nature Machine Intelligence → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5546931","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":392861217,"identity":"948a7783-d8ba-4b6c-9a29-e0e9ed2286fa","order_by":0,"name":"Yuen Wu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAwklEQVRIiWNgGAWjYPACGyjNRryWNKhqErQcJkGLwY3kY495284nzp/f/IDhQ9lhBv7ZDYS0pKUb87bdTtxwjM2Acca5wwwSdw4Q0pJjJp0L0sLGw8DM23aYwUAigZCW/G9ALecS57cBtfwlTksOG1DLgcSGY0AtjMRokTzzzEz6z7lk4w3H0gwO9pxL55G4QUAL3/HkZ5Izyuxk5zcffvjgR5m1HP8MAloUDiBxQGwe/OqBQL6BoJJRMApGwSgY8QAAySBC6C6jkRYAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0001-9524-2843","institution":"University of Science and Technology of China","correspondingAuthor":true,"prefix":"","firstName":"Yuen","middleName":"","lastName":"Wu","suffix":""},{"id":392861218,"identity":"029e51b4-ee68-4621-8ee3-65f5199ce19b","order_by":1,"name":"Liang Zhang","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Liang","middleName":"","lastName":"Zhang","suffix":""},{"id":392861219,"identity":"7ddee828-1503-4145-a01f-ec1e186abfc5","order_by":2,"name":"Kong Chen","email":"","orcid":"https://orcid.org/0009-0008-9082-4789","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Kong","middleName":"","lastName":"Chen","suffix":""},{"id":392861220,"identity":"4778ceaf-b8bf-44bf-865d-7f8bbb2210bb","order_by":3,"name":"Jun Jiang","email":"","orcid":"https://orcid.org/0000-0002-6116-5605","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"","lastName":"Jiang","suffix":""},{"id":392861221,"identity":"2c71a3f3-eeb8-4d78-948b-ae70400ee475","order_by":4,"name":"Yanyong Zhang","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Yanyong","middleName":"","lastName":"Zhang","suffix":""}],"badges":[],"createdAt":"2024-11-29 07:20:20","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5546931/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5546931/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s42256-026-01184-1","type":"published","date":"2026-02-10T05:00:00+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":102387657,"identity":"43583376-4b64-4d3a-81cb-0c5289766aa8","added_by":"auto","created_at":"2026-02-11 08:12:15","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":818724,"visible":true,"origin":"","legend":"","description":"","filename":"Afederatedgraphlearningmethodtomultipartycollaborationformoleculardiscovery.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5546931/v1_covered_21d4e886-fa93-4ff2-bc83-adf9ae81f928.pdf"},{"id":72123356,"identity":"17e2fc5c-4982-4b54-8c43-0927a04c19c1","added_by":"auto","created_at":"2024-12-23 02:19:09","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":57220,"visible":true,"origin":"","legend":"Supplementary Information","description":"","filename":"SupplementaryInformationAfederatedgraphlearningmethodtomultipartycollaborationformoleculardiscovery.docx","url":"https://assets-eu.researchsquare.com/files/rs-5546931/v1/0b7df6409bd817ba94bd4bdc.docx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"A federated graph learning method to multi-party collaboration for molecular discovery","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-5546931/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5546931/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOptimizing molecular resources utilization for molecular discovery requires collaborative efforts across research institutions to accelerate progress. However, given the high research value of both successful and unsuccessful molecules conducted by each institution (or laboratory), these findings are typically kept private and confidential until formal publication, with failed ones rarely disclosed. This confidentiality requirement presents a great challenge for most existing methods when handing molecular data with heterogeneous distributions under stringent privacy constraints. Here, we propose FedLG, a federated graph learning method that leverages the Lanczos algorithm to facilitate collaborative model training across multiple parties, achieving reliable prediction performance under strict privacy protection conditions. Compared with various traditional federate learning methods, FedLG method exhibits excellent model performance on all benchmark datasets. With different privacy-preserving mechanism settings, FedLG method demonstrates potential application with high robustness and noise resistance. Comparison tests on datasets from each simulated research institution also show that FedLG method effectively achieves superior data aggregation capabilities and more promising outcomes than localized model training. In addition, we incorporate the Bayesian optimization algorithm into FedLG method to demonstrate its scalability and further enhance model performance. Overall, the proposed method FedLG can be deemed a highly effective method to realize multi-party collaboration while ensuring sensitive molecular information is protected from potential leakage.\u003c/p\u003e","manuscriptTitle":"A federated graph learning method to multi-party collaboration for molecular discovery","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-12-23 02:19:05","doi":"10.21203/rs.3.rs-5546931/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-machine-intelligence","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"natmachintell","sideBox":"Learn more about [Nature Machine Intelligence](http://www.nature.com/natmachintell/)","snPcode":"","submissionUrl":"","title":"Nature Machine Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Research","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"75cdbd57-5559-4ea0-8d08-22dfea75fa88","owner":[],"postedDate":"December 23rd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":41884098,"name":"Biological sciences/Drug discovery/Drug safety"},{"id":41884099,"name":"Physical sciences/Mathematics and computing/Information technology"}],"tags":[],"updatedAt":"2026-02-11T08:12:08+00:00","versionOfRecord":{"articleIdentity":"rs-5546931","link":"https://doi.org/10.1038/s42256-026-01184-1","journal":{"identity":"nature-machine-intelligence","isVorOnly":false,"title":"Nature Machine Intelligence"},"publishedOn":"2026-02-10 05:00:00","publishedOnDateReadable":"February 10th, 2026"},"versionCreatedAt":"2024-12-23 02:19:05","video":"","vorDoi":"10.1038/s42256-026-01184-1","vorDoiUrl":"https://doi.org/10.1038/s42256-026-01184-1","workflowStages":[]},"version":"v1","identity":"rs-5546931","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5546931","identity":"rs-5546931","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.