Adaptive and scalable catalyst discovery with composable intelligence | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Physical Sciences - Article Adaptive and scalable catalyst discovery with composable intelligence Xiaonan Wang, Wentao Li, Qilong Cai, Qi Lei, Jinxing Chen, Ruixuan Chen, and 16 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9556645/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Impactful discoveries in catalysis research are increasingly driven by systems that combine properties across multiple subdisciplines. However, state-of-the-art predictive models remain constrained by domain-specific knowledge representations, limiting transferability across catalyst systems and hindering extrapolative discovery. Herein, we introduce Composable Catalytic Intelligence (CatAI), a modular predictive framework that assembles task-specific models by combining pretrained physicochemical knowledge modules (30 encoders trained on 1.8 M structure-property datapoints), enabling adaptation across diverse catalytic systems. Across six benchmark tasks, the model reduces prediction error by 7.8% relative to baseline models, while achieving over 15% improvement in few-shot accuracies and active learning scenarios. To demonstrate practical utility, we validate CatAI across two distinct discovery campaigns. In homogeneous catalysis, it enables large-scale screening to identify a zirconocene catalyst for polyolefin elastomer synthesis with an activity of 194.4 × 10^4 kg/(mol·h), corresponding to a 9-fold improvement over the commercial benchmark. In heterogeneous catalysis, integration with an autonomous platform reveals a Cu-based geminal-atom photocatalyst for C–O coupling, breaking the Ni-dominated paradigm. At its core, CatAI's modular architecture allows new structure-property modules to be added and selectively combined with existing ones, enabling interpretable predictions. This design establishes an extensible platform for catalyst discovery that translates computational insights into experimental outcomes across diverse catalytic systems. Physical sciences/Chemistry/Catalysis/Heterogeneous catalysis Physical sciences/Chemistry/Catalysis/Homogeneous catalysis Physical sciences/Chemistry/Cheminformatics Physical sciences/Materials science/Theory and computation/Computational methods Full Text Additional Declarations There is NO Competing Interest. Supplementary Files Ext4.pdf Extended Data 4 Ext2.pdf Extended Data 2 Ext1new.pdf Extended Data 1 CatAISIsubmission.docx Supplementary Information automatedpolymerizationevaluationplatform.mp4 Automated polymerization evaluation platform Ext3.pdf Extended Data 3 Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9556645","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Physical Sciences - Article","associatedPublications":[],"authors":[{"id":631694641,"identity":"a3eeadc0-8160-4c63-a134-3602c57b763b","order_by":0,"name":"Xiaonan Wang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABIUlEQVRIie2RvWqDUBSArwi6+ACnQ5NXOHLhNiHQPkgXLwGzXCRQKIGU1skpkNUtr2DpC9wgJIuS9Q4dstfhQpcOUqp2CmibsVA/DucHzrecQ0hPz58Fm2wcCYEmKsyzFBPl+co3Fsi6/KbgPk/fnPn14MrOd0tdjoLh7crVZDHhoZ3LViUL/HGMUzpeBb7aRnDnvmYUSDbjoRN4bQqTgqFGkydVo7Yh8OdYIBhRykNwsFU5FAw9fOJJ1cxl2Sj0w/j8QVGCHjWmPFGCEWkB34BgYITdyo0qmBHjnqIqKOQR8AT8+5G3m9HIEa3KxVrQd6dcDvAgXL0oH/kmnr4o/TC5XNtZq0Lqd5zeUFapPpXVsV9h6pNxGHav9vT09PxPvgC0JmVH2ABaggAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0001-9775-2417","institution":"Tsinghua University","correspondingAuthor":true,"prefix":"","firstName":"Xiaonan","middleName":"","lastName":"Wang","suffix":""},{"id":631694642,"identity":"f9e402fd-c396-45bd-a98a-0d554c027995","order_by":1,"name":"Wentao Li","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Wentao","middleName":"","lastName":"Li","suffix":""},{"id":631694643,"identity":"7c85d547-02a0-4f95-b44d-8e8f61c98082","order_by":2,"name":"Qilong Cai","email":"","orcid":"https://orcid.org/0000-0003-3932-702X","institution":"National University of Singapore","correspondingAuthor":false,"prefix":"","firstName":"Qilong","middleName":"","lastName":"Cai","suffix":""},{"id":631694644,"identity":"305b52f0-8a35-4ac9-b60e-982de4b46e43","order_by":3,"name":"Qi Lei","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Qi","middleName":"","lastName":"Lei","suffix":""},{"id":631694645,"identity":"dc2af13e-e5bc-416d-b501-7de87910451c","order_by":4,"name":"Jinxing Chen","email":"","orcid":"","institution":"National University of Singapore","correspondingAuthor":false,"prefix":"","firstName":"Jinxing","middleName":"","lastName":"Chen","suffix":""},{"id":631694646,"identity":"c62a6798-ee8f-4278-9d8c-cd220778849d","order_by":5,"name":"Ruixuan Chen","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Ruixuan","middleName":"","lastName":"Chen","suffix":""},{"id":631694647,"identity":"9c7a561e-bc12-4ced-b08e-2fcb37ef2d48","order_by":6,"name":"Jiangjie Qiu","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Jiangjie","middleName":"","lastName":"Qiu","suffix":""},{"id":631694648,"identity":"fb032463-62e1-4a37-995a-2c7e38360d40","order_by":7,"name":"Botian Wang","email":"","orcid":"","institution":"Institute for AI Industry Research (AIR), Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Botian","middleName":"","lastName":"Wang","suffix":""},{"id":631694649,"identity":"3a04cb7a-db02-4c48-94da-5b92ad1d6d3d","order_by":8,"name":"Kai Zhao","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Kai","middleName":"","lastName":"Zhao","suffix":""},{"id":631694650,"identity":"3b76b41c-1252-4b48-8567-aeeeab553d60","order_by":9,"name":"Zemeng Wang","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Zemeng","middleName":"","lastName":"Wang","suffix":""},{"id":631694651,"identity":"d8c8ff75-3689-44cf-aa1b-40d88d8b9a52","order_by":10,"name":"Yijun Li","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Yijun","middleName":"","lastName":"Li","suffix":""},{"id":631694652,"identity":"14e846fe-ddd2-4d34-aee8-912fb50c5ba7","order_by":11,"name":"Jun Yin","email":"","orcid":"https://orcid.org/0000-0002-8993-3178","institution":"National University of Singapore","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"","lastName":"Yin","suffix":""},{"id":631694653,"identity":"6e9a819c-2efb-405e-b9ea-88d4f3d74c07","order_by":12,"name":"Fangfang Li","email":"","orcid":"","institution":"Wanhua Chemical Group Co., Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Fangfang","middleName":"","lastName":"Li","suffix":""},{"id":631694654,"identity":"0fc8a68e-3d7c-46d6-8732-c1ceca480e86","order_by":13,"name":"Hongxuan Liu","email":"","orcid":"","institution":"Massachusetts Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Hongxuan","middleName":"","lastName":"Liu","suffix":""},{"id":631694655,"identity":"b281dfb0-4390-4571-b8f1-24d4c6ff1133","order_by":14,"name":"Yingdong Lu","email":"","orcid":"","institution":"Wanhua Chemical Group Co., Ltd.","correspondingAuthor":false,"prefix":"","firstName":"Yingdong","middleName":"","lastName":"Lu","suffix":""},{"id":631694656,"identity":"1b0da747-6a19-41ef-a1a8-44657fc81c3c","order_by":15,"name":"Yawen Ouyang","email":"","orcid":"","institution":"Institute for AI Industry Research (AIR), Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Yawen","middleName":"","lastName":"Ouyang","suffix":""},{"id":631694657,"identity":"90bf0fbe-8fe9-4e6f-abd5-c62dd8954372","order_by":16,"name":"Manu Suvarna","email":"","orcid":"","institution":"University of Greifswald","correspondingAuthor":false,"prefix":"","firstName":"Manu","middleName":"","lastName":"Suvarna","suffix":""},{"id":631694658,"identity":"5dd753d4-0e1e-40b3-8134-f97b3c98838d","order_by":17,"name":"Wei-Ying Ma","email":"","orcid":"","institution":"Institute for AI Industry Research, Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Wei-Ying","middleName":"","lastName":"Ma","suffix":""},{"id":631694659,"identity":"b9ae0776-54ce-4164-b652-68ceb1b0f63b","order_by":18,"name":"Ya-Qin Zhang","email":"","orcid":"","institution":"Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Ya-Qin","middleName":"","lastName":"Zhang","suffix":""},{"id":631694660,"identity":"9682e904-e45b-490d-a601-60246c8bf7d2","order_by":19,"name":"Kostya Novoselov","email":"","orcid":"https://orcid.org/0000-0003-4972-5371","institution":"National University of Singapore","correspondingAuthor":false,"prefix":"","firstName":"Kostya","middleName":"","lastName":"Novoselov","suffix":""},{"id":631694661,"identity":"255989e8-a162-45fa-bd23-686c293cf4dc","order_by":20,"name":"Hao Zhou","email":"","orcid":"","institution":"Institute for AI Industry Research (AIR), Tsinghua University","correspondingAuthor":false,"prefix":"","firstName":"Hao","middleName":"","lastName":"Zhou","suffix":""},{"id":631694662,"identity":"c154bdaf-8e61-4222-997e-b6f1d5acdd46","order_by":21,"name":"Jiong Lu","email":"","orcid":"https://orcid.org/0000-0002-3690-8235","institution":"National University of Singapore","correspondingAuthor":false,"prefix":"","firstName":"Jiong","middleName":"","lastName":"Lu","suffix":""}],"badges":[],"createdAt":"2026-04-28 16:25:57","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9556645/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9556645/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108497130,"identity":"4cdc874d-56f8-401c-b1f4-ddeea1a97062","added_by":"auto","created_at":"2026-05-05 10:13:01","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1971301,"visible":true,"origin":"","legend":"","description":"","filename":"CatAIsubmission.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1_covered_a1e8e7a7-64ce-4a42-869d-f846801bc163.pdf"},{"id":108494641,"identity":"eb8bee65-e950-4287-8da1-14e4b1c83378","added_by":"auto","created_at":"2026-05-05 10:06:12","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2308017,"visible":true,"origin":"","legend":"Extended Data 4","description":"","filename":"Ext4.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/900f0f8814056e49c04ef818.pdf"},{"id":108494480,"identity":"e976aded-9898-41ac-b0ff-ab4de445be96","added_by":"auto","created_at":"2026-05-05 10:05:35","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":2785710,"visible":true,"origin":"","legend":"Extended Data 2","description":"","filename":"Ext2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/3f7bae24214daa7547c50609.pdf"},{"id":108495101,"identity":"dfe5e7d6-92f3-430b-8622-da28378933ab","added_by":"auto","created_at":"2026-05-05 10:08:44","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":160644,"visible":true,"origin":"","legend":"Extended Data 1","description":"","filename":"Ext1new.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/fbe3a398c551ccc048ffd4cb.pdf"},{"id":108494676,"identity":"9edd8bb9-def0-43d9-8c5f-13f9035c5fe7","added_by":"auto","created_at":"2026-05-05 10:06:29","extension":"docx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":60524468,"visible":true,"origin":"","legend":"Supplementary Information","description":"","filename":"CatAISIsubmission.docx","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/6e6aecff3827f87841b25d1b.docx"},{"id":108494481,"identity":"befdd218-e8c5-4afa-a924-294271597ef2","added_by":"auto","created_at":"2026-05-05 10:05:35","extension":"mp4","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":333542257,"visible":true,"origin":"","legend":"Automated polymerization evaluation platform","description":"","filename":"automatedpolymerizationevaluationplatform.mp4","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/674fa5d31baea7083aa357ea.mp4"},{"id":108495233,"identity":"21ca9301-a15a-4a74-aa37-8a31e5b61d3c","added_by":"auto","created_at":"2026-05-05 10:09:31","extension":"pdf","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":1582609,"visible":true,"origin":"","legend":"Extended Data 3","description":"","filename":"Ext3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9556645/v1/9576bfd72098e782acdf33bc.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Adaptive and scalable catalyst discovery with composable intelligence","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-9556645/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9556645/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Impactful discoveries in catalysis research are increasingly driven by systems that combine properties across multiple subdisciplines. However, state-of-the-art predictive models remain constrained by domain-specific knowledge representations, limiting transferability across catalyst systems and hindering extrapolative discovery. Herein, we introduce Composable Catalytic Intelligence (CatAI), a modular predictive framework that assembles task-specific models by combining pretrained physicochemical knowledge modules (30 encoders trained on 1.8 M structure-property datapoints), enabling adaptation across diverse catalytic systems. Across six benchmark tasks, the model reduces prediction error by 7.8% relative to baseline models, while achieving over 15% improvement in few-shot accuracies and active learning scenarios. To demonstrate practical utility, we validate CatAI across two distinct discovery campaigns. In homogeneous catalysis, it enables large-scale screening to identify a zirconocene catalyst for polyolefin elastomer synthesis with an activity of 194.4 × 10^4 kg/(mol·h), corresponding to a 9-fold improvement over the commercial benchmark. In heterogeneous catalysis, integration with an autonomous platform reveals a Cu-based geminal-atom photocatalyst for C–O coupling, breaking the Ni-dominated paradigm. At its core, CatAI's modular architecture allows new structure-property modules to be added and selectively combined with existing ones, enabling interpretable predictions. This design establishes an extensible platform for catalyst discovery that translates computational insights into experimental outcomes across diverse catalytic systems.","manuscriptTitle":"Adaptive and scalable catalyst discovery with composable intelligence","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-04 17:27:23","doi":"10.21203/rs.3.rs-9556645/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"b183b294-0f05-4a32-961d-b50398c2e14b","owner":[],"postedDate":"May 4th, 2026","published":true,"recentEditorialEvents":[{"type":"editorAssigned","content":"","date":"2026-04-29T10:48:01+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":67237772,"name":"Physical sciences/Chemistry/Catalysis/Heterogeneous catalysis"},{"id":67237773,"name":"Physical sciences/Chemistry/Catalysis/Homogeneous catalysis"},{"id":67237774,"name":"Physical sciences/Chemistry/Cheminformatics"},{"id":67237775,"name":"Physical sciences/Materials science/Theory and computation/Computational methods"}],"tags":[],"updatedAt":"2026-05-04T17:27:24+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-04 17:27:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9556645","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9556645","identity":"rs-9556645","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.