An AI system to help scientists write expert-level empirical software | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Physical Sciences - Article An AI system to help scientists write expert-level empirical software Michael Brenner, Eser Aygün, Anastasiya Belyaeva, Gheorghe Comanici, and 38 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7610233/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments. To address this, we present an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and intelligently navigate the large space of possible solutions. The system achieves expert-level results when it explores and integrates complex research ideas from external sources. The effectiveness of tree search is demonstrated across a wide range of benchmarks. In bioinformatics, it discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, it generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. Our method also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting and numerical solution of integrals. By devising and implementing novel solutions to diverse tasks, the system represents a significant step towards accelerating scientific progress. Physical sciences/Mathematics and computing/Computational science Physical sciences/Mathematics and computing/Computer science Tree Search Generative AI Scorable Scientific Tasks Empirical Software Full Text Additional Declarations Yes there is potential Competing Interest. The authors are employees of Google and own shares in the company. Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7610233","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Physical Sciences - Article","associatedPublications":[],"authors":[{"id":534110742,"identity":"ce97d0db-22f9-4b70-8340-4275a3b8d117","order_by":0,"name":"Michael Brenner","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+ElEQVRIiWNgGAWjYBACAygtByYZGyA8CQYGZoJajEnXkthAtBZz6cPHHhdU3Euf3978+APjDrt8g+NnH95gqLBObMChxbIvLd14xpni3A1njplJMJ5JttxwJt3YguFMOk4tBmd4zKR52xJyN0gkmDEwtjEbSDaksUkwth3Go4X/mzTvv4R0+fnPP39gbKs3kOx/BtTyD58WHjZp3oaEBIYbPAYgww34JUC2NODWYtnDZibNcyzBcMOZnDKJxLbjQC3PmC0SjqUb49JizsP8TJqnJkFevv345g8f26oN2PjTGG98qLGWxaUFFSRgMEbBKBgFo2AUkAUAyYhSDqPlJXAAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-5673-7947","institution":"Harvard University","correspondingAuthor":true,"prefix":"","firstName":"Michael","middleName":"","lastName":"Brenner","suffix":""},{"id":534110743,"identity":"a9410fdd-452b-45c2-9109-bfa5b5740c91","order_by":1,"name":"Eser Aygün","email":"","orcid":"","institution":"Google Deepmind","correspondingAuthor":false,"prefix":"","firstName":"Eser","middleName":"","lastName":"Aygün","suffix":""},{"id":534110744,"identity":"a422abd5-2698-475b-9c54-6fdd8aa3238d","order_by":2,"name":"Anastasiya Belyaeva","email":"","orcid":"","institution":"Google LLC","correspondingAuthor":false,"prefix":"","firstName":"Anastasiya","middleName":"","lastName":"Belyaeva","suffix":""},{"id":534110745,"identity":"e09aad42-5808-4f26-a354-71d0f1517800","order_by":3,"name":"Gheorghe Comanici","email":"","orcid":"","institution":"Google Deepmind","correspondingAuthor":false,"prefix":"","firstName":"Gheorghe","middleName":"","lastName":"Comanici","suffix":""},{"id":534110746,"identity":"27c32dc0-6f27-4461-a912-b096a0fe9191","order_by":4,"name":"Marc Coram","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Marc","middleName":"","lastName":"Coram","suffix":""},{"id":534110747,"identity":"a2814dea-7baa-4875-b7f7-f4f018219c66","order_by":5,"name":"Hao Cui","email":"","orcid":"https://orcid.org/0009-0006-2456-083X","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Hao","middleName":"","lastName":"Cui","suffix":""},{"id":534110748,"identity":"5a57cec8-cf3e-4706-b861-068d86709ec3","order_by":6,"name":"Jake Garrison","email":"","orcid":"","institution":"Google","correspondingAuthor":false,"prefix":"","firstName":"Jake","middleName":"","lastName":"Garrison","suffix":""},{"id":534110749,"identity":"01e627c0-118a-4566-b3eb-c4a95bbe428d","order_by":7,"name":"Renee Johnston","email":"","orcid":"","institution":"Google","correspondingAuthor":false,"prefix":"","firstName":"Renee","middleName":"","lastName":"Johnston","suffix":""},{"id":534110750,"identity":"64991bee-9ca3-4bda-a07f-3ad54cedcf0e","order_by":8,"name":"Anton Kast","email":"","orcid":"https://orcid.org/0000-0002-0755-9996","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Anton","middleName":"","lastName":"Kast","suffix":""},{"id":534110751,"identity":"e46d877d-44e0-4e12-90f5-43827209ff5d","order_by":9,"name":"Cory McLean","email":"","orcid":"https://orcid.org/0000-0001-9928-8216","institution":"Google (United States)","correspondingAuthor":false,"prefix":"","firstName":"Cory","middleName":"","lastName":"McLean","suffix":""},{"id":534110752,"identity":"fa22223d-de27-4d62-9602-a2eb2be006ab","order_by":10,"name":"Peter Norgaard","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Peter","middleName":"","lastName":"Norgaard","suffix":""},{"id":534110753,"identity":"9a9e1aa7-865b-404d-b9f0-c2562d7e9d6d","order_by":11,"name":"Zahra Shamsi","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Zahra","middleName":"","lastName":"Shamsi","suffix":""},{"id":534110754,"identity":"3735c2d8-f93d-49cb-835f-92cf7af4a806","order_by":12,"name":"David Smalling","email":"","orcid":"","institution":"Google Deepmind","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Smalling","suffix":""},{"id":534110755,"identity":"25a740a9-d28f-463f-97ed-3478f36e1b51","order_by":13,"name":"James Thompson","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"James","middleName":"","lastName":"Thompson","suffix":""},{"id":534110756,"identity":"673f65ee-c00c-406f-be53-4247c8047a9a","order_by":14,"name":"Subhashini Venugopalan","email":"","orcid":"https://orcid.org/0000-0003-3729-8456","institution":"Google Research at Mountain View","correspondingAuthor":false,"prefix":"","firstName":"Subhashini","middleName":"","lastName":"Venugopalan","suffix":""},{"id":534110757,"identity":"2d2f69a4-61fa-4e5d-bd6a-160b71df3366","order_by":15,"name":"Brian Williams","email":"","orcid":"https://orcid.org/0000-0002-2839-0106","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Brian","middleName":"","lastName":"Williams","suffix":""},{"id":534110758,"identity":"be842bf5-a478-4c96-823c-7bfff44faa94","order_by":16,"name":"Chujun He","email":"","orcid":"","institution":"MIT","correspondingAuthor":false,"prefix":"","firstName":"Chujun","middleName":"","lastName":"He","suffix":""},{"id":534110759,"identity":"a6f9e91b-9676-432d-9adc-1e678565a869","order_by":17,"name":"Sarah Martinson","email":"","orcid":"https://orcid.org/0009-0004-4636-5061","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Sarah","middleName":"","lastName":"Martinson","suffix":""},{"id":534110760,"identity":"454f9ba9-08e4-4ff6-91c1-89c255c23f64","order_by":18,"name":"Martyna Plomecka","email":"","orcid":"","institution":"Google Inc","correspondingAuthor":false,"prefix":"","firstName":"Martyna","middleName":"","lastName":"Plomecka","suffix":""},{"id":534110761,"identity":"1ec127e5-24b4-4e50-b81c-a356323b4d1c","order_by":19,"name":"Lai Wei","email":"","orcid":"","institution":"Google Inc","correspondingAuthor":false,"prefix":"","firstName":"Lai","middleName":"","lastName":"Wei","suffix":""},{"id":534110762,"identity":"63aef3e9-d1fa-4ed4-85df-0ebc0bdfa9da","order_by":20,"name":"Yuchen Zhou","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Yuchen","middleName":"","lastName":"Zhou","suffix":""},{"id":534110763,"identity":"999faefb-9d19-43f4-b29c-b0f7c7858c4d","order_by":21,"name":"Qian-Ze Zhu","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Qian-Ze","middleName":"","lastName":"Zhu","suffix":""},{"id":534110764,"identity":"e21b221e-2335-4ebf-999d-9f54a484f74f","order_by":22,"name":"Matthew Abraham","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Matthew","middleName":"","lastName":"Abraham","suffix":""},{"id":534110765,"identity":"f50b4116-3424-4c58-b87c-98a42d205f6e","order_by":23,"name":"Erica Brand","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Erica","middleName":"","lastName":"Brand","suffix":""},{"id":534110766,"identity":"32293387-5078-4127-b5af-90fcc2b9718d","order_by":24,"name":"Anna Bulanova","email":"","orcid":"","institution":"Google Deepmind","correspondingAuthor":false,"prefix":"","firstName":"Anna","middleName":"","lastName":"Bulanova","suffix":""},{"id":534110767,"identity":"5ba59351-b957-47bd-90eb-0c9206cc96d6","order_by":25,"name":"Jeffrey Cardille","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Jeffrey","middleName":"","lastName":"Cardille","suffix":""},{"id":534110768,"identity":"12b635c9-78aa-42f5-a2e4-640d5ecf5a1b","order_by":26,"name":"Chris Co","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Chris","middleName":"","lastName":"Co","suffix":""},{"id":534110769,"identity":"0d8d9ad6-a7fe-4a42-b75c-8126ae59909b","order_by":27,"name":"Scott Ellsworth","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Scott","middleName":"","lastName":"Ellsworth","suffix":""},{"id":534110770,"identity":"d9e1bc5a-a6d1-4d09-a652-65e8017e7058","order_by":28,"name":"Grace Joseph","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Grace","middleName":"","lastName":"Joseph","suffix":""},{"id":534110771,"identity":"114ffb74-471b-4e93-86fe-072692bc8b10","order_by":29,"name":"Malcolm Kane","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Malcolm","middleName":"","lastName":"Kane","suffix":""},{"id":534110772,"identity":"8cffe006-3a3c-4287-8bd8-0375ac86fd24","order_by":30,"name":"Ryan Krueger","email":"","orcid":"","institution":"Harvard University","correspondingAuthor":false,"prefix":"","firstName":"Ryan","middleName":"","lastName":"Krueger","suffix":""},{"id":534110773,"identity":"5f18bd3b-fffd-4036-a22f-46bf300e838f","order_by":31,"name":"Johan Kartiwa","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Johan","middleName":"","lastName":"Kartiwa","suffix":""},{"id":534110774,"identity":"4cce48da-d27a-4329-9290-3417af8b5641","order_by":32,"name":"Dan Liebling","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Dan","middleName":"","lastName":"Liebling","suffix":""},{"id":534110775,"identity":"57c624b9-f404-40d9-8a5d-469f78070ab0","order_by":33,"name":"Jan-Matthis Lueckmann","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Jan-Matthis","middleName":"","lastName":"Lueckmann","suffix":""},{"id":534110776,"identity":"5a37f396-5d5f-4fae-98e4-3aed20363135","order_by":34,"name":"Paul Raccuglia","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Paul","middleName":"","lastName":"Raccuglia","suffix":""},{"id":534110777,"identity":"3b2c274e-afef-46b5-a50d-c72d58a0f20e","order_by":35,"name":"Xuefei (Julie) Wang","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Xuefei","middleName":"(Julie)","lastName":"Wang","suffix":""},{"id":534110778,"identity":"fc85a127-e86b-4759-827e-fc0522ecb373","order_by":36,"name":"Katherine Chou","email":"","orcid":"","institution":"Google, Inc.","correspondingAuthor":false,"prefix":"","firstName":"Katherine","middleName":"","lastName":"Chou","suffix":""},{"id":534110779,"identity":"fde2a53c-777a-4fe2-a5a3-1f1e0f24128e","order_by":37,"name":"James Manyika","email":"","orcid":"","institution":"Google","correspondingAuthor":false,"prefix":"","firstName":"James","middleName":"","lastName":"Manyika","suffix":""},{"id":534110780,"identity":"ac775ddb-77e9-4fa4-8e9b-8871069f2ae6","order_by":38,"name":"Yossi Matias","email":"","orcid":"https://orcid.org/0000-0003-3960-6002","institution":"Google","correspondingAuthor":false,"prefix":"","firstName":"Yossi","middleName":"","lastName":"Matias","suffix":""},{"id":534110781,"identity":"6db2530d-127c-4f1b-a9b5-52dfae8dd096","order_by":39,"name":"John Platt","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"John","middleName":"","lastName":"Platt","suffix":""},{"id":534110782,"identity":"038510c1-d2cf-4efd-891d-0579c1c057d5","order_by":40,"name":"Elizabeth Dorfman","email":"","orcid":"","institution":"Google Research","correspondingAuthor":false,"prefix":"","firstName":"Elizabeth","middleName":"","lastName":"Dorfman","suffix":""},{"id":534110783,"identity":"f2c676f0-8331-449a-a041-12009043d4bf","order_by":41,"name":"Shibl Mourad","email":"","orcid":"","institution":"Google Deepmind","correspondingAuthor":false,"prefix":"","firstName":"Shibl","middleName":"","lastName":"Mourad","suffix":""}],"badges":[],"createdAt":"2025-09-14 03:45:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7610233/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7610233/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94248300,"identity":"769ba656-acfc-45c1-ad76-a3b1edbf8e17","added_by":"auto","created_at":"2025-10-24 06:02:49","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":19626861,"visible":true,"origin":"","legend":"","description":"","filename":"naturesubmissionAIsystemempiricalcode.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7610233/v1_covered_ad892930-4754-40d7-ac97-29a113f89029.pdf"}],"financialInterests":"\u003cb\u003eYes\u003c/b\u003e there is potential Competing Interest.\nThe authors are employees of Google and own shares in the company.","formattedTitle":"An AI system to help scientists write expert-level empirical software","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Tree Search, Generative AI, Scorable Scientific Tasks, Empirical Software","lastPublishedDoi":"10.21203/rs.3.rs-7610233/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7610233/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments. To address this, we present an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and intelligently navigate the large space of possible solutions. The system achieves expert-level results when it explores and integrates complex research ideas from external sources.\r\nThe effectiveness of tree search is demonstrated across a wide range of benchmarks. In bioinformatics, it discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, it generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. Our method also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting and numerical solution of integrals. By devising and implementing novel solutions to diverse tasks, the system represents a significant step towards accelerating scientific progress.","manuscriptTitle":"An AI system to help scientists write expert-level empirical software","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-24 05:54:04","doi":"10.21203/rs.3.rs-7610233/v1","editorialEvents":[],"status":"published","journal":{"display":false,"email":"
[email protected]","identity":"nature","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"nature","sideBox":"Learn more about [Nature](http://www.nature.com/nature/)","snPcode":"","submissionUrl":"","title":"Nature","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"a035b271-230d-4a41-979c-de33637f2c47","owner":[],"postedDate":"October 24th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":56792860,"name":"Physical sciences/Mathematics and computing/Computational science"},{"id":56792861,"name":"Physical sciences/Mathematics and computing/Computer science"}],"tags":[],"updatedAt":"2026-05-13T15:23:50+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-24 05:54:04","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7610233","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7610233","identity":"rs-7610233","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.