AdaptMol: Domain Adaptation for Molecular Image Recognition with Limited Supervision | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article AdaptMol: Domain Adaptation for Molecular Image Recognition with Limited Supervision Feng Hu, Estrid He, Karin Verspoor This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8365561/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 02 May, 2026 Read the published version in Journal of Cheminformatics → Version 1 posted 9 You are reading this latest preprint version Abstract Optical Chemical Structure Recognition (OCSR) aims to convert two-dimensional molecular images into machine-readable formats such as SMILES strings. Despite significant progress in deep learning based approaches, current OCSR methods trained predominantly on synthetic data often fail to generalize to diverse real-world inputs with varying visual styles and acquisition conditions. Hand-drawn images represent a particularly challenging domain of molecular diagrams, exhibiting large variations in geometry and drawing styles. In this work, we propose an image-to-graph model \modelname, which enables effective transfer from synthetic to real-world data without requiring manual graph annotations in target domains. \modelname is an integrated pipeline that starts with training a base model on synthetic data, and then refines model representations through unsupervised domain adaptation and self-training. Our key insight is that bond features are domain-invariant in nature; they encode structural relationships between atoms that are independent of visual variations across domains. Thus, during domain adaptation, we align bond-level feature distributions via class-conditional Maximum Mean Discrepancy (MMD) to enforce cross-domain consistency. We further design a comprehensive data augmentation strategy to enhance the robustness of the base model, facilitating stable self-training on unlabeled target samples. We demonstrate our approach on hand-drawn molecules, achieving 82.6% accuracy—a 10.7-point improvement over the best prior method—while maintaining state-of-the-art performance on four benchmarks comprising molecular images from scientific literature and patent documents. This establishes a practical pipeline for molecular recognition that generalizes effectively across diverse real-world domains. Scientific contribution We propose AdaptMol, an image-to-graph model that predicts molecular structures as graphs of atoms and bonds, achieving effective transfer from synthetic to real-world molecular images without requiring target domain graph annotations. We combine class-conditional Maximum Mean Discrepancy to align bond features across domains with comprehensive data augmentation to increase training data variation, jointly improving base model accuracy sufficiently for self-training and addressing the critical failure mode of prior approaches that begin with insufficient accuracy. We further introduce dual position representation that supervises atom positions through both discrete coordinate tokens and continuous spatial heatmaps to reduce false positives in atom localization. OCSR Deep learning Transformer Domain Adaptation Unsupervised Learning Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 02 May, 2026 Read the published version in Journal of Cheminformatics → Version 1 posted Editorial decision: Revision requested 04 Mar, 2026 Reviews received at journal 20 Jan, 2026 Reviews received at journal 17 Jan, 2026 Reviewers agreed at journal 11 Jan, 2026 Reviewers agreed at journal 11 Jan, 2026 Reviewers invited by journal 11 Jan, 2026 Editor assigned by journal 11 Jan, 2026 Submission checks completed at journal 26 Dec, 2025 First submitted to journal 25 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8365561","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":581736829,"identity":"499c05d9-4d31-48d3-9c97-49122f40f6c7","order_by":0,"name":"Feng Hu","email":"","orcid":"","institution":"RMIT University","correspondingAuthor":false,"prefix":"","firstName":"Feng","middleName":"","lastName":"Hu","suffix":""},{"id":581736830,"identity":"19815db7-2eec-4a9d-8ff3-2af0466b84a8","order_by":1,"name":"Estrid He","email":"","orcid":"","institution":"RMIT University","correspondingAuthor":false,"prefix":"","firstName":"Estrid","middleName":"","lastName":"He","suffix":""},{"id":581736831,"identity":"43237f6c-08ff-4e5b-b68c-76b4ec392c1c","order_by":2,"name":"Karin Verspoor","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBElEQVRIie3QMWvCQBTA8ScHna7c+iQSP0HhgpBSGvwsFiEupXWSbm66CK73STp1uPCGLpqsgktcnBUXl0LvzFALiToK3h+S8CA/3iUALtcVhubSR3MEwOzz7nISG8LOk+MIijUnSH08X1EfyH+czpN8+5W9P4yFhs2AQKhOKfF4T5ICajUWb91ArZdPn8SgplICXJQT35yeOMQvCnnocb2UoSHsfmROWEXE+kCGSswsSQvyY0izgnh42BJ1EF4t0QWpGSIrSF3ZLTIKFNpv0V1LZDJJezyY5aUEs5jt+Ac2UZg/ttFtGWbJKt8Pnn3/u3xLkbQ3/jfr/2Nll7zjcrlcN9kv895WXnOan/cAAAAASUVORK5CYII=","orcid":"","institution":"RMIT University","correspondingAuthor":true,"prefix":"","firstName":"Karin","middleName":"","lastName":"Verspoor","suffix":""}],"badges":[],"createdAt":"2025-12-15 11:23:39","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8365561/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8365561/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s13321-026-01209-2","type":"published","date":"2026-05-02T15:57:53+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":108809288,"identity":"434aeaae-e867-4924-a060-46da0504a997","added_by":"auto","created_at":"2026-05-08 15:51:45","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2573879,"visible":true,"origin":"","legend":"","description":"","filename":"adaptmolpaper.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8365561/v1_covered_21d05a8f-e205-4dd6-bfc3-3b92ae38d878.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"AdaptMol: Domain Adaptation for Molecular Image Recognition with Limited Supervision","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":true,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"journal-of-cheminformatics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"chin","sideBox":"Learn more about [Journal of Cheminformatics](https://jcheminf.biomedcentral.com/)","snPcode":"13321","submissionUrl":"https://submission.nature.com/new-submission/13321/3","title":"Journal of Cheminformatics","twitterHandle":"@jcheminf","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"OCSR, Deep learning, Transformer, Domain Adaptation, Unsupervised Learning","lastPublishedDoi":"10.21203/rs.3.rs-8365561/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8365561/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOptical Chemical Structure Recognition (OCSR) aims to convert two-dimensional molecular images into machine-readable formats such as SMILES strings. Despite significant progress in deep learning based approaches, current OCSR methods trained predominantly on synthetic data often fail to generalize to diverse real-world inputs with varying visual styles and acquisition conditions. Hand-drawn images represent a particularly challenging domain of molecular diagrams, exhibiting large variations in geometry and drawing styles. In this work, we propose an image-to-graph model \\modelname, which enables effective transfer from synthetic to real-world data without requiring manual graph annotations in target domains. \\modelname is an integrated pipeline that starts with training a base model on synthetic data, and then refines model representations through unsupervised domain adaptation and self-training. Our key insight is that bond features are domain-invariant in nature; they encode structural relationships between atoms that are independent of visual variations across domains. Thus, during domain adaptation, we align bond-level feature distributions via class-conditional Maximum Mean Discrepancy (MMD) to enforce cross-domain consistency. We further design a comprehensive data augmentation strategy to enhance the robustness of the base model, facilitating stable self-training on unlabeled target samples. We demonstrate our approach on hand-drawn molecules, achieving 82.6% accuracy—a 10.7-point improvement over the best prior method—while maintaining state-of-the-art performance on four benchmarks comprising molecular images from scientific literature and patent documents. This establishes a practical pipeline for molecular recognition that generalizes effectively across diverse real-world domains.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eScientific contribution \u003c/strong\u003eWe propose AdaptMol, an image-to-graph model that predicts molecular structures as graphs of atoms and bonds, achieving effective transfer from synthetic to real-world molecular images without requiring target domain graph annotations. We combine class-conditional Maximum Mean Discrepancy to align bond features across domains with comprehensive data augmentation to increase training data variation, jointly improving base model accuracy sufficiently for self-training and addressing the critical failure mode of prior approaches that begin with insufficient accuracy. We further introduce dual position representation that supervises atom positions through both discrete coordinate tokens and continuous spatial heatmaps to reduce false positives in atom localization.\u003c/p\u003e","manuscriptTitle":"AdaptMol: Domain Adaptation for Molecular Image Recognition with Limited Supervision","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-19 04:38:11","doi":"10.21203/rs.3.rs-8365561/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-03-04T08:51:45+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-20T10:27:42+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-01-17T12:25:09+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"197763455604628549291780037685130201463","date":"2026-01-11T20:38:45+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"319447227408409749168300654725880451023","date":"2026-01-11T18:01:35+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-01-11T17:59:28+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-01-11T10:22:32+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-26T05:06:01+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Cheminformatics","date":"2025-12-26T00:17:47+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"journal-of-cheminformatics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"chin","sideBox":"Learn more about [Journal of Cheminformatics](https://jcheminf.biomedcentral.com/)","snPcode":"13321","submissionUrl":"https://submission.nature.com/new-submission/13321/3","title":"Journal of Cheminformatics","twitterHandle":"@jcheminf","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"74c4116b-4821-4819-9591-9b2781d4d196","owner":[],"postedDate":"February 19th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-05-08T15:20:01+00:00","versionOfRecord":{"articleIdentity":"rs-8365561","link":"https://doi.org/10.1186/s13321-026-01209-2","journal":{"identity":"journal-of-cheminformatics","isVorOnly":false,"title":"Journal of Cheminformatics"},"publishedOn":"2026-05-02 15:57:53","publishedOnDateReadable":"May 2nd, 2026"},"versionCreatedAt":"2026-02-19 04:38:11","video":"","vorDoi":"10.1186/s13321-026-01209-2","vorDoiUrl":"https://doi.org/10.1186/s13321-026-01209-2","workflowStages":[]},"version":"v1","identity":"rs-8365561","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8365561","identity":"rs-8365561","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.