Multimodal Gene Expression Deep Learning for Predicting Sentinel Lymph Node Macro-metastasis in Early Breast Cancer: Development and Validation in the SCAN-B Cohort | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Multimodal Gene Expression Deep Learning for Predicting Sentinel Lymph Node Macro-metastasis in Early Breast Cancer: Development and Validation in the SCAN-B Cohort Daqu Zhang, Johan Staaf, Pär-Ola Bendahl, Looket Dihge, Mattias Ohlsson, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9281660/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Purpose: This study evaluates deep learning (DL) using gene expression (GEX) and preoperatively available clinical data (PreopClinic) to predict sentinel lymph node macro-metastasis (SLNM), and explores their potential for guiding axillary surgery de-escalation and supporting prognostic assessment. Experimental Design: We retrospectively included 6,836 clinically node-negative (cN0) T1-T2 patients with invasive breast cancer who underwent primary surgery from the Swedish SCAN-B cohort. Three DL models—a multilayer perceptron, a pathway-informed sparse neural network, and a transformer—were developed using the development set (n=4,625) and evaluated against XGBoost in the independent test set (n=2,211). Results: The Transformer outperformed other methods for GEX modeling and minimized the need for prior gene selection. In the independent test set, the combined PreopClinic+GEX model significantly improved SLNM prediction (ROC AUC 0.693, P<0.001) and better identified low-risk patients who might avoid unnecessary SLNB (reduction rate 27.2% at a sensitivity of 92.1%, P=0.02) compared to the PreopClinic model alone. Notably, across-subtype training outperformed within-subtype training, improving nodal prediction, especially in TNBC (ROC AUC 0.734; 95% CI: 0.644-0.837), achieving a substantial SLNB reduction rate of 51.5% (95% CI: 43.2-59.9%). Importantly, the derived SLNM predictor showed prognostic significance (P=0.039), and provided complementary information to the established prognostic factors in the ER+HER2- patients recommended for SLNB under the 2025 ASCO guidelines. Conclusions: These findings highlight the Transformer's robustness against noise and effectiveness in capturing informative GEX features across scales, suggesting the potential of integrating GEX data and PreopClinic variables to enable further axillary surgical de-escalation, including for patients with tumor characteristics not reflected in current ASCO recommendations. Breast cancer Axillary lymph node metastasis Sentinel lymph node metastasis preoperative lymph node staging Gene expression Cancer pathway Deep learning Transfer learning Transformer Grad-CAM Full Text Additional Declarations The authors declare no competing interests. Supplementary Files supplementary.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9281660","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":615383529,"identity":"3504daaa-19d0-476a-952a-ad446fa00968","order_by":0,"name":"Daqu Zhang","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Daqu","middleName":"","lastName":"Zhang","suffix":""},{"id":615385703,"identity":"cb11cc72-3c6d-4cf2-b742-c5453e69fbd3","order_by":1,"name":"Johan Staaf","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Johan","middleName":"","lastName":"Staaf","suffix":""},{"id":615385704,"identity":"bbfe382b-c480-4299-ab0a-b6c0ec153922","order_by":2,"name":"Pär-Ola Bendahl","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Pär-Ola","middleName":"","lastName":"Bendahl","suffix":""},{"id":615385705,"identity":"3a728e71-23f4-42de-a85f-9fc87d675f38","order_by":3,"name":"Looket Dihge","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Looket","middleName":"","lastName":"Dihge","suffix":""},{"id":615385706,"identity":"22558dd7-6768-4c78-8f58-ed66870ea133","order_by":4,"name":"Mattias Ohlsson","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Mattias","middleName":"","lastName":"Ohlsson","suffix":""},{"id":615385707,"identity":"6f99b2ae-ad99-4ec1-a183-0f33c303aa93","order_by":5,"name":"Martin Sjöström","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Martin","middleName":"","lastName":"Sjöström","suffix":""},{"id":615385708,"identity":"d285bb3a-b3c8-4b3c-a1a7-59442c6a154f","order_by":6,"name":"Johan Vallon-Christersson","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Johan","middleName":"","lastName":"Vallon-Christersson","suffix":""},{"id":615385709,"identity":"d8639091-eb1a-442d-9f73-67a19a45b7aa","order_by":7,"name":"Patrik Edén","email":"","orcid":"","institution":"lund university","correspondingAuthor":false,"prefix":"","firstName":"Patrik","middleName":"","lastName":"Edén","suffix":""},{"id":615385710,"identity":"86a8cab2-82f7-4958-bf09-97d0d9a6f4dc","order_by":8,"name":"Lisa Rydén","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA40lEQVRIiWNgGAWjYBACPmYIzdjAwHyAgYGNQYagFjaEFrYEkBYewloY4Fp4DIjUws7+8APjnsOy/dI9Xzd8KANqPMD+8AF+h/EYSzA8O2w8c87ZbTdnnANp4TE2IKCFQYLhwOHEDTdyt93mbQNrYZPAr4X98Q+Qlv03cp7d/gvWwv78B34tDGYQWyRy2G4zgrUwmOHTAXKYmUXCgXTjGTfSzG72nJPgkTwM8h0ewM9//PGNDwesZftnJD+78aPMRo7vePvDD3itAYEEhmYYE2g+M0H1YFBHnLJRMApGwSgYmQAA23dF+XIwXKQAAAAASUVORK5CYII=","orcid":"","institution":"lund university","correspondingAuthor":true,"prefix":"","firstName":"Lisa","middleName":"","lastName":"Rydén","suffix":""}],"badges":[],"createdAt":"2026-03-31 15:06:25","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-9281660/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9281660/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107707821,"identity":"0bb26438-78eb-466f-9b0b-9b1c6fdbfcef","added_by":"auto","created_at":"2026-04-24 09:21:12","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7374992,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9281660/v1_covered_892b0c1f-9953-4394-9b50-62caaee76d1d.pdf"},{"id":106013004,"identity":"80a045d1-bfe1-44b0-92c3-5a3cf7985cde","added_by":"auto","created_at":"2026-04-02 12:20:24","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":10119596,"visible":true,"origin":"","legend":"","description":"","filename":"supplementary.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9281660/v1/ec6845e45de9faba38058030.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eMultimodal Gene Expression Deep Learning for Predicting Sentinel Lymph Node Macro-metastasis in Early Breast Cancer: Development and Validation in the SCAN-B Cohort\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Lund University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Breast cancer, Axillary lymph node metastasis, Sentinel lymph node metastasis, preoperative lymph node staging, Gene expression, Cancer pathway, Deep learning, Transfer learning, Transformer, Grad-CAM","lastPublishedDoi":"10.21203/rs.3.rs-9281660/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9281660/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003ePurpose: This study evaluates deep learning (DL) using gene expression \u0026nbsp;(GEX) and preoperatively available clinical data (PreopClinic) to \u0026nbsp;predict sentinel lymph node macro-metastasis (SLNM), and explores their \u0026nbsp;potential for guiding axillary surgery de-escalation and supporting \u0026nbsp;prognostic assessment.\u003cbr\u003e\nExperimental Design: We retrospectively \u0026nbsp;included 6,836 clinically node-negative (cN0) T1-T2 patients with \u0026nbsp;invasive breast cancer who underwent primary surgery from the Swedish \u0026nbsp;SCAN-B cohort. Three DL models—a multilayer perceptron, a \u0026nbsp;pathway-informed sparse neural network, and a transformer—were developed \u0026nbsp;using the development set (n=4,625) and evaluated against XGBoost in \u0026nbsp;the independent test set (n=2,211).\u003cbr\u003e\nResults: The Transformer \u0026nbsp;outperformed other methods for GEX modeling and minimized the need for \u0026nbsp;prior gene selection. In the independent test set, the combined \u0026nbsp;PreopClinic+GEX model significantly improved SLNM prediction (ROC AUC \u0026nbsp;0.693, P\u0026lt;0.001) and better identified low-risk patients who might \u0026nbsp;avoid unnecessary SLNB (reduction rate 27.2% at a sensitivity of 92.1%, \u0026nbsp;P=0.02) compared to the PreopClinic model alone. Notably, across-subtype \u0026nbsp;training outperformed within-subtype training, improving nodal \u0026nbsp;prediction, especially in TNBC (ROC AUC 0.734; 95% CI: 0.644-0.837), \u0026nbsp;achieving a substantial SLNB reduction rate of 51.5% (95% CI: \u0026nbsp;43.2-59.9%). Importantly, the derived SLNM predictor showed prognostic \u0026nbsp;significance (P=0.039), and provided complementary information to the \u0026nbsp;established prognostic factors in the ER+HER2- patients recommended for \u0026nbsp;SLNB under the 2025 ASCO guidelines.\u003cbr\u003e\nConclusions: These findings \u0026nbsp;highlight the Transformer's robustness against noise and effectiveness \u0026nbsp;in capturing informative GEX features across scales, suggesting the \u0026nbsp;potential of integrating GEX data and PreopClinic variables to enable \u0026nbsp;further axillary surgical de-escalation, including for patients with \u0026nbsp;tumor characteristics not reflected in current ASCO recommendations.\u003c/p\u003e","manuscriptTitle":"Multimodal Gene Expression Deep Learning for Predicting Sentinel Lymph Node Macro-metastasis in Early Breast Cancer: Development and Validation in the SCAN-B Cohort","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-02 12:20:18","doi":"10.21203/rs.3.rs-9281660/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"6e447ac6-6a57-41b3-bac5-e7141e339642","owner":[],"postedDate":"April 2nd, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-04-02T12:20:18+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-02 12:20:18","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9281660","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9281660","identity":"rs-9281660","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.