Interpretable Multiple Instance Learning for Hematologic Diagnosis from Peripheral Blood Smears

preprint OA: closed
Full text JSON View at publisher
Full text 26,545 characters · extracted from preprint-html · click to expand
Interpretable Multiple Instance Learning for Hematologic Diagnosis from Peripheral Blood Smears | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Interpretable Multiple Instance Learning for Hematologic Diagnosis from Peripheral Blood Smears Siddharth Singi, Shenghuan Sun, Zhanghan Yin, Riya Gupta, Dylan Webb, and 36 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6933141/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract Accurate diagnosis of hematologic malignancies from peripheral blood smears (PBSs) requires integrating cellular morphology and composition across hundreds of white blood cells. Existing approaches primarily automate single-cell classification without providing whole-slide diagnostic predictions. We present a pipeline that utilizes a highly performative cell-based encoder (DeepHeme) for feature extraction paired with our weakly supervised framework using attention-based multiple instance learning (MIL) that we call CAREMIL (Cell AggRegation, Explainable, Multiple Instance Learning). Upon evaluating various popular image encoders and MIL architectures, the combination of DeepHeme and CAREMIL is the best performing pipeline on our disease classification task. CAREMIL proves to be a robust aggregation function that outperforms the most commonly used slide level aggregation functon(gated multiple instance learning) across several encoder types. The greatest improvements in performance gain with CAREMIL is observed when using out-of-domain encoders, including an encoder trained on ImageNet and leading open-source pathology foundational models (UNI2 and Virchow2). CAREMIL plus DeepHeme achieves the highest diagnostic performance across acute leukemia (AML), myelodysplastic syndromes (MDS), and hairy cell leukemia (HCL) (AUROCs 0.999, 0.891, and 0.945, respectively), and identifies AML disease even in cases with minimal or absent circulating blasts. Attention values assigned by CAREMIL highlight diagnostically relevant cells and reveal disease-specific morphometric signatures, enabling biological interpretability and case-level insight. CAREMIL remains robust to misclassified cell types by the cell image encoder and does not require explicit cell-level supervision. These findings position CAREMIL as an effective and interpretable multiple instance learning framework for hematologic slide diagnosis, with potential to extend to bone marrow aspirates, cytology, and other liquid biopsy specimens, and to support a broader shift toward quantitative, morphology-informed diagnostics in hematology. Health sciences/Oncology/Cancer/Haematological cancer Biological sciences/Cancer/Cancer screening Health sciences/Biomarkers/Diagnostic markers Health sciences/Oncology/Cancer/Cancer imaging Full Text Additional Declarations There is NO Competing Interest. Supplementary Files naturecommunicationschecklistsigned.pdf Nature Communications Checklist SupplementalPaper.pdf Supplementary Info Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6933141","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":485626757,"identity":"9342f27e-eaa1-4f97-a5a4-2c015e2b8ddd","order_by":0,"name":"Siddharth Singi","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA4UlEQVRIiWNgGAWjYHACxocfDGzkGBiYGxgYGyRAIgaEtDAbSxSkGQO1Eq+FTYLnw+HEBogWBsJa5PuPP5CQMGBO729vbJO6ucNCnoG9eZsEPi0GN3IMDAoM2HJnnDnYJp17RsKwgedYGX4tEjwMCRIGPLkbJBKBWtokgN7JMcOrBeSwAzwGEukGUC32DfJv8GthOJAAdIqBQQJMS2KDBA9+LUC/GDNLGCQYAv3SbA30S3IbT1qxBQGHPf/54c9/ef725oO3c3fU2fazH954A6/DMAAbacpHwSgYBaNgFGADAJSRRpd9PmidAAAAAElFTkSuQmCC","orcid":"https://orcid.org/0009-0005-6184-5239","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":true,"prefix":"","firstName":"Siddharth","middleName":"","lastName":"Singi","suffix":""},{"id":485626758,"identity":"5ac36676-5990-418f-b884-1145553ed506","order_by":1,"name":"Shenghuan Sun","email":"","orcid":"https://orcid.org/0000-0002-4339-2716","institution":"University of California San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Shenghuan","middleName":"","lastName":"Sun","suffix":""},{"id":485626759,"identity":"402c7758-71f6-4d80-a4f5-e43634e03714","order_by":2,"name":"Zhanghan Yin","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Zhanghan","middleName":"","lastName":"Yin","suffix":""},{"id":485626760,"identity":"07290199-a3bd-4343-86d1-debb8d4336bb","order_by":3,"name":"Riya Gupta","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Riya","middleName":"","lastName":"Gupta","suffix":""},{"id":485626761,"identity":"06834a77-b8ca-4c54-a0a2-5fedc703c68f","order_by":4,"name":"Dylan Webb","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Dylan","middleName":"","lastName":"Webb","suffix":""},{"id":485626762,"identity":"4b32112b-979c-433a-a55e-06b1d86bc255","order_by":5,"name":"Khawaja Bilal","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Khawaja","middleName":"","lastName":"Bilal","suffix":""},{"id":485626763,"identity":"0209e216-66cc-421e-b9d1-6d22486ae667","order_by":6,"name":"Deepika Dilip","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Deepika","middleName":"","lastName":"Dilip","suffix":""},{"id":485626764,"identity":"a2f71624-38db-4e93-8c33-bbb89d13c3d4","order_by":7,"name":"Linlin Wang","email":"","orcid":"","institution":"University of California, San Francisco","correspondingAuthor":false,"prefix":"","firstName":"Linlin","middleName":"","lastName":"Wang","suffix":""},{"id":485626765,"identity":"a410b9e4-120c-449d-b071-451f9d8a4b9d","order_by":8,"name":"Neeraj Kumar","email":"","orcid":"","institution":"University of Alberta, Edmonton, Alberta, Canada.","correspondingAuthor":false,"prefix":"","firstName":"Neeraj","middleName":"","lastName":"Kumar","suffix":""},{"id":485626766,"identity":"a08fd1fc-eabb-4dca-9c7f-0f1a4f710799","order_by":9,"name":"Swaraj Nanda","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Swaraj","middleName":"","lastName":"Nanda","suffix":""},{"id":485626767,"identity":"0ec6efdc-e82e-4b0a-ae11-e116dc7eab92","order_by":10,"name":"Nicholas Sanchez","email":"","orcid":"","institution":"University of California Berkeley","correspondingAuthor":false,"prefix":"","firstName":"Nicholas","middleName":"","lastName":"Sanchez","suffix":""},{"id":485626768,"identity":"4019bced-2a08-4c8e-93a0-92330111774f","order_by":11,"name":"Jacob Cleaves","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Jacob","middleName":"","lastName":"Cleaves","suffix":""},{"id":485626769,"identity":"406793db-6832-49eb-a54a-739f523d2b33","order_by":12,"name":"Brenda Fried","email":"","orcid":"https://orcid.org/0000-0001-5972-6146","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Brenda","middleName":"","lastName":"Fried","suffix":""},{"id":485626770,"identity":"83930ac9-e0d4-42fa-b9d6-e6e10c20e5ff","order_by":13,"name":"Sean Paulsen","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Sean","middleName":"","lastName":"Paulsen","suffix":""},{"id":485626771,"identity":"f914f571-f8ba-401b-8974-5b5df04911d9","order_by":14,"name":"Ethan Yan","email":"","orcid":"https://orcid.org/0009-0005-6332-6803","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Ethan","middleName":"","lastName":"Yan","suffix":""},{"id":485626772,"identity":"afc6e77e-9a25-4dfb-a315-fd84946f13e6","order_by":15,"name":"Ali Kamali","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Ali","middleName":"","lastName":"Kamali","suffix":""},{"id":485626773,"identity":"c9760336-eada-4375-9450-3a34e12f8d2e","order_by":16,"name":"Argho Sarkar","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Argho","middleName":"","lastName":"Sarkar","suffix":""},{"id":485626774,"identity":"1be7812f-d3eb-40ba-97a9-1749276fdf05","order_by":17,"name":"Allyne Manzo","email":"","orcid":"https://orcid.org/0009-0004-3056-1180","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Allyne","middleName":"","lastName":"Manzo","suffix":""},{"id":485626775,"identity":"2a0b60ba-3632-40a0-a896-16895635f1b2","order_by":18,"name":"Jeeyeon Baik","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Jeeyeon","middleName":"","lastName":"Baik","suffix":""},{"id":485626776,"identity":"41841aff-e5bf-45be-b400-437818a87483","order_by":19,"name":"Irem Isgor","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Irem","middleName":"","lastName":"Isgor","suffix":""},{"id":485626777,"identity":"bed6dfe4-54d3-4393-9b61-5d6f156ab05b","order_by":20,"name":"Cesar Colorado-Jimenez","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Cesar","middleName":"","lastName":"Colorado-Jimenez","suffix":""},{"id":485626778,"identity":"36548d94-ac94-482b-9e0a-6a26f7d2391f","order_by":21,"name":"Anthony Cardillo","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Anthony","middleName":"","lastName":"Cardillo","suffix":""},{"id":485626779,"identity":"18eb2e39-c44a-4131-82a8-d90381c53b1a","order_by":22,"name":"Leonardo Boiocchi","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Leonardo","middleName":"","lastName":"Boiocchi","suffix":""},{"id":485626780,"identity":"392978ef-9d1c-43da-bb80-95a6b825318c","order_by":23,"name":"Aijazuddin Syed","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Aijazuddin","middleName":"","lastName":"Syed","suffix":""},{"id":485626781,"identity":"73dba9a9-6cf4-4596-92de-b4a69293d6c1","order_by":24,"name":"David Kim","email":"","orcid":"https://orcid.org/0000-0002-7593-6052","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Kim","suffix":""},{"id":485626782,"identity":"140b8fbc-7e8c-4e9b-b541-f3f5250fab89","order_by":25,"name":"Brie Kezlarian-Sachs","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Brie","middleName":"","lastName":"Kezlarian-Sachs","suffix":""},{"id":485626783,"identity":"c7a19c74-1365-4ca2-b7ef-c6325e22ce61","order_by":26,"name":"Maly Fenelus","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Maly","middleName":"","lastName":"Fenelus","suffix":""},{"id":485626784,"identity":"bdbcb95e-eed4-4fd3-8da2-582f433c9474","order_by":27,"name":"Alexander Chan","email":"","orcid":"https://orcid.org/0000-0001-5884-8424","institution":"Department of Pathology","correspondingAuthor":false,"prefix":"","firstName":"Alexander","middleName":"","lastName":"Chan","suffix":""},{"id":485626785,"identity":"ae9bf76d-d715-482f-b0c2-4e25ea3ccc03","order_by":28,"name":"Mariko Yabe","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Mariko","middleName":"","lastName":"Yabe","suffix":""},{"id":485626786,"identity":"28a4b6ec-c450-4a6d-9253-1ec39bfe8117","order_by":29,"name":"Samuel McCash","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Samuel","middleName":"","lastName":"McCash","suffix":""},{"id":485626787,"identity":"004316ca-7200-4588-8e48-1a84319c9021","order_by":30,"name":"Menglei Zhu","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Menglei","middleName":"","lastName":"Zhu","suffix":""},{"id":485626788,"identity":"c3ec9a94-bace-4a39-a377-c30a3c6e5891","order_by":31,"name":"Simon Mantha","email":"","orcid":"https://orcid.org/0000-0003-4277-5261","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Simon","middleName":"","lastName":"Mantha","suffix":""},{"id":485626789,"identity":"62747f68-34f7-44a5-9aad-d38c8e898181","order_by":32,"name":"Orly Ardon","email":"","orcid":"https://orcid.org/0000-0001-8147-933X","institution":"Memorial Sloan-Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Orly","middleName":"","lastName":"Ardon","suffix":""},{"id":485626790,"identity":"c00fd2aa-648d-46ff-b695-f9ce024a5640","order_by":33,"name":"Lauren McVoy","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Lauren","middleName":"","lastName":"McVoy","suffix":""},{"id":485626791,"identity":"48dac2ab-5f4f-44e4-87e9-fb37baa13d17","order_by":34,"name":"Wenbin Xiao","email":"","orcid":"https://orcid.org/0000-0001-8586-8500","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Wenbin","middleName":"","lastName":"Xiao","suffix":""},{"id":485626792,"identity":"17dbe5e4-ef80-4167-a8ea-cae45d07f038","order_by":35,"name":"Mikhail Roshal","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Mikhail","middleName":"","lastName":"Roshal","suffix":""},{"id":485626793,"identity":"cc32eb2d-74ae-485d-b566-cd8eafbf2ccb","order_by":36,"name":"Oscar Lin","email":"","orcid":"","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Oscar","middleName":"","lastName":"Lin","suffix":""},{"id":485626794,"identity":"45dfa738-f3f8-4074-a1e3-28f425123b72","order_by":37,"name":"Ahmet Dogan","email":"","orcid":"https://orcid.org/0000-0001-6576-5256","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Ahmet","middleName":"","lastName":"Dogan","suffix":""},{"id":485626795,"identity":"ae398ad9-d57f-40e2-a1a6-438c737fb308","order_by":38,"name":"Iain Carmichael","email":"","orcid":"","institution":"University of North Carolina at Chapel Hill","correspondingAuthor":false,"prefix":"","firstName":"Iain","middleName":"","lastName":"Carmichael","suffix":""},{"id":485626796,"identity":"24f82749-4751-42c3-bff8-a9be4482a2da","order_by":39,"name":"Chad Vanderbilt","email":"","orcid":"https://orcid.org/0000-0002-8114-0237","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Chad","middleName":"","lastName":"Vanderbilt","suffix":""},{"id":485626797,"identity":"d390454f-0d96-40d3-9a42-f14c98d5f21e","order_by":40,"name":"Gregory Goldgof","email":"","orcid":"https://orcid.org/0000-0001-8732-9834","institution":"Memorial Sloan Kettering Cancer Center","correspondingAuthor":false,"prefix":"","firstName":"Gregory","middleName":"","lastName":"Goldgof","suffix":""}],"badges":[],"createdAt":"2025-06-19 16:55:33","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6933141/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6933141/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94985638,"identity":"31740668-a7ee-4ad3-878f-15abb1667b38","added_by":"auto","created_at":"2025-11-03 06:58:32","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":4185172,"visible":true,"origin":"","legend":"","description":"","filename":"MainManuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/7d43193ce94a56fd3169150d.pdf"},{"id":94844552,"identity":"95bb3eb0-379c-482b-93b5-6e268ae22157","added_by":"auto","created_at":"2025-10-31 10:03:46","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":36345,"visible":true,"origin":"","legend":"","description":"","filename":"COMMSMED251505T.json","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/72e0e21422cddaa1e3340fe2.json"},{"id":94844555,"identity":"71585b02-786e-400f-8326-c0415db0f81c","added_by":"auto","created_at":"2025-10-31 10:03:47","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":12705951,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementalPaper.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/4edf414a7771c194f4fec8ee.pdf"},{"id":94844554,"identity":"f16e7e0d-7000-47bc-9e12-240aba8646f2","added_by":"auto","created_at":"2025-10-31 10:03:47","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1627937,"visible":true,"origin":"","legend":"","description":"","filename":"naturecommunicationschecklistsigned.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/3cca7b44fe68355140c3fd61.pdf"},{"id":95000528,"identity":"f7e34ef1-83f3-4916-b61a-943dcb613668","added_by":"auto","created_at":"2025-11-03 08:59:06","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1531689,"visible":true,"origin":"","legend":"","description":"","filename":"MainManuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1_covered_ffc7e878-b3ad-4549-9150-0d4f0dc4b1ab.pdf"},{"id":94985273,"identity":"b68b4aa0-b382-4d2d-86e7-dd9bd8df0ad6","added_by":"auto","created_at":"2025-11-03 06:57:50","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1627937,"visible":true,"origin":"","legend":"Nature Communications Checklist","description":"","filename":"naturecommunicationschecklistsigned.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/bf0b73a3e63f18ad932f7734.pdf"},{"id":94844557,"identity":"9e13b587-98c6-49a4-9b82-1d043d1493fb","added_by":"auto","created_at":"2025-10-31 10:03:47","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":12705951,"visible":true,"origin":"","legend":"Supplementary Info","description":"","filename":"SupplementalPaper.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6933141/v1/e288350bf7ab4721e5af7627.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Interpretable Multiple Instance Learning for Hematologic Diagnosis from Peripheral Blood Smears","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6933141/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6933141/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Accurate diagnosis of hematologic malignancies from peripheral blood smears (PBSs) requires integrating cellular morphology and composition across hundreds of white blood cells. Existing approaches primarily automate single-cell classification without providing whole-slide diagnostic predictions. We present a pipeline that utilizes a highly performative cell-based encoder (DeepHeme) for feature extraction paired with our weakly supervised framework using attention-based multiple instance learning (MIL) that we call CAREMIL (Cell AggRegation, Explainable, Multiple Instance Learning). Upon evaluating various popular image encoders and MIL architectures, the combination of DeepHeme and CAREMIL is the best performing pipeline on our disease classification task. CAREMIL proves to be a robust aggregation function that outperforms the most commonly used slide level aggregation functon(gated multiple instance learning) across several encoder types. The greatest improvements in performance gain with CAREMIL is observed when using out-of-domain encoders, including an encoder trained on ImageNet and leading open-source pathology foundational models (UNI2 and Virchow2). CAREMIL plus DeepHeme achieves the highest diagnostic performance across acute leukemia (AML), myelodysplastic syndromes (MDS), and hairy cell leukemia (HCL) (AUROCs 0.999, 0.891, and 0.945, respectively), and identifies AML disease even in cases with minimal or absent circulating blasts. Attention values assigned by CAREMIL highlight diagnostically relevant cells and reveal disease-specific morphometric signatures, enabling biological interpretability and case-level insight. CAREMIL remains robust to misclassified cell types by the cell image encoder and does not require explicit cell-level supervision. These findings position CAREMIL as an effective and interpretable multiple instance learning framework for hematologic slide diagnosis, with potential to extend to bone marrow aspirates, cytology, and other liquid biopsy specimens, and to support a broader shift toward quantitative, morphology-informed diagnostics in hematology.","manuscriptTitle":"Interpretable Multiple Instance Learning for Hematologic Diagnosis from Peripheral Blood Smears","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-31 10:03:42","doi":"10.21203/rs.3.rs-6933141/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"communications-medicine","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"commsmed","sideBox":"Learn more about [Communications Medicine](http://www.nature.com/commsmed)","snPcode":"43856","submissionUrl":"https://mts-commsmed.nature.com/cgi-bin/main.plex","title":"Communications Medicine","twitterHandle":"@commsmedicine","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Communications Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"8f398ed6-3776-4f50-afcd-11b9f0f017c5","owner":[],"postedDate":"October 31st, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":51550265,"name":"Health sciences/Oncology/Cancer/Haematological cancer"},{"id":51550266,"name":"Biological sciences/Cancer/Cancer screening"},{"id":51550267,"name":"Health sciences/Biomarkers/Diagnostic markers"},{"id":51550268,"name":"Health sciences/Oncology/Cancer/Cancer imaging"}],"tags":[],"updatedAt":"2026-03-13T08:05:12+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-31 10:03:42","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6933141","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6933141","identity":"rs-6933141","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00