Reliable CNN Evaluation in Medical Imaging via Variance-Aware Cross-Validation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Systematic Review Reliable CNN Evaluation in Medical Imaging via Variance-Aware Cross-Validation Peter Abban, Mehdi Taassori This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8807781/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Reliable evaluation and generalizable hyperparameter selection remain critical challenges in deep learning–based medical image analysis, particularly under limited, imbalanced, and heterogeneous data conditions. This paper proposes a Variance-Aware K-Fold Cross-Validation framework for robust hyperparameter optimization of convolutional neural networks (CNNs). Unlike conventional single-run or mean-based cross-validation strategies, the proposed framework introduces a variance-regularized objective function that jointly maximizes mean validation performance while explicitly penalizing fold-to-fold variability, thereby promoting stability and generalization. The approach is systematically integrated with Bayesian optimization and Tree-structured Parzen Estimator (TPE) methods and evaluated across multiple optimization libraries, demonstrating its library-agnostic applicability. Extensive experiments under varying K-Fold configurations show that variance-aware optimization consistently mitigates the optimistic bias of single-run evaluations and identifies hyperparameter configurations with superior robustness and reproducibility. A theoretical analysis further establishes variance-aware generalization error bounds and a reliability ordering principle, providing formal justification for the proposed optimization criterion. Empirical validation on a multi-class breast ultrasound imaging dataset confirms improved performance stability and reduced variance across folds. Overall, the proposed framework offers a principled, reproducible, and architecture-independent evaluation strategy that enhances the reliability of CNN-based medical imaging systems and is readily extensible to other data-limited clinical applications. Artificial Intelligence and Machine Learning Medical Imaging Convolutional Neural Networks (CNNs) Hyperparameter Optimization Variance-Aware Cross-Validation Model Reliability Full Text Additional Declarations The authors declare no competing interests. Supplementary Files SupplementaryMaterialChap45.pdf TableofResultsforMacroandPerClassMetrics.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8807781","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Systematic Review","associatedPublications":[],"authors":[{"id":587011003,"identity":"e3294653-2bf1-4b10-89e1-8ee5da8d4e44","order_by":0,"name":"Peter Abban","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6ElEQVRIiWNgGAWjYDACCR4gUQFiJQCxARQxNhDScoZBgkQtjG0wLQxEaOGf3Xvwc+G8ujqD4wls0gUFDHLmEskbPzDuOIzbkjvnkqVnbjssYXDmAZv0DAMGY8sZacUSjGdwa2G4kWMgzbvtgITBDaAtPAb/EzcARSQY23Brkb+RY/ybd04dTAsDSIvxD3xaDG7kmEnzNjCjaDHDa4vhnXNp1jzHDkvOPPOw2ZoH5JeeZ2UWiW3pOLXI3e49fJunpo6f73jywds8f4Ahxp68+cbHNmvc3kcA5LhIIEbDKBgFo2AUjAKcAABC61EDBRM0pAAAAABJRU5ErkJggg==","orcid":"","institution":"Obuda University","correspondingAuthor":true,"prefix":"","firstName":"Peter","middleName":"","lastName":"Abban","suffix":""},{"id":587011131,"identity":"a91e4d7b-a107-4a84-b38a-f40e81f5b099","order_by":1,"name":"Mehdi Taassori","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAklEQVRIiWNgGAWjYBADHiBmY+AxsGFgYIYhCdxKkbWkEa+FAaKF4TCYhVeLPfvhxx9/VNyTYRA7fOzBm4LzidvZGRgfF7ZZyzNI9xhgtYUnzUya50wxD4N0WrrhHIPbiTubGZiNZ7alGzbInMGuhSHBjJmxLQGoJQeoF6hlw2H+b9K8bYcZGyRysGvhf/7540+wlvxvQC3ngFoY2EBa7HFqAYpL8EJsYQNqOQDXkohTy403ZUC/JPCwSaeZSc4xSDYGamE2nnEuPblNIq0Amxb2/vTNwBBLsOeXTn4m8eaPneyG8wcYHxeUWdv2SyRvwB7MUMBGhMgoGAWjYBSMAmIBAA0wUjTWIBscAAAAAElFTkSuQmCC","orcid":"","institution":"Obuda University","correspondingAuthor":true,"prefix":"","firstName":"Mehdi","middleName":"","lastName":"Taassori","suffix":""}],"badges":[],"createdAt":"2026-02-06 13:45:14","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-8807781/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8807781/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":102746585,"identity":"28216588-1314-4a18-ab53-8696a27b6847","added_by":"auto","created_at":"2026-02-16 08:58:22","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1313766,"visible":true,"origin":"","legend":"","description":"","filename":"ReliableCNN.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8807781/v1_covered_d9c049f7-5880-417b-a8cb-b4e4b7c3dd1f.pdf"},{"id":102483542,"identity":"cdc9baec-a94d-478f-a96f-aaf3ef6cc428","added_by":"auto","created_at":"2026-02-12 07:18:46","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":209313,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryMaterialChap45.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8807781/v1/b6d2dc4fc12111d216050b6a.pdf"},{"id":102483543,"identity":"da84a4f1-edac-4099-b3e3-0e974478fdf3","added_by":"auto","created_at":"2026-02-12 07:18:46","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":254583,"visible":true,"origin":"","legend":"","description":"","filename":"TableofResultsforMacroandPerClassMetrics.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8807781/v1/a6172660526027b03cd7cef9.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eReliable CNN Evaluation in Medical Imaging via Variance-Aware Cross-Validation\u003c/strong\u003e\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Óbuda University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Medical Imaging, Convolutional Neural Networks (CNNs), Hyperparameter Optimization, Variance-Aware Cross-Validation, Model Reliability","lastPublishedDoi":"10.21203/rs.3.rs-8807781/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8807781/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eReliable evaluation and generalizable hyperparameter selection remain critical challenges in deep learning\u0026ndash;based medical image analysis, particularly under limited, imbalanced, and heterogeneous data conditions. This paper proposes a Variance-Aware K-Fold Cross-Validation framework for robust hyperparameter optimization of convolutional neural networks (CNNs). Unlike conventional single-run or mean-based cross-validation strategies, the proposed framework introduces a variance-regularized objective function that jointly maximizes mean validation performance while explicitly penalizing fold-to-fold variability, thereby promoting stability and generalization. The approach is systematically integrated with Bayesian optimization and Tree-structured Parzen Estimator (TPE) methods and evaluated across multiple optimization libraries, demonstrating its library-agnostic applicability. Extensive experiments under varying K-Fold configurations show that variance-aware optimization consistently mitigates the optimistic bias of single-run evaluations and identifies hyperparameter configurations with superior robustness and reproducibility. A theoretical analysis further establishes variance-aware generalization error bounds and a reliability ordering principle, providing formal justification for the proposed optimization criterion. Empirical validation on a multi-class breast ultrasound imaging dataset confirms improved performance stability and reduced variance across folds. Overall, the proposed framework offers a principled, reproducible, and architecture-independent evaluation strategy that enhances the reliability of CNN-based medical imaging systems and is readily extensible to other data-limited clinical applications.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e","manuscriptTitle":"Reliable CNN Evaluation in Medical Imaging via Variance-Aware Cross-Validation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-12 07:18:41","doi":"10.21203/rs.3.rs-8807781/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"83f06f05-6001-4ac0-ac83-8afea7e9cbe0","owner":[],"postedDate":"February 12th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":62459747,"name":"Artificial Intelligence and Machine Learning"}],"tags":[],"updatedAt":"2026-02-12T07:18:41+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-12 07:18:41","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8807781","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8807781","identity":"rs-8807781","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.