Does Class Imbalance Undermine SHAP Explanations? A Multi-Domain Empirical Study of Feature Attribution Stability

preprint OA: closed
Full text JSON View at publisher
Full text 12,705 characters · extracted from preprint-html · click to expand
Does Class Imbalance Undermine SHAP Explanations? A Multi-Domain Empirical Study of Feature Attribution Stability | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Does Class Imbalance Undermine SHAP Explanations? A Multi-Domain Empirical Study of Feature Attribution Stability Minyeong KIM This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9489204/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract SHAP (SHapley Additive exPlanations) is the most widely used post-hoc explanation method for machine learning models, yet its reliability under class imbalance—the dominant condition in real-world classification tasks—remains poorly understood. This study presents a systematic empirical evaluation of how class imbalance affects SHAP explanation stability across 10 benchmark datasets (8 domains), 6 model architectures, 7 imbalance ratios (1:1 to 1:100), and 6 mitigation strategies. This study proposes a five-dimensional evaluation framework assessing rank stability, Top-K overlap, magnitude consistency, direction fidelity, and value divergence. Results reveal three key findings. First, models trained on imbalanced data produce SHAP explanations that diverge significantly from balanced-data references across all five dimensions, with effects detectable at ratios as mild as 1:2 (Wilcoxon p < 0.001, Cohen’s d = 0.813 to 2.134). Second, standard resampling techniques (SMOTE, ADASYN, random undersampling) counterintuitively increase this divergence in 56–68% of cases, while class weighting—the only strategy that does not actively worsen stability—produces a practically negligible improvement (+0.018 in rank stability). Third, explanation dimensions differ markedly in sensitivity: Top-K feature overlap drops to 0.52 at 1:100, while magnitude consistency remains above 0.86. These findings show that class imbalance alters the feature relationships models learn, and that SHAP faithfully captures these altered relationships—producing explanations that may mislead practitioners who assume invariance across class distributions. Practical guidelines are provided for obtaining more trustworthy explanations from imbalanced datasets. Explainable AI SHAP Class imbalance Feature importance stability Resampling Trustworthy AI Full Text Additional Declarations No competing interests reported. Supplementary Files Fig1.pdf Fig2.pdf Fig3.pdf Fig4.pdf Fig5.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9489204","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":627350281,"identity":"69eefb0c-fbef-4ca0-9002-df045b4d5399","order_by":0,"name":"Minyeong KIM","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAsElEQVRIiWNgGAWjYDACZgaGAx8qbEjTwnhwxpk0Eu05zNt2mAT1Bsd5DxzmOXM+j39G8rEPDDU20YS1HOZLODin4naxxI205BkMx9JyGwhpMTvMY3DgzZnbiQ1nzhgzMDYcJlILb9u5xPlnzn8mXstB3rYDiRuO9zATp8UepGXGmeTEjcfbjBkSiPGLZP8Z4w8fKuwS5x1mfszwocaGsBZUkECa8lEwCkbBKBgFuAAAvgpFdnbHTRQAAAAASUVORK5CYII=","orcid":"","institution":"","correspondingAuthor":true,"prefix":"","firstName":"Minyeong","middleName":"","lastName":"KIM","suffix":""}],"badges":[],"createdAt":"2026-04-22 00:38:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9489204/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9489204/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108523965,"identity":"cf846d67-1ae0-4b3f-a727-21feb69fab13","added_by":"auto","created_at":"2026-05-05 14:42:56","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":412153,"visible":true,"origin":"","legend":"","description":"","filename":"ijdsasubmission.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1_covered_11b6cc4a-6bf1-4833-9af3-0852395e440e.pdf"},{"id":107709100,"identity":"7a2f3132-fcfb-44c6-9e57-2281c7849bdc","added_by":"auto","created_at":"2026-04-24 09:34:48","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":35453,"visible":true,"origin":"","legend":"","description":"","filename":"Fig1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1/662b2f3a14754c7d85285e64.pdf"},{"id":107709053,"identity":"0422c94c-7391-49d9-9bcd-9aa8874f2913","added_by":"auto","created_at":"2026-04-24 09:34:29","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":39449,"visible":true,"origin":"","legend":"","description":"","filename":"Fig2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1/5e3ce20c1a6fcb9f7adcbe96.pdf"},{"id":107710321,"identity":"acf0d7a0-bd88-4625-90dc-76e2d6e74b26","added_by":"auto","created_at":"2026-04-24 09:40:22","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":47160,"visible":true,"origin":"","legend":"","description":"","filename":"Fig3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1/fcb849b743f534d509c997e1.pdf"},{"id":107709259,"identity":"cdf615de-2c73-4cab-8127-528c60c7af5b","added_by":"auto","created_at":"2026-04-24 09:35:11","extension":"pdf","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":38812,"visible":true,"origin":"","legend":"","description":"","filename":"Fig4.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1/3971a6829cf48ed09e0b7729.pdf"},{"id":107709062,"identity":"a0c458ae-43eb-462c-ae0b-cbb34253f43c","added_by":"auto","created_at":"2026-04-24 09:34:33","extension":"pdf","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":25346,"visible":true,"origin":"","legend":"","description":"","filename":"Fig5.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9489204/v1/343fe2f3d828e49aa3664183.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Does Class Imbalance Undermine SHAP Explanations? A Multi-Domain Empirical Study of Feature Attribution Stability","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Explainable AI, SHAP, Class imbalance, Feature importance stability, Resampling, Trustworthy AI","lastPublishedDoi":"10.21203/rs.3.rs-9489204/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9489204/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eSHAP (SHapley Additive exPlanations) is the most widely used post-hoc explanation method for machine learning models, yet its reliability under class imbalance—the dominant condition in real-world classification tasks—remains poorly understood. This study presents a systematic empirical evaluation of how class imbalance affects SHAP explanation stability across 10 benchmark datasets (8 domains), 6 model architectures, 7 imbalance ratios (1:1 to 1:100), and 6 mitigation strategies. This study proposes a five-dimensional evaluation framework assessing rank stability, Top-K overlap, magnitude consistency, direction fidelity, and value divergence. Results reveal three key findings. First, models trained on imbalanced data produce SHAP explanations that diverge significantly from balanced-data references across all five dimensions, with effects detectable at ratios as mild as 1:2 (Wilcoxon p \u0026lt; 0.001, Cohen’s d = 0.813 to 2.134). Second, standard resampling techniques (SMOTE, ADASYN, random undersampling) counterintuitively increase this divergence in 56–68% of cases, while class weighting—the only strategy that does not actively worsen stability—produces a practically negligible improvement (+0.018 in rank stability). Third, explanation dimensions differ markedly in sensitivity: Top-K feature overlap drops to 0.52 at 1:100, while magnitude consistency remains above 0.86. These findings show that class imbalance alters the feature relationships models learn, and that SHAP faithfully captures these altered relationships—producing explanations that may mislead practitioners who assume invariance across class distributions. Practical guidelines are provided for obtaining more trustworthy explanations from imbalanced datasets.\u003c/p\u003e","manuscriptTitle":"Does Class Imbalance Undermine SHAP Explanations? A Multi-Domain Empirical Study of Feature Attribution Stability","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-23 16:32:02","doi":"10.21203/rs.3.rs-9489204/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"fbc92cf3-4d3a-4dda-b4de-a5e72fb51659","owner":[],"postedDate":"April 23rd, 2026","published":true,"recentEditorialEvents":[{"type":"decision","content":"Withdrawn","date":"2026-05-05T14:31:01+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-05-05T14:41:39+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-23 16:32:02","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9489204","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9489204","identity":"rs-9489204","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00