Causal Fairness in Black-Box AI: A Counterfactual Auditing Framework for Deep Models | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Causal Fairness in Black-Box AI: A Counterfactual Auditing Framework for Deep Models Nurul Hakim Asif This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7023047/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract As synthetic intelligence (AI) structures grow to be an increasingly number of embedded in important regions together with finance, crook justice, employment, and healthcare, questions of equity and responsibility are not theoretical—they may be urgent. Many of the maximum influential fashions in those domain names perform as black boxes, generating selections which might be tough to interpret or even more difficult to audit. Traditional fairness metrics, such as Demographic Parity and Equalized Odds, assess disparities across groups but often miss subtler, individual-level biases and fail to consider the causal pathways that link protected attributes to decisions.This paper introduces a model-agnostic framework for evaluating fairness in black-box AI models using counterfactual reasoning. We propose the Counterfactual Fairness Gap (CFG)—a novel metric that quantifies how frequently an individual’s predicted outcome would change if their protected attribute (e.g., race or gender) were counterfactually altered, while maintaining causal consistency through a structural causal model (SCM).Our framework does not require access to internal model architecture or training data, making it broadly applicable in real-world scenarios where models are proprietary or opaque. We apply this method to two widely studied datasets—COMPAS and UCI Adult Income—using three commonly deployed classifiers: Deep Neural Networks, XGBoost, and Random Forests. Empirical results show that CFG identifies significant fairness violations that remain undetected by traditional statistical metrics.In addition to its technical utility, the framework provides practical and regulatory benefits. It supports both pre-deployment and post-deployment auditing and aligns with global AI governance initiatives such as the EU AI Act. By combining causal rigor with operational flexibility, our approach offers a powerful tool for identifying and addressing fairness risks in modern AI systems. Artificial Intelligence and Machine Learning Theoretical Computer Science Counterfactual Fairness Black-box AI Models Structural Causal Model (SCM) Algorithmic Bias Auditing Counterfactual Fairness Gap (CFG) Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7023047","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":479223245,"identity":"8b9d74b3-814d-41be-b04f-9fa196c82553","order_by":0,"name":"Nurul Hakim Asif","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABB0lEQVRIiWNgGAWjYLCChw0MBmwQpgQPP4hKKCCgJRGuJcFGRrIBRBsQoQXCSkizMTgAYuDRYi6R+0wicYedMZ/Y8WeSP38c5jE+vzrxwwMDBnl+sQNYtVjOSDeTSDyTbMYmnWMmzZNwmMfsxtvNEkCHGc6cnYBVi8GNNDaJxDZmG6AWNmkGsJazG0BaEgxu49VSD9SS/kzyB1CL8Yyzm38QoeUw0GEJZhI8CWk8Bvy92/DbcuYZs0Vi23FjoMOMrXnSbHgkbvBus0gwkMDtl+NpjDc+tlUbzp+d/vDmDxsJe/7+s5tv/qiwkeeXxq4FC5AAq5QgVjkI8B8gRfUoGAWjYBSMAAAAoBhZUJp8vQcAAAAASUVORK5CYII=","orcid":"https://orcid.org/0009-0000-8415-0716","institution":"Nantong University","correspondingAuthor":true,"prefix":"","firstName":"Nurul","middleName":"Hakim","lastName":"Asif","suffix":""}],"badges":[],"createdAt":"2025-07-01 18:59:04","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-7023047/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7023047/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":85937202,"identity":"8cee800e-022b-4593-8f18-b14922654a53","added_by":"auto","created_at":"2025-07-03 10:43:13","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":850750,"visible":true,"origin":"","legend":"","description":"","filename":"TitleCausalFairnessinBlackBoxAIACounterfactualAuditingFrameworkforDeepModels.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7023047/v1_covered_cde9cbdb-f3dd-4c47-a83a-7807cdf62229.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eCausal Fairness in Black-Box AI: A Counterfactual Auditing Framework for Deep Models\u003c/strong\u003e\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Nantong University","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Counterfactual Fairness, Black-box AI Models, Structural Causal Model (SCM), Algorithmic Bias Auditing, Counterfactual Fairness Gap (CFG)","lastPublishedDoi":"10.21203/rs.3.rs-7023047/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7023047/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAs synthetic intelligence (AI) structures grow to be an increasingly number of embedded in important regions together with finance, crook justice, employment, and healthcare, questions of equity and responsibility are not theoretical\u0026mdash;they may be urgent. Many of the maximum influential fashions in those domain names perform as black boxes, generating selections which might be tough to interpret or even more difficult to audit. Traditional fairness metrics, such as Demographic Parity and Equalized Odds, assess disparities across groups but often miss subtler, individual-level biases and fail to consider the causal pathways that link protected attributes to decisions.This paper introduces a model-agnostic framework for evaluating fairness in black-box AI models using counterfactual reasoning. We propose the Counterfactual Fairness Gap (CFG)\u0026mdash;a novel metric that quantifies how frequently an individual\u0026rsquo;s predicted outcome would change if their protected attribute (e.g., race or gender) were counterfactually altered, while maintaining causal consistency through a structural causal model (SCM).Our framework does not require access to internal model architecture or training data, making it broadly applicable in real-world scenarios where models are proprietary or opaque. We apply this method to two widely studied datasets\u0026mdash;COMPAS and UCI Adult Income\u0026mdash;using three commonly deployed classifiers: Deep Neural Networks, XGBoost, and Random Forests. Empirical results show that CFG identifies significant fairness violations that remain undetected by traditional statistical metrics.In addition to its technical utility, the framework provides practical and regulatory benefits. It supports both pre-deployment and post-deployment auditing and aligns with global AI governance initiatives such as the EU AI Act. By combining causal rigor with operational flexibility, our approach offers a powerful tool for identifying and addressing fairness risks in modern AI systems.\u003c/p\u003e","manuscriptTitle":"Causal Fairness in Black-Box AI: A Counterfactual Auditing Framework for Deep Models","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-03 10:19:07","doi":"10.21203/rs.3.rs-7023047/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"e2e96bef-3965-434b-ad38-31c632fdadf9","owner":[],"postedDate":"July 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":50886163,"name":"Artificial Intelligence and Machine Learning"},{"id":50886164,"name":"Theoretical Computer Science"}],"tags":[],"updatedAt":"2025-07-03T10:19:07+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-03 10:19:07","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7023047","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7023047","identity":"rs-7023047","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.