From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems

doi:10.21203/rs.3.rs-6832455/v1

From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems

2025 · doi:10.21203/rs.3.rs-6832455/v1

preprint OA: closed

Full text JSON View at publisher

Full text 11,464 characters · extracted from preprint-html · click to expand

From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems Vivian Wei Jiang, Gustavo Batista, Michael Bain This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6832455/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 4 You are reading this latest preprint version Abstract Ensuring fairness when training Machine Learning (ML) models remains a critical challenge, particularly when biases are embedded in the underlying data. This paper presents a fairness-aware graph structure learning framework demonstrating how learning fair graphs leads to fairer data for ML training and, consequently, fairer Artificial Intelligence (AI) decisioning based on such models. Our method incorporates a fairness regularization term into score-based structure learning algorithms, guiding the search towards graph structures that minimize discriminatory pathways while preserving statistical relationships. The learned fair graph structures enable the generation of synthetic datasets with mitigated biases, which can be used to train diverse ML models. This modification is non-trivial, as structure learning algorithms rely on local search strategies, while fairness is a global property that depends on the entire graph structure. Our framework is highly adaptable, compatible with various structure learning algorithms, and seamlessly incorporates different fairness metrics to meet specific contextual needs. Extensive experiments on both real-world and synthetic datasets demonstrate that our approach significantly improves fairness while maintaining competitive predictive performance, offering an interpretable and versatile solution for mitigating bias in AI systems.. ML Fairness Bias Mitigation Bayesian Networks Generative Models Graph Structure Learning Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: Revision requested 24 Jun, 2025 Editor assigned by journal 23 Jun, 2025 Submission checks completed at journal 06 Jun, 2025 First submitted to journal 05 Jun, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6832455","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":475728594,"identity":"5707d885-65a2-4323-8615-0a8ccd653aac","order_by":0,"name":"Vivian Wei Jiang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABC0lEQVRIiWNgGAWjYDACZgaGAyCKDcRJqICK8hCt5cEZYrQgA8aHbURoMTjOY3jg5w4Gdj72s4dfJM47nCc/I4Hxwds2BnmDAzi0HOYxONh7Bugwnrw0i8Rth4sNbiQwG85tYzDcgEOLZDNbwgHeNpBfcswMgFoSN0gnsEkDRRjxaTn4F6SF/w1Qy5zDifNnJ7D/Bmqxx6WFn5n5wGGwLRI5xg8SGw4nNtxOYGMGiiTi1SLbJgHU8saMIeFYeuKG+w+bJeeck0ieiUMLG//B5o9v22yS5ftzjD/+qLFOnN9z+OCHN2U2tn04tECBRDJIuwSEw9gAEsGrHgTsgJj5A0Flo2AUjIJRMCIBAGumWagoUjKlAAAAAElFTkSuQmCC","orcid":"","institution":"UNSW Sydney","correspondingAuthor":true,"prefix":"","firstName":"Vivian","middleName":"Wei","lastName":"Jiang","suffix":""},{"id":475728595,"identity":"4ea50ac0-9908-40ae-87c8-535f5088b288","order_by":1,"name":"Gustavo Batista","email":"","orcid":"","institution":"UNSW Sydney","correspondingAuthor":false,"prefix":"","firstName":"Gustavo","middleName":"","lastName":"Batista","suffix":""},{"id":475728601,"identity":"6f92c1c4-2b70-4fa1-8a9c-6481f6140603","order_by":2,"name":"Michael Bain","email":"","orcid":"","institution":"UNSW Sydney","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"","lastName":"Bain","suffix":""}],"badges":[],"createdAt":"2025-06-06 00:53:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6832455/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6832455/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":85912427,"identity":"ee36a63f-be5f-45ed-88dc-1891d76f3d2f","added_by":"auto","created_at":"2025-07-03 05:57:18","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":983678,"visible":true,"origin":"","legend":"","description":"","filename":"FromFairGraphstoFairDataADAGBasedApproachtoMitigatingBiasinAISystems.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6832455/v1_covered_8723c6f9-d7f5-4d55-b4d0-14cb808ee91d.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"knowledge-and-information-systems","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"kais","sideBox":"Learn more about [Knowledge and Information Systems](http://link.springer.com/journal/10115)","snPcode":"10115","submissionUrl":"https://submission.nature.com/new-submission/10115/3","title":"Knowledge and Information Systems","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"ML Fairness, Bias Mitigation, Bayesian Networks, Generative Models, Graph Structure Learning","lastPublishedDoi":"10.21203/rs.3.rs-6832455/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6832455/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eEnsuring fairness when training Machine Learning (ML) models remains a critical challenge, particularly when biases are embedded in the underlying data. This paper presents a fairness-aware graph structure learning framework demonstrating how learning fair graphs leads to fairer data for ML training and, consequently, fairer Artificial Intelligence (AI) decisioning based on such models. Our method incorporates a fairness regularization term into score-based structure learning algorithms, guiding the search towards graph structures that minimize discriminatory pathways while preserving statistical relationships. The learned fair graph structures enable the generation of synthetic datasets with mitigated biases, which can be used to train diverse ML models. This modification is non-trivial, as structure learning algorithms rely on local search strategies, while fairness is a global property that depends on the entire graph structure. Our framework is highly adaptable, compatible with various structure learning algorithms, and seamlessly incorporates different fairness metrics to meet specific contextual needs. Extensive experiments on both real-world and synthetic datasets demonstrate that our approach significantly improves fairness while maintaining competitive predictive performance, offering an interpretable and versatile solution for mitigating bias in AI systems..\u003c/p\u003e","manuscriptTitle":"From Fair Graphs to Fair Data: A DAG-Based Approach to Mitigating Bias in AI Systems","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-03 05:49:10","doi":"10.21203/rs.3.rs-6832455/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-06-24T11:33:01+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-06-23T06:38:47+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-06-06T04:46:50+00:00","index":"","fulltext":""},{"type":"submitted","content":"Knowledge and Information Systems","date":"2025-06-06T00:47:40+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"knowledge-and-information-systems","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"kais","sideBox":"Learn more about [Knowledge and Information Systems](http://link.springer.com/journal/10115)","snPcode":"10115","submissionUrl":"https://submission.nature.com/new-submission/10115/3","title":"Knowledge and Information Systems","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"7eb00351-57f1-46d7-a578-3d7e469d2658","owner":[],"postedDate":"July 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-17T06:53:28+00:00","versionOfRecord":[],"versionCreatedAt":"2025-07-03 05:49:10","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6832455","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6832455","identity":"rs-6832455","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00