Graph-Based Community Detection of Oil Families Using Integrated n-Alkane and Biomarker Fingerprints: Cretaceous Reservoirs in Abadan Plain, SW Iran | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Graph-Based Community Detection of Oil Families Using Integrated n-Alkane and Biomarker Fingerprints: Cretaceous Reservoirs in Abadan Plain, SW Iran Ahmad Batvandi, Ali Shekarifard, Golnaz Joozani-Kohan, Asal Naseri This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9534746/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 10 You are reading this latest preprint version Abstract Resolving molecular heterogeneity among crude oils is essential for reconstructing petroleum system evolution, particularly in structurally complex, multi-charge basins. This study applies a graph-based community detection framework to delineate oil families within Cretaceous reservoirs of the Abadan Plain (Zagros Fold–Thrust Belt, SW Iran) using integrated high-resolution n-alkane and biomarker fingerprints. Twenty-four reservoired crude oils were characterized using High Resolution Gromatography -derived molecular ratios and represented as weighted similarity networks constructed through a Pearson correlation–based k-nearest neighbor approach (k = 5). Community structure was identified using the Leiden algorithm, enabling detection of emergent molecular populations without imposing predefined geometric clustering constraints. Independent analyses of bulk (n-alkane) and biomarker domains revealed internally coherent yet compositionally distinct architectures. The bulk network exhibited strong modular organization (Q ≈ 0.50; S ≈ 0.50), whereas the biomarker network demonstrated pronounced genetic cohesion (Q ≈ 0.46–0.50; S ≈ 0.52). Integration of both feature domains preserved relational topology and resolved four well-defined oil families in the unified network (Q ≈ 0.40–0.46), despite reduced silhouette separation (S ≈ 0.29) reflecting increased molecular complexity following data fusion. Robustness testing across graph densities (k = 3–10) confirmed structural persistence of community assignments (ARI ≈ 1.0 for k ≥ 5), indicating that inferred oil families are not artifacts of parameterization. Cross-domain comparison highlights complementary contributions from bulk compositional variability and genetically diagnostic biomarker parameters. Strong stratigraphic association observed in the biomarker network (Cramer's V = 1.00, p < 0.01) provides independent geological validation of the network-derived classification, whereas the integrated network shows similarly significant but moderate agreement. The results are consistent with a petroleum accumulation history involving multiple charge events rather than a single homogeneous filling episode. By modelling oil–oil similarity as relational topology rather than purely geometric distance, this framework captures nonlinear molecular affinities and transitional relationships, offering a reproducible and transferable approach for oil family classification in complex petroleum systems. Biological sciences/Computational biology and bioinformatics Earth and environmental sciences/Solid earth sciences Oil families geochemical fingerprinting Biomarkers n-alkanes Community detection Graph-based clustering Cretaceous reservoirs Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 12 May, 2026 Reviews received at journal 11 May, 2026 Reviewers agreed at journal 07 May, 2026 Reviewers agreed at journal 05 May, 2026 Reviewers agreed at journal 04 May, 2026 Reviewers invited by journal 04 May, 2026 Editor invited by journal 04 May, 2026 Editor assigned by journal 27 Apr, 2026 Submission checks completed at journal 27 Apr, 2026 First submitted to journal 26 Apr, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9534746","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":638264198,"identity":"32cdaa76-1fd1-4507-be9f-e28042692408","order_by":0,"name":"Ahmad Batvandi","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA0klEQVRIiWNgGAWjYFACHih9mPkAhMFMjBaw2sNsCRAG8VoO8BgwHCDGWfwNvAc/f2yzy+M7zvNN+gODnTwDO+8DvFokDvAlSxxsSy6WPMy7TeIAQ7JhAzO7AX5rgO6ROHCGOXEDRAtzAgMzG34d8gd4jH8cOFMP1MLzDKilnrAWgwM8ZhIHKg6DtLABtRwmrMXwMF+axZmK44kzD7MZW5wxOG7YRkiL3PHewzcqDKoT+84ffnijoqJanp//GH4taBEHDCsCdoyCUTAKRsEoIAYAADIbQIAtgdgQAAAAAElFTkSuQmCC","orcid":"","institution":"University of Tehran","correspondingAuthor":true,"prefix":"","firstName":"Ahmad","middleName":"","lastName":"Batvandi","suffix":""},{"id":638264199,"identity":"a7fc85be-f95e-4f10-925a-e05d2ac962e5","order_by":1,"name":"Ali Shekarifard","email":"","orcid":"","institution":"University of Tehran","correspondingAuthor":false,"prefix":"","firstName":"Ali","middleName":"","lastName":"Shekarifard","suffix":""},{"id":638264201,"identity":"4bb62a3d-3e8c-4e57-86a4-d84418c3cc9a","order_by":2,"name":"Golnaz Joozani-Kohan","email":"","orcid":"","institution":"University of Tehran","correspondingAuthor":false,"prefix":"","firstName":"Golnaz","middleName":"","lastName":"Joozani-Kohan","suffix":""},{"id":638264203,"identity":"ebc67823-c7d8-4081-912c-6c36d1159ed0","order_by":3,"name":"Asal Naseri","email":"","orcid":"","institution":"Isfahan University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Asal","middleName":"","lastName":"Naseri","suffix":""}],"badges":[],"createdAt":"2026-04-26 22:53:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9534746/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9534746/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109068280,"identity":"098b16ae-5355-4cb5-b4a3-f474620a00a7","added_by":"auto","created_at":"2026-05-12 10:05:16","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1803364,"visible":true,"origin":"","legend":"","description":"","filename":"ManuscriptClustering.Maindraft2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9534746/v1_covered_c3dc703e-74b7-402f-930d-2f7bb54e5121.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Graph-Based Community Detection of Oil Families Using Integrated n-Alkane and Biomarker Fingerprints: Cretaceous Reservoirs in Abadan Plain, SW Iran","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Oil families, geochemical fingerprinting, Biomarkers, n-alkanes, Community detection, Graph-based clustering, Cretaceous reservoirs","lastPublishedDoi":"10.21203/rs.3.rs-9534746/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9534746/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eResolving molecular heterogeneity among crude oils is essential for reconstructing petroleum system evolution, particularly in structurally complex, multi-charge basins. This study applies a graph-based community detection framework to delineate oil families within Cretaceous reservoirs of the Abadan Plain (Zagros Fold\u0026ndash;Thrust Belt, SW Iran) using integrated high-resolution n-alkane and biomarker fingerprints. Twenty-four reservoired crude oils were characterized using High Resolution Gromatography -derived molecular ratios and represented as weighted similarity networks constructed through a Pearson correlation\u0026ndash;based k-nearest neighbor approach (k\u0026thinsp;=\u0026thinsp;5). Community structure was identified using the Leiden algorithm, enabling detection of emergent molecular populations without imposing predefined geometric clustering constraints.\u003c/p\u003e \u003cp\u003eIndependent analyses of bulk (n-alkane) and biomarker domains revealed internally coherent yet compositionally distinct architectures. The bulk network exhibited strong modular organization (Q\u0026thinsp;\u0026asymp;\u0026thinsp;0.50; S\u0026thinsp;\u0026asymp;\u0026thinsp;0.50), whereas the biomarker network demonstrated pronounced genetic cohesion (Q\u0026thinsp;\u0026asymp;\u0026thinsp;0.46\u0026ndash;0.50; S\u0026thinsp;\u0026asymp;\u0026thinsp;0.52). Integration of both feature domains preserved relational topology and resolved four well-defined oil families in the unified network (Q\u0026thinsp;\u0026asymp;\u0026thinsp;0.40\u0026ndash;0.46), despite reduced silhouette separation (S\u0026thinsp;\u0026asymp;\u0026thinsp;0.29) reflecting increased molecular complexity following data fusion. Robustness testing across graph densities (k\u0026thinsp;=\u0026thinsp;3\u0026ndash;10) confirmed structural persistence of community assignments (ARI\u0026thinsp;\u0026asymp;\u0026thinsp;1.0 for k\u0026thinsp;\u0026ge;\u0026thinsp;5), indicating that inferred oil families are not artifacts of parameterization.\u003c/p\u003e \u003cp\u003eCross-domain comparison highlights complementary contributions from bulk compositional variability and genetically diagnostic biomarker parameters. Strong stratigraphic association observed in the biomarker network (Cramer's V\u0026thinsp;=\u0026thinsp;1.00, p\u0026thinsp;\u0026lt;\u0026thinsp;0.01) provides independent geological validation of the network-derived classification, whereas the integrated network shows similarly significant but moderate agreement. The results are consistent with a petroleum accumulation history involving multiple charge events rather than a single homogeneous filling episode. By modelling oil\u0026ndash;oil similarity as relational topology rather than purely geometric distance, this framework captures nonlinear molecular affinities and transitional relationships, offering a reproducible and transferable approach for oil family classification in complex petroleum systems.\u003c/p\u003e","manuscriptTitle":"Graph-Based Community Detection of Oil Families Using Integrated n-Alkane and Biomarker Fingerprints: Cretaceous Reservoirs in Abadan Plain, SW Iran","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-12 06:52:32","doi":"10.21203/rs.3.rs-9534746/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-12T13:40:49+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-11T12:47:12+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"150101965207074347485356671108372855855","date":"2026-05-07T04:24:23+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"296169491952284094627978638425881018230","date":"2026-05-05T05:49:59+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"192213344601363352205006001628368620316","date":"2026-05-04T23:09:27+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-05-04T22:04:07+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-05-04T11:57:11+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-04-27T07:34:39+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-04-27T07:34:30+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2026-04-26T22:47:17+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"f246e4ab-fa98-4206-af09-57ea2d76ca67","owner":[],"postedDate":"May 12th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-12T13:40:49+00:00","index":25,"fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-11T12:47:12+00:00","index":24,"fulltext":""},{"type":"reviewerAgreed","content":"150101965207074347485356671108372855855","date":"2026-05-07T04:24:23+00:00","index":23,"fulltext":""},{"type":"reviewerAgreed","content":"296169491952284094627978638425881018230","date":"2026-05-05T05:49:59+00:00","index":22,"fulltext":""},{"type":"reviewerAgreed","content":"192213344601363352205006001628368620316","date":"2026-05-04T23:09:27+00:00","index":21,"fulltext":""},{"type":"reviewersInvited","content":"7","date":"2026-05-04T22:04:07+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-05-04T11:57:11+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":67931693,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":67931694,"name":"Earth and environmental sciences/Solid earth sciences"}],"tags":[],"updatedAt":"2026-05-12T06:52:32+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-12 06:52:32","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9534746","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9534746","identity":"rs-9534746","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.