Identification of cancer associated biomarkers by analysing biologically enriched clusters using MSAGK_CL

preprint OA: closed
Full text JSON View at publisher
Full text 15,076 characters · extracted from preprint-html · click to expand
Identification of cancer associated biomarkers by analysing biologically enriched clusters using MSAGK_CL | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Identification of cancer associated biomarkers by analysing biologically enriched clusters using MSA GK_CL Samir Kumar Sett, Subir Hazra, Jadav Chandra Das, Anupam Ghosh, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8817220/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 12 You are reading this latest preprint version Abstract Recent clustering techniques have witnessed the rise of fuzzification systems designed for various medical applications. Computational tools, in the last decades, have considerably empowered the tasks of physicians and biologists in finding cancer-mediating biomarkers. By taking advantage of these benefits, we introduce a fuzzy clustering-based methodology to find genes associated with particular cancers. The work (MSA GK_CL ) applies the Gustafson-Kessel (GK) algorithm to gene expression datasets, including normal and carcinogenic states, showing its effectiveness on gene expression profile. The adaptive behaviour of the GK algorithm provides a robust treatment of non-spherical cluster shapes, which is crucial in gene expression analysis. To overcome the problem of deciding the number of clusters optimally, we used cluster validity indices like Xie-Beni (XB), Fukuyama-Sugeno (FS), and Dunn Index. These indices give a quantitative basis for the quality of clustering and help in deciding the best clustering configuration. The second approach was a dynamic threshold-based membership score analysis to identify significant genes. This analysis calculates the maximum absolute differences in membership scores between cancerous and normal datasets using the percentile of these differences to adaptively select the threshold. Its effectiveness is validated by the study's findings through precision, recall, and F1-score metrics, demonstrating its usefulness in discovering genes related to cancer. Such an integrative framework thereby opens up a promising path for future research in the area of cancer genomics, helping identify therapeutic targets for personalized medicine Health sciences/Biomarkers Biological sciences/Cancer Biological sciences/Computational biology and bioinformatics Gustafson-Kessel Algorithm Fuzzy Clustering Membership Score Analysis Validity Indices Gene Expression Analysis Adaptive Thresholding Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 19 May, 2026 Reviewers agreed at journal 18 May, 2026 Reviewers agreed at journal 16 May, 2026 Reviewers agreed at journal 26 Apr, 2026 Reviewers agreed at journal 26 Mar, 2026 Reviewers agreed at journal 21 Mar, 2026 Reviewers agreed at journal 12 Mar, 2026 Reviewers invited by journal 09 Mar, 2026 Editor assigned by journal 09 Mar, 2026 Editor invited by journal 15 Feb, 2026 Submission checks completed at journal 10 Feb, 2026 First submitted to journal 10 Feb, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8817220","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":604847628,"identity":"9f8b21ef-5fe1-40a5-b36b-0da79866f672","order_by":0,"name":"Samir Kumar Sett","email":"","orcid":"","institution":"Techno India Salt Lake","correspondingAuthor":false,"prefix":"","firstName":"Samir","middleName":"Kumar","lastName":"Sett","suffix":""},{"id":604847629,"identity":"0f04c7f1-86bd-422b-9c9c-427136569409","order_by":1,"name":"Subir Hazra","email":"","orcid":"","institution":"Meghnad Saha Institute of Technology","correspondingAuthor":false,"prefix":"","firstName":"Subir","middleName":"","lastName":"Hazra","suffix":""},{"id":604847630,"identity":"037670c3-8ad5-4740-9637-2ba86cca30f0","order_by":2,"name":"Jadav Chandra Das","email":"","orcid":"","institution":"Maulana Abul Kalam Azad University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Jadav","middleName":"Chandra","lastName":"Das","suffix":""},{"id":604847631,"identity":"641e7df1-7c9f-4e07-a9ff-7e79fce5e4ae","order_by":3,"name":"Anupam Ghosh","email":"","orcid":"","institution":"Netaji Subhash Engineering College","correspondingAuthor":false,"prefix":"","firstName":"Anupam","middleName":"","lastName":"Ghosh","suffix":""},{"id":604847632,"identity":"7f943052-c434-4abd-88ed-f096823b3c8c","order_by":4,"name":"Sahabul Alam","email":"","orcid":"","institution":"Brainware University","correspondingAuthor":false,"prefix":"","firstName":"Sahabul","middleName":"","lastName":"Alam","suffix":""},{"id":604847633,"identity":"56029718-34ae-40fe-b0d0-5021bf34fd14","order_by":5,"name":"Arunangshu Pal","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABRklEQVRIie3RMWvCQBTA8RcO4pLompCqX+GFQGzBmq9ycuDUoVPJ0EqgEJe2c0u/RCapm3IQl1NXSxbdW3C00NJeqhYM0q4d7j88jkd+hOMAVKr/HH7P8yYBslvlhwWAFv1KsFMg9G/CC+sDpPIo/MXb01XQsKi+WOOsVOmZqRNCv94omaMVhWY1Ge4Ra3rWcG/FuD24pyX3BjNi8XLHFpC5g+sysyh0vAIBYfiWGacU51S3DEmAG74dQaYl3EBJeLtA6pLYH3Ea5MR+xympb0kgibem8FkkKIljxpdaIolj4JDglrQl8eVfhkXiCv3CqcZyL5Y97wgZcXmZnUSYsURe6pgi8x72SU2Qvv0adwMcs3T5ErZYbTYZPUdhdprMJny+ClvVu8L1N+XPocX5iQEYu2f6ORyqmw89H63Nl7AjKpVKpYIvJy52MelSPGoAAAAASUVORK5CYII=","orcid":"","institution":"Manipal University Jaipur","correspondingAuthor":true,"prefix":"","firstName":"Arunangshu","middleName":"","lastName":"Pal","suffix":""}],"badges":[],"createdAt":"2026-02-07 17:38:41","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8817220/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8817220/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":104781809,"identity":"b5c7ef2a-b5df-4894-b975-99e69dbfa87a","added_by":"auto","created_at":"2026-03-17 07:56:21","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":861449,"visible":true,"origin":"","legend":"","description":"","filename":"MSAGKCL.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8817220/v1_covered_8fe1b385-06a1-4269-85de-9ae0c4fd30ec.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"\u003cp\u003eIdentification of cancer associated biomarkers by analysing biologically enriched clusters using MSA\u003csup\u003eGK_CL\u003c/sup\u003e\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Gustafson-Kessel Algorithm, Fuzzy Clustering, Membership Score Analysis, Validity Indices, Gene Expression Analysis, Adaptive Thresholding","lastPublishedDoi":"10.21203/rs.3.rs-8817220/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8817220/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eRecent clustering techniques have witnessed the rise of fuzzification systems designed for various medical applications. Computational tools, in the last decades, have considerably empowered the tasks of physicians and biologists in finding cancer-mediating biomarkers. By taking advantage of these benefits, we introduce a fuzzy clustering-based methodology to find genes associated with particular cancers. The work (MSA\u003csup\u003eGK_CL\u003c/sup\u003e) applies the Gustafson-Kessel (GK) algorithm to gene expression datasets, including normal and carcinogenic states, showing its effectiveness on gene expression profile. The adaptive behaviour of the GK algorithm provides a robust treatment of non-spherical cluster shapes, which is crucial in gene expression analysis. To overcome the problem of deciding the number of clusters optimally, we used cluster validity indices like Xie-Beni (XB), Fukuyama-Sugeno (FS), and Dunn Index. These indices give a quantitative basis for the quality of clustering and help in deciding the best clustering configuration. The second approach was a dynamic threshold-based membership score analysis to identify significant genes. This analysis calculates the maximum absolute differences in membership scores between cancerous and normal datasets using the percentile of these differences to adaptively select the threshold. Its effectiveness is validated by the study's findings through precision, recall, and F1-score metrics, demonstrating its usefulness in discovering genes related to cancer. Such an integrative framework thereby opens up a promising path for future research in the area of cancer genomics, helping identify therapeutic targets for personalized medicine\u003c/p\u003e","manuscriptTitle":"Identification of cancer associated biomarkers by analysing biologically enriched clusters using MSAGK_CL","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-03-13 06:45:11","doi":"10.21203/rs.3.rs-8817220/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-19T10:18:38+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"81194152289871909574736607100082883784","date":"2026-05-18T06:15:16+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"327910179690625582249714889610519986509","date":"2026-05-16T13:53:59+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"283885493361896228309768541500039608359","date":"2026-04-26T16:38:31+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"12012286725187976366990646281121733583","date":"2026-03-26T04:20:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"227939517978072449441666323548624136516","date":"2026-03-21T16:05:03+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"9461021391432208662189778207196990527","date":"2026-03-12T06:20:00+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-03-10T03:24:14+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-10T03:10:23+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-02-16T04:38:37+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-02-10T12:42:16+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2026-02-10T11:52:54+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"259d9705-17cf-434f-aaaf-e9b5283bbea6","owner":[],"postedDate":"March 13th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-19T10:18:38+00:00","index":86,"fulltext":""},{"type":"reviewerAgreed","content":"81194152289871909574736607100082883784","date":"2026-05-18T06:15:16+00:00","index":84,"fulltext":""},{"type":"reviewerAgreed","content":"327910179690625582249714889610519986509","date":"2026-05-16T13:53:59+00:00","index":82,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":64434913,"name":"Health sciences/Biomarkers"},{"id":64434914,"name":"Biological sciences/Cancer"},{"id":64434915,"name":"Biological sciences/Computational biology and bioinformatics"}],"tags":[],"updatedAt":"2026-03-13T06:45:11+00:00","versionOfRecord":[],"versionCreatedAt":"2026-03-13 06:45:11","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8817220","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8817220","identity":"rs-8817220","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00