Symbolic Recurrence: A Framework for Linguistic Biomarker Discovery in Speech

preprint OA: closed
Full text JSON View at publisher
Full text 11,593 characters · extracted from preprint-html · click to expand
Symbolic Recurrence: A Framework for Linguistic Biomarker Discovery in Speech | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Symbolic Recurrence: A Framework for Linguistic Biomarker Discovery in Speech Kevin Mekulu, Faisal Aqlan, Hui Yang This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6356840/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The rising prevalence of neurocognitive disorders, such as Alzheimer's disease (AD), poses a significant global health challenge. Traditional analytical methods, including clinical interviews and paper-based tests like the Mini-Mental State Examination (MMSE), Mini-Cog Test, and Montreal Cognitive Assessment (MoCA), are limited by subjectivity, memory biases, and interviewer variability. To advance our understanding of cognitive-linguistic patterns, this study introduces an interpretable artificial intelligence (AI) architecture to analyze distinctive speech characteristics between healthy individuals and those with dementia. Initially, each unique character in the speech transcripts is encoded as a distinct number, enabling fine-grained analysis of linguistic patterns. Recurrence Quantification Analysis (RQA) is then applied to generate recurrence plots, capturing the dynamic temporal structures and non-linear patterns in speech production. From these plots, we extract salient features using deep metric learning with Siamese networks, which learn to represent and differentiate essential linguistic characteristics in a meaningful embedding space. This novel architecture enables the discovery of subtle yet significant differences in language patterns between groups. Our approach reveals distinct linguistic signatures, demonstrating clear separability between healthy and dementia-related speech patterns, as evidenced by quantitative evaluation metrics. This research advances our understanding of the underlying linguistic indicators of cognitive disorders, providing insights into the characteristic patterns of language changes associated with cognitive decline. These findings contribute to the development of more nuanced and interpretable approaches for analyzing cognitive-linguistic patterns in clinical settings. Biological sciences/Neuroscience/Cognitive neuroscience Biological sciences/Computational biology and bioinformatics/Computational models Biological sciences/Psychology Physical sciences/Mathematics and computing/Computer science Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6356840","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":446762787,"identity":"83ea85b1-e1b0-4084-99d3-262ccc58d372","order_by":0,"name":"Kevin Mekulu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAx0lEQVRIiWNgGAWjYBACCQYG9g8fKtjkIFw24rSwMc44w2dMmhZm3ja5xAaitUjOyE57zMNmlr6d/4wBw4eyw4S1SPOc3W44hyctd+eMHAPGGeeI0CLH3rtB4o3EsdwNN3gMgC4kRgsz7wYJHoP/6Qbnzxgw/yVGizR77zZJngS2BIMDOQbMjMRokew5u9lwxgE2w50z0goO9pxLJ6xF4kbuxgcf/7HJm/Mf3vjgR5k1YS1wYADEB0hQD9UyCkbBKBgFowArAADryzpYJ+1miAAAAABJRU5ErkJggg==","orcid":"","institution":"Pennsylvania State University","correspondingAuthor":true,"prefix":"","firstName":"Kevin","middleName":"","lastName":"Mekulu","suffix":""},{"id":446762788,"identity":"b2175687-09c7-4a9d-b960-6ed87433da76","order_by":1,"name":"Faisal Aqlan","email":"","orcid":"","institution":"University of Louisville","correspondingAuthor":false,"prefix":"","firstName":"Faisal","middleName":"","lastName":"Aqlan","suffix":""},{"id":446762789,"identity":"ff61315d-e834-4e82-a165-a946cd60cf32","order_by":2,"name":"Hui Yang","email":"","orcid":"","institution":"Pennsylvania State University","correspondingAuthor":false,"prefix":"","firstName":"Hui","middleName":"","lastName":"Yang","suffix":""}],"badges":[],"createdAt":"2025-04-02 02:38:15","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6356840/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6356840/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":81599046,"identity":"7b26eab0-eb81-4441-931c-b8459b0f4c37","added_by":"auto","created_at":"2025-04-29 03:30:19","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":943003,"visible":true,"origin":"","legend":"","description":"","filename":"SymbolicRecurrenceforLinguisticBiomarkerDiscoveryinSpeechRevised.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6356840/v1_covered_5ca2ee30-17ac-419d-9f8f-77acd78250c5.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Symbolic Recurrence: A Framework for Linguistic Biomarker Discovery in Speech","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6356840/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6356840/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"The rising prevalence of neurocognitive disorders, such as Alzheimer's disease (AD), poses a significant global health challenge. Traditional analytical methods, including clinical interviews and paper-based tests like the Mini-Mental State Examination (MMSE), Mini-Cog Test, and Montreal Cognitive Assessment (MoCA), are limited by subjectivity, memory biases, and interviewer variability. To advance our understanding of cognitive-linguistic patterns, this study introduces an interpretable artificial intelligence (AI) architecture to analyze distinctive speech characteristics between healthy individuals and those with dementia. Initially, each unique character in the speech transcripts is encoded as a distinct number, enabling fine-grained analysis of linguistic patterns. Recurrence Quantification Analysis (RQA) is then applied to generate recurrence plots, capturing the dynamic temporal structures and non-linear patterns in speech production. From these plots, we extract salient features using deep metric learning with Siamese networks, which learn to represent and differentiate essential linguistic characteristics in a meaningful embedding space. This novel architecture enables the discovery of subtle yet significant differences in language patterns between groups. Our approach reveals distinct linguistic signatures, demonstrating clear separability between healthy and dementia-related speech patterns, as evidenced by quantitative evaluation metrics. This research advances our understanding of the underlying linguistic indicators of cognitive disorders, providing insights into the characteristic patterns of language changes associated with cognitive decline. These findings contribute to the development of more nuanced and interpretable approaches for analyzing cognitive-linguistic patterns in clinical settings.","manuscriptTitle":"Symbolic Recurrence: A Framework for Linguistic Biomarker Discovery in Speech","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-29 03:14:13","doi":"10.21203/rs.3.rs-6356840/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"7834cb64-112b-4865-8a4e-32059ac2145e","owner":[],"postedDate":"April 29th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":47552008,"name":"Biological sciences/Neuroscience/Cognitive neuroscience"},{"id":47552009,"name":"Biological sciences/Computational biology and bioinformatics/Computational models"},{"id":47552010,"name":"Biological sciences/Psychology"},{"id":47552011,"name":"Physical sciences/Mathematics and computing/Computer science"}],"tags":[],"updatedAt":"2025-04-29T03:14:14+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-29 03:14:13","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6356840","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6356840","identity":"rs-6356840","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00