Temporal Elements of Speech in Mania | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Temporal Elements of Speech in Mania Jeremiah Joyce, Ivan Ayala, Sanjeev Mishra, George Chatzisofroniou, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7613536/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 12 You are reading this latest preprint version Abstract Computational analysis of speech enables precise, objective measurements of a behavioral signal closely linked to manic episodes. In a prospective cohort study conducted in Rochester, MN ( ClinicalTrials.gov Identifier: NCT05956340 ) , eighteen English-speaking adults voluntarily hospitalized for a manic episode of Bipolar I Disorder between August 2023 and September 2024 were enrolled. Participants completed semi-structured research interviews during both acute episodes and remission, with manic symptom severity assessed at each session using the Young Mania Rating Scale (YMRS). Across 57 recorded interviews, temporal prosodic features were extracted using automated computational algorithms. After adjusting for individual differences and moderating effects, participant articulation rate (β = 0.24, p = 0.0038), speech rate variability (β = 0.23, p = 0.0041), floor transfer offset (β = -0.31, p = 0.007), and speaking duration (β = 0.54, p < 0.001) were significantly associated with manic symptom severity. These temporal elements classified manic (CV AUC = 0.85) and remission (CV AUC = 0.83) status, highlighting their potential as monitoring and prognostic biomarkers for manic episodes. Health sciences/Biomarkers/Prognostic markers Biological sciences/Psychology/Human behaviour Health sciences/Diseases/Psychiatric disorders/Bipolar disorder Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Temporal elements are frequently emphasized in descriptions of severe affective states. Mood episodes are defined not only by their duration and frequency of symptoms but also by disruptions in circadian timing, the tempo and sustainability of behaviors, the rate and sequencing of thoughts, and behavioral latencies which discriminate mania from depression. Others have suggested that even the subjective experience of time is biased, both during severe mood episodes 1 and more generally in ordinary life 2 emphasizing the importance of quantitative measurements. Much like other human behaviors, the temporal dynamics of speech production are markedly altered during episodes of mental illness. Current and historical diagnostic criteria for Bipolar Disorder (BD) highlight increased rate and quantity of speech as symptoms of manic episodes 3 . In clinical training, mental health professionals are taught to recognize alterations in speech patterns consistent with specific mental illness with informal bedside teaching by senior clinicians. However, this education rarely draws on formal training in linguistics or speech-language pathology. Moreover, clinical assessments of speech are conducted without measurement devices, limiting their precision and reliability. As a result, qualitative descriptions of speech are difficult to communicate consistently in the medical record and are often of limited utility to other clinicians or for longitudinal tracking. Recent advancements in artificial intelligence and computational linguistics have enabled the automated extraction of lexical content and prosodic parameters from the human voice. The close relationship between spoken language and overall brain function has introduced research investigating those parameters and specific psychiatric conditions, such as mania. A recent article 4 reviewed the state of the field, and commented on the surprising omission of temporal speech elements in regard to BD. The objectives of this study included both capturing high-quality audio recordings of human speech during episodes of mania and the development of computational tools to quantify temporal elements of speech. Methods Study design and Oversight : The Computational Analysis of Spoken Language in Mania (CASLIM) study was a single-site, cohort investigation aimed at identifying speech alterations in mania relative to participant's baseline speaking patterns, using audio recordings of semi-structured interviews. Following voluntary written informed consent, as approved by the Mayo Clinic, institutional review board (IRB# 22-010487), participants were enrolled between August 2023 and September 2024 and received financial compensation for completing study tasks. The study protocol was conducted in accordance with the Declaration of Helsinki and prospectively registered on ClinicalTrials.gov (NCT05956340). This study adhered to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines. Participants : Patients were recruited from the emergency department or psychiatric inpatient unit of a tertiary care hospital in the Midwestern United States. Adults aged 18–75 years, with a Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision (DSM-5-TR) diagnosis of Bipolar I Disorder, current manic episode, as determined by clinical staff, were eligible for inclusion. Eligible patients were identified through review of electronic medical records and consultation with treating clinicians. Research staff approached patients within 48 hours of admission ( Hospital Initial ) to explain study procedures and obtain written informed consent. Exclusion criteria included inability to provide informed consent, comorbid neurological disorders, paranoid delusions regarding electronic surveillance, aggression, and current substance-induced mania. Follow-up assessments were conducted throughout the hospitalization ( Hospital Interval ) and at discharge (H ospital Discharge ), with an optional post-discharge assessment at 2–4 weeks ( Post Discharge ). Procedures : Equipment During audio recording sessions, participants were equipped with C544 L headset microphone (AKG, USA) and interviewers were equipped with LV4-C lavalier microphone (Movo, USA). Analog audio signal was digitized to 24-bit depth and 48kHz using Scarlett 2i2 audio interface (Focusrite, UK). Gain was manually adjusted to avoid clipping or excessively quiet recordings. Audio files were stored in lossless WAV file format for analysis. Linguistic Tasks To capture speech patterns across a variety of contexts, a semi-structured interview format was used. Interviewers were instructed to follow a brief script and allow participants to respond freely without interruption. Oral Reading Task : The participant was first instructed to read aloud the “Rainbow Passage” 5 a commonly used linguistic passage which contains most of the phonemes or sounds contained in the English language. Open-Ended Prompts : Next participants were asked questions to evoke spontaneous speech and explore cognitive processes. The participant was first asked the neutral prompt “What brings you into the hospital?” and then the emotional prompt “Has anything been irritating today?”. Image Description Task : Lastly, participants were provided with the National Institutes of Health Stroke Scale (NIHSS) Cookie Theft picture and instructed to describe the scene. For the purposes of the present study, all audio data from these tasks were together. Pre-Processing Using a conversation analysis paradigm 6 and operationalized terminology for labeling temporal elements in social communication 7 , 8 , we annotated the semi-structured interview audio data into segments with software developed in Python 3.11.9 (Fig. 1 ). The pyannote-audio 9 library was employed to perform voice activity detection (VAD) and speaker diarization. VAD was configured such that the continuous segments of speech were delimited by silences greater than 200 ms. Following VAD and speaker diarization, continuous segments of speaker-specific utterances, or interpausal units (IPUs), were obtained. Audio files were subsequently processed to: Classify silences as either pauses (within speaker utterances) or as gaps (between speakers) or with assignment to the next speaker. Classify overlapping speech segments as occurring either fully contained within or between the other speaker utterances. Identify single speaker turns Within each utterance, syllable nuclei were identified acoustically using a validated computational linguistics algorithm which detects peaks in amplitude 10 . A syllable contains exactly one nucleus. The count of syllables detected via acoustic method was similar to, but slightly lower, than the count of syllable using a dictionary-based method which influences downstream calculations. Acoustically derived syllables were used in subsequent calculations given this method's ability to account for the idiosyncratic pronunciation of words, dysfluencies, and the ability to provide timepoints within words. Syllable nuclei detected in overlapping speech segments were excluded from analysis. Variables Clinical outcomes were assessed using the YMRS. Before the study, investigators were trained in YMRS administration and all YMRS scores were reviewed during the study by a board-certified psychiatrist (JBJ). An absolute endpoint YMRS score ≥ 20 was the cutoff for mania, ≤ 8 was the cutoff defining remission 11 , and in-between was considered hypomania. Articulation Rate (AR) AR is defined as the total number of syllables produced per total speaking time excluding silences or overlapping speech and represents rate of speech. This is contrasted with speech rate with includes duration of silences within the denominator. Speech Rate Variability (SRV) For each turn, consecutive syllable nuclei onset intervals (Δt ) were calculated. The normalized pairwise variability index (nPVI) 12 was then derived as the mean of the absolute differences between successive intervals, divided by their average duration and multiplied by 100 (Eq. 1). Let Δt i be the i th interval and n the total number of intervals: \(\:nPVI\:=\:\frac{100}{n\:-\:1}\sum\:_{i\:=1}^{n-1}\frac{\left|{\varDelta\:t}_{i+1}-{\varDelta\:t}_{i}\right|}{\frac{\left({\varDelta\:t}_{i+1}+{\varDelta\:t}_{i}\right)}{2}}\) Equation 1 Values closer to 0 indicate perfectly isochronous timing, whereas higher values reflect greater rhythmic irregularity. Two modifications were applied relative to the standard formulation for rhythm variability: (1) syllable nuclei onset intervals were used in place of syllable durations, and (2) because participants were permitted to speak freely, intervals were allowed to span across pauses but were constrained to exclude segments with overlapping speakers or gaps in the recording. Notably, excluding pauses entirely would make the interpretation of rate variability dependent on the threshold chosen to define a pause. Floor Transfer Offsets The transition between speakers in a conversation is a highly regulated interactional process enabling the often-seamless continuation of speech. Floor transfer offsets (FTOs) refer to the interval of time between the conclusion, or offset, of a speaker’s turn and the onset of the next speaker’s turn. Each FTO can be positive (speaker transition with a gap), negative (transition with between-overlap), or zero. Study Size As this was a pilot study attempting to quantify speech parameters in a specific mood state, there was limited data to inform sample size calculations. Therefore, with an assumed expected effect size of 0.5 (based on clinical experience), 90% power and two-sided 5% significance our initial sample size goal was 15 participants in accordance with previous established statistical guidance 13 . Statistical methods Continuous variables are presented as median and inter-quartile range (IQR) and categorical variables are presented as number (%). Continuous variables with skewed distributions (e.g., speaking time, pause counts, syllable interval) were transformed using the natural logarithm. The transformed variables were then standardized (z-scored) by subtracting the sample mean and dividing by the standard deviation prior to use as model covariate. Linear and generalized mixed-effects models fit with fixed-effects and included a random intercept for participant to account for repeated measures. Multicollinearity was assessed using the variance inflation factors (VIF) for each covariate and excluded if above a threshold of five. Standardized effect sizes are reported. Statistical significance was defined as < 0.05 using two tailed tests. No data imputation was conducted. Linear and generalized mixed-effects models were fit with only fixed-effects using the lmerTest 14 package and cross validation split created using the rsample 15 package. Cross validated Area Under the ROC Curve (CV AUC) and confidence intervals (CI) were calculated using the cvAUC 16 package. Data analysis was conducted in R, version 4.3.3 (R Foundation for Statistical Computing). Results Participants Participants were enrolled between August 2023 and September 2024. Over the study period, 38 patients met inclusion criteria. Of those 19 patients met exclusion criteria or declined participation: involuntary hospitalization (nine), short lengths of stay (three), deemed inappropriate to record by staff (three), declined participation (two), inability to consent due to legal guardian (one), and positive amphetamine urine drug screen (one). One patient was initially enrolled but not recorded as his clinical diagnosis was later revised. 18 patients were enrolled and recorded. The median age was 37.2 years old (SD = 12.4). 44.0% (n = 8) of participants were Female. The racial distribution was 83.5% (n = 15) White, 11.0% as Black or African American (n = 2), and 5.5% (n = 1) Other. Regarding ethnicity, 89% (n = 16) self-identified as Not Hispanic or Latino, 5.5% (n = 1) Hispanic, Latino, or Spanish Origin, and 5.5% (n = 1) were unknown. English was the primary language for 83.5% (n = 15) of patients, while 16.5% (n = 3) reported Spanish (n = 1), Arabic (n = 1), or Somali (n = 1). Baseline characteristics of enrolled participants are listed in Table 1 . Table 1 Sociodemographic Characteristics of Participants Baseline Characteristics Age mean SD Years 37.2 12.7 Sex n % Female 10 56 Male 8 44 Race n % Black or African American 2 11 White 15 83.5 Other 1 5.5 Ethnicity n % Not Hispanic or Latino 16 89 Hispanic, Latino, or Spanish Origin 1 5.5 Unknown 1 5.5 Primary language n % English 15 83.5 Spanish 1 5.5 Arabic 1 5.5 Somali 1 5.5 Marital status n % Single 13 72 Married/partnered 4 22 Divorced/widowed 1 6 Highest educational level n % Middle school 1 5.5 High school/some college 16 89 University or postgraduate degree 1 5.5 Employment n % Unemployed or disability 7 39 Working part time 4 22 Working full time 7 39 Household income n % Do not know 3 16.5 Less than $ 27,000 5 28 $ 27,000 through $ 51,999 5 28 $ 52,000 through $ 84,999 2 11 $ 85,000 through $ 140,999 2 11 $ 141,000 and greater 1 5.5 Descriptive Data A total of 5.6 hours of audio data was recorded across 57 research sessions (see Table 2 ). Participants spoke for a total of 4.3 hours, or 77% of the total recorded time. Speaking time was not strongly correlated with the other speech parameters, however a strong negative relationship between AR and SRV was observed (Spearman’s ρ = -0.68). Moderate between-participant variability was observed for SRV (ICC = 0.52), pause rate (ICC = 0.502), and AR (ICC = 0.432) as indicated in Fig. 2 . Table 2 Temporal Characteristics of Speech Across Manic States Mania Hypomania Remission Subjects, n 12 16 16 YMRS score, Median (IQR) 22.00 (3.00) 14.00 (5.50) 4.00 (4.00) Recordings, n 17 23 17 Speaking time, total (s) 6,828.05 6,117.37 2,526.76 Speaking time per recording (s), Median (IQR) 429.49 (255.17) 194.97 (203.51) 109.28 (92.42) Pauses per minute, Median (IQR) 4.06 (4.46) 2.81 (2.31) 3.06 (2.10) Within-overlaps per minute, Median (IQR) 0.17 (0.39) 0.06 (0.39) 0.00 (0.78) Articulation rate (syll/sec), Median (IQR) 2.98 (0.29) 3.01 (0.50) 2.89 (0.52) Speech Rate Variability, Median (IQR) 61.44 (6.64) 58.86 (6.43) 59.95 (4.78) Floor transfer offset (s), Median (IQR) 0.63 (0.40) 0.64 (0.51) 0.83 (0.89) During more severe manic symptoms, participants were quicker to start speaking with a shorter floor transfer offset (β = -0.31; 95% CI = -0.53 to -0.093; p = 0.007, Fig. 3 ) and they spoke for longer (β = 0.54; 95% CI = 0.32 to 0.75; p < 0.001). After adjusting for the other, higher YMRS scores were associated with faster AR (β = 0.24; 95% CI = 0.08 to 0.40; p = 0.0038) and greater SRV (β = 0.23; 95% CI = 0.08 to 0.38; p = 0.0041). A cross-over interaction effect indicated that increased SRV even at lower AR was associated with greater symptom burden (β = 0.41; 95% CI = 0.12 to 0.69; p = 0.0049, Fig. 4 ). Neither pause rate nor within-overlap rate were significantly associated with severity of manic symptoms. Exploratory analysis indicated that Language-Thought Disorder (Item 6 of the YMRS) was independently associated with SRV. After controlling for total YMRS score, greater thought disorganization was associated with greater SRV with both a significant linear (β = 0.97; 95% CI, 0.18 to 1.76; p = 0.023) and a U-shaped quadratic relationship (β = 0.55; 95% CI, 0.13 to 0.98; p = 0.017). Predictive Data We evaluated the ability of temporal speech features to classify mania (YMRS ≥ 20) using five-fold cross-validation, grouping by participant to preserve the repeated-measures structure. Within each training fold we fit a generalized linear mixed-effects model, predicting mania from AR, SRV, their interaction, floor-transfer offset, and speaking time. Out-of-fold predictions were generated for the held-out participants, and model performance was acceptable (CV AUC = 0.85, 95% CI 0.72 to 0.99) for out-of-fold predictions. The same process was conducted for predicting remission status with similar results (CV AUC = 0.83, 95% CI = 0.73 to 0.94). Discussion Key Results Based on clinical knowledge, we hypothesized that several temporal speech characteristics would differ significantly as a function of manic severity. Our hypothesis was confirmed as participants started speaking sooner, spoke for longer durations, and with quicker rate although this was moderated by SRV. Furthermore, these parameters can identify either mania or remission status. These results closely align with clinical intuition and diagnostic criteria regarding the importance of temporal speech characteristics in BD. We suspect that the lack of within-overlaps segments is more an artifact of the artificial interview structure rather than evidence of an absence of interruptions. What was unexpected were the findings involving our assessment of SRV and both its negative association and crossover interaction with AR. It is important to note our measurement of rate of speech, AR, specifically excludes silences so the presences of irregularly timed pauses would not lower the rate. Given the strong association between thought disorganization and speech disorganization, a possible explanation would suggest that inability to sustain cognitive processes slows the formulation of language, its articulation, or both. This finding is similar to a previous study which found a decreased word output during a test of verbal fluency among those with BD with more racing or ruminative thoughts 17 . As cognitive thought disorganization is a symptom shared by many distinct mental illnesses, this speech feature may prove to have value in a transdiagnostic sense. There was also evidence of broad differences between individual variability in several parameters, which was unexpected. Future research in the field may benefit from considering an individual's speech holistically over time versus isolated assessments of individual parameters. Limitations We acknowledge several limitations to our study. First, our sample was relatively small and homogenous, and the speech data was collected in a research setting limiting the generalizability of findings to other patient populations or cultural contexts. Sociodemographic features can substantially influence speech even within the same language 18 , and as different languages vary considerably not only in their vocabulary, but also in their phonemes and prosody, we would expect to find considerable differences in these parameters. Despite our small sample size, several of our findings included moderate to large effect sizes which reach statistical significance in this English-speaking population. Second, we employed sophisticated audio recording equipment and techniques, guided by audio engineering expertise, to capture speech with high fidelity. While this approach improved data quality, it may complicate replication and limit translation into clinical practice. It remains unclear whether comparable results would be obtained using lower-quality audio typical of everyday or clinical environments. Finally, we relied on automated tools for preprocessing tasks such as speaker diarization and syllable nucleus detection which may introduce error. In formal linguistics studies, these steps are generally performed manually on short segments given that careful annotation of “one minute of conversation take[s] an hour for experienced conversational analyst transcribers” 19 . Such procedures are not feasible for studies involving hours of data, nor are they practical for eventual clinical implementation. To mitigate these potential sources of error we manually reviewed computer generated annotations. Declarations Conflict of Interest The authors declare no competing financial interests in relation to the work described. Acknowledgements This work is supported by the Thomas and Elizabeth Grainger Family Charitable Fund at The Chicago Community Foundation References Martin W, Gergel T, Owen GS. Manic Temporality. Philosophical Psychology . 2018;32(1):72-97. doi:10.1080/09515089.2018.1502873 Block RA, Gruber RP. Time perception, attention, and memory: A selective review. Acta Psychologica . 2014;149:129-133. doi:10.1016/j.actpsy.2013.11.003 Kendler KS. The origin of our modern concept of mania in texts from 1780 to 1900. Mol Psychiatry . 2020;25(9):1975-1985. doi:10.1038/s41380-020-0657-0 Low DM, Bentley KH, Ghosh SS. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investig Otolaryngol . 2020;5(1):96-116. doi:10.1002/lio2.354 Fairbanks G. Voice and Articulation Drillbook . New York, London, Harper & Bros; 1940. Accessed August 22, 2025. http://archive.org/details/voicearticulatio00fair Sacks H, Schegloff EA, Jefferson G. A Simplest Systematics for the Organization of Turn-Taking for Conversation. Language . 1974;50(4):696-735. doi:10.2307/412243 Heldner M, Edlund J. Pauses, gaps and overlaps in conversations. Journal of Phonetics . 2010;38(4):555-568. doi:10.1016/j.wocn.2010.08.002 Levinson SC, Torreira F. Timing in turn-taking and its implications for processing models of language. Front Psychol . 2015;6. doi:10.3389/fpsyg.2015.00731 Bredin H. pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe. In: INTERSPEECH 2023 . ISCA; 2023:1983-1987. doi:10.21437/Interspeech.2023-105 de Jong NH, Pacilly J, Heeren W. PRAAT scripts to measure speed fluency and breakdown fluency in speech automatically. Assessment in Education: Principles, Policy & Practice . 2021;28(4):456-476. doi:10.1080/0969594X.2021.1951162 Tohen M, Frank E, Bowden CL, et al. The International Society for Bipolar Disorders (ISBD) Task Force report on the nomenclature of course and outcome in bipolar disorders. Bipolar Disord . 2009;11(5):453-473. doi:10.1111/j.1399-5618.2009.00726.x Nolan F, Asu EL. The Pairwise Variability Index and coexisting rhythms in language. Phonetica . 2009;66(1-2):64-77. doi:10.1159/000208931 Whitehead AL, Julious SA, Cooper CL, Campbell MJ. Estimating the sample size for a pilot randomised trial to minimise the overall trial sample size for the external pilot and main trial for a continuous outcome variable. Stat Methods Med Res . 2016;25(3):1057-1073. doi:10.1177/0962280215588241 Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software . 2017;82:1-26. doi:10.18637/jss.v082.i13 Frick H, Chow F, Kuhn M, Mahoney M. rsample: General Resampling Infrastructure. rsample: General Resampling Infrastructure. Accessed August 22, 2025. https://rsample.tidymodels.org/authors.html LeDell E, Petersen M, van der Laan M. Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. Electron J Stat . 2015;9(1):1583-1607. doi:10.1214/15-EJS1035 Weiner L, Doignon-Camus N, Bertschy G, Giersch A. Thought and language disturbance in bipolar disorder quantified via process-oriented verbal fluency measures. Sci Rep . 2019;9(1):14282. doi:10.1038/s41598-019-50818-5 Jacewicz E, Fox RA, O’Neill C, Salmons J. Articulation rate across dialect, age, and gender. Lang Var Change . 2009;21(2):233-256. doi:10.1017/S0954394509990093 Point S, Baruch Y. (Re)thinking transcription strategies: Current challenges and future research directions. Scandinavian Journal of Management . 2023;39(2):101272. doi:10.1016/j.scaman.2023.101272 Additional Declarations The authors have declared there is NO conflict of interest to disclose Cite Share Download PDF Status: Under Review Version 1 posted Editorial decision: revise 29 Mar, 2026 Review # 1 received at journal 27 Mar, 2026 Review # 4 received at journal 26 Mar, 2026 Reviewer # 4 agreed at journal 26 Mar, 2026 Reviewer # 3 agreed at journal 12 Mar, 2026 Reviewer # 2 agreed at journal 12 Mar, 2026 Reviewer # 1 agreed at journal 05 Mar, 2026 Reviewers invited by journal 05 Mar, 2026 Editor assigned by journal 23 Sep, 2025 Submission checks completed at journal 23 Sep, 2025 First submitted to journal 22 Sep, 2025 Unknown event 15 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7613536","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":519333400,"identity":"384f08ab-58ef-4842-b2ad-96c4362816db","order_by":0,"name":"Jeremiah Joyce","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBElEQVRIiWNgGAWjYBACxgYwxQYmDzA22EDF2YjXkkZYC7oBhwlrYW7vMfzwgYFPzry9O/HAzx3no/lnJB9g+FB2GKcWxp4zxpIzGNiMZc6c3XCw98zt3Bk30hIYZ5zDo2VGjhkzDwNb4gyJ3A0HeNtu5zbcyDFg5m0joOUPA1v9DPm3Gw7+bTuXOx+k5S8hLUDfJkhI8G44zNt2IHcDSAsjPi09x4olewzYDGfw5G44LNuWnLvxzLOEgz3n0nFqMWxv3vjhR8UxeQn2s5s/vm2zy513PPnggx9l1ri1NHAYMDAYHEMSEkhgOIBTPRDIM7A/AFI1SEL8eDWMglEwCkbBCAQAQ39d6b6IXDkAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-4367-6380","institution":"Mayo Clinic","correspondingAuthor":true,"prefix":"","firstName":"Jeremiah","middleName":"","lastName":"Joyce","suffix":""},{"id":519333401,"identity":"b0e2e9aa-72d5-4c71-8537-b04dec3ddb85","order_by":1,"name":"Ivan Ayala","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Ivan","middleName":"","lastName":"Ayala","suffix":""},{"id":519333402,"identity":"a07271cc-45de-4923-9043-0615ae7ecff4","order_by":2,"name":"Sanjeev Mishra","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Sanjeev","middleName":"","lastName":"Mishra","suffix":""},{"id":519333403,"identity":"2414f4f5-cc2c-4a21-94ee-c78d7f99d8bf","order_by":3,"name":"George Chatzisofroniou","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"George","middleName":"","lastName":"Chatzisofroniou","suffix":""},{"id":519333404,"identity":"1ab4195a-e7d0-46a3-b3c2-3bfd39d3eaac","order_by":4,"name":"Erik Clemens","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Erik","middleName":"","lastName":"Clemens","suffix":""},{"id":519333405,"identity":"c94ea077-6f9b-419c-a284-5795d23f2899","order_by":5,"name":"Hang Yu","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Hang","middleName":"","lastName":"Yu","suffix":""},{"id":519333406,"identity":"b21c96bb-3a75-4742-953a-74f9c6623f0a","order_by":6,"name":"Zachi Attia","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Zachi","middleName":"","lastName":"Attia","suffix":""},{"id":519333407,"identity":"9cabb9f1-480b-4abe-b6ca-79e9ee9d01f4","order_by":7,"name":"Baihan Lin","email":"","orcid":"","institution":"","correspondingAuthor":false,"prefix":"","firstName":"Baihan","middleName":"","lastName":"Lin","suffix":""},{"id":519333408,"identity":"6d3e1fbe-7e98-4e7e-81a5-5cdb062a735c","order_by":8,"name":"Mark Frye","email":"","orcid":"https://orcid.org/0000-0001-6997-4215","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Mark","middleName":"","lastName":"Frye","suffix":""}],"badges":[],"createdAt":"2025-09-14 15:25:07","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7613536/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7613536/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":96917680,"identity":"9acd0769-568f-4ea2-8d80-95277e775379","added_by":"auto","created_at":"2025-11-27 14:10:25","extension":"png","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":199579,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/addc862272d297ac22267e2c.png"},{"id":96917470,"identity":"82af873b-4b4f-493d-ada6-871adeffa8a9","added_by":"auto","created_at":"2025-11-27 14:09:46","extension":"png","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":142781,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/77c8a46a7d1d87e8b779ea3b.png"},{"id":96805893,"identity":"b3869d9d-fbcd-41b7-9a76-913d6359dbaa","added_by":"auto","created_at":"2025-11-26 09:12:45","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":100031,"visible":true,"origin":"","legend":"","description":"","filename":"20250922maniaspeechmanuscript.docx","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/81c167d6dec49262d61563b7.docx"},{"id":96805887,"identity":"aae0bd44-4f1f-49ad-a673-eb765bc7c794","added_by":"auto","created_at":"2025-11-26 09:12:44","extension":"png","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":185901,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/133c039f7611857b6ed7ade9.png"},{"id":96805883,"identity":"f798db85-959d-45eb-87e3-bfd21b8b724f","added_by":"auto","created_at":"2025-11-26 09:12:44","extension":"json","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":9485,"visible":true,"origin":"","legend":"","description":"","filename":"2025TP002216.json","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/89b0b5639bfd439b021c1233.json"},{"id":96805886,"identity":"5d5ecd6e-ee99-4ab4-b392-87c1888596f5","added_by":"auto","created_at":"2025-11-26 09:12:44","extension":"xml","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":82197,"visible":true,"origin":"","legend":"","description":"","filename":"2025TP0022160enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/a5a196d89f609ea1caa402f2.xml"},{"id":96917186,"identity":"cb2de2fe-a83c-4cbe-97f9-f2a678d689cb","added_by":"auto","created_at":"2025-11-27 14:09:20","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":199579,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/c6fc8ad4c0c007759e994d36.png"},{"id":96805891,"identity":"bf4d396a-7ba9-403f-87f4-31e7da5f35c7","added_by":"auto","created_at":"2025-11-26 09:12:45","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":142781,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/d6116d8bd846cc6d1b0f31b0.png"},{"id":96917202,"identity":"fea13d1f-996b-4f12-9165-26364a97bdca","added_by":"auto","created_at":"2025-11-27 14:09:21","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":185901,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/bc2c458e6dddd4c6c08a85b9.png"},{"id":96917462,"identity":"0fa064e2-2cf5-4eb7-b479-fd0f5ab9f056","added_by":"auto","created_at":"2025-11-27 14:09:45","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":168340,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/1f0578e65f22943d0526e3a8.png"},{"id":96917438,"identity":"00ba12c6-1027-4a93-a436-259199cb6615","added_by":"auto","created_at":"2025-11-27 14:09:44","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":51439,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/bdb7ee6aa9d911b4efada108.png"},{"id":96916528,"identity":"c89d67e6-9bbc-48dd-97bc-ced43453276e","added_by":"auto","created_at":"2025-11-27 14:08:40","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":83142,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/8222e3e89909fa61868b37fe.png"},{"id":96917204,"identity":"a9d5f8e4-fe5f-4247-a7c1-63dd76a284b5","added_by":"auto","created_at":"2025-11-27 14:09:21","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":78334,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/a7f2547667ee20885e172c1d.png"},{"id":96918103,"identity":"8a26cc45-9e10-450a-a76b-3ad3e58fdd55","added_by":"auto","created_at":"2025-11-27 14:11:09","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":81279,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/64faa900e9afc190ed7c7b48.png"},{"id":96805898,"identity":"ce8ba71c-6caf-4989-9cea-af7f42b5834c","added_by":"auto","created_at":"2025-11-26 09:12:45","extension":"xml","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":76995,"visible":true,"origin":"","legend":"","description":"","filename":"2025TP0022160structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/939fff7ac33d823724983342.xml"},{"id":96805897,"identity":"c83c9936-1578-43c4-9b2d-718ce88f00ba","added_by":"auto","created_at":"2025-11-26 09:12:45","extension":"html","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":92171,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/839c95c1228065ba0234d4b9.html"},{"id":97538570,"identity":"a6d5c3da-acb7-475f-a8d4-98c3df93cce6","added_by":"auto","created_at":"2025-12-05 14:50:24","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":199579,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of the audio segmentation process. Speaker specific segments of speech are identified as IPUs. Segments of silence and overlapping speech are identified for further annotation.\u003c/p\u003e","description":"","filename":"Figure11.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/aaa277ab472b9a5d7b0399fd.png"},{"id":97538576,"identity":"f8d4f1ab-c91d-459d-acf5-058bb4f8c516","added_by":"auto","created_at":"2025-12-05 14:50:46","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":142781,"visible":true,"origin":"","legend":"\u003cp\u003eVariability between speakers in three linguistic parameters (participants ordered by interview order)\u003c/p\u003e","description":"","filename":"Figure21.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/85dfaccf4c7eddff8375f4fb.png"},{"id":97538682,"identity":"eb891743-7091-4d0b-a8e2-56a031c3b8f3","added_by":"auto","created_at":"2025-12-05 14:51:33","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":185901,"visible":true,"origin":"","legend":"\u003cp\u003eFloor transfer offset by mania state\u003c/p\u003e","description":"","filename":"Figure31.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/b76f4b00bf724939fabf2563.png"},{"id":96805890,"identity":"9261529d-9341-4243-b9b7-090952b2703d","added_by":"auto","created_at":"2025-11-26 09:12:44","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":168340,"visible":true,"origin":"","legend":"\u003cp\u003eInteraction of AR and SRV on mania severity\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/a063793266e0ffa191bafb77.png"},{"id":98621930,"identity":"ff38bf47-1241-4acc-9e08-fdd49fc4586c","added_by":"auto","created_at":"2025-12-19 16:34:48","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1347733,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7613536/v1/dba3637e-db82-4576-8ff5-c83678aeaa77.pdf"}],"financialInterests":"The authors have declared there is \u003cb\u003eNO\u003c/b\u003e conflict of interest to disclose","formattedTitle":"Temporal Elements of Speech in Mania","fulltext":[{"header":"Introduction","content":"\u003cp\u003eTemporal elements are frequently emphasized in descriptions of severe affective states. Mood episodes are defined not only by their duration and frequency of symptoms but also by disruptions in circadian timing, the tempo and sustainability of behaviors, the rate and sequencing of thoughts, and behavioral latencies which discriminate mania from depression. Others have suggested that even the subjective experience of time is biased, both during severe mood episodes\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e and more generally in ordinary life\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e emphasizing the importance of quantitative measurements. Much like other human behaviors, the temporal dynamics of speech production are markedly altered during episodes of mental illness.\u003c/p\u003e\u003cp\u003eCurrent and historical diagnostic criteria for Bipolar Disorder (BD) highlight increased rate and quantity of speech as symptoms of manic episodes\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. In clinical training, mental health professionals are taught to recognize alterations in speech patterns consistent with specific mental illness with informal bedside teaching by senior clinicians. However, this education rarely draws on formal training in linguistics or speech-language pathology. Moreover, clinical assessments of speech are conducted without measurement devices, limiting their precision and reliability. As a result, qualitative descriptions of speech are difficult to communicate consistently in the medical record and are often of limited utility to other clinicians or for longitudinal tracking.\u003c/p\u003e\u003cp\u003eRecent advancements in artificial intelligence and computational linguistics have enabled the automated extraction of lexical content and prosodic parameters from the human voice. The close relationship between spoken language and overall brain function has introduced research investigating those parameters and specific psychiatric conditions, such as mania. A recent article\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e reviewed the state of the field, and commented on the surprising omission of temporal speech elements in regard to BD. The objectives of this study included both capturing high-quality audio recordings of human speech during episodes of mania and the development of computational tools to quantify temporal elements of speech.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cb\u003eStudy design and Oversight\u003c/b\u003e:\u003c/p\u003e\u003cp\u003eThe \u003cem\u003eComputational Analysis of Spoken Language in Mania\u003c/em\u003e (CASLIM) study was a single-site, cohort investigation aimed at identifying speech alterations in mania relative to participant's baseline speaking patterns, using audio recordings of semi-structured interviews. Following voluntary written informed consent, as approved by the Mayo Clinic, institutional review board (IRB# 22-010487), participants were enrolled between August 2023 and September 2024 and received financial compensation for completing study tasks. The study protocol was conducted in accordance with the Declaration of Helsinki and prospectively registered on ClinicalTrials.gov (NCT05956340). This study adhered to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines.\u003c/p\u003e\u003cp\u003e\u003cb\u003eParticipants\u003c/b\u003e:\u003c/p\u003e\u003cp\u003ePatients were recruited from the emergency department or psychiatric inpatient unit of a tertiary care hospital in the Midwestern United States. Adults aged 18\u0026ndash;75 years, with a \u003cem\u003eDiagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision\u003c/em\u003e (DSM-5-TR) diagnosis of Bipolar I Disorder, current manic episode, as determined by clinical staff, were eligible for inclusion. Eligible patients were identified through review of electronic medical records and consultation with treating clinicians. Research staff approached patients within 48 hours of admission (\u003cem\u003eHospital Initial\u003c/em\u003e) to explain study procedures and obtain written informed consent. Exclusion criteria included inability to provide informed consent, comorbid neurological disorders, paranoid delusions regarding electronic surveillance, aggression, and current substance-induced mania. Follow-up assessments were conducted throughout the hospitalization (\u003cem\u003eHospital Interval\u003c/em\u003e) and at discharge (H\u003cem\u003eospital Discharge\u003c/em\u003e), with an optional post-discharge assessment at 2\u0026ndash;4 weeks (\u003cem\u003ePost Discharge\u003c/em\u003e).\u003c/p\u003e\u003cp\u003e\u003cb\u003eProcedures\u003c/b\u003e:\u003c/p\u003e\u003cp\u003e\u003cem\u003eEquipment\u003c/em\u003e\u003c/p\u003e\u003cp\u003e During audio recording sessions, participants were equipped with C544 L headset microphone (AKG, USA) and interviewers were equipped with LV4-C lavalier microphone (Movo, USA). Analog audio signal was digitized to 24-bit depth and 48kHz using Scarlett 2i2 audio interface (Focusrite, UK). Gain was manually adjusted to avoid clipping or excessively quiet recordings. Audio files were stored in lossless WAV file format for analysis.\u003c/p\u003e\u003cp\u003e\u003cem\u003eLinguistic Tasks\u003c/em\u003e\u003c/p\u003e\u003cp\u003eTo capture speech patterns across a variety of contexts, a semi-structured interview format was used. Interviewers were instructed to follow a brief script and allow participants to respond freely without interruption.\u003c/p\u003e\u003cp\u003e\u003col\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eOral Reading Task\u003c/b\u003e: The participant was first instructed to read aloud the \u0026ldquo;Rainbow Passage\u0026rdquo;\u003csup\u003e5\u003c/sup\u003e a commonly used linguistic passage which contains most of the phonemes or sounds contained in the English language.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eOpen-Ended Prompts\u003c/b\u003e: Next participants were asked questions to evoke spontaneous speech and explore cognitive processes. The participant was first asked the neutral prompt \u0026ldquo;What brings you into the hospital?\u0026rdquo; and then the emotional prompt \u0026ldquo;Has anything been irritating today?\u0026rdquo;.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003cspan\u003e\u003cli\u003e\u003cp\u003e\u003cb\u003eImage Description Task\u003c/b\u003e: Lastly, participants were provided with the National Institutes of Health Stroke Scale (NIHSS) Cookie Theft picture and instructed to describe the scene.\u003c/p\u003e\u003c/li\u003e\u003c/span\u003e\u003c/ol\u003e\u003c/p\u003e\u003cp\u003eFor the purposes of the present study, all audio data from these tasks were together.\u003c/p\u003e\u003cp\u003e\u003cem\u003ePre-Processing\u003c/em\u003e\u003c/p\u003e\u003cp\u003eUsing a conversation analysis paradigm\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e and operationalized terminology for labeling temporal elements in social communication\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e,\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e, we annotated the semi-structured interview audio data into segments with software developed in Python 3.11.9 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The \u003cem\u003epyannote-audio\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e library was employed to perform voice activity detection (VAD) and speaker diarization. VAD was configured such that the continuous segments of speech were delimited by silences greater than 200 ms.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e Following VAD and speaker diarization, continuous segments of speaker-specific utterances, or interpausal units (IPUs), were obtained. Audio files were subsequently processed to:\u003c/p\u003e\u003cp\u003e\u003cul\u003e\u003cli\u003e\u003cp\u003eClassify silences as either pauses (within speaker utterances) or as gaps (between speakers) or with assignment to the next speaker.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eClassify overlapping speech segments as occurring either fully contained \u003cem\u003ewithin\u003c/em\u003e or \u003cem\u003ebetween\u003c/em\u003e the other speaker utterances.\u003c/p\u003e\u003c/li\u003e\u003cli\u003e\u003cp\u003eIdentify single speaker turns\u003c/p\u003e\u003c/li\u003e\u003c/ul\u003e\u003c/p\u003e\u003cp\u003eWithin each utterance, syllable nuclei were identified acoustically using a validated computational linguistics algorithm which detects peaks in amplitude\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. A syllable contains exactly one nucleus. The count of syllables detected via acoustic method was similar to, but slightly lower, than the count of syllable using a dictionary-based method which influences downstream calculations. Acoustically derived syllables were used in subsequent calculations given this method's ability to account for the idiosyncratic pronunciation of words, dysfluencies, and the ability to provide timepoints within words. Syllable nuclei detected in overlapping speech segments were excluded from analysis.\u003c/p\u003e\u003cp\u003e\u003cem\u003eVariables\u003c/em\u003e\u003c/p\u003e\u003cp\u003eClinical outcomes were assessed using the YMRS. Before the study, investigators were trained in YMRS administration and all YMRS scores were reviewed during the study by a board-certified psychiatrist (JBJ). An absolute endpoint YMRS score\u0026thinsp;\u0026ge;\u0026thinsp;20 was the cutoff for mania, \u0026le;\u0026thinsp;8 was the cutoff defining remission\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, and in-between was considered hypomania.\u003c/p\u003e\u003cp\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eArticulation Rate (AR)\u003c/span\u003e\u003c/p\u003e\u003cp\u003eAR is defined as the total number of syllables produced per total speaking time excluding silences or overlapping speech and represents rate of speech. This is contrasted with speech rate with includes duration of silences within the denominator.\u003c/p\u003e\u003cp\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eSpeech Rate Variability (SRV)\u003c/span\u003e\u003c/p\u003e\u003cp\u003eFor each turn, consecutive syllable nuclei onset intervals \u003cem\u003e(Δt\u003c/em\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003e)\u003c/span\u003e were calculated. The normalized pairwise variability index (nPVI)\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e was then derived as the mean of the absolute differences between successive intervals, divided by their average duration and multiplied by 100 (Eq.\u0026nbsp;1). Let \u003cem\u003eΔt\u003c/em\u003e\u003csub\u003e\u003cem\u003ei\u003c/em\u003e\u003c/sub\u003e be the \u003cem\u003ei\u003c/em\u003eth interval and \u003cem\u003en\u003c/em\u003e the total number of intervals:\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e\u003ccolgroup cols=\"2\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:nPVI\\:=\\:\\frac{100}{n\\:-\\:1}\\sum\\:_{i\\:=1}^{n-1}\\frac{\\left|{\\varDelta\\:t}_{i+1}-{\\varDelta\\:t}_{i}\\right|}{\\frac{\\left({\\varDelta\\:t}_{i+1}+{\\varDelta\\:t}_{i}\\right)}{2}}\\)\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eEquation 1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eValues closer to 0 indicate perfectly isochronous timing, whereas higher values reflect greater rhythmic irregularity.\u003c/p\u003e\u003cp\u003e Two modifications were applied relative to the standard formulation for rhythm variability: (1) syllable nuclei onset intervals were used in place of syllable durations, and (2) because participants were permitted to speak freely, intervals were allowed to span across pauses but were constrained to exclude segments with overlapping speakers or gaps in the recording. Notably, excluding pauses entirely would make the interpretation of rate variability dependent on the threshold chosen to define a pause.\u003c/p\u003e\u003cp\u003e\u003cspan type=\"Underline\" class=\"Underline\" name=\"Emphasis\"\u003eFloor Transfer Offsets\u003c/span\u003e\u003c/p\u003e\u003cp\u003eThe transition between speakers in a conversation is a highly regulated interactional process enabling the often-seamless continuation of speech. Floor transfer offsets (FTOs) refer to the interval of time between the conclusion, or offset, of a speaker\u0026rsquo;s turn and the onset of the next speaker\u0026rsquo;s turn. Each FTO can be positive (speaker transition with a gap), negative (transition with between-overlap), or zero.\u003c/p\u003e\u003cp\u003e\u003cem\u003eStudy Size\u003c/em\u003e\u003c/p\u003e\u003cp\u003eAs this was a pilot study attempting to quantify speech parameters in a specific mood state, there was limited data to inform sample size calculations. Therefore, with an assumed expected effect size of 0.5 (based on clinical experience), 90% power and two-sided 5% significance our initial sample size goal was 15 participants in accordance with previous established statistical guidance\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003e\u003cem\u003eStatistical methods\u003c/em\u003e\u003c/p\u003e\u003cp\u003eContinuous variables are presented as median and inter-quartile range (IQR) and categorical variables are presented as number (%). Continuous variables with skewed distributions (e.g., speaking time, pause counts, syllable interval) were transformed using the natural logarithm. The transformed variables were then standardized (z-scored) by subtracting the sample mean and dividing by the standard deviation prior to use as model covariate. Linear and generalized mixed-effects models fit with fixed-effects and included a random intercept for participant to account for repeated measures. Multicollinearity was assessed using the variance inflation factors (VIF) for each covariate and excluded if above a threshold of five. Standardized effect sizes are reported. Statistical significance was defined as \u0026lt;\u0026thinsp;0.05 using two tailed tests. No data imputation was conducted.\u003c/p\u003e\u003cp\u003eLinear and generalized mixed-effects models were fit with only fixed-effects using the \u003cem\u003elmerTest\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e package and cross validation split created using the \u003cem\u003ersample\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e package. Cross validated Area Under the ROC Curve (CV AUC) and confidence intervals (CI) were calculated using the \u003cem\u003ecvAUC\u003c/em\u003e\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e package. Data analysis was conducted in R, version 4.3.3 (R Foundation for Statistical Computing).\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cb\u003eParticipants\u003c/b\u003e\u003c/p\u003e\u003cp\u003eParticipants were enrolled between August 2023 and September 2024. Over the study period, 38 patients met inclusion criteria. Of those 19 patients met exclusion criteria or declined participation: involuntary hospitalization (nine), short lengths of stay (three), deemed inappropriate to record by staff (three), declined participation (two), inability to consent due to legal guardian (one), and positive amphetamine urine drug screen (one). One patient was initially enrolled but not recorded as his clinical diagnosis was later revised. 18 patients were enrolled and recorded.\u003c/p\u003e\u003cp\u003eThe median age was 37.2 years old (SD\u0026thinsp;=\u0026thinsp;12.4). 44.0% (n\u0026thinsp;=\u0026thinsp;8) of participants were Female. The racial distribution was 83.5% (n\u0026thinsp;=\u0026thinsp;15) White, 11.0% as Black or African American (n\u0026thinsp;=\u0026thinsp;2), and 5.5% (n\u0026thinsp;=\u0026thinsp;1) Other. Regarding ethnicity, 89% (n\u0026thinsp;=\u0026thinsp;16) self-identified as Not Hispanic or Latino, 5.5% (n\u0026thinsp;=\u0026thinsp;1) Hispanic, Latino, or Spanish Origin, and 5.5% (n\u0026thinsp;=\u0026thinsp;1) were unknown. English was the primary language for 83.5% (n\u0026thinsp;=\u0026thinsp;15) of patients, while 16.5% (n\u0026thinsp;=\u0026thinsp;3) reported Spanish (n\u0026thinsp;=\u0026thinsp;1), Arabic (n\u0026thinsp;=\u0026thinsp;1), or Somali (n\u0026thinsp;=\u0026thinsp;1). Baseline characteristics of enrolled participants are listed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSociodemographic Characteristics of Participants\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBaseline Characteristics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eAge\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003emean\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003eSD\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eYears\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e37.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e12.7\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eSex\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFemale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e10\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e56\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e44\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eRace\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBlack or African American\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWhite\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e83.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOther\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEthnicity\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNot Hispanic or Latino\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e89\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHispanic, Latino, or Spanish Origin\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUnknown\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003ePrimary language\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEnglish\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e83.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSpanish\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eArabic\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSomali\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eMarital status\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSingle\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e13\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e72\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMarried/partnered\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e22\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDivorced/widowed\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eHighest educational level\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMiddle school\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHigh school/some college\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e89\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUniversity or postgraduate degree\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEmployment\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eUnemployed or disability\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e39\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWorking part time\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e22\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWorking full time\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e39\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eHousehold income\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cem\u003en\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cem\u003e%\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDo not know\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e16.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLess than \u003cspan\u003e$\u003c/span\u003e27,000\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e28\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cspan\u003e$\u003c/span\u003e27,000 through \u003cspan\u003e$\u003c/span\u003e51,999\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e28\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cspan\u003e$\u003c/span\u003e52,000 through \u003cspan\u003e$\u003c/span\u003e84,999\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cspan\u003e$\u003c/span\u003e85,000 through \u003cspan\u003e$\u003c/span\u003e140,999\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cspan\u003e$\u003c/span\u003e141,000 and greater\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e5.5\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eDescriptive Data\u003c/b\u003e\u003c/p\u003e\u003cp\u003eA total of 5.6 hours of audio data was recorded across 57 research sessions (see Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Participants spoke for a total of 4.3 hours, or 77% of the total recorded time. Speaking time was not strongly correlated with the other speech parameters, however a strong negative relationship between AR and SRV was observed (Spearman\u0026rsquo;s ρ = -0.68). Moderate between-participant variability was observed for SRV (ICC\u0026thinsp;=\u0026thinsp;0.52), pause rate (ICC\u0026thinsp;=\u0026thinsp;0.502), and AR (ICC\u0026thinsp;=\u0026thinsp;0.432) as indicated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eTemporal Characteristics of Speech Across Manic States\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"4\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eMania\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eHypomania\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eRemission\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSubjects, n\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e12\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e16\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eYMRS score, Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e22.00 (3.00)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e14.00 (5.50)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e4.00 (4.00)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRecordings, n\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e23\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e17\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSpeaking time, total (s)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e6,828.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e6,117.37\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e2,526.76\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSpeaking time per recording (s), Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e429.49 (255.17)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e194.97 (203.51)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e109.28 (92.42)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003ePauses per minute, Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e4.06 (4.46)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2.81 (2.31)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e3.06 (2.10)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWithin-overlaps per minute, Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.17 (0.39)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.06 (0.39)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.00 (0.78)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eArticulation rate (syll/sec), Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e2.98 (0.29)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.01 (0.50)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e2.89 (0.52)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSpeech Rate Variability, Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e61.44 (6.64)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e58.86 (6.43)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e59.95 (4.78)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFloor transfer offset (s), Median (IQR)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.63 (0.40)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.64 (0.51)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.83 (0.89)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e During more severe manic symptoms, participants were quicker to start speaking with a shorter floor transfer offset (β = -0.31; 95% CI = -0.53 to -0.093; p\u0026thinsp;=\u0026thinsp;0.007, Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e) and they spoke for longer (β\u0026thinsp;=\u0026thinsp;0.54; 95% CI\u0026thinsp;=\u0026thinsp;0.32 to 0.75; \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.001). After adjusting for the other, higher YMRS scores were associated with faster AR (β\u0026thinsp;=\u0026thinsp;0.24; 95% CI\u0026thinsp;=\u0026thinsp;0.08 to 0.40; p\u0026thinsp;=\u0026thinsp;0.0038) and greater SRV (β\u0026thinsp;=\u0026thinsp;0.23; 95% CI\u0026thinsp;=\u0026thinsp;0.08 to 0.38; p\u0026thinsp;=\u0026thinsp;0.0041). A cross-over interaction effect indicated that increased SRV even at lower AR was associated with greater symptom burden (β\u0026thinsp;=\u0026thinsp;0.41; 95% CI\u0026thinsp;=\u0026thinsp;0.12 to 0.69; p\u0026thinsp;=\u0026thinsp;0.0049, Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). Neither pause rate nor within-overlap rate were significantly associated with severity of manic symptoms.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eExploratory analysis indicated that Language-Thought Disorder (Item 6 of the YMRS) was independently associated with SRV. After controlling for total YMRS score, greater thought disorganization was associated with greater SRV with both a significant linear (β\u0026thinsp;=\u0026thinsp;0.97; 95% CI, 0.18 to 1.76; p\u0026thinsp;=\u0026thinsp;0.023) and a U-shaped quadratic relationship (β\u0026thinsp;=\u0026thinsp;0.55; 95% CI, 0.13 to 0.98; p\u0026thinsp;=\u0026thinsp;0.017).\u003c/p\u003e\u003cp\u003e\u003cb\u003ePredictive Data\u003c/b\u003e\u003c/p\u003e\u003cp\u003e We evaluated the ability of temporal speech features to classify mania (YMRS\u0026thinsp;\u0026ge;\u0026thinsp;20) using five-fold cross-validation, grouping by participant to preserve the repeated-measures structure. Within each training fold we fit a generalized linear mixed-effects model, predicting mania from AR, SRV, their interaction, floor-transfer offset, and speaking time. Out-of-fold predictions were generated for the held-out participants, and model performance was acceptable (CV AUC\u0026thinsp;=\u0026thinsp;0.85, 95% CI 0.72 to 0.99) for out-of-fold predictions. The same process was conducted for predicting remission status with similar results (CV AUC\u0026thinsp;=\u0026thinsp;0.83, 95% CI\u0026thinsp;=\u0026thinsp;0.73 to 0.94).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003e\u003cb\u003eKey Results\u003c/b\u003e\u003c/p\u003e\u003cp\u003eBased on clinical knowledge, we hypothesized that several temporal speech characteristics would differ significantly as a function of manic severity. Our hypothesis was confirmed as participants started speaking sooner, spoke for longer durations, and with quicker rate although this was moderated by SRV. Furthermore, these parameters can identify either mania or remission status. These results closely align with clinical intuition and diagnostic criteria regarding the importance of temporal speech characteristics in BD. We suspect that the lack of within-overlaps segments is more an artifact of the artificial interview structure rather than evidence of an absence of interruptions.\u003c/p\u003e\u003cp\u003eWhat was unexpected were the findings involving our assessment of SRV and both its negative association and crossover interaction with AR. It is important to note our measurement of rate of speech, AR, specifically excludes silences so the presences of irregularly timed pauses would not lower the rate. Given the strong association between thought disorganization and speech disorganization, a possible explanation would suggest that inability to sustain cognitive processes slows the formulation of language, its articulation, or both. This finding is similar to a previous study which found a decreased word output during a test of verbal fluency among those with BD with more racing or ruminative thoughts\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. As cognitive thought disorganization is a symptom shared by many distinct mental illnesses, this speech feature may prove to have value in a transdiagnostic sense.\u003c/p\u003e\u003cp\u003eThere was also evidence of broad differences between individual variability in several parameters, which was unexpected. Future research in the field may benefit from considering an individual's speech holistically over time versus isolated assessments of individual parameters.\u003c/p\u003e\u003cp\u003e\u003cb\u003eLimitations\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe acknowledge several limitations to our study. First, our sample was relatively small and homogenous, and the speech data was collected in a research setting limiting the generalizability of findings to other patient populations or cultural contexts. Sociodemographic features can substantially influence speech even within the same language\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e, and as different languages vary considerably not only in their vocabulary, but also in their phonemes and prosody, we would expect to find considerable differences in these parameters. Despite our small sample size, several of our findings included moderate to large effect sizes which reach statistical significance in this English-speaking population.\u003c/p\u003e\u003cp\u003e Second, we employed sophisticated audio recording equipment and techniques, guided by audio engineering expertise, to capture speech with high fidelity. While this approach improved data quality, it may complicate replication and limit translation into clinical practice. It remains unclear whether comparable results would be obtained using lower-quality audio typical of everyday or clinical environments.\u003c/p\u003e\u003cp\u003eFinally, we relied on automated tools for preprocessing tasks such as speaker diarization and syllable nucleus detection which may introduce error. In formal linguistics studies, these steps are generally performed manually on short segments given that careful annotation of \u0026ldquo;one minute of conversation take[s] an hour for experienced conversational analyst transcribers\u0026rdquo;\u003csup\u003e19\u003c/sup\u003e. Such procedures are not feasible for studies involving hours of data, nor are they practical for eventual clinical implementation. To mitigate these potential sources of error we manually reviewed computer generated annotations.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eConflict of Interest\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing financial interests in relation to the work described.\u003c/p\u003e\n\u003cp\u003eAcknowledgements\u003c/p\u003e\n\u003cp\u003eThis work is supported by the Thomas and Elizabeth Grainger Family Charitable Fund at The Chicago Community Foundation\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eMartin W, Gergel T, Owen GS. Manic Temporality. \u003cem\u003ePhilosophical Psychology\u003c/em\u003e. 2018;32(1):72-97. doi:10.1080/09515089.2018.1502873\u003c/li\u003e\n\u003cli\u003eBlock RA, Gruber RP. Time perception, attention, and memory: A selective review. \u003cem\u003eActa Psychologica\u003c/em\u003e. 2014;149:129-133. doi:10.1016/j.actpsy.2013.11.003\u003c/li\u003e\n\u003cli\u003eKendler KS. The origin of our modern concept of mania in texts from 1780 to 1900. \u003cem\u003eMol Psychiatry\u003c/em\u003e. 2020;25(9):1975-1985. doi:10.1038/s41380-020-0657-0\u003c/li\u003e\n\u003cli\u003eLow DM, Bentley KH, Ghosh SS. Automated assessment of psychiatric disorders using speech: A systematic review. \u003cem\u003eLaryngoscope Investig Otolaryngol\u003c/em\u003e. 2020;5(1):96-116. doi:10.1002/lio2.354\u003c/li\u003e\n\u003cli\u003eFairbanks G. \u003cem\u003eVoice and Articulation Drillbook\u003c/em\u003e. New York, London, Harper \u0026amp; Bros; 1940. Accessed August 22, 2025. http://archive.org/details/voicearticulatio00fair\u003c/li\u003e\n\u003cli\u003eSacks H, Schegloff EA, Jefferson G. A Simplest Systematics for the Organization of Turn-Taking for Conversation. \u003cem\u003eLanguage\u003c/em\u003e. 1974;50(4):696-735. doi:10.2307/412243\u003c/li\u003e\n\u003cli\u003eHeldner M, Edlund J. Pauses, gaps and overlaps in conversations. \u003cem\u003eJournal of Phonetics\u003c/em\u003e. 2010;38(4):555-568. doi:10.1016/j.wocn.2010.08.002\u003c/li\u003e\n\u003cli\u003eLevinson SC, Torreira F. Timing in turn-taking and its implications for processing models of language. \u003cem\u003eFront Psychol\u003c/em\u003e. 2015;6. doi:10.3389/fpsyg.2015.00731\u003c/li\u003e\n\u003cli\u003eBredin H. pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe. In: \u003cem\u003eINTERSPEECH 2023\u003c/em\u003e. ISCA; 2023:1983-1987. doi:10.21437/Interspeech.2023-105\u003c/li\u003e\n\u003cli\u003ede Jong NH, Pacilly J, Heeren W. PRAAT scripts to measure speed fluency and breakdown fluency in speech automatically. \u003cem\u003eAssessment in Education: Principles, Policy \u0026amp; Practice\u003c/em\u003e. 2021;28(4):456-476. doi:10.1080/0969594X.2021.1951162\u003c/li\u003e\n\u003cli\u003eTohen M, Frank E, Bowden CL, et al. The International Society for Bipolar Disorders (ISBD) Task Force report on the nomenclature of course and outcome in bipolar disorders. \u003cem\u003eBipolar Disord\u003c/em\u003e. 2009;11(5):453-473. doi:10.1111/j.1399-5618.2009.00726.x\u003c/li\u003e\n\u003cli\u003eNolan F, Asu EL. The Pairwise Variability Index and coexisting rhythms in language. \u003cem\u003ePhonetica\u003c/em\u003e. 2009;66(1-2):64-77. doi:10.1159/000208931\u003c/li\u003e\n\u003cli\u003eWhitehead AL, Julious SA, Cooper CL, Campbell MJ. Estimating the sample size for a pilot randomised trial to minimise the overall trial sample size for the external pilot and main trial for a continuous outcome variable. \u003cem\u003eStat Methods Med Res\u003c/em\u003e. 2016;25(3):1057-1073. doi:10.1177/0962280215588241\u003c/li\u003e\n\u003cli\u003eKuznetsova A, Brockhoff PB, Christensen RHB. lmerTest Package: Tests in Linear Mixed Effects Models. \u003cem\u003eJournal of Statistical Software\u003c/em\u003e. 2017;82:1-26. doi:10.18637/jss.v082.i13\u003c/li\u003e\n\u003cli\u003eFrick H, Chow F, Kuhn M, Mahoney M. rsample: General Resampling Infrastructure. rsample: General Resampling Infrastructure. Accessed August 22, 2025. https://rsample.tidymodels.org/authors.html\u003c/li\u003e\n\u003cli\u003eLeDell E, Petersen M, van der Laan M. Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. \u003cem\u003eElectron J Stat\u003c/em\u003e. 2015;9(1):1583-1607. doi:10.1214/15-EJS1035\u003c/li\u003e\n\u003cli\u003eWeiner L, Doignon-Camus N, Bertschy G, Giersch A. Thought and language disturbance in bipolar disorder quantified via process-oriented verbal fluency measures. \u003cem\u003eSci Rep\u003c/em\u003e. 2019;9(1):14282. doi:10.1038/s41598-019-50818-5\u003c/li\u003e\n\u003cli\u003eJacewicz E, Fox RA, O\u0026rsquo;Neill C, Salmons J. Articulation rate across dialect, age, and gender. \u003cem\u003eLang Var Change\u003c/em\u003e. 2009;21(2):233-256. doi:10.1017/S0954394509990093\u003c/li\u003e\n\u003cli\u003ePoint S, Baruch Y. (Re)thinking transcription strategies: Current challenges and future research directions. \u003cem\u003eScandinavian Journal of Management\u003c/em\u003e. 2023;39(2):101272. doi:10.1016/j.scaman.2023.101272\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"translational-psychiatry","isNatureJournal":false,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"tp","sideBox":"Learn more about [Translational Psychiatry](http://www.nature.com/tp/)","snPcode":"41398","submissionUrl":"https://mts-tp.nature.com/cgi-bin/main.plex","title":"Translational Psychiatry","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-7613536/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7613536/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eComputational analysis of speech enables precise, objective measurements of a behavioral signal closely linked to manic episodes. In a prospective cohort study conducted in Rochester, MN \u003cb\u003e(\u003c/b\u003eClinicalTrials.gov Identifier: NCT05956340\u003cb\u003e)\u003c/b\u003e, eighteen English-speaking adults voluntarily hospitalized for a manic episode of Bipolar I Disorder between August 2023 and September 2024 were enrolled. Participants completed semi-structured research interviews during both acute episodes and remission, with manic symptom severity assessed at each session using the Young Mania Rating Scale (YMRS). Across 57 recorded interviews, temporal prosodic features were extracted using automated computational algorithms. After adjusting for individual differences and moderating effects, participant articulation rate (β\u0026thinsp;=\u0026thinsp;0.24, p\u0026thinsp;=\u0026thinsp;0.0038), speech rate variability (β\u0026thinsp;=\u0026thinsp;0.23, p\u0026thinsp;=\u0026thinsp;0.0041), floor transfer offset (β = -0.31, p\u0026thinsp;=\u0026thinsp;0.007), and speaking duration (β\u0026thinsp;=\u0026thinsp;0.54, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) were significantly associated with manic symptom severity. These temporal elements classified manic (CV AUC\u0026thinsp;=\u0026thinsp;0.85) and remission (CV AUC\u0026thinsp;=\u0026thinsp;0.83) status, highlighting their potential as monitoring and prognostic biomarkers for manic episodes.\u003c/p\u003e","manuscriptTitle":"Temporal Elements of Speech in Mania","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-26 09:12:40","doi":"10.21203/rs.3.rs-7613536/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"revise","date":"2026-03-29T14:30:16+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"This content is not available.","date":"2026-03-27T14:02:37+00:00","index":1,"fulltext":"This content is not available."},{"type":"editorInvitedReview","content":"This content is not available.","date":"2026-03-27T00:51:48+00:00","index":4,"fulltext":"This content is not available."},{"type":"reviewerAgreed","content":"This content is not available.","date":"2026-03-27T00:39:58+00:00","index":4,"fulltext":"This content is not available."},{"type":"reviewerAgreed","content":"This content is not available.","date":"2026-03-12T15:00:53+00:00","index":3,"fulltext":"This content is not available."},{"type":"reviewerAgreed","content":"This content is not available.","date":"2026-03-12T14:48:14+00:00","index":2,"fulltext":"This content is not available."},{"type":"reviewerAgreed","content":"This content is not available.","date":"2026-03-06T01:53:26+00:00","index":1,"fulltext":"This content is not available."},{"type":"reviewersInvited","content":"","date":"2026-03-05T22:14:23+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-23T10:46:46+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-23T10:46:37+00:00","index":"","fulltext":""},{"type":"submitted","content":"Translational Psychiatry","date":"2025-09-22T21:02:06+00:00","index":"","fulltext":""},{"type":"checksFailed","content":"","date":"2025-09-15T14:24:30+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"translational-psychiatry","isNatureJournal":false,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"tp","sideBox":"Learn more about [Translational Psychiatry](http://www.nature.com/tp/)","snPcode":"41398","submissionUrl":"https://mts-tp.nature.com/cgi-bin/main.plex","title":"Translational Psychiatry","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a93974a6-4516-47d0-bd05-5fea0402abaa","owner":[],"postedDate":"November 26th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":55186939,"name":"Health sciences/Biomarkers/Prognostic markers"},{"id":55186940,"name":"Biological sciences/Psychology/Human behaviour"},{"id":55186941,"name":"Health sciences/Diseases/Psychiatric disorders/Bipolar disorder"}],"tags":[],"updatedAt":"2026-05-11T16:41:32+00:00","versionOfRecord":[],"versionCreatedAt":"2025-11-26 09:12:40","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7613536","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7613536","identity":"rs-7613536","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.