Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement

doi:10.21203/rs.3.rs-5073434/v1

Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement

2024 · doi:10.21203/rs.3.rs-5073434/v1

preprint OA: closed

Full text JSON View at publisher

Full text 164,420 characters · extracted from preprint-html · click to expand

Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement Karee Garvin, Eliana Spradling, Kathryn Franich This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5073434/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 02 Jan, 2025 Read the published version in Scientific Reports → Version 1 posted 10 You are reading this latest preprint version Abstract Humans rarely speak without producing co-speech gestures of the hands, head, and other parts of the body. Co-speech gestures are also highly restricted in how they are timed with speech, typically synchronizing with prosodically-prominent syllables. What functional principles underlie this relationship? Here, we examine how the production of co-speech manual gestures influences spatiotemporal patterns of the oral articulators during speech production. We provide novel evidence that co-speech gestures induce more extreme tongue and jaw displacement and that they contribute to greater temporal stability of oral articulatory movements. This effect–which we term coupling enhancement –differs from stress-based hyperarticulation in that differences in articulatory magnitude are not vowel-specific in their patterning. Speech and gesture synergies therefore constitute an independent variable to consider when modeling the effects of prosodic prominence on articulatory patterns. Our results are consistent with work in language acquisition and speech-motor control suggesting that synchronizing speech to gesture can entrain acoustic prominence. Biological sciences/Developmental biology Biological sciences/Evolution/Evolutionary developmental biology Biological sciences/Neuroscience/Motor control Biological sciences/Psychology/Human behaviour Speech Co-speech gestures Articulation Prosody Speech-motor coupling Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. Introduction Co-speech gestures–the term for movements of the hands, arms, head, eyebrows, and other parts of the body that accompany speech–are ubiquitous in human language use. Although gestures have been shown to have a facilitative effect on speech perception and language processing [1, 2, 3], gestures occur even when communication is not face-to-face [4], suggesting their functional role is not limited to aiding the perceiver. Indeed, evidence indicates that even congenitally blind speakers utilize co-speech gestures, suggesting that visual input may not even be a precursor to gesture acquisition [5]. A body of recent research has suggested that co-speech gestures may play a more integral role in speech production. For example, co-speech gestures tend to occur synchronously with pitch-accented syllables in speech, suggesting that the planning of gesture and speech is closely linked [6, 7, 8, 9, 10]. Furthermore, speech tends to be more fluent when accompanied by co-speech gestures in individuals who stutter [11] as well as individuals with aphasia of speech [12]. Speaking while gesturing appears to bear some similarities to speaking with an external timekeeper (like a metronome), the key similarity being the act of synchronizing speech with another system [13, 14]. Despite the intriguing links between gesture and stability of speech, little work has sought to directly investigate how synchronization between speech and co-speech gesture may influence speech production. Insights from motor control research provide some hints as to the nature of this relationship. A long line of research into movement dynamics in both humans and non-human animals has shown that movements that are coupled in-phase, or synchronously, with other movements are relatively larger and more stable than asynchronous or uncoupled movements [15, 16, 17, 18, 19, 20]. For example, in non-speech motor tasks such as interlimb coordination, synchronous coordination has been shown to result in higher movement amplitude and greater timing stability of arm movements [21]. Increased amplitude and stability of synchronized movement is observed even when coordinating movement of the limbs to an external stimulus, such as a metronome [22]. In these studies, the positive relationship between movement amplitude and stability is striking given that larger movements are generally found to involve greater variability in timing [23, 24]. These studies provide evidence that coupled limb movements are subject to greater stability under a limited set of coordinative regimes: most notably, in-phase coupling [19, 17]. In the speech domain, there is also evidence that synchronization leads to greater temporal stability of movements. For example, the Coupled Oscillator Model of Articulatory Phonology proposes that more stable timing patterns observed between syllable onsets and vowels (vs. syllable codas and vowels) is due to synchronous planning of articulatory movements [25, 26, 27] Coordinative stability between onsets and vowels has been demonstrated for a number of languages [28, 29, 30]. In a study of metronome-timed speech [31], speakers of languages with different prosodic profiles demonstrated increased duration of metronome-synchronized syllables (as opposed to those produced on the offbeat of the metronome), as well as reduced variability in durations for synchronized syllables. Research has also shown that speech synchronized between speakers (‘joint speech’) is relatively less temporally variable than speech spoken alone [32]. Taken together, these findings suggest that speech movement is conditioned by the same principles of coupling enhancement shown in limb movement. There has been comparably little investigation of the effects of co-speech gestures on speech itself. One study by [33] had participants utter short sentences, manipulating where participants were to produce co-speech gestures within the sentence, and where the nuclear pitch accent was to be produced in the sentence. Thus, in a Dutch sentence like A man da gaat naar Mal ta (‘Amanda goes to Malta’), the relevant pitch accent would either be produced on the word Amanda or Malta (aligning with the underlined lexically-stressed syllable), and a beat gesture was to be produced either concurrently with the pitch accent (‘congruent’ conditions), or on the opposite word from the pitch accent (‘incongruent’ conditions). The study found that participants produced longer acoustic durations and lower second formant frequencies on syllables where a gesture was present, even in incongruent conditions, where there was no accompanying pitch accent. Although these results suggest a clear effect of gesture on speech, the authors note that the act of pairing a gesture with an unaccented syllable in the incongruent condition may have been unnatural, leading participants to produce a prominence where one was not intended. Thus, this study leaves open two possibilities for the source of gesture’s effect on speech: (a) prosodic enhancement of unaccented syllables, or (b) a biomechanical effect of gestural-speech coupling on syllable duration. 1.2. Research questions and hypotheses In this paper, we investigate how the presence vs. absence of a temporally-synchronized co-speech gesture (CSG) can influence the articulation of speech. We utilize electromagnetic articulography (EMA) to measure the position and velocities of the oral articulators of speakers of English to examine how the presence of a co-speech gesture influences both the magnitude and timing of articulatory movements. We likewise explore how increased magnitude and temporal stability manifest in the context of synchronization to tease apart the role of gesture from speech prosody. Literature on enhancement effects in speech production has proposed two main mechanisms through which increased articulatory magnitude could be realized: (i) hyperarticulation, whereby articulatory movements are made more extreme in vowel-specific ways: front vowels become fronter, back vowels become backer, low vowels become lower, and high vowels become higher [34, 35]; and (ii) sonority expansion, whereby articulators like the tongue and jaw will be realized with more extreme downward positions in the presence of a gesture [34]. In short, hyperarticulation serves to increase the vowel space (in all directions) through the enhancement of articulatory movements, whereas sonority expansion serves to increase the overall openness of the oral cavity, and thus does not predict any vowel-specific effects of enhancement on articulatory movement direction. We investigate how coupling impacts speech magnitude by comparing these two dimensions of increased gestural magnitude. There is overwhelming evidence from both speech production and perception that accentual prominence in English is associated with at least some level of hyperarticulation of vowels [35, 36, 37, 38]. Thus, if gestures were found to induce articulatory enhancement without hyperarticulation, this would suggest that gesture-based enhancement effects were independent from those related to prosodic prominence more generally. To provide a baseline for our hypotheses, we first investigate two primary aspects of our data to confirm that our results replicate the essential findings from key prior studies: (i) co-speech gestures are timed to stressed/pitch-accented syllables (h1), and (ii) stressed syllables exhibit effects of hyperarticulation when compared to unstressed syllables (h2). We formalize these and our main hypotheses (h3-h4) as follows: h1: Co-speech gestures (CSGs) will be timed to the pitch peak of the stressed syllable. h2: Tongue gestures will have more extreme target achievements in stressed syllables compared to unstressed syllables, with high vowels realized with higher tongue positions, low vowels with lower tongue positions, back vowels with more back tongue positions, and front vowels with more front tongue positions (consistent with hyper articulation). h3: Coupling with CSG will lead to increased magnitude of movement of oral articulators. h3a hyperarticulation: The direction of the effect will differ between vowels, with high vowels realized with higher tongue positions, low vowels with lower tongue positions, back vowels with more back tongue positions, and front vowels with more front tongue positions h3b sonority-expansion: The direction of the effect is consistent across vowel types, where the oral articulators have a more open posture for /i, o, a/, regardless of vowel quality h4: Coupling with CSG leads to increased temporal stability of movement between oral articulators. To test these predictions, we analyze productions of CVCV tokens with alternating stress on either the first syllable (initial stress) or the second syllable (final stress), where each stress condition was produced with (CSG condition) and without (no-CSG condition) an accompanying CSG. To investigate the coordinative and temporal properties of these utterances, we analyze the movement of the tongue tip (TT), tongue body (TB), and jaw (JW) for speech articulation and the CSG apex in co-speech gesture articulation, where target achievement for both oral articulations and CSGs was defined as the point of minimum speed/maximum displacement of the articulator (details on this method are provided in §4.4.2 and §4.4.3, respectively). All methods were performed in accordance with relevant guidelines and regulations. 2. Results 2.1 Replication of previous findings Prior to presenting results of our study, we establish that overall patterns for co-speech gesture timing and stress-based enhancement are similar in our study to those that have been found in previous work. First off, co-speech gesture apexes have been shown to correlate in time with pitch peak of stressed/pitch accented syllables [8, 39, 40]. In our own study, we find that the apex is timed to the target word (e.g. ‘speebee’) in 100% of utterances, and timed to the stressed syllable within this word in 80% of tokens. Furthermore, we conducted a Pearson correlation test between the timing of gesture apex and pitch peak, which shows a strong correlation between stressed syllable peak f0 and apex timing, as shown in Fig. 1 ( r (1751)=.89, p <0.001). This result confirms h1, replicating the finding of existing literature that co-speech gestures are generally timed to pitch extrema in stressed syllables. Second, in English, stressed syllables have been shown to have more extreme tongue displacement compared to unstressed syllables. To test for these effects, we compared the position of the TB during target achievement of stressed and unstressed syllables in both the vertical (y) and horizontal (x) dimensions. A linear mixed effects model indicated a significant effect of both stress (vertical direction: β =3.32, t =12.50, p <0.001; horizontal direction: β =1.48, t =6.78, p <0.001) and vowel quality (vertical direction, /i/ vs. /a/: β =14.06; t =77.65; p <.001; horizontal direction, /i/ vs. /a/: β =8.34; t =52.97; p <.001) in predicting target achievement values, as expected. The model also revealed a significant interaction between these two predictors (vertical direction: β =-3.63, t =-13.01, p <0.001); horizontal direction: β =-1.77, t =-7.28, p <0.001), consistent with the idea that enhancement effects showed effects of vowel-specific hyperarticulation. As seen in Fig. 2, stressed vowels were realized with both lower and backer tongue positions for /a/ and /o/, but higher and fronter tongue positions for /i/, consistent with hyperarticulation, per h2a. 2.2 Effect of co-speech gesture on oral articulator magnitude 2.2.1 Hyperarticulation vs sonority expansion We now examine how the presence of a co-speech gesture may influence speech articulations, considering first the magnitude of speech gestures from the standpoint of both hyperarticulation and sonority expansion. To evaluate these effects, we used a series of GAMM analyses of TB vertical and horizontal displacement across each of the vowel contrasts included in the study, /i, o, a/. The use of GAMMs allows us to examine not only the overall displacement of the tongue across conditions, but also how these effects may vary over time in the articulation of the target word. Turning first to vertical displacement, Fig. 3(a,b) demonstrates that for all vowels in the final stress condition and vowels /i/ and /o/ in the initial stress condition, the TB is significantly lower across vowel types in the presence of a CSG. Furthermore, differences across conditions appear to be roughly localized to the site of the CSG across conditions. Although we focus on the TB results here because TB is most closely associated with vowel production, we likewise tested TT and TD (tongue dorsum) to ensure that the hyperarticulation effect wasn’t localized to the front or back of the tongue. Our results for the TT and TD are similar to those provided here for the TB. In other words, the direction of the effect was the same for all vowels, consistent with an effect of sonority expansion across the board (hypothesis h3b). We next analyzed TB horizontal displacement across vowels in both stress conditions. As demonstrated in Fig 3c,d, there was no significant difference in horizontal displacement for any vowel type for either stress condition, though displacement values were numerically in the same direction across vowels. Results were comparable for TT and TD. Together, these results support h2b, consistent with sonority expansion, and we find no support for h2a, the hyperarticulation hypothesis. 2.2.2 Effects of co-speech gesture on gesture magnitude across oral articulators In line with prior work on coupling-based enhancement, results from our own data presented in §2.2.1 suggest that the vertical movements of the tongue are enhanced for target words when produced in time with a co-speech gesture, regardless of vowel quality. To further delve into the effect of CSG coupling on articulatory gesture enhancement, we analyzed both the acoustic duration of stressed vowels (pooled across all vowel qualities) in target words, as well as the magnitude of movement of various oral articulators in the presence vs. absence of a CSG. Our results reveal increased acoustic duration of stressed syllables when coordinated with a co-speech gesture. A linear mixed effects model of acoustic duration illustrates that the effect of gesture was significant ( β =-2.72e-02, t =-4.57, p <0.001) as was the interaction between gesture and stress, with final-stress tokens showing a larger effect of gesture presence ( β =-1.972e-02, t =43.691, p <0.001). These results demonstrate acoustic enhancement of vowels, where stressed vowels are longer when coordinated with a CSG in both stress conditions, consistent with [33]. We also analyzed the kinematic enhancement of tongue sensor trajectories using a series of GAMM analyses of vertical displacement and velocity. For final stress tokens (Fig. 4a,b), the results reveal greater displacement of both the TT and TB, with the significant difference in gesture magnitude between CSG and no-CSG conditions extending throughout the target word for the TT, and timed at the locus of the stressed syllable and the point of maximum displacement in the TB. In other words, maximum displacement of both the TT and TB during the stressed vowel was greater when the target word was coordinated with a CSG. As expected based on prior work [41], increases in gesture magnitude for stressed syllables were accompanied by increases in articulatory velocity in the CSG condition. For the TB, downward movement in preparation for the stressed vowel target and the closure movement during the stressed syllable both occur at higher velocities when accompanied by a CSG. In the TT gesture, closure movement of the tongue following maximum displacement of the stressed vowel occurs later in the CSG condition. Results in the initial stress condition were more modest, yet still exhibit several effects of coupling enhancement. Namely, Fig. 4c,d demonstrates that TB vertical displacement at the onset of the stressed syllable was significantly greater and the TT gesture was significantly longer when speech was accompanied with a CSG. Nevertheless, there was no significant difference in TT vertical displacement and the TB velocity patterned opposite from the predicted direction of effect; we return to a discussion of these effects and differences between stress conditions in §4. Together, these results demonstrate that coupling of speech with CSG leads to enhancement effects on oral gestures, consistent with studies on synchronized limb movement [21, 22]. Despite significant differences in the magnitude of the TT and TB gestures across subjects, GAMMs analyses of JW gestures across subjects revealed no significant difference in JW displacement or velocity in either stress condition. We return to this finding in light of individual variability in JW displacement in §2.2.3 . 2.2.3 Individual differences in jaw magnitude enhancement The asymmetry in results between the tongue and the jaw was surprising; however, prior work has demonstrated individual differences in the extent to which speakers move their jaw while speaking due to anatomical differences between speakers [42]. Accordingly, we analyzed individual differences in JW vertical displacement. The analyses revealed that some subjects do indeed show a significant difference in vertical JW displacement, with greater displacement in the CSG condition (Table 1). Importantly, there was an implicational relationship between vertical tongue movement and JW movement: if a participant had greater JW displacement in the CSG condition, they also had greater tongue displacement in this condition; yet the reverse was not true. These results are consistent with individual differences found in [42] suggesting that some speakers make greater use of the tongue than the jaw in vowel articulations Table 1. By-subject comparison of magnitude effect of CSG on tongue (TT and/or TB) and jaw displacement in analyses. Checkmark indicates a significantly lower position in articulator vertical displacement during the stressed syllable during the gesture condition compared to the no gesture condition; n.s. indicates no significant difference; X indicates a significant difference in the opposite direction, where the articulator was significantly higher during the stressed syllable in the gesture condition versus the no gesture condition. 2.3 Effect of co-speech gesture on temporal stability In addition to greater movement magnitude, studies on coupling enhancement have shown greater temporal stability of synchronized vs. unsynchronized movement [15, 21, 22, 31]. Specifically for speech, the Coupled Oscillator Model of speech production proposes that temporal stability in CV sequences in English is the result of synchronized planning between the C and V gestures [e.g., 25, 28, 26, 27, 29]. Thus, to analyze the effects of temporal stability in the presence of a CSG, we begin by analyzing the temporal stability in target achievement in the stressed syllable CV sequence (CV stability), where the articulator for target achievement depends on the quality of the segment, as detailed in §4.4.2. The results reveal greater stability, as indexed by relative standard deviation (RSD), in the temporal coordination of targets in the CV sequence when the target word is produced with a CSG compared to targets not produced with a CSG, as demonstrated in Fig. 5. A linear mixed effects model demonstrates that this effect of stress is significant ( β = 4.174e-02, t = -2.84, p < 0.05), as was the interaction between gesture and stress position ( β = -3.994e-02, t = -4.659, p < 0.001). These results demonstrate that syllables synchronized with a CSG are more temporally stable than those unsynchronized to a gesture. We likewise analyzed temporal stability in synchronization between articulators in producing the stressed vowel (within-segment stability). Both the tongue and the jaw move downward during the production of a stressed vowel; thus, within-segment-stability can be understood as how closely the timing of maximum displacement is synchronized between the TB and JW during stressed vowel production. Figure 6 demonstrates that synchronization between TB and JW is greater in the presence of a CSG compared to tokens produced without a CSG. An lme model of TB-JW lag reveals that this effect is significant across stress conditions ( β =2.842e-02, t =6.782, p < 0.001). As mentioned in §1, studies suggest a positive relationship between the strength of temporal coupling of movements and the magnitude of movement [15, 21, 22, 31]. We test this hypothesis in our data by analyzing the correlation between the temporal lag, i.e., synchronization, between TB and JW movements (within-segment stability), and the magnitude of TB vertical displacement. We also examined the correlation between the lag in timing between gesture apex and TB and the magnitude of TB vertical displacement. A Pearson correlation test demonstrates a significant positive correlation between TB-JW lag and TB displacement (Fig. 7a) in both stress conditions (Final Stress: r (660)=.22, p <0.001; Initial Stress: r (768)=.19, p <0.001). A stronger correlation was observed between TB-Apex lag and TB displacement (Fig. 7b) in the final stress condition (Final Stress: r (660)=.48, p 0.05). Overall, these results demonstrate that as stability increases, magnitude also increases. We return to differences across stress conditions and explore possible sources of these differences in §4. 3. Discussion Co-speech gestures are ubiquitous in naturalistic communication. Although they are known to aid in perception of speech and in conveying semantic and pragmatic information, our study provides evidence that the facilitative role of gestures extends to speech production. Specifically, our results suggest that speech movements timed to co-speech gestures show enhanced magnitude and temporal stability. Consistent with h3 , across all subjects in our sample, vowel durations were longer and tongue displacement was lower for speech produced synchronously with a manual co-speech gesture than speech produced without a gesture. Most subjects likewise achieved lower JW positions for target words in utterances containing a gesture. Complementing the increase in magnitude found in the gesture condition, we also find evidence that articulatory movements were less variable when produced with co-speech gestures, consistent with h4 . First, we found relatively lower variability in consonant-vowel timing in the presence of a gesture, particularly in the final stress condition. Second, within stressed vowels (the syllables to which gestures were most closely timed in our data), the lag between target achievement of TB and JW movements was found to be lower overall in the CSG condition. Furthermore, we find a direct link between stability of timing to the CSG and magnitude of oral articulatory gestures, as tighter timing between TB and CSG correlated positively with greater movement of the TB in words with final stress. Taken together, our results display the trademarks of coupling-based movement enhancement and stabilization that have been described within several domains of motor control [ 15 , 18 , 21 , 22 , 25 , 26 , 31 ]. Our results suggest that by producing speech with gestures, speakers are, in effect, stabilizing their own articulatory movements and enhancing coordination between those movements. This facilitative effect is likely a motivating factor behind the pervasive use of gestures even when they do not play a direct role in facilitating speech perception. Furthermore, the kinds of facilitative effects observed in co-speech gesture coupling suggest that gestures may play an active role in enhancing fluency in a variety of speech and language disorders [ 11 , 12 , 43 ] as well as in L2 learning [ 44 ]. Our findings on the effects of CSG on the magnitude of speech gestures also shed light on the relationship between co-speech gestures and speech prosody. While our results demonstrate an effect of hyperarticulation in comparing stressed and unstressed syllables (§ 3.1), the effects of gesture on speech are distinct from those of stress: we find a uniformly lower tongue position across vowel types in the presence of a co-speech gesture, consistent only with sonority expansion effects. Meanwhile, we find no evidence to support an added hyperarticulation effect in the presence of a co-speech gesture. Thus, it does not appear that gesture presence leads to an increase in the magnitude of stress correlates in speech. Our results also provide counterevidence to the idea that speech and gesture serve as redundant and complementary cues to prosodic prominence in the way that e.g. voice onset time, vowel duration, and fundamental frequency complement one another in cuing voicing contrasts in some languages [ 45 ]; if this were the case, we would hypothesize that oral gestures should be less extreme in the context of a CSG. However, we find no evidence of a trading relation between speech and co-speech gesture in marking prominence in our data. Instead, it appears that gestures are inducing a biomechanical effect on articulatory gestural magnitude, which does not implicate phonology or prosody directly [c.f. 31]. We term this biomechanical effect coupling enhancement . Our findings also point to new avenues for future research on the relationship between speech articulation, gesture coordination, and rhythm in language. Our findings indicated that the effects of co-speech gestures on speech dynamics were much stronger for words with final stress than for words with initial stress. Though we cannot say definitively why this difference arose across stress conditions, we look to research on the timing of articulatory gestures at syllable boundaries for some clues. Disyllabic words with initial stress have been shown to display greater variability of timing in tongue movements for medial consonants, as medial consonantal gestures (which would normally be expected to serve as onsets to the final syllable) tend to show an attraction to the preceding stressed syllable [46, 47]. Variability of this magnitude is not observed with disyllabic final stress words. We hypothesize that this variability in coordination reduces the effect of CSG on magnitude and stability for initial stress tokens in our study. Finally, we believe our findings can help to bridge a gap between theories of speech-motor coordination and prosody. Prior work has proposed that speech and CSG are jointly controlled by the same prosodic planning mechanism [48, 49]. For example, [49] find that both speech articulatory gestures and co-speech gestures display differences in duration as a function of prosodic prominence. They propose that the same clock-slowing mechanism for inducing longer durations on speech articulations under prominence may also operate over co-speech gestures. Our own results suggest that coupling speech to gesture has effects that are independent of prosodic effects. However, the effects we observe in our data could act as a precursor to such a prosodic control mechanism. Specifically, the duration-enhancing effect of gesture on speech coupling could be grammaticalized and applied predictably to different prosodic environments, similar to the grammaticalization of lexical tone based on f0 perturbations over time [ 50 ]. Indeed, the idea that gesture can entrain articulatory and acoustic cues to prosodic prominence has been postulated on an even shorter timeline, as young children have been shown to employ prosodically appropriate co-speech gestures before they learn to produce the acoustic cues to prominence that accompany such gestures in adult speech [ 51 ]. If this hypothesis proves to be correct, it will have important implications for the study of prosody and language typology. Concretely, given cross-linguistic variability in the timing between co-speech gestures and speech, we may expect to see correlations between gesture timing strategies and prosodic patterns across the world’s languages. A better understanding of gesture as a potential phonetic precursor may help to shed light on patterns of sound change and the evolutionary relationship between speech and co-speech gesture. 4. Methods 4.1 Participants Prior approval from the Institutional Review Board (IRB) was obtained for this study and appropriate protocols were followed for data collection. All methods were performed in accordance with relevant guidelines and regulations. This included obtaining participants' written informed consent to participate in the study. Ten subjects (1 male, 8 female, 1 unspecified) participated in the study. Participants consisted of students and young professionals recruited in the greater Boston, Massachusetts area; ages ranged from 20-40. All participants reported US English as their native language and none of the participants reported any history of speech/language or vision impairment. 4.2 Materials Stimuli consisted of nonce words of the shape CVbV, controlling for vowel quality: /i, o, a~ə/, initial consonant: /s, p, l/, stress (initial vs. final): /sóbo/ or /sobó/. Vowels were chosen to mimic stressed and unstressed vowels in English while sampling a range of the total vowel space. As will be detailed below, target words were produced in one of two conditions: with or without a concurrent co-speech gesture. Examples of the target words are provided in Table 2. All target words were produced in the carrier phrase I saw the____ today , to situate the target word in the environment V#CVCV#C, to maximize clarity in articulatory data parsing. The task was blocked by stress condition and by co-speech gesture condition, with the order of blocks randomized by participant. Half of participants produced the gesture block followed by the non-gesture block and the other half produced the non-gesture block followed by the gesture block. Each token was repeated 6 times throughout the task block and the order of tokens was randomized within the task block. A total of 216 tokens per subject were produced (9 word shapes x 2 stress conditions x 2 gesture conditions x 6 repetitions). Table 2: Target word shapes included in the study, where IPA transcripts are provided on the first line and orthographic transcripts seen by participants are included on the second line. /s/ /p/ /l/ /i / /síbi, sibí/ /píbi, pibí/ /líbi, libí/ /o / /sóbo, sobó/ /póbo, pobó/ /lóbo, lobó/ /a~ə/ /sábə, səbá/ /pábə, pəbá/ /lábə, ləbá/ 4.3 Procedure Stimuli were displayed using OpenSesame software [52] on a computer monitor at a comfortable distance from each participant. A member of the research team manually progressed through each experimental item to ensure that each token was produced correctly before moving on to the next item. If the participant made a speech error, the researcher prompted the participant to try again and the target utterance was immediately repeated. Each task block contained a short set of instructions that told the participant what to expect during the task and provided an example sentence to illustrate stress production. The instructions for each task block were read aloud by a researcher. Sample sentences on instruction screens contained words that replicated the same stress construction, but with phones that were not present in the stimulus set. In order to ensure that the prosodic context was similar between gesture and non-gesture conditions, for all items, participants were given the same prompt, shown in Ex 1. Imagine you’re traveling in a foreign land and see a famous landmark while out exploring. You run into a friend in your travels and excitedly tell them about your experience. Prior to the start of the gesture condition experiment block, participants were presented with a single demonstration video of a researcher naturally producing a sample sentence with a bimanual co-speech gesture produced synchronously with the target word. Participants were asked to model their gesture after the video, and to use both hands for the gesture, but were not explicitly told to copy all aspects of the model gesture. Participants were also not explicitly told when they should begin or end the gesturing within the spoken sentence. Participants were only instructed to produce a manual gesture as they read each sentence aloud. 4.4 Data collection 4.4.1 Acoustic data collection The primary acoustic data used for acoustic analyses in this study were collected using a Rode NTG2 shotgun microphone. The microphone was attached to a boom arm mounted to the desk and positioned above the participant’s head. This data is time-aligned with the EMA data files, where for the primary acoustic data and EMA data, each utterance was saved individually. However, the video data was not time aligned to the primary acoustic data and EMA recordings; thus, secondary acoustic data was also recorded for the video data in order to time align the two data sets, as discussed in §4.5.1. Secondary acoustic data time-aligned to the video was recorded using a Zoom Q8 Handy Video Recorder Microphone. 4.4.2 EMA data collection EMA data was collected using the NDI Wave system, a point-tracking system with accuracy within approximately 0.5 mm [53] and a sampling rate of 100Hz. NDI Wave 5 Degree of Freedom (5DoF) sensors were attached to the face and mouth to capture movement of the articulators. Reference sensors were used to track the position and movement of the head in order to correct articulatory data for head movement, as discussed in more detail in §4.4.2. Reference sensors consisted of five NDI Wave 5DoF sensors. Three 5DoF reference sensors were placed on the right mastoid (RMA), left mastoid (LMA) and on the bridge of the nose (NAS), as illustrated in Fig. (8a); these sensors remained in place for the entirety of the experiment. Two additional sensors were used to record the occlusal plane of each participant. Sensors were attached along the sagittal midline to a wax bite plate with one sensor aligned with the front incisor (OS) and the other aligned with the back molar (MS), as illustrated in Fig. (8b). Prior to the presentation of the stimuli, participants were asked to hold still and held the bite plate between their teeth for five seconds while a recording was made of their position. The bite plate was then removed and set aside for the remainder of the experiment. Movement of the oral articulators were tracked using six 5DoF sensors attached to the incisors, lips, and tongue, as shown in Fig. 9. Sensors were attached to the upper and lower lips near the vermilion border, lower jaw to the gums below the lower incisor (JW), and three sensors were attached along the sagittal midline of the tongue at the tongue tip (TT), blade (TB), and dorsum (TD). The front-most sensor (TT) was attached less than 1 cm from the tip of the tongue, the back-most sensor (TD) was attached as far back as was comfortable for the participant, typically 4-6 cm from the tongue tip, and the third sensor (TB) was placed midway between the TT and TD sensors. The participant was seated in front of the experiment monitor and beside the EMA magnet. A researcher then created a mold of the participant’s bite and reference sensors were attached before collecting the bite plate recording. Next, positions for the oral sensors were marked before gluing to ensure consistency in placement in the case of re-gluing and then attached accordingly. Tongue and jaw sensors were attached using glue and lip sensors were attached using medical-grade tape. Before beginning stimuli collection, the researcher engaged the participant in conversation for several minutes to help them adjust to speaking with the attached sensors. Throughout the study, one researcher was assigned the task of controlling the experiment program while another was assigned the task of monitoring the sensor tracking to ensure that all sensors were actively tracking. The researcher monitoring the sensor tracking would interrupt the experiment to notify the other researcher of a sensor that began to behave irregularly. In such cases, sensor stability was assessed and sensors were secured as needed. 4.4.3 Video data collection Video data was collected using a Zoom Q8 Handy Recorder with a 160° wide-angle lens and a 30 fps framerate. The camera was mounted directly above the participant’s monitor and angled to ensure it captured the entirety of a participant's manual gestures, from resting position in the participant's lap to full extension during the manual beat gesture. Examples of still images from the video recording are provided in Fig. 10, where (a) demonstrates a no-gesture condition where the hands were kept at rest in the lap and (b) demonstrates a gesture condition at the point of maximum extension during the beat gesture production. Each task block comprised its own video recording. Informed consent was obtained from all subjects and/or their legal guardian(s) for publication of identifying information/images in an online open-access publication.; however, out of consideration for the participant, we obscured the eyes to protect their privacy. 4.5 Data processing 4.5.1 Acoustic data processing Primary acoustic data was forced aligned using the Montreal Forced Aligner (MFM) [54]. Pitch maxima were extracted from each vowel in target words using a script for Praat [55]. Primary acoustic data, which was time-aligned to the EMA recordings, and secondary acoustic data, which was time-aligned to the video data, was merged using the Python tool audalign , which identifies similarities across audio files in order to allow for time alignment of signals [56]. The alignment between primary and secondary acoustic data was reviewed by the researchers to identify and correct any misalignments between the two data sets. Corrections were made by identifying a unique acoustic landmark within the utterance that could be used to synchronize the two recordings. 4.5.2 EMA data processing The EMA data was rotated along the mid-sagittal plane and head movements were corrected for in Python [57] using the reference and bite plate sensors [58], such that the origin of the spatial coordinates corresponds to the front teeth. All articulatory trajectories were smoothed using Garcia’s robust smoothing algorithm [59]. All EMA data was visually inspected by a research assistant to ensure that no recordings containing major tracking errors or detached sensors were included in the data analysis. Major errors in the recording were identified by ensuring that all movements were consistent with possible movement of the articulators. Recordings that included any major errors in tacking were removed from the dataset prior to analysis. Following a similar procedure to that implemented in [60], relevant gestural targets, including the point of minimum and maximum displacement in the horizontal and vertical plane, as well as local x and y speed minima, were automatically identified in Python using a window determined by the acoustic signal. The articulator used to determine target achievement differs depending on the segment quality and the kinematic profile of the utterance. Target identification proceeded hierarchically based on the articulatory parameters of the segment, where if a target could not be detected on the basis of the first articulator, a target was assessed using the second and finally third articulators, as detailed in Table 3. Typically, the secondary and tertiary levels were only needed in the case of unstressed vowels, which are not the focus of this study. Table 3: Articulators used in determining target achievement for a given segment. The secondary tier of the table was only used if no target could be identified from the primary level, and the tertiary tier was only used if no target could be identified from either the primary or the secondary level. b p l s a ~ ə i o primary LL/UL LL/UL TT TT JW JW JW secondary – TB TB TB TB TB tertiary TT LL/UL 4.5.3 Video data processing Co-speech gestures were coded by a team of researchers trained in gesture coding using ELAN [61] following the MIT Gesture Studies Coding Manual [62], which outlines several phases of the gesture including preparations, strokes, holds, and recoveries based on [7]. The apex of the gesture was automatically extracted based on manual annotations of gesture strokes using MultiPose [63], which uses MediaPipe [64] to track pixel movement in the video recording to extract a set of xy coordinates for a given articulator. In this study, we identified the right wrist as the most stable articulator for identifying the apex of the co-speech gesture (though there was one left-handed participant in the sample, he produced all gestures with both hands). The MultiPose workflow can identify several key kinematic landmarks for a given articulator. In this study, we defined the CSG apex as the xy speed minimum, which closely corresponds to the point of maximum extension. 4.6 Statistical analysis The data was analyzed in R [65] using a combination of Pearson Correlation Testing, Linear Mixed Effects Models (lmer) [66], and Generalized Additive Mixed Models (GAMM) [67]. 4.6.1 Linear mixed effects models Linear mixed effects (lme) models were used to evaluate differences in displacement of oral articulators at specified timepoints and duration of vowels in target words. We use an lme model to model the interaction of stress and vowel quality on TB vertical and horizontal displacement at the point of achievement during production of stressed and unstressed vowels (§2.1). We likewise used lme models to predict the interactional effect of stress and gesture on a number of variables including stressed vowel duration (§2.2.2), CV stability (§2.3) and synchronization between TB and JW within the stressed vowel (§2.3). Subject was included as a random intercept for all lme models. Following [68], models were initially fit with maximal random effects structures, and random slope parameters were only reduced from the model if they eliminated singularity [68]. The resulting models are illustrated in (2), where X is determined by the analysis as outlined above. Linear mixed effects model structures lmer(X~stress*vowel_quality+(1+stress|Subject) lmer(X~stress*gesture+(1+gesture|Subject) 4.6.2 Generalized Additive Mixed Models Generalized Additive Mixed Model (GAMM) analyses were implemented using the bam package to assess differences in displacement and velocity of the oral articulators over time during target word production between the gesture and the no gesture conditions. All GAMM analyses used the basic model formula provided in (3), where X is either the vertical displacement, horizontal displacement, or vertical velocity of the relevant sensor, and time is normalized such that the onset of the target word corresponds with zero and the offset of the target word corresponds with 1. These models predict a given articulatory trajectory with gesture presence, smoothed time, and time smoothed by gesture as fixed effects, the random intercepts of subject and gesture, and the random smooths of subject and time. This model gives both the nonlinear and the constant difference between the gesture tasks. Generalized additive mixed model structure bam(X~gesture + s(time) + s(time, by=gesture, bs="tp", k=10) + s(subject, gesture) + s(time, subject) Across all data in the GAMM analyses, time was normalized to the production of the target word, and articulatory trajectories and velocities were compared in the presence of a CSG (gesture condition) and the absence of a CSG (no-gesture condition). 4.7 Defined measures of analysis In this section, we define each of the measures used in the analysis of this study. Apex (AX) is defined as the point of minimum speed of the right wrist during the execution of a gesture. Time of max f0 is defined as the time of the pitch peak for the phone coinciding with the gesture apex. TB vertical displacement is defined as the vertical (y) position of the TB sensor during the target achievement of the vowel TB horizontal displacement is defined as the horizontal (x) position of the TB sensor during the target achievement of the vowel Stressed vowel duration : Acoustic duration of the stressed vowel as defined by the start point and end point of the parsed segment in the forced alignment process. CV lag Relative Standard Deviation (RSD) : standard deviation of the lag divided by the mean of the lag, where lag is the target achievement of the consonant subtracted from the target achievement of the vowel, and target achievement is the local xy speed minimum of a given gesture (as defined in §4.4.2). JW to TB lag : Absolute value of the lag between maximum extension of the TB and JW during stressed vowel production. Declarations 6. Data accessibility Data files and scripts for statistical analysis are included with the supplementary files and will be made available in an open-access repository upon acceptance of the paper. 7. Acknowledgements Withheld for anonymity. To be added upon acceptance. 8. Author contributions Author 1 and Author 3 conceived of the study and wrote the main manuscript text; all authors contributed to the data collection, processing, and writing the methods section. Author 1 prepared all figures. All authors reviewed the manuscript. 9. Additional information 9.1 Competing interests statement None declared. 10. Ethics declarations This research was granted ethics approval by the Institutional Review Board of [WITHHELD FOR ANONYMITY] (Protocol IRB 22-1097). References Sueyoshi, A, & Hardison, D. M. The role of gestures and facial cues in second language listening comprehension. Language Learning 55, 661–99 (2005). Hostetter, Autumn B. When do gestures communicate? A meta-analysis. Psychological Bulletin 137, 297–315 (2011). Goldin-Meadow, S. & Alibali, M. W. Gesture’s role in speaking, learning, and creating language. Annual Review of Psychology 64, 257–83 (2013). Bavelas, J. Gerwing, J. Sutton, Ch. & Prevost, D. Gesturing on the telephone: Independent effects of dialogue and visibility. Journal of Memory and Language 58. 495–520 (2008). Özçalışkan, Ş., Adamson, L.B., Dimitrova, N., & Baumann, S. Early gesture provides a helping hand to spoken vocabulary development for children with autism, down syndrome, and typical development. Journal of Cognition and Development 18, 325–37 (2017). Esteve-Gibert, N., Borràs-Comes, J., Asor, E., Swerts, M., & Prieto, P. The timing of head movements: The role of prosodic heads and edges. The Journal of the Acoustical Society of America. 141, 4727–4739 (2017). Kendon, A. Gesticulation and speech: Two Aspects of the process of utterance in The Relationship of Verbal and Nonverbal Communication. (ed. Key, M. R.) 207–228 (De Gruyter Mouton, 1980). Leonard, T., & Cummins, F. The temporal relation between beat gestures and speech. Language and Cognitive Processes. 26, 1457–1471 (2011). Loehr, D. P. Temporal, structural, and pragmatic synchrony between intonation and gesture. Laboratory Phonology. 3, 71-89 (2012). Rochet-Capellan, A., Laboissière, R., Galván, A., & Schwartz, J.-L. The speech focus position effect on jaw–finger coordination in a pointing task. Journal of speech, language, and hearing research. 51, 1507–1521 (2008). Mayberry, R. I., & Jaques, J. Gesture production during stuttered speech: Insights into the nature of gesture–speech integration in Language and Gesture, (ed. McNeill D.) 199–214 (Cambridge University Press, 2000). Devanga, S. R., & Mathew, M. Exploring the use of co-speech hand gestures as treatment outcome measures for aphasia. Aphasiology. Advance https://doi.org/10.1080/02687038.2024.2356287 (2024). Brady, J. P. Studies on the metronome effect on stuttering. Behaviour Research and Therapy. 7, 197–204 (1969). Toyomura, A., Fujii, T., & Kuriki, S. Effect of external auditory pacing on the neural activity of stuttering speakers. NeuroImage. 57, 1507–16 (2011). von Holst, E. The behavioural physiology of animals and man in The collected papers of Eric von Holst. (University of Miami Press, 1973) Hoyt, D. F., & C. Taylor, R. Gait and the energetics of locomotion in horses. Nature. 292, 239–40 (1981). Haken, H., Kelso, J. A. S., & Bunz, H. A theoretical model of phase transitions in human hand movements. Biological Cybernetics. 51, 347–356 (1985). Beek, P. J., Peper, C. E., & Stegeman, D. F. Dynamical models of movement coordination. Human Movement Science. 14, 573–608 (1995). Kelso, J. A. S. Dynamic patterns: The self-organization of brain and behavior. (MIT Press, 1995). De Poel, H. J., Roerdink, M., Peper, C. (L.) E., & Beek, P. J. A re-appraisal of the effect of amplitude on the stability of interlimb coordination based on tightened normalization procedures. Brain Sciences. 10 https://doi.org/10.3390/brainsci10100724 (2020). Schwartz, M., Amazeen, E. L., & Turvey, M. T. Superimposition in interlimb coordination. Human Movement Science. 14, 681–694 (1995). Kudo, K., Park, H., Kay, B. A., & Turvey, M. T. Environmental coupling modulates the attractors of rhythmic coordination. Journal of Experimental Psychology: Human Perception and Performance. 32, 599–609 (2006). Fitts, P. M. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology. 47, 381–391 (1954). Messier, J, & Kalaska, J. F. Differential effect of task conditions on errors of direction and extent of reaching movements. Experimental Brain Research. 115, 469–78 (1997). Kozhevnikov, V. A. & Chistovich, L. A. Speech: articulation and perception. (Joint Publications Research Service, 1966). Löfqvist, A., & Gracco, V. L. Interarticulator programming in VCV sequences: Lip and tongue movements. The Journal of the Acoustical Society of America. 105, 1864–1876 (1999). Nam, H., Goldstein, L.M., & Saltzman, E. Self-organization of syllable structure: A Coupled Oscillator Model in Approaches to phonological complexity. 299–328. (2009). Browman, C. P., & Goldstein, L. M. Some notes on syllable structure in articulatory phonology. Phonetica. 45, 140-155 (1988). Marin, S., & Pouplier, M. Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control. 14, 380-407 (2010). Tilsen, S. et al. A cross-linguistic investigation of articulatory coordination in word-initial consonant clusters. Cornell Working Papers in Phonetics and Phonology. 51-81 (2012). Franich, K. How we speak when we speak to a beat: The influence of temporal coupling on phonetic enhancement. Laboratory Phonology 13, https://doi.org/10.16995/labphon.6452 (2022). Cummins, F. On synchronous speech. Acoustic Research Letters Online. 3, 7–11 (2002). Swerts, M. G. J., & Krahmer, E. J. Facial expressions and prosodic prominence: Effects of modality and facial area. Journal of Phonetics. 36, 219-238 (2008). de Jong, K. J., Beckman, M.E., & Edwards, J. The interplay between prosodic structure and coarticulation. Language and speech. 36, 197–212 (1993). de Jong, K. J. The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. Journal of the Acoustical Society of America. 97, 491–504 (1995). Erickson, D. Articulation of extreme formant patterns for emphasized vowels. Phonetica. 59, 134–149 (2002). Cho, T. Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of /ɑ, i/ in English. The Journal of the Acoustical Society of America. 117, 3867–3878 (2005). Steffman, J. Contextual prominence in vowel perception: Testing listener sensitivity to sonority expansion and hyperarticulation. JASA Express Letters 1, 045203. https://doi.org/10.1121/10.0003984 (2021). Esteve-Gibert, N. & Prieto, P. Prosodic structure shapes the temporal realization of intonation and manual gesture movements. Journal of speech, language, and hearing research. 56, 850-864 (2013). Krivokapic, J., Tiede, M. K., Tyrone, M. E., & Goldenberg, D. Speech and manual gesture coordination in a pointing task in Proceedings of Speech Prosody. 1240-1244 (2016). Munhall, K. G., Ostry, D. J., & Parush, A. Characteristics of velocity profiles of speech movements. Journal of experimental psychology: Human perception and performance, 11, 457-474 (1985). Johnson, K. Speech production patterns in producing linguistic contrasts are partly determined by individual differences in anatomy. UC Berkeley Phonetics and Phonology Lab Annual Report. http://dx.doi.org/10.5070/P7141042483 (2018). Kong, A. P.-H., Law, S.-P., Wat, W. K.-C. & Lai, C. Co-verbal gestures among speakers with aphasia: Influence of aphasia severity, linguistic and semantic skills, and hemiplegia on gesture employment in oral discourse. Journal of Communication Disorders 56, 88–102 (2015). Cavicchio, F. & Grazia Busà, M. Lending a hand to speech: Gestures help fluency and increase pitch in second language speakers. LIA 14, 218–246 (2023). Byrd, D., Tobin, S., Bresch, E., & Narayanan, S. Timing effects of syllable structure and stress on nasals: A real-time MRI examination. Journal of Phonetics. 37, 97–110 (2009). Garvin, K. Word-medial syllabification and gestural coordination. Doctoral Dissertation, University of California, Berkeley. (2021). Parrell, B., Goldstein, L., Lee, S., & Byrd, D. Spatiotemporal coupling between speech and manual motor actions. Journal of Phonetics. 42, 1–11 (2014). Krivokapić, J., Tiede, M. K., & Tyrone, M. E. A kinematic study of prosodic structure in articulatory and manual gestures: Results from a novel method of data collection. Laboratory Phonology. 8, https://doi.org/10.5334/labphon.75 (2017). Matisoff, J. A. Tibeto-Burman tonology in an areal context in Procedings of the symposium: Cross-linguistic studies of tonal phenomena: Tonogenesis, typology and related topics (ed. Kaji, S.) 3–32 (ILCAA, 1999). Esteve‐Gibert, N., Lœvenbruck, H., Dohen, M. & D’Imperio, M. Pre‐schoolers use head gestures rather than prosodic cues to highlight important information in speech. Developmental Science 25, e13154; https://doi.org/10.1111/desc.13154 (2022). Mathôt, S., Schreij, D. and Theeuwes, J. (2012). OpenSesame: An open-source, graphical experiment builder for the social sciences. Berry, J. J. Accuracy of the NDI Wave speech research system. Journal of speech, language, and hearing research. 54, 1295-1301 (2011). McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., and Sonderegger, M. Montreal Forced Aligner: trainable text-speech alignment using Kaldi. In Proceedings of the 18th Conference of the International Speech Communication Association. (2017). Boersma, P., and Weenink, D. Praat: Doing phonetics by computer. 6.2.23 http://www.praat.org/ (2022). Miller, B. Audalign 1.2.4. https://pypi.org/project/audalign/ (2024). Van Rossum, G., & Drake Jr, F. L. Python reference manual. 3.10.12 Centrum voor wiskunde en informatica Amsterdam. (1995). Johnson, K., & Sprouse, R. L. Head correction of point tracking data. UC Berkeley PhonLab Annual Report. 15, https://doi.org/10.5070/P7151050341 (2019). Garcia, D. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics and Data Analysis. 54, 1167–1178 (2010). Tiede, M. MVIEW: Multi-channel visualization application for displaying dynamic sensor movements. (2010) ELAN 6.4 https://archive.mpi.nl/tla/elan (2022) MIT speech communication group gesture coding manual. http://scg.mit.edu/gesture/coding-manual.html Dych, W., Garvin, K., & Franich, K. Creating multimodal corpora for co-speech gesture research. CorpusPhon. abstr. (2024). Lugaresi et al. MediaPipe: A Framework for Building Perception Pipelines. (2019). R Core Team. R: A language and environment for statistical computing. (2013). Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 67, 1–48 (2015). Wood, S. Generalized additive models: an introduction with R. (CRC Press, 2006). Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language. 68, 255–278 (2013). Additional Declarations No competing interests reported. Supplementary Files ALLSUBtargetcvcv.csv SNALLARTGESTENGstabilitydf.csv supplementalmaterial.zip Cite Share Download PDF Status: Published Journal Publication published 02 Jan, 2025 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 31 Oct, 2024 Reviews received at journal 23 Oct, 2024 Reviews received at journal 09 Oct, 2024 Reviewers agreed at journal 05 Oct, 2024 Reviewers agreed at journal 05 Oct, 2024 Reviewers invited by journal 05 Oct, 2024 Editor assigned by journal 03 Oct, 2024 Editor invited by journal 30 Sep, 2024 Submission checks completed at journal 27 Sep, 2024 First submitted to journal 11 Sep, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5073434","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":372564837,"identity":"5991fdeb-13a3-441a-904c-d1bf52024c0a","order_by":0,"name":"Karee Garvin","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5klEQVRIiWNgGAWjYLCCBCAEAsYHQIKHjxQtzAYgLWxE2wMEbBJgkpBaefezzyQe7kiTN29vPlb5NcdOho2B+eGjG3i0GJ5JN5NIPJNjOOfMsbTbstuSgQ5jMzbOwaelIY3ZILGtgnGGRI7ZbcltzEAtPGzSeLX0PwNrsZ8h//5bseS2esJa5CXSGB8ktuUkzpDgYWP8uO0wYS0GEs9AWtKSZ/CkGUszbjvOw8ZMwC/y/WkMB3+2JdvOYD/88OPPbdX2/OzNDx/jteUAEoeZB0ziUQ62pQGJw/iDgOpRMApGwSgYmQAAhAdD0UZx/gcAAAAASUVORK5CYII=","orcid":"","institution":"Harvard University","correspondingAuthor":true,"prefix":"","firstName":"Karee","middleName":"","lastName":"Garvin","suffix":""},{"id":372564838,"identity":"8028bdb5-9c4d-4029-bba7-64e746b1b850","order_by":1,"name":"Eliana Spradling","email":"","orcid":"","institution":"Harvard University","correspondingAuthor":false,"prefix":"","firstName":"Eliana","middleName":"","lastName":"Spradling","suffix":""},{"id":372564841,"identity":"b40eb86d-d13e-47b4-9dcf-15e56a38e111","order_by":2,"name":"Kathryn Franich","email":"","orcid":"","institution":"Harvard University","correspondingAuthor":false,"prefix":"","firstName":"Kathryn","middleName":"","lastName":"Franich","suffix":""}],"badges":[],"createdAt":"2024-09-11 20:30:57","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5073434/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5073434/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-024-84097-6","type":"published","date":"2025-01-02T15:57:41+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":69362494,"identity":"a96a22b1-9e7e-43ac-9721-5dd594ba7cfe","added_by":"auto","created_at":"2024-11-19 14:34:22","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":41821,"visible":true,"origin":"","legend":"\u003cp\u003eCorrelation of timing between F0 maximum and co-speech gesture apex, where both the time of gesture apex and time of max f0 are relative to the target word onset.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/824deed1c4393fd8842b6828.png"},{"id":69361943,"identity":"89b3d3cb-bb44-42d3-aec7-3b486e02c4dc","added_by":"auto","created_at":"2024-11-19 14:26:22","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":39239,"visible":true,"origin":"","legend":"\u003cp\u003eTB vertical (2a) and horizontal (2b) displacement during target achievement for /i, o, a/. Lower values in horizontal displacement correspond with a backer tongue posture.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/03ecba5608c8700c96068f48.png"},{"id":69361951,"identity":"35a99ca9-5711-4d9f-a1b8-e83fc49f9a26","added_by":"auto","created_at":"2024-11-19 14:26:24","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":116706,"visible":true,"origin":"","legend":"\u003cp\u003eGAMM models of TB vertical and horizontal displacement for /i, o, a/ across stress conditions. In (a) and (b) the top of the plot corresponds with higher tongue posture and the bottom corresponds with lower tongue posture. In (c) and (d) the top of the plot corresponds with a fronter tongue posture and the bottom corresponds with a backer posture. Shading indicates portions of trajectory that are significantly different. x-axis time is normalized by target word.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/07be9e0b7f61e9a0b31e9ca7.png"},{"id":69361948,"identity":"ad1ca7f5-3319-4cb1-8ec5-578e622dc2e5","added_by":"auto","created_at":"2024-11-19 14:26:22","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":104456,"visible":true,"origin":"","legend":"\u003cp\u003eGAMMs of vertical displacement and velocity of the TT (left) and TB (right) across stress conditions. Shading indicates portions of trajectory that are significantly different. x-axis time is normalized by target word. In (a) and (c) the top of the plot corresponds to higher positions of the articulator and the bottom of the plot corresponds to lower positions of the articulator. In the velocity plots (b) and (d), values above the zero line indicate upward vertical movement and values below the zero line indicate downward vertical movement. Values near zero correspond with low velocity. Where values cross the zero line, this indicates a change in the direction of the articulator.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/963fdc0a39ed371dfb6feabe.png"},{"id":69361945,"identity":"f3e85978-ad63-463d-a50d-ec9a4dced1be","added_by":"auto","created_at":"2024-11-19 14:26:22","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":20413,"visible":true,"origin":"","legend":"\u003cp\u003eRelative standard deviation of lag between target achievement in consonantal and vocalic gestures occurring in sequence\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/0f3298c1a87b27dbe3d9e19f.png"},{"id":69361947,"identity":"caa9d8f7-7970-4868-8589-39d2e09a3b41","added_by":"auto","created_at":"2024-11-19 14:26:22","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":24374,"visible":true,"origin":"","legend":"\u003cp\u003eSynchronization (lag) between maximum displacement of TB and JW in stressed syllables.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/601d8f3be71d805f2967fd1d.png"},{"id":69361950,"identity":"d35c7019-0379-4b8b-bfbd-aa6681abd1db","added_by":"auto","created_at":"2024-11-19 14:26:23","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":106634,"visible":true,"origin":"","legend":"\u003cp\u003ePositive correlation between articulatory magnitude and stability. Lower values in vertical displacement indicate a lower posture of the tongue. The x-axis shows TB vertical displacement. The y-axis shows the absolute value of the lag between TB and JW maximum displacement (7a) and TB maximum displacement and CSG apex timing (7b) during the stressed vowel, where the values are z-scored by subject.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/e53ca671389c00666b25f5d7.png"},{"id":73093389,"identity":"b6ea2fd5-e14e-4210-aea9-4da6675c1629","added_by":"auto","created_at":"2025-01-06 16:16:30","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1099265,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/c5afa2d5-c8ee-4f3d-957e-063c105d878d.pdf"},{"id":69361963,"identity":"3653faba-68d1-45c6-9444-01d6a8b4bdc3","added_by":"auto","created_at":"2024-11-19 14:26:40","extension":"csv","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":591632794,"visible":true,"origin":"","legend":"","description":"","filename":"ALLSUBtargetcvcv.csv","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/25b156699db1968b819b710b.csv"},{"id":69361949,"identity":"63d2deb8-96c9-4760-b8ca-8eee04cde0db","added_by":"auto","created_at":"2024-11-19 14:26:22","extension":"csv","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":17615321,"visible":true,"origin":"","legend":"","description":"","filename":"SNALLARTGESTENGstabilitydf.csv","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/098f7c5fb6a021cce672279e.csv"},{"id":69361956,"identity":"db97ca9f-8d27-4461-baec-a74708d4478e","added_by":"auto","created_at":"2024-11-19 14:26:35","extension":"zip","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":174426923,"visible":true,"origin":"","legend":"","description":"","filename":"supplementalmaterial.zip","url":"https://assets-eu.researchsquare.com/files/rs-5073434/v1/491c24a4343ec4c175c4d406.zip"}],"financialInterests":"No competing interests reported.","formattedTitle":"Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eCo-speech gestures–the term for movements of the hands, arms, head, eyebrows, and other parts of the body that accompany speech–are ubiquitous in human language use. Although gestures have been shown to have a facilitative effect on speech perception and language processing [1, 2, 3], gestures occur even when communication is not face-to-face [4], suggesting their functional role is not limited to aiding the perceiver. Indeed, evidence indicates that even congenitally blind speakers utilize co-speech gestures, suggesting that visual input may not even be a precursor to gesture acquisition\u0026nbsp;[5]. A body of recent research has suggested that co-speech gestures may play a more integral role in speech production. For example, co-speech gestures tend to occur synchronously with pitch-accented syllables in speech, suggesting that the planning of gesture and speech is closely linked [6, 7, 8, 9, 10].\u0026nbsp;Furthermore, speech tends to be more fluent when accompanied by co-speech gestures in individuals who stutter [11] as well as individuals with aphasia of speech [12]. Speaking while gesturing appears to bear some similarities to speaking with an external timekeeper (like a metronome), the key similarity being the act of synchronizing speech with another system [13, 14].\u003c/p\u003e\n\u003cp\u003eDespite the intriguing links between gesture and stability of speech, little work has sought to directly investigate how synchronization between speech and co-speech gesture may influence speech production. Insights from motor control research provide some hints as to the nature of this relationship.\u0026nbsp;A long line of research into movement dynamics in both humans and non-human animals has shown that movements that are coupled in-phase, or synchronously, with other movements are relatively larger and more stable than asynchronous or uncoupled movements [15, 16, 17, 18, 19, 20]. For example, in non-speech motor tasks such as interlimb coordination, synchronous coordination has been shown to result in higher movement amplitude and greater timing stability of arm movements [21]. Increased amplitude and stability of synchronized movement is observed even when coordinating movement of the limbs to an external stimulus, such as a metronome [22]. In these studies, the positive relationship between movement amplitude and stability is striking given that larger movements are generally found to involve greater variability in timing [23, 24]. These studies provide evidence that coupled limb movements are subject to greater stability under a limited set of coordinative regimes: most notably, in-phase coupling [19, 17].\u003c/p\u003e\n\u003cp\u003eIn the speech domain, there is also evidence that synchronization leads to greater temporal stability of movements. For example, the Coupled Oscillator Model of Articulatory Phonology proposes that more stable timing patterns observed between syllable onsets and vowels (vs. syllable codas and vowels) is due to synchronous planning of articulatory movements\u0026nbsp;[25, 26, 27]\u0026nbsp;Coordinative stability between onsets and vowels has been demonstrated for a number of languages [28, 29, 30]. In a study of metronome-timed speech [31], speakers of languages with different prosodic profiles demonstrated increased duration of metronome-synchronized syllables (as opposed to those produced on the offbeat of the metronome), as well as reduced variability in durations for synchronized syllables. Research has also shown that speech synchronized between speakers (‘joint speech’) is relatively less temporally variable than speech spoken alone [32]. Taken together, these findings suggest that speech movement is conditioned by the same principles of coupling enhancement shown in limb movement.\u003c/p\u003e\n\u003cp\u003eThere has been comparably little investigation of the effects of co-speech gestures on speech itself. One study by\u0026nbsp;[33] had participants utter short sentences, manipulating where participants were to produce co-speech gestures within the sentence, and where the nuclear pitch accent was to be produced in the sentence. Thus, in a Dutch sentence like \u003cem\u003eA\u003cu\u003eman\u003c/u\u003eda gaat naar \u003cu\u003eMal\u003c/u\u003eta\u0026nbsp;\u003c/em\u003e(‘Amanda goes to Malta’), the relevant pitch accent would either be produced on the word \u003cem\u003eAmanda\u0026nbsp;\u003c/em\u003eor \u003cem\u003eMalta\u0026nbsp;\u003c/em\u003e(aligning with the underlined lexically-stressed syllable), and a beat gesture was to be produced either concurrently with the pitch accent (‘congruent’ conditions), or on the opposite word from the pitch accent (‘incongruent’ conditions). The study found that participants produced longer acoustic durations and lower second formant frequencies on syllables where a gesture was present, even in incongruent conditions, where there was no accompanying pitch accent. Although these results suggest a clear effect of gesture on speech, the authors note that the act of pairing a gesture with an unaccented syllable in the incongruent condition may have been unnatural, leading participants to produce a prominence where one was not intended. Thus, this study leaves open two possibilities for the source of gesture’s effect on speech: (a) prosodic enhancement of unaccented syllables, or (b) a biomechanical effect of gestural-speech coupling on syllable duration.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e1.2. Research questions and hypotheses\u0026nbsp;\u003c/h2\u003e\n\u003cp\u003eIn this paper, we investigate how the presence vs. absence of a temporally-synchronized co-speech gesture (CSG) can influence the articulation of speech.\u0026nbsp;We utilize electromagnetic articulography (EMA) to measure the position and velocities of the oral articulators of speakers of English to examine how the presence of a co-speech gesture influences both the magnitude and timing of articulatory movements. We likewise explore how increased magnitude and temporal stability manifest in the context of synchronization to tease apart the role of gesture from speech prosody.\u003c/p\u003e\n\u003cp\u003eLiterature on enhancement effects in speech production has proposed two main mechanisms through\u0026nbsp;which increased articulatory magnitude could be realized: (i) \u003cem\u003ehyperarticulation,\u0026nbsp;\u003c/em\u003ewhereby articulatory movements are made more extreme in vowel-specific ways: front vowels become fronter, back vowels become backer, low vowels become lower, and high vowels become higher [34, 35]; and (ii) \u003cem\u003esonority expansion,\u0026nbsp;\u003c/em\u003ewhereby articulators like the tongue and jaw will be realized with more extreme downward positions in the presence of a gesture [34]. In short, hyperarticulation serves to increase the vowel space (in all directions) through the enhancement of articulatory movements, whereas sonority expansion serves to increase the overall openness of the oral cavity, and thus does not predict any vowel-specific effects of enhancement on articulatory movement direction. We investigate how coupling impacts speech magnitude by comparing these two dimensions of increased gestural magnitude. There is overwhelming evidence from both speech production and perception that accentual prominence in English is associated with at least some level of hyperarticulation of vowels [35, 36, 37, 38]. Thus, if gestures were found to induce articulatory enhancement without hyperarticulation, this would suggest that gesture-based enhancement effects were independent from those related to prosodic prominence more generally.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTo provide a baseline for our hypotheses, we first investigate two primary aspects of our data to confirm that our results replicate the essential findings from key prior studies: (i) co-speech gestures are timed to stressed/pitch-accented syllables (h1), and (ii) stressed syllables exhibit effects of hyperarticulation when compared to unstressed syllables (h2). \u0026nbsp;We formalize these and our main hypotheses (h3-h4) as follows:\u003c/p\u003e\n\u003cp\u003eh1: Co-speech gestures (CSGs) will be timed to the pitch peak of the stressed syllable.\u003c/p\u003e\n\u003cp\u003eh2:\u0026nbsp;Tongue gestures will have more extreme target achievements in stressed syllables compared to unstressed syllables, with high vowels realized with higher tongue positions, low vowels with lower tongue positions, back vowels with more back tongue positions, and front vowels with more front tongue positions (consistent with hyper articulation).\u003c/p\u003e\n\u003cp\u003eh3:\u0026nbsp;Coupling with CSG will lead to increased magnitude of movement of oral articulators.\u003c/p\u003e\n\u003cp\u003eh3a hyperarticulation:\u0026nbsp;The direction of the effect will differ between vowels, with high vowels realized with higher tongue positions, low vowels with lower tongue positions, back vowels with more back tongue positions, and front vowels with more front tongue positions\u003c/p\u003e\n\u003cp\u003eh3b sonority-expansion:\u0026nbsp;The direction of the effect is consistent across vowel types, where the oral articulators have a more open posture for /i, o, a/, regardless of vowel quality\u003c/p\u003e\n\u003cp\u003eh4:\u0026nbsp;Coupling with CSG leads to increased temporal stability of movement between oral articulators.\u003c/p\u003e\n\u003cp\u003eTo test these predictions, we analyze productions of CVCV tokens with alternating stress on either the first syllable (initial stress) or the second syllable (final stress), where each stress condition was produced with (CSG condition) and without (no-CSG condition) an accompanying CSG. To investigate the coordinative and temporal properties of these utterances, we analyze the movement of the tongue tip (TT), tongue body (TB), and jaw (JW) for speech articulation and the CSG apex in co-speech gesture articulation, where target achievement for both oral articulations and CSGs was defined as the point of minimum speed/maximum displacement of the articulator (details on this method are provided in §4.4.2 and §4.4.3, respectively). All methods were performed in accordance with relevant guidelines and regulations.\u003c/p\u003e"},{"header":"2. Results","content":"\u003ch2\u003e2.1 Replication of previous findings\u003c/h2\u003e\n\u003cp\u003ePrior to presenting results of our study, we establish that overall patterns for co-speech gesture timing and stress-based enhancement are similar in our study to those that have been found in previous work. First off, co-speech gesture apexes have been shown to correlate in time with pitch peak of stressed/pitch accented syllables [8, 39, 40]. In our own study, we find that the apex is timed to the target word (e.g. \u0026lsquo;speebee\u0026rsquo;) in 100% of utterances, and timed to the stressed syllable within this word in 80% of tokens. Furthermore, we conducted a Pearson correlation test between the timing of gesture apex and pitch peak, which shows a strong correlation between stressed syllable peak f0 and apex timing, as shown in Fig. 1 (\u003cem\u003er\u003c/em\u003e(1751)=.89, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001). This result confirms h1, replicating the finding of existing literature that co-speech gestures are generally timed to pitch extrema in stressed syllables.\u003c/p\u003e\n\u003cp\u003eSecond, in English, stressed syllables have been shown to have more extreme tongue displacement compared to unstressed syllables. To test for these effects, we compared the position of the TB during target achievement of stressed and unstressed syllables in both the vertical (y) and horizontal (x) dimensions. A linear mixed effects model\u0026nbsp;indicated a significant effect of both stress (vertical direction:\u0026nbsp;\u003cem\u003e\u0026beta;\u003c/em\u003e=3.32, \u003cem\u003et\u003c/em\u003e=12.50, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001; horizontal direction: \u003cem\u003e\u0026beta;\u003c/em\u003e=1.48, \u003cem\u003et\u003c/em\u003e=6.78, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001) and vowel quality (vertical direction, /i/ vs. /a/:\u0026nbsp;\u003cem\u003e\u0026beta;\u003c/em\u003e=14.06; \u003cem\u003et\u003c/em\u003e=77.65; \u003cem\u003ep\u003c/em\u003e\u0026lt;.001; horizontal direction, /i/ vs. /a/: \u003cem\u003e\u0026beta;\u003c/em\u003e=8.34; \u003cem\u003et\u003c/em\u003e=52.97; \u003cem\u003ep\u003c/em\u003e\u0026lt;.001) in predicting target achievement values, as expected. The model also revealed a significant interaction between these two predictors (vertical direction:\u0026nbsp;\u003cem\u003e\u0026beta;\u003c/em\u003e=-3.63, \u003cem\u003et\u003c/em\u003e=-13.01, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001); horizontal direction:\u0026nbsp;\u003cem\u003e\u0026beta;\u003c/em\u003e=-1.77, \u003cem\u003et\u003c/em\u003e=-7.28,\u0026nbsp;\u003cem\u003ep\u003c/em\u003e\u0026lt;0.001), consistent with the idea that enhancement effects showed effects of vowel-specific hyperarticulation. As seen in Fig. 2, stressed vowels were realized with both lower and backer tongue positions for /a/ and /o/, but higher and fronter tongue positions for /i/, consistent with hyperarticulation, per h2a.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e2.2 Effect of co-speech gesture on oral articulator magnitude\u0026nbsp;\u003c/h2\u003e\n\u003ch3\u003e2.2.1 Hyperarticulation vs sonority expansion\u003c/h3\u003e\n\u003cp\u003eWe now examine how the presence of a co-speech gesture may influence speech articulations, considering first the magnitude of speech gestures from the standpoint of both hyperarticulation and sonority expansion. To evaluate these effects, we used a series of GAMM analyses of TB vertical and horizontal displacement across each of the vowel contrasts included in the study, /i, o, a/. \u0026nbsp;The use of GAMMs allows us to examine not only the overall displacement of the tongue across conditions, but also how these effects may vary over time in the articulation of the target word.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eTurning first to vertical displacement, Fig. 3(a,b) demonstrates that for all vowels in the final stress condition and vowels /i/ and /o/ in the initial stress condition, the TB is significantly lower across vowel types in the presence of a CSG. Furthermore, differences across conditions appear to be roughly localized to the site of the CSG across conditions. Although we focus on the TB results here because TB is most closely associated with vowel production, we likewise tested TT and TD \u0026nbsp;(tongue dorsum) to ensure that the hyperarticulation effect wasn\u0026rsquo;t localized to the front or back of the tongue. Our results for the TT and TD are similar to those provided here for the TB. In other words, the direction of the effect was the same for all vowels, consistent with an effect of sonority expansion across the board (hypothesis h3b).\u003c/p\u003e\n\u003cp\u003eWe next analyzed TB horizontal displacement across vowels in both stress conditions. As demonstrated in Fig 3c,d, there was no significant difference in horizontal displacement for any vowel type for either stress condition, though displacement values were numerically in the same direction across vowels. Results were comparable for TT and TD. Together, these results support h2b, consistent with sonority expansion, and we find no support for h2a, the hyperarticulation hypothesis.\u003c/p\u003e\n\u003ch3\u003e2.2.2 Effects of co-speech gesture on gesture magnitude across oral articulators\u003c/h3\u003e\n\u003cp\u003eIn line with prior work on coupling-based enhancement, results from our own data presented in \u0026sect;2.2.1 suggest that the vertical movements of the tongue are enhanced for target words when produced in time with a co-speech gesture, regardless of vowel quality. To further delve into the effect of CSG coupling on articulatory gesture enhancement, we analyzed both the acoustic duration of stressed vowels (pooled across all vowel qualities) in target words, as well as the magnitude of movement of various oral articulators in the presence vs. absence of a CSG.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOur results reveal increased acoustic duration of stressed syllables when coordinated with a co-speech gesture. \u0026nbsp;A linear mixed effects model of acoustic duration illustrates that the effect of gesture was significant (\u003cem\u003e\u0026beta;\u003c/em\u003e=-2.72e-02, \u003cem\u003et\u003c/em\u003e=-4.57, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001) as was the interaction between gesture and stress, with final-stress tokens showing a larger effect of gesture presence (\u003cem\u003e\u0026beta;\u003c/em\u003e=-1.972e-02, \u003cem\u003et\u003c/em\u003e=43.691, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001). These results demonstrate acoustic enhancement of vowels, where stressed vowels are longer when coordinated with a CSG in both stress conditions, consistent with [33].\u003c/p\u003e\n\u003cp\u003eWe also analyzed the kinematic enhancement of tongue sensor trajectories using a series of GAMM analyses of vertical displacement and velocity. For final stress tokens (Fig. 4a,b), the results reveal greater displacement of both the TT and TB, with the significant difference in gesture magnitude between CSG and no-CSG conditions extending throughout the target word for the TT, and timed at the locus of the stressed syllable and the point of maximum displacement in the TB. In other words, maximum displacement of both the TT and TB during the stressed vowel was greater when the target word was coordinated with a CSG.\u003c/p\u003e\n\u003cp\u003eAs expected based on prior work [41], increases in gesture magnitude for stressed syllables were accompanied by increases in articulatory velocity in the CSG condition. \u0026nbsp;For the TB, downward movement in preparation for the stressed vowel target and the closure movement during the stressed syllable both occur at higher velocities when accompanied by a CSG. In the TT gesture, closure movement of the tongue following maximum displacement of the stressed vowel occurs later in the CSG condition.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eResults in the initial stress condition were more modest, yet still exhibit several effects of coupling enhancement. Namely, Fig. 4c,d demonstrates that TB vertical displacement at the onset of the stressed syllable was significantly greater and the TT gesture was significantly longer when speech was accompanied with a CSG. Nevertheless, there was no significant difference in TT vertical displacement and the TB velocity patterned opposite from the predicted direction of effect; we return to a discussion of these effects and differences between stress conditions in \u0026sect;4. Together, these results demonstrate that coupling of speech with CSG leads to enhancement effects on oral gestures, consistent with studies on synchronized limb movement [21, 22].\u003c/p\u003e\n\u003cp\u003eDespite significant differences in the magnitude of the TT and TB gestures across subjects, GAMMs analyses of JW gestures across subjects revealed no significant difference in JW displacement or velocity in either stress condition. We return to this finding in light of individual variability in JW displacement in \u0026sect;2.2.3\u003cem\u003e.\u003c/em\u003e\u003c/p\u003e\n\u003ch3\u003e2.2.3 Individual differences in jaw magnitude enhancement\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eThe asymmetry in results between the tongue and the jaw was surprising; however, prior work has demonstrated individual differences in the extent to which speakers move their jaw while speaking due to anatomical differences between speakers [42]. Accordingly, we analyzed individual differences in JW vertical displacement. The analyses revealed that some subjects do indeed show a significant difference in vertical JW displacement, with greater displacement in the CSG condition (Table 1). Importantly, there was an implicational relationship between vertical tongue movement and JW movement: if a participant had greater JW displacement in the CSG condition, they also had greater tongue displacement in this condition; yet the reverse was not true. These results are consistent with individual differences found in [42] suggesting that some speakers make greater use of the tongue than the jaw in vowel articulations\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 1.\u003c/strong\u003e By-subject comparison of magnitude effect of CSG on tongue (TT and/or TB) and jaw displacement in analyses. Checkmark indicates a significantly lower position in articulator vertical displacement during the stressed syllable during the gesture condition compared to the no gesture condition; n.s. indicates no significant difference; X indicates a significant difference in the opposite direction, where the articulator was significantly higher during the stressed syllable in the gesture condition versus the no gesture condition. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cimg src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAnkAAADdCAYAAADHG2RwAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAACkxSURBVHhe7d0JuCZVeeDxrzOZSBKFQQ1Cg4CiY9xwhNCNDTSICyjKJgyCgCgGaBC3IGAA5QFcwB5FRUDUBAFpCaiA4oALQoMtNAFHYnAcRdl6gbCMSxI1menc/7n1fpxbfPf27dt3qXP6/3ueulVf7afW955zqmrWqiG9xsVfWdF0ScMO2XeT4o+LGtIA09EttaQDpqVbTEM31JCGP2jakiRJqohBniRJUoUM8iRJkipkkCdJklQhgzxJkqQKGeRJkiRVyCBPkiSpQgZ5kiRJFTLIkyRJqpBBniRJUoUM8iRJkipkkCdJklQhgzxJkqQKGeRJkiRVyCBPkiSpQgZ5kiRJFSo6yHvowXt7h75hdr/h9xcueF8zdPVi+v/z46VNnyc67ph5vY+eflDzS5PlNbs8vXfIvpv0mzkvWb8ZUpY/3+pPR6SjZHk6SFfpYt9s9PQ/avqU5eUv23DEPqGhX8nYF5EWrgEl2evVfzZiX0RT2rXrv7/uGSPWv8RzPb9/sF+6bjz3O/rF8Mk8N4oO8o47+mW9Aw45uXfRl5en5qzTDux957ovNEPHRoDH9Kuz8NNLeu895dLmlybL/7zh4d7FX1nR/CoTF8ftWicrF9AScWHJtdNVotLT8GdPe2Jw+t3vP9Z0lYdzY7f5T+v97vf/P537XANKsv6T/7DpGmnpD3/VdHUfAdGT/ugPej+5+5/TPnj40X9L50lJ/wgRAD39qf+5d9vQdqdhv3Q90Fvd/Y4A73lD9xPSw3ikb7ICvWKDvFtuviq199j76NQGAdmL/9suza+xbfSMLXoLz/1+82t6mCNYl+dt9Se96xY/kk7KB1b8NvXjAlraf8bkDsXFhXYoOTdvMv8Tnglc9O+5/1/TPsmbUhHgRXDxd19/sOlbDs6FCIyi4Vz51W/+vRmj+wjkIlC9Z9nw9eqRx36f2ltuul5qdx1pIADC/x7aHzT800C6Sr5ebfnMP05t0gOOK9I5GcF3sUHeU582nPNwzZXnpnboaq4b6/ngynuaX6rB9+/4Ze+hh4cvknkOy6O//Lemqwz/+NPhiyWijby7JARIT3nyf2p+lWmTZzwp/WdPDmupucOBfyII8Mg1KinXK8e50F73Zw3dmFc8+LvmV/fFtQov22aD1P7TPxk+TyLo67qnbjAc4OUI8rDJRmVWyyCQ4/wYZDKC72KDvP/6/Dkp1+6yi8/o18fLUc8u6upFnbv43Q4MEcOog5cj962dA8c4o82LcWNYTEc9QdbzoaEgb9A0KlN+0QxccAb177J8fSMHjFyLEnHB5L/iO3/8m6ZPefIcF3ADINgrtT5eFDuTMxF1jgbVSSoN/0iUEhyFKHHg+GI/bLbJeqk0orRrFiYjl6sLBgWuIYLwtVF0nTxy7aiTB+rXEUAFgsB2cSz19jbaeMvm1+POOHnvfr0+ArEIzmj/w/+6IXUHArzj378ojcv8Cd6i6JjxKQaOYUzLsDcf8eG0niybYXkRs+oQRQUlBxdc9KMohFykEpFDUfI+CHmxYOCGXFqRFOsbuRQEE1EdgOOr5Js06SrxHzpKHNpFzC98bjnHFDmqkXNX0npP1HpPWseDPBAwEThFXTwCvXau3uqcfMaVTVevd/S7z0vBGfMgiMzr+JEjSBAYAWU8uPGTu25Jw5iOgA4R7G2/417pt+pG/TyKo0ot4gQBRfynjxKeWsuR0/WrX/970fsA7cAh3y+lFkmBdNVyk2Y/lFRUm/vJ3f+SjqkI9kr75yHqdLLe/GMaud4rHiovN3J1fvu7/9d0TVyxQR5BVeSggYAsgrXbvn9Nak9E1PUb5NFHVqSgj+AtbwjsGKZ1UxSjlfa04CD8p58/fFESigXjwp8/WcsTnaUX7ZT8VG1bBHkl41grragW1O+kLiGu+uY/9fdFaf88RC53XKsIWEv9526sOtz//C/rcJAHctByUU/v4Yfub/qsOYI1ilXJiWsjAGwX34YIDtc0F1Fl4z/g9Z/yh+mCGUqtOxXiYvn7369KbXUDN+TScivyG2870J6MG9hMKLWoNorO89yhX/+mzH0Q4p85HoIrFcfRaP/4TMY/EkUHebwTL3+IIYpMX7bTPk2fYZHLRs4fxa3Uo8un++lP/r7p6vXO/fiC3n4HntD8GokgkgAw6uwFHqyIYbyrLzfo5cw+eFEPLjJRiTma0rTXO4puSsuZpBin/R8+SqtYTjAU+yQCI/bJPz0yXNxZmihqjuJZzhduaqU+aVtqUW3kGOXvX4w6uDxhXxIeEItrVqkPjuR4XRLi2ss5QvWfyUhX0UFeFM9SP46GByjoR8AFcuNesdubU+DGcHL+CMR4CIK6fPGuvHhCl4bpx6pHx7v4CCRjfJqoh8cwDBr23Of9Rf/p2u1etkfqty6Lt8cHKmKX9qqIfP1zpeW2RN2cCCyoX0igpJmR/2dPUXPsk1KLbFlvAr0oSidtJb4rL5RaVMtxRUAUT2vToKQgKb4KQXAa7y0sYd1Xd7/jHx7SQ6YB4xHgTdY/2bNWDWm6vbAPEE/MRrC2ruGAK/24qCENMB3dUks6YFq6xTR0Qw1pKP7p2unwvBds33RJkiSVwSBvAHLvorj1BS/e0degSJKk4hjkDcDrWOL1KL64WJIklcggT5IkqUIGeZIkSRUyyJMkSaqQQZ4kSVKFDPIkSZIqZJAnSZJUIYM8SZKkCo34rJkkSZLq4LdrNSa/P9gdpqNbakkHTEu3mIZuqCENFtdKkiRVyCBPkiSpQgZ5kiRJFTLIkyRJqpBP10qSJFXInDxJkqQKGeRJkiRVyCBPkiSpQiPq5C2+9bGmSxo2f+6GxR8XNaQBpqNbakkHTEu3mIZuqCEN5uRJkiRVyCBPkiSpQgZ5kiRJFTLIkyRJqpBBniRJUoUM8iRJkipkkCdJklQhgzxJkqQKGeRJkiRVyCBPkiSpQtMS5B2037a9nbd/ar/htyRJkqbOtAR5l15xe+/9p38udS/68h3ptyRJkqaOxbWr8Z1vfSU1kiRJJTHIW43TTnlb0yVJklSOTgZ5ef29j591XOq3fNk9T6jTl9f1+4cf3jKiX+S+Lbr4k0+YDoxPvxjericYywOBXnQjnycN4waGMa98fSMN4b3v2m/E9DRMw3jRDdIQw/NlxHg0zEuSJKmtc0Eegcs5n/lG78ZbHk319678yt+koGb2plumfptu9qze4UeelMalbt/e+761N2f7XXsvfsn2qR/DmP4Vr9o3BVxgupiWoIgA7+1HvjYNW7nivjSsXU8wlgfqE0Y387zj9sX9ebL8A9+wTQrCGHb+p0/tLXvgF73j3vmGNJxpSUMEaaSF4TE968T6s/x3H7+wd9Qxp6bxQBpIS47pt37pvP70zIt+kiRJuU4FeQRJecBGoNUOkrab8/LetddcmrrD0luu7w9/aOUD/em/dtUXUtAVuV6MR8PwCJ4OeNPbU3u8mN+hb3k8qCIwI1C77Ivn9A485B0pSON3BI0bbTQ7tcPy5fekNITX7/XmFKiNF9sichZpmPa2pd9thkqSJA3rVJBHrlpbBEmPPLwytV+5234jgrqNN9k8BVU3Xn91+p3PgwCI3MDI9YpmomKZbSyf4G08dt/joBSoBXIF86BvLLH8dnp8WlmSJLV1IsjLixsJ4AZ52tM3Tm1y4SKoI+dv5133TLlh5NpRDEsQGBjvx3fd0fyaPD+6c2nT9bjZs7dsusZGESy5lXlOHLmB40HOJqL+oSRJ0mhmPMiLenOIotP8YYJvX3dFqvcWAQ4I6sgBA/0J9AiWPnz6Mf2iWjAeRZt5DtxE66/FgxysC0W2EWgxbwLT8Rb7kl5y88bKhcuLby/62+EAMOr9sfyoTxiskydJktqmJcjjadF4FQnBSuRi0RAw8SABCNgoXiVoiuFo53S9aOs5aRzaYDpyx9rFntSRiwcjYn4si4AtAiWGjZUzxnxZ9zt/sCQth3WJQIv5MT3rzDCCrXjwgmGDlkNxcl6nLpoIdllnRP9ttp2ffscyWH6eE0iT515KkiRh1qohTXdv8a2PNV2aKgRzEcjlRus/0+bP3bD446KGNMB0dEst6YBp6RbT0A01pKFTD17Ujpw8cvrarGMnSZImm0HeNKK+IK9YyYtaaah318VcPEmSVC6DvGlGMJe//oTmo2df0QyVJEmaHAZ5kiRJFTLIkyRJqpBBniRJUoUM8tQptyz5VnoZdv6S7NL86pePpncmko7RPoVXAtPRPTWcH6GWtNSQDtPQDVORhqqCPF4+zNOqY13I2XiM0xWrWx/SwvB14TUrpPGE9xyQXnS93fa7Nn3Lc/oHjkjfJ37ykzdIL7AulenollrOD9SSlhrSYRq6YarSUFWQx3dheVp1rAt5PN3aFWOtDwEeX8pYV3zq7JNSm9fMPOe5L0rdpeE/MU7Spzzlv/Te/d6PNn3LYzq6p4bzI9SSlhrSYRq6YarSUH1xLVmfpYl1Jljlc2brgqu/emHvJz/+QW/TzZ5V7DsDKRb85MdOTN18fm79DbqTY7wmTEf31HB+hFrSUkM6TEM3TGUaqg7yKL7lv/iSUHzLt2/XJdyMLzj3tNT9vlM+ndoluubqS9K+49vC5CqXynR0Sy3nB2pJSw3pMA3dMNVpmPYgL69jdtB+26budp20GCeadiXEmI4gjgrVIa/fRvdpp7wtddMvxqPN9LmYLpq8Th/DGD9fp3yZOfrHOKxb1BGkidy56BfrkK8P3Xz2jBsT47TTzTxifjXhZvzrX//fdDPmqyAl4kT94kVnp+5D3zL4+CiB6eieGs6PUEtaakiHaeiGqU7DtAZ5eR2ztx/52t7CT3y5Xx8tgqAYh2JKhp3zmW+kwCcCHtqHH3lSGnbnD5b0li8fDsjon38XlizP95/+udTNuBTXEERRATvHdHfcvjiNQ7P3vm9Ny2c9Yp4EXce98w1pOPNkHnkgGFgGw8lyJeeAJopb40ZDP5Zx6RW3P2F9mP6oY05N07OsPNuW7cU86A+CxRL97Kc/GrHt8pvxkUNpLwHrzD8puThRd33lPsVcbExH99RwfoRa0lJDOkxDN8xEGqY1yMvrmBG8xQMSBEYUq5L4y754TgqCYhgXaAKfPICLHDqCovgkGAER442F8Zl3jvnm/+kzDkEW6xHz5DdBGTbaaHZqj+b5L9gmBYXtm863rxteT9K49Uvnpe5B6zMatlfcrFifh1Y+kLpLQmB6+CHze0cctmv/QL9t6Q3pZsx2KKXC7FGHvyoF3Z89/4ymT6/3tau+kNp/ueCU1C6B6eiWWs4P1JKWGtJhGrphptLQiTp5eeAUOXO5jTbeLLXZMJG7FcWWbLiJig3dRhA1aD3Gg+CUbNcI6m68/up+7l/8JhBcF203Z5fe857/0nRQn3ry4anfZZcO10F45W7lPCBDTjIuufBj6elNmqj7Ff+clMB0dEst5wdqSUsN6TAN3TBTaehEkPfQQ8tTOy7Ity39bmq3xfAoWiUYI1evnWu2pn5059Km63GzZ0/85rD7Hgf1g7qVK+7rV/qOgLSkG89k4unGhWdfnvYbTxJRRE97quoiTBX251+d8LHUfcYHjuz9zWc/krpLq/tlOrqllvMDtaSlhnSYhm6YqTTMWJAXOV34/Gc+2C9q5cLMf+HUVwvXXnNpfzj15CJYogiVDTYeTNdGsEU2KUW2ESiSu0fR8QFvenv6PRER1JGGmA/LIZ1rYtA6l44D/bQPDxelxZPPBMWl2XOfw1J9L/4r40TlOCzp5hVMR7fUcn6glrTUkA7T0A0zkYYZzcmLItft5ry8XwzLhZn6Z+SExfBttp0/4iEEHriIYa/f681pmnhIAvRHFIvy+0Vbz+k/6EAQGU+0Rr046vQwXjz0QQDI+PHgBcMILhkPjDdWDiLROUW+kWtHdmyaz657pt8YtD6sZyyPYur8QRWWT3+Gs155IFwS6h5E0M5LaiMoLg0v12X98cY3HZvaJTId3VLL+YFa0lJDOkxDN0x3GmatGtJ09xbf+ljTNXVSvbqhwCV/kEDdNX/uhlN2XBAkP+3pG0958fVUpoHj+ZGHV07LsWw6Vq+WdGC6zg+YlvGp4ZplGsavhjR0ok6e1k3ciKfjoj+VWP8a/lkxHd1Tw/kRaklLDekwDd0wXWmY1iCPyLVd/ChJkqTJN61BHpFrPBlLU2r9DEmSpK6zuFaSJKlCBnmSJEkVMsiTJEmq0DoR5PHAB++Xi4Y3TSPaIR+HaXj3Hq9kaGO6fFweIGF8HySRJEldUX2QxwuDeZKX9/LFAx98VYPgLMfLiHkpcoxz0d8u7L9cOcd0vIw4xqPh5cws46GVDzRjSZIkzayqgzxy1viiRPvFy3TTj2AN5MLRnX/K7KNnX5GCvhyBIJ9L4nNqOb6awfwkSZK6ouogj2/FjvYNS/rxOTXw1mlc9sVzUjsQvMXLCgkYCQQPP/Kk9LuN+eWfXpMkSZpJ1QZ51KUjKItAbhCCOBDIkWsX38slZ6+NIlnE93AlSZK6rNogj+9XrgkCvvef/rnUTf06gr38oYvly5/4AAbaD3VQpCtJkjTTqn/wYk3wBQ4epIi6eHyCbVCuXo5iWqaZs/2uqWnX15MkSZoJ1QZ5UQ9vtBy4HDl2vC4lkKu36Mt3pO5vX3dFam+z7fzUvvH6q1NbkiSpy6rOyTvqmFN7S2+5ftTcOF6vEkWyd9y+OLVD1NOLIDEequC1KoPenSdJktQlVQd5BGYUoVLHrv2iYl5ovPVL5/WfniUYJOgLBHI8iLH7Hgc1fXr93L3xFONKkiTNpOrr5PG+O3L0TjvlbSMekOCFyNTBCzx0QdAXwwnk6JePQ0AYdfbi4YxoeJKXZUmSJHXBrFVDmu7e4lsfa7qkYfPnblj8cVFDGmA6uqWWdMC0dItp6IYa0uDTtZIkSRUyyJMkSaqQQZ4kSVKFDPIkSZIqZJAnSZJUIYM8SZKkChnkSZIkVcggT5IkqUIGeZIkSRUa8cULSZIk1WFEkHfvst82XdKwLTZdr/jjooY0wHR0Sy3pgGnpFtPQDTWkweJaSZKkChnkSZIkVcggT5IkqUIGeZIkSRUyyJMkSaqQQZ4kSVKFDPIkSZIqZJAnSZJUoU4FeSf/9Tt7W272xyOa++77RTO0284/72NFra8kSapbp4K8Mz70id7iJXel7k99+qLePQ/8a2/zzZ+VfnfdUQveU9T6SpKkullcK0mSVCGDPEmSpAoVHeQdevCe/bp7O+/wwqZvL3Xndfryun7UnUP0o43bli7pj0MTdetox3QxbJCvXXX5iOkQ4+fTxXg0LDvmn4+T98vnJ0mSNF7FBnkEePN22CXVg6NBBHo3fu8fewcfekRqqCNHXT/q+IG6czjiqHf1Tjzpg2kYAd6nPvmR/rzoP3/eC1KARRtLvnfDiGXlCNyOPebQ5tcwArj5u7xqxDQEiq/fa/9+vUPWgfVj+BZbPDstF/Rj3elvHT9JkjQRRQZ5BGWLb/hWP2DDxYu+3rv33p+ngAtz5+7Yu+SiC1J3Lob/8Ad/33vtHvuk7quuvCzNL3LPPvLBk1L/B1eu6Adkx77jxNQehMAtgshA8HjRJVf3c+VyBG4Edd+45qtNn2GLLvl809XrbfbMLZouSZKkNVdkkLdyxbKm63GR47Vs+f2pTeCFCOroT+7Y5ZdfnH7feuvN/WkIxOJp3rzZbs68NHwiIrg75MDXpXkR1OUOPPjwflBH0Hrc8aemIJVumk1nPzMNkyRJmoiigjwCNgKgkHeHPDiKoC6Cpr32PiDl2BGA5TllBHsEfZOJ4I7iV4qOByEXMYI6chIJSinepfv222/pB6mSJEkTUUyQRzC08KxTU+4aARA5Y/vv+4pm6HAASL88OIqgLoImpmUcArBtt92+GWt4PIp2I9cP7YByIh64/97UJqgkoMsRWEZQR9Ey9t//kLQeMZ0kSdJEdSrISw8rNA868CBD1JGjIaDbaedXpmEgh4yALYYTALZzzQYVt1JMinwY3RTX5suMnL1YH5afB4GrQ/ErARvzIqgkoKOuXzzNCx4cYZwITKMdQZ8kSdJEzVo1pOnu3bvst02X1hQBIEEi9e9qssWm6xV/XNSQBpiObqklHTAt3WIauqGGNBT54EVXkVsnSZLUBQZ5a4Hcu7y4mFemSJIkdYFB3lqgDl28bmW0p2glSZJmgkGeJElShQzyJEmSKmSQJ0mSVCGDPHXKd6+/tnfowXv2zj/vY02f8jz22KPpfYikgxdha2a5P6Sx1XDdrSENU6GqIC+edh3rQs4BwDhdsbr1IS0MX9uvb5SANL7l0H3SV0p2zl58XZp3HntYesn1+utv0P8+smZOKfsjf1o/mnh5entY/lJ1TZ11YZ/UcN2t5d4xFXwZcocR4MUXNy7/yncGfsFjqk3nyyD33GOH3p0/vCN98/eoBe9p+q696UwD/01ysdlggw17N9z8o96GGz61GbL2angxJ9wfY9t5hxemzyC2zwNuZHx5Z/GSuyY9WK3l2MJUpGW698l07o8arrs1pGGqVF9cS/ZtaWKduWhw8VgXXHrJ59JJyqfqJvMknU4UC556yl+l7jM+9IlJDSi05krdH/HJRj6DmJdKEEzwz565w9Ov1n1Sw3W3hjRMpaqDPLLTyb4tCcW39/7i7ubXuoGb8Zkffn/qXvjxz6Z2iS770oXpv32+fPL65jvEmjkl74+LF309taMIkH/8+L72TOTma1ht+6SG624t946pNO1BXl7HjCzwqM+Qi3GiaVekjOkI4vJ6EHn9Nrr5lizoF+PRZvpcTBdN/p8awxg/X6d8mTn6xzisW15nI3Lnol+sQ74+dPOfIjcmxmmnm3nE/GrCzfiXv3ws3YxLvmCed87C1H3sO05M7S7hfOO44TiO86d9HrTFsca0ox3zXdX1/bE65AwRQKQ6RkP7ad4Ou3Q6UOXaxDGSX/PoHk1cc5mO7vya21Wl7ZPVqeG62+U0dOWcmNYgj5WOOmZkc/OfUXzQn4QhxqGYkmFkhRP4RMBD+7jjT03Dbr315v6GoD/jBbJtOSHBuBTXsMGpgJ1juiXfuyGNQ3PwoUek5TPfmCdB1yEHvi4NZ57MY9AOYBkMJ9uYk58milvjRkM/lkH2f3t9mJ46BUzPsvKsZ7YX86A/xjpYuuzHd905Yts9lt2M3/fXZ6R217HOBD65uNi8bs/9OnexiXpD4NjmvOO45Lge7Tji2OTcS8f8Jz/S9O2m0vbHeMW1gv302j32afp2D9duAh+uZVyT4zrKpx4H4fyPay7XtPy63XWl7JO2Gq67JaWhS+fEtAZ5eR2zvB5D/HdEQi84/+y0MWIYF2gCnzzRkUNHUBTfiyUgYryxMD7zzjHf/D99xiHIYj1invyOz5ZtvMmmqT2al7z0L9IFoH3TuerKy1KbNM6du2PqHrQ+o2F7xc2K9Vm2/P7UXRICite8em7v9a/ZoX+y3rz4O+lmzHZ4/gu2Tv26bu/X7ZSCpo+e+YGmT6+36JLPp/bxJ56W2l3CccPxA84/zq04v8Zy3Lv/MrU5xzhWu6q0/bEm4p9Agu6u4vggJ4VzOI6TzZ65RWqPhus91wOOTW5s4zkeu6KEfZKr4bpbWhq6dE50ok5eHjjFDsxtOvuZqc2wyN0aT/bn6gxaFrZ41lajDlsddgw7N4K6b1zz1X7uX/wmEFwX7Tj/Fb2tX7JNOjHfvuDg1O+zF5yd2nvtfUBql4CcZHz6U2elpzdpCOzZ7yXdrMZyxFHvSmmK86z9T0uX1Lo/yA0gOI9/gqM0o3TsE25+/LMexVOlKHGf1HDdreXeMZqpPCc6EeStXLEsteOCfNON307tthhOlEtDjhYbZW1vQLfffkvT9bi1uTnsv/8h/aDugfvvTVn8iIB0beZdMp5u/MIlX0v7jaehOJBpczOOXMoSsD8/9JFPpe53HfvW3sf/x+mpO88RLh3HaJxnIKdsov/4TLUa9wfBA3W+OC9IHzcASh26ug/WFLkbHFuUlBAslZArVuo+qeG6W8u9YyxTdU7MWJAXOV2gnDqKWrkw8194nsDLL7+4P5wTLYKleKx9PAb9x8WNLE7UCBQ5YdnA5GRMVAR1pCHmw3JGK48fTQn/Ja4pTtbzP7sodbOdQVBcmoMOfluq78V/llxsOA5rudiAYzfOiQj0uqym/RH/xed1crkBcEOjTmWXc1XHg2tsnsa4tndZ6fukhutuLfeOQabynJjRnLwoCtpp51f2Tx4uzGSHkxMWw/nvKT+5qMgYww48+PA0DQFR1NujP6JYlN/bbrt9unExX4LIeLKQE5UAjJwKxouHPggAGZ95RrEVwWVegX2sE5uTnx0XuXZkKTOfvLLuoPVhPWN5FFOzHLBclk9/hrNeTF8i6k/EQcxLaiMoLs3pH/xEWn8cueDdqd1F7eM2jiOQEx7/OMUxDuqP8MAF/WgonuJY5pjmdxePvVL2x2jYD2xbbmCc37EvwA2gf2Mb2pdxQ+gCrl2sG9cy1qt93Yxx4hoHjqU4tqg/yXUYTEs/jrMuKHWfDFLDdbeUNHTpnJg13V+8YEW50eQPEqi7pvKN3wTJz9h4k34gPFWmMg0czw+uXDEtx3INb1+H+6N7TEu3TGUaarju1pCG6dKJOnlaN3EjnuqTdKqx/v6z0h3uD2lsNVx3a0jDdJnWII/ou138KEmSpMk3rUEe0Xc8sUdTal0sSZKkrrO4VpIkqUIGeZIkSRUyyJMkSarQOhHk8cBHvH+GJt5n1H6vUT4O0/COpEHvpmG6fFweIGF8HySRJEldUX2Qx4sEeZKX9/LFAx98VYPgLMdLCXkpcozDi2B5eWEb0937i7v749HwcmaWsWz5/c1YkiRJM6vqII+cNd443X7xMt30I1gDuXC8iTr/lNlFl1ydgr4cgSCfS+JzajneTM38JEmSuqLqII9vxY72DUv68Tk18OZsXHD+2akdCN7ihYsEjASCxx0/+PuzzC//9JokSdJMqjbIoy4dQVkEcoPEt+EI5Mi1I9eP4lhy9tookkV8D1eSJKnLqg3y+H7lmiDg4yPsoH4dwV7+0MWgBzDQfqgj/+CwJEnSTKn+wYs1wRc4eJAi6uLxCbZBuXo5immZZv4ur0pNu76eJEnSTKg2yIt6eKPlwOUYh9elBHL1Fi+5K3VfdeVlqT1vh11S+xvXfDW1JUmSuqzqnLwTT/pgb/EN3xo1N47Xq0QQuOR7N6R2iHp6MTwequC1KuMJHCVJkmZS1UEegRlFqNSxa7+omBcaz527Y//pWYJBgr5AIMeDGPvvf0jTZ2icJndvPMW4kiRJM6n6Onm8744cvWOPOXTEAxK8EJk6eIGHLgj6YjiBHP3ycQgIo85ePJwRDe/cY1mSJEldMGvVkKa7d++y3zZd0rAtNl2v+OOihjTAdHRLLemAaekW09ANNaTBp2slSZIqZJAnSZJUIYM8SZKkChnkSZIkVcggT5IkqUIGeZIkSRUyyJMkSaqQQZ4kSVKFRrwMWZIkSXUwJ0+SJKlCBnmSJEkVMsiTJEmqkEGeJElShQzyJEmSKmSQJ0mSVCGDPEmSpAoZ5EmSJFXIIE+SJKlCBnmSJEkVMsiTJEmqkEGeJElShQzyJEmSKmSQJ0mSVCGDPEmSpAoZ5EmSJFXIIE+SJKlCBnmq1qxZs57QHH300c3QNZPP40tf+lLTt9d7znOek/qN11lnnfWEeYSf//znI5Zz8803N0PWDvOJee6+++5N37FFumhY5zAove31jrSxrOg33uWWKNK/tmlkuzGffHuvq+LYYdtOlqk6v6QuM8hTtVatWtU788wzm1+93t13390799xzm19rhnkNwjwx3pvRPffck9r33Xdfauee/exn9+c3mXbccccR22E8fvazn/W22mqr5tfjBqWX8Whuuumm9Ju0EUxfd911qR/DmF+tli9fntprm8Y4JuIYWZfFtoxtG9YmAJ6q80vqMoM8aS0Q/NFwAxkPgkzGP/7445s+ZWmnN88NIZiMtH3zm99s+g7fsGsO8iLda5tGthvzmeg/IjVhW7It2LaBnM7rr7+++SVpPAzyJEmdxj8TBx54YPNL0ngZ5GmdQ1Fi1MuJelA01DfL5cMG1eXL661RfJnXQaNBLIthUR+PJi92yvsvXLiw6TusXY8I7d9oj0ezpnWOVld3r51ets9OO+2UhlEMRn/QjmIxhjNeiOljHmhvl/idr0O+7EhXe//k2z/mHfLp2/s5ny5f1/w4oRm0PfPtHvMd7/GVa6cl5P2jCfl6051vRxrWLdKdpyvk22TQvo/fNLn2ctrpGrTONKSrva/zdQh5ulgvmjjOqAJA//Yy8rTG75Avs31+hXx+kX6pCqukip155plUpkvNUODR9F21aquttur3z38vWrQo/b7pppvSb/qD/jF+jMP8ol/Me7fddhsxDhYsWNB0PT6c9ULMl/7I15d1QL5sxLrFb8R8GRbjx7oj5hvLacvTgnwZsa6D0tveTiG2Z6Qhpo3x2Cb8DrH+NCyP37Gu0R+Rjlh+zIeGZcXvfJvHuiCGx/6JYcwvthvziXTF+jNddLcN2t4xX5r8dyx3kEHrns9j0PD2fGOc2Has82jLzPdnrHv8pkHsl/YxEOO3h7e3ReyvfJ1jGhrm154HYjjrzzgxn0gXYh/RME6eHroR6xPTxXxoYn9Gv1g+3fn6SiUzJ0/rtKGbQNM10qWXXpraRxxxRGq/8Y1vTO3VOfnkk0e0ySE46KCDUvcgF154YWofdthhqb3ffvul9pq69tpruSuPqMO0Jq644orUHrq5pTbzGbpRp+7JEPN/9atfndpbbrllardzmFg+ddNID00MH7pJp/bmm2+e2jG/wHSD0s70Qzf8/vRRJ5L9SQ4Rw0gndQw322yzNA77fvbs2ambHCTGY7qJbNvRjq/xYl1pRhPHT7Rju5LjBdIynmM3r0OJ0dab7cT6jFb/MB4eidy92F/t+WMoqBpXXdbx1ncdZDzn1wUXXJDa8+bNS22Oh/POOy91S6UzyJMGaBf3jVcERwQPBAfc9MYKDib7gQSKxKLuEuswXlP9RGfMn5snRWInnHBC+t1+yjiClBDDo5gu0jbe9W3PP/fAAw+kNtuJeUeRIPuewCK2H/3bRZLTjeVH4JEfm3FssX043ghUIqAlwG1vz8nENotgMvZHBHVxXMf2j+B+uo3n/Mr3M2mK3xO9BkhdYpAnDRC5BxMJfiL3L3IDxxLBw1jBSOQwjYUbOjcobuoTyT2KYGCqbmwxfwKQyJ2iWd1TxhE0EDjn0433CdSYPoKRXL5d83mTg4jItWJ7cuOfibpaUb+PHKnIZW2L/meccUYK+iLXitzkyJ2aTFFnjm0SAWUg1/Cmm25KwxiHYJ71m4onhiO3dSzjOb8ix5r9nB8Ha5ODKHWFQZ40wPz581M7ipnIJRmvKBIi52V1Qcyuu+6a2vFqiKVLl6b2WJYsWdJ0PS5yuLipj3VDG00EAxEMEexFjsZY8tywXPyO4fn8Y1sSmEZx7GjmzJmT2swvxmV6KtOPR15UGQ80xPR5DmvMj3QzHu0IEJhHO5iZDqxn5N6xrqMF4HGsRlASaWabjZWLPBHsA/ZhFHEPyikjyGTZESxNdoDHOgw6HwedO+M5vyKXMapYwIcvVI2hk1CqEod3u1mwYEG/cno0QzfwEb+HblBp+ryS9qBxhm50/d905xif5eTy+dFExe98fdrLCXn/dnd7eHt+eQV1mqhg3raoqaROQ3ry9DGPdnrz8aNBux/job0esX1G2y6BbZ0PjzS3l59vA5pY7mjTh3xY7EemaR8ng7TnzTTt6drrxTSDxHSxXZBPl8+3vQ/pl283lpnPZ5D2/myvZ/s38x+U3uhmew86JqJBe18PWka7X4jfebry5efTxX7EaOPQhHwcmvYxKJVqFn+GDmpJ0gwiB5Gcu6GAY0qKN6cLOaBDwWDz63Glp0sqkcW1kjSDKBokwIt6i1P5sMRUi3ffkXeQN7vttlvR6ZJKZU6eJM0gAqN42phgKB78KBUPXLSdeeaZq62fKmnyGeRJkiRVyOJaSZKkChnkSZIkVcggT5IkqTq93n8AHztV5lgfEVAAAAAASUVORK5CYII=\" width=\"633\" height=\"221\"\u003e\u003c/p\u003e\n\u003ch2\u003e2.3 Effect of co-speech gesture on temporal stability\u003c/h2\u003e\n\u003cp\u003eIn addition to greater movement magnitude, studies on coupling enhancement have shown greater temporal stability of synchronized vs. unsynchronized movement [15, 21, 22, 31]. Specifically for speech, the Coupled Oscillator Model of speech production proposes that temporal stability in CV sequences in English is the result of synchronized planning between the C and V gestures [e.g., 25, 28, 26, 27, 29]. Thus, to analyze the effects of temporal stability in the presence of a CSG, we begin by analyzing the temporal stability in target achievement in the stressed syllable CV sequence (CV stability), where the articulator for target achievement depends on the quality of the segment, as detailed in \u0026sect;4.4.2. The results reveal greater stability, as indexed by relative standard deviation (RSD), in the temporal coordination of targets in the CV sequence when the target word is produced with a CSG compared to targets not produced with a CSG, as demonstrated in Fig. 5. A linear mixed effects model demonstrates that this effect of stress is significant (\u003cem\u003e\u0026beta;\u003c/em\u003e= 4.174e-02, \u003cem\u003et\u003c/em\u003e= -2.84, \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.05), as was the interaction between gesture and stress position \u0026nbsp;(\u003cem\u003e\u0026beta;\u003c/em\u003e= -3.994e-02, \u003cem\u003et\u003c/em\u003e= -4.659, \u003cem\u003ep\u003c/em\u003e\u0026lt; 0.001). These results demonstrate that syllables synchronized with a CSG are more temporally stable than those unsynchronized to a gesture.\u003c/p\u003e\n\u003cp\u003eWe likewise analyzed temporal stability in synchronization between articulators in producing the stressed vowel (within-segment stability). Both the tongue and the jaw move downward during the production of a stressed vowel; thus, within-segment-stability can be understood as how closely the timing of maximum displacement is synchronized between the TB and JW during stressed vowel production. Figure 6 demonstrates that synchronization between TB and JW is greater in the presence of a CSG compared to tokens produced without a CSG. An lme model of TB-JW lag reveals that this effect is significant across stress conditions (\u003cem\u003e\u0026beta;\u003c/em\u003e=2.842e-02,\u003cem\u003e\u0026nbsp;t\u003c/em\u003e=6.782,\u003cem\u003e\u0026nbsp;p\u003c/em\u003e\u0026lt; 0.001).\u003c/p\u003e\n\u003cp\u003eAs mentioned in \u0026sect;1, studies suggest a positive relationship between the strength of temporal coupling of movements and the magnitude of movement [15, 21, 22, 31]. We test this hypothesis in our data by analyzing the correlation between the temporal lag, i.e., synchronization, between \u0026nbsp;TB and JW movements (within-segment stability), and the magnitude of TB vertical displacement. We also examined the correlation between the lag in timing between gesture apex and TB and the magnitude of TB vertical displacement. A Pearson correlation test demonstrates a significant positive correlation between TB-JW lag and TB displacement (Fig. 7a) in both stress conditions (Final Stress: \u003cem\u003er\u003c/em\u003e(660)=.22, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001; Initial Stress: \u003cem\u003er\u003c/em\u003e(768)=.19, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001). A stronger correlation was observed between TB-Apex lag and TB displacement (Fig. 7b) in the final stress condition (Final Stress: \u003cem\u003er\u003c/em\u003e(660)=.48, \u003cem\u003ep\u003c/em\u003e\u0026lt;0.001; Initial Stress: \u003cem\u003er\u003c/em\u003e(768)=.014,\u0026nbsp;\u003cem\u003ep\u003c/em\u003e\u0026gt;0.05). Overall, these results demonstrate that as stability increases, magnitude also increases. We return to differences across stress conditions and explore possible sources of these differences in \u0026sect;4.\u003c/p\u003e"},{"header":"3. Discussion","content":"\u003cp\u003eCo-speech gestures are ubiquitous in naturalistic communication. Although they are known to aid in perception of speech and in conveying semantic and pragmatic information, our study provides evidence that the facilitative role of gestures extends to speech production. Specifically, our results suggest that speech movements timed to co-speech gestures show enhanced magnitude and temporal stability. Consistent with \u003cspan type=\"SmallCaps\" class=\"SmallCaps\" name=\"Emphasis\"\u003eh3\u003c/span\u003e, across all subjects in our sample, vowel durations were longer and tongue displacement was lower for speech produced synchronously with a manual co-speech gesture than speech produced without a gesture. Most subjects likewise achieved lower JW positions for target words in utterances containing a gesture.\u003c/p\u003e \u003cp\u003eComplementing the increase in magnitude found in the gesture condition, we also find evidence that articulatory movements were less variable when produced with co-speech gestures, consistent with \u003cspan type=\"SmallCaps\" class=\"SmallCaps\" name=\"Emphasis\"\u003eh4\u003c/span\u003e. First, we found relatively lower variability in consonant-vowel timing in the presence of a gesture, particularly in the final stress condition. Second, within stressed vowels (the syllables to which gestures were most closely timed in our data), the lag between target achievement of TB and JW movements was found to be lower overall in the CSG condition. Furthermore, we find a direct link between stability of timing to the CSG and magnitude of oral articulatory gestures, as tighter timing between TB and CSG correlated positively with greater movement of the TB in words with final stress. Taken together, our results display the trademarks of coupling-based movement enhancement and stabilization that have been described within several domains of motor control [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e, \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eOur results suggest that by producing speech with gestures, speakers are, in effect, stabilizing their own articulatory movements and enhancing coordination between those movements. This facilitative effect is likely a motivating factor behind the pervasive use of gestures even when they do not play a direct role in facilitating speech perception. Furthermore, the kinds of facilitative effects observed in co-speech gesture coupling suggest that gestures may play an active role in enhancing fluency in a variety of speech and language disorders [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e] as well as in L2 learning [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eOur findings on the effects of CSG on the magnitude of speech gestures also shed light on the relationship between co-speech gestures and speech prosody. While our results demonstrate an effect of hyperarticulation in comparing stressed and unstressed syllables (\u0026sect;\u0026nbsp;3.1), the effects of gesture on speech are distinct from those of stress: we find a uniformly lower tongue position across vowel types in the presence of a co-speech gesture, consistent only with sonority expansion effects. Meanwhile, we find no evidence to support an added hyperarticulation effect in the presence of a co-speech gesture. Thus, it does not appear that gesture presence leads to an increase in the magnitude of stress correlates in speech. Our results also provide counterevidence to the idea that speech and gesture serve as redundant and complementary cues to prosodic prominence in the way that e.g. voice onset time, vowel duration, and fundamental frequency complement one another in cuing voicing contrasts in some languages [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]; if this were the case, we would hypothesize that oral gestures should be less extreme in the context of a CSG. However, we find no evidence of a trading relation between speech and co-speech gesture in marking prominence in our data. Instead, it appears that gestures are inducing a biomechanical effect on articulatory gestural magnitude, which does not implicate phonology or prosody directly [c.f. 31]. We term this biomechanical effect \u003cem\u003ecoupling enhancement\u003c/em\u003e.\u003c/p\u003e \u003cp\u003eOur findings also point to new avenues for future research on the relationship between speech articulation, gesture coordination, and rhythm in language. Our findings indicated that the effects of co-speech gestures on speech dynamics were much stronger for words with final stress than for words with initial stress. Though we cannot say definitively why this difference arose across stress conditions, we look to research on the timing of articulatory gestures at syllable boundaries for some clues. Disyllabic words with initial stress have been shown to display greater variability of timing in tongue movements for medial consonants, as medial consonantal gestures (which would normally be expected to serve as onsets to the final syllable) tend to show an attraction to the preceding stressed syllable [46, 47]. Variability of this magnitude is not observed with disyllabic final stress words. We hypothesize that this variability in coordination reduces the effect of CSG on magnitude and stability for initial stress tokens in our study.\u003c/p\u003e \u003cp\u003eFinally, we believe our findings can help to bridge a gap between theories of speech-motor coordination and prosody. Prior work has proposed that speech and CSG are jointly controlled by the same prosodic planning mechanism [48, 49]. For example, [49] find that both speech articulatory gestures and co-speech gestures display differences in duration as a function of prosodic prominence. They propose that the same clock-slowing mechanism for inducing longer durations on speech articulations under prominence may also operate over co-speech gestures. Our own results suggest that coupling speech to gesture has effects that are independent of prosodic effects. However, the effects we observe in our data could act as a precursor to such a prosodic control mechanism. Specifically, the duration-enhancing effect of gesture on speech coupling could be grammaticalized and applied predictably to different prosodic environments, similar to the grammaticalization of lexical tone based on f0 perturbations over time [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. Indeed, the idea that gesture can entrain articulatory and acoustic cues to prosodic prominence has been postulated on an even shorter timeline, as young children have been shown to employ prosodically appropriate co-speech gestures before they learn to produce the acoustic cues to prominence that accompany such gestures in adult speech [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e]. If this hypothesis proves to be correct, it will have important implications for the study of prosody and language typology. Concretely, given cross-linguistic variability in the timing between co-speech gestures and speech, we may expect to see correlations between gesture timing strategies and prosodic patterns across the world\u0026rsquo;s languages. A better understanding of gesture as a potential phonetic precursor may help to shed light on patterns of sound change and the evolutionary relationship between speech and co-speech gesture.\u003c/p\u003e"},{"header":"4. Methods","content":"\u003ch2\u003e4.1 Participants\u003c/h2\u003e\n\u003cp\u003ePrior approval from the Institutional Review Board (IRB) was obtained for this study and appropriate protocols were followed for data collection. All methods were performed in accordance with relevant guidelines and regulations. This included obtaining participants\u0026apos; written informed consent to participate in the study. Ten subjects (1 male, 8 female, 1 unspecified) participated in the study. Participants consisted of students and young professionals recruited in the greater Boston, Massachusetts area; ages ranged from 20-40. All participants reported US English as their native language and none of the participants reported any history of speech/language or vision impairment.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e4.2 Materials\u003c/h2\u003e\n\u003cp\u003eStimuli consisted of nonce words of the shape CVbV, controlling for vowel quality: /i, o, a~ə/, initial consonant: /s, p, l/, stress (initial vs. final): /sóbo/ or /sobó/. Vowels were chosen to mimic stressed and unstressed vowels in English while sampling a range of the total vowel space. As will be detailed below, target words were produced in one of two conditions: with or without a concurrent co-speech gesture. Examples of the target words are provided in Table 2. All target words were produced in the carrier phrase \u003cem\u003eI saw the____ today\u003c/em\u003e, to situate the target word in the environment V#CVCV#C, to maximize clarity in articulatory data parsing. The task was blocked by stress condition and by co-speech gesture condition, with the order of blocks randomized by participant. Half of participants produced the gesture block followed by the non-gesture block and the other half produced the non-gesture block followed by the gesture block. Each token was repeated 6 times throughout the task block and the order of tokens was randomized within the task block. A total of 216 tokens per subject were produced (9 word shapes x 2 stress conditions x 2 gesture conditions x 6 repetitions).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 2:\u0026nbsp;\u003c/strong\u003eTarget word shapes included in the study, where IPA transcripts are provided on the first line and orthographic transcripts seen by participants are included on the second line.\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"467\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/s/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/p/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/l/\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e/i /\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/síbi, sibí/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/píbi, pibí/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/líbi, libí/\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;seeybee, seebeey\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;peeybee, peebeey\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;leeybee, leebeey\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e/o /\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/sóbo, sobó/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/póbo, pobó/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/lóbo, lobó/\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;sohbo, soboh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;pohbo, poboh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;lohbo, loboh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e/a~ə/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/sábə, səbá/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/pábə, pəbá/\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e/lábə, ləbá/\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;suhbah, sahbuh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;puhbah, pahbuh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u0026lt;luhbah, lahbuh\u0026gt;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n\u003ch2\u003e4.3 Procedure\u003c/h2\u003e\n\u003cp\u003eStimuli were displayed using OpenSesame software [52] on a computer monitor at a comfortable distance from each participant. A member of the research team manually progressed through each experimental item to ensure that each token was produced correctly before moving on to the next item. If the participant made a speech error, the researcher prompted the participant to try again and the target utterance was immediately repeated.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eEach task block contained a short set of instructions that told the participant what to expect during the task and provided an example sentence to illustrate stress production. The instructions for each task block were read aloud by a researcher. Sample sentences on instruction screens contained words that replicated the same stress construction, but with phones that were not present in the stimulus set. In order to ensure that the prosodic context was similar between gesture and non-gesture conditions, for all items, participants were given the same prompt, shown in Ex 1.\u0026nbsp;\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003e\u003cem\u003eImagine you\u0026rsquo;re traveling in a foreign land and see a famous landmark while out exploring. You run into a friend in your travels and excitedly tell them about your experience.\u0026nbsp;\u003c/em\u003e\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003ePrior to the start of the gesture condition experiment block, participants were presented with a single demonstration video of a researcher naturally producing a sample sentence with a bimanual co-speech gesture produced synchronously with the target word. Participants were asked to model their gesture after the video, and to use both hands for the gesture, but were not explicitly told to copy all aspects of the model gesture. Participants were also not explicitly told when they should begin or end the gesturing within the spoken sentence. Participants were only instructed to produce a manual gesture as they read each sentence aloud.\u003c/p\u003e\n\u003ch2\u003e4.4 Data collection\u0026nbsp;\u003c/h2\u003e\n\u003ch3\u003e4.4.1 Acoustic data collection\u003c/h3\u003e\n\u003cp\u003eThe primary acoustic data used for acoustic analyses in this study were collected using a Rode NTG2 shotgun microphone. The microphone was attached to a boom arm mounted to the desk and positioned above the participant\u0026rsquo;s head. This data is time-aligned with the EMA data files, where for the primary acoustic data and EMA data, each utterance was saved individually. However, the video data was not time aligned to the primary acoustic data and EMA recordings; thus, secondary acoustic data was also recorded for the video data in order to time align the two data sets, as discussed in \u0026sect;4.5.1. Secondary acoustic data time-aligned to the video was recorded using a Zoom Q8 Handy Video Recorder Microphone.\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003e4.4.2 EMA data collection\u003c/h3\u003e\n\u003cp\u003eEMA data was collected using the NDI Wave system, a point-tracking system with accuracy within approximately 0.5 mm [53] and a sampling rate of 100Hz. NDI Wave 5 Degree of Freedom (5DoF) sensors were attached to the face and mouth to capture movement of the articulators. Reference sensors were used to track the position and movement of the head in order to correct articulatory data for head movement, as discussed in more detail in \u0026sect;4.4.2. Reference sensors consisted of five NDI Wave 5DoF sensors. Three 5DoF reference sensors were placed on the right mastoid (RMA), left mastoid (LMA) and on the bridge of the nose (NAS), as illustrated in Fig. (8a); these sensors remained in place for the entirety of the experiment. Two additional sensors were used to record the occlusal plane of each participant. Sensors were attached along the sagittal midline to a wax bite plate with one sensor aligned with the front incisor (OS) and the other aligned with the back molar (MS), as illustrated in Fig. (8b). Prior to the presentation of the stimuli, participants were asked to hold still and held the bite plate between their teeth for five seconds while a recording was made of their position. The bite plate was then removed and set aside for the remainder of the experiment.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eMovement of the oral articulators were tracked using six 5DoF sensors attached to the incisors, lips, and tongue, as shown in Fig. 9. Sensors were attached to the upper and lower lips near the vermilion border, lower jaw to the gums below the lower incisor (JW), and three sensors were attached along the sagittal midline of the tongue at the tongue tip (TT), blade (TB), and dorsum (TD). The front-most sensor (TT) was attached less than 1 cm from the tip of the tongue, the back-most sensor (TD) was attached as far back as was comfortable for the participant, typically 4-6 cm from the tongue tip, and the third sensor (TB) was placed midway between the TT and TD sensors.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe participant was seated in front of the experiment monitor and beside the EMA magnet. A researcher then created a mold of the participant\u0026rsquo;s bite and reference sensors were attached before collecting the bite plate recording. Next, positions for the oral sensors were marked before gluing to ensure consistency in placement in the case of re-gluing and then attached accordingly. Tongue and jaw sensors were attached using glue and lip sensors were attached using medical-grade tape. Before beginning stimuli collection, the researcher engaged the participant in conversation for several minutes to help them adjust to speaking with the attached sensors.\u003c/p\u003e\n\u003cp\u003eThroughout the study, one researcher was assigned the task of controlling the experiment program while another was assigned the task of monitoring the sensor tracking to ensure that all sensors were actively tracking. The researcher monitoring the sensor tracking would interrupt the experiment to notify the other researcher of a sensor that began to behave irregularly. In such cases, sensor stability was assessed and sensors were secured as needed.\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003e4.4.3 Video data collection\u003c/h3\u003e\n\u003cp\u003eVideo data was collected using a Zoom Q8 Handy Recorder with a 160\u0026deg; wide-angle lens and a 30 fps framerate. The camera was mounted directly above the participant\u0026rsquo;s monitor and angled to ensure it captured the entirety of a participant\u0026apos;s manual gestures, from resting position in the participant\u0026apos;s lap to full extension during the manual beat gesture. Examples of still images from the video recording are provided in Fig. 10, where (a) demonstrates a no-gesture condition where the hands were kept at rest in the lap and (b) demonstrates a gesture condition at the point of maximum extension during the beat gesture production. Each task block comprised its own video recording.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eInformed consent was obtained from all subjects and/or their legal guardian(s) for publication of identifying information/images in an online open-access publication.; however, out of consideration for the participant, we obscured the eyes to protect their privacy.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e4.5 Data processing\u003c/h2\u003e\n\u003ch3\u003e4.5.1 Acoustic data processing\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003ePrimary acoustic data was forced aligned using the Montreal Forced Aligner (MFM) [54]. Pitch maxima were extracted from each vowel in target words using a script for Praat [55].\u003c/p\u003e\n\u003cp\u003ePrimary acoustic data, which was time-aligned to the EMA recordings, and secondary acoustic data, which was time-aligned to the video data, was merged using the Python tool \u003cem\u003eaudalign\u003c/em\u003e, which identifies similarities across audio files in order to allow for time alignment of signals [56]. The alignment between primary and secondary acoustic data was reviewed by the researchers to identify and correct any misalignments between the two data sets. Corrections were made by identifying a unique acoustic landmark within the utterance that could be used to synchronize the two recordings.\u003c/p\u003e\n\u003ch3\u003e4.5.2 EMA data processing\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eThe EMA data was rotated along the mid-sagittal plane and head movements were corrected for in Python [57] using the reference and bite plate sensors [58], such that the origin of the spatial coordinates corresponds to the front teeth. All articulatory trajectories were smoothed using Garcia\u0026rsquo;s robust smoothing algorithm [59]. All EMA data was visually inspected by a research assistant to ensure that no recordings containing major tracking errors or detached sensors were included in the data analysis. Major errors in the recording were identified by ensuring that all movements were consistent with possible movement of the articulators. Recordings that included any major errors in tacking were removed from the dataset prior to analysis.\u003c/p\u003e\n\u003cp\u003eFollowing a similar procedure to that implemented in [60], relevant gestural targets, including the point of minimum and maximum displacement in the horizontal and vertical plane, as well as local x and y speed minima, were automatically identified in Python using a window determined by the acoustic signal. The articulator used to determine target achievement differs depending on the segment quality and the kinematic profile of the utterance. Target identification proceeded hierarchically based on the articulatory parameters of the segment, where if a target could not be detected on the basis of the first articulator, a target was assessed using the second and finally third articulators, as detailed in Table 3. Typically, the secondary and tertiary levels were only needed in the case of unstressed vowels, which are not the focus of this study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 3:\u003c/strong\u003e Articulators used in determining target achievement for a given segment. The secondary tier of the table was only used if no target could be identified from the primary level, and the tertiary tier was only used if no target could be identified from either the primary or the secondary level.\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"624\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eb\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003ep\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003el\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003es\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003ea ~ ə\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003ei\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eo\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eprimary\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eLL/UL\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eLL/UL\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eJW\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eJW\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eJW\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003esecondary\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026ndash;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTB\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTB\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTB\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTB\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTB\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003etertiary\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eTT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 12.5%;\"\u003e\n \u003cp\u003eLL/UL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003ch3\u003e4.5.3 Video data processing\u0026nbsp;\u003c/h3\u003e\n\u003cp\u003eCo-speech gestures were coded by a team of researchers trained in gesture coding using ELAN [61] following the MIT Gesture Studies Coding Manual [62], which outlines several phases of the gesture including preparations, strokes, holds, and recoveries based on [7].\u003c/p\u003e\n\u003cp\u003eThe apex of the gesture was automatically extracted based on manual annotations of gesture strokes using MultiPose [63], which uses MediaPipe [64] to track pixel movement in the video recording to extract a set of xy coordinates for a given articulator. In this study, we identified the right wrist as the most stable articulator for identifying the apex of the co-speech gesture (though there was one left-handed participant in the sample, he produced all gestures with both hands). The MultiPose workflow can identify several key kinematic landmarks for a given articulator. In this study, we defined the CSG apex as the xy speed minimum, which closely corresponds to the point of maximum extension.\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e4.6 Statistical analysis\u003c/h2\u003e\n\u003cp\u003eThe data was analyzed in R [65] using a combination of Pearson Correlation Testing, Linear Mixed Effects Models (lmer) [66], and Generalized Additive Mixed Models (GAMM) [67].\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e4.6.1 Linear mixed effects models\u003c/p\u003e\n\u003cp\u003eLinear mixed effects (lme) models were used to evaluate differences in displacement of oral articulators at specified timepoints and duration of vowels in target words. We use an lme model to model the interaction of stress and vowel quality on TB vertical and horizontal displacement at the point of achievement during production of stressed and unstressed vowels (\u0026sect;2.1). We likewise used lme models to predict the interactional effect of stress and gesture on a number of variables including stressed vowel duration (\u0026sect;2.2.2), CV stability (\u0026sect;2.3) and synchronization between TB and JW within the stressed vowel (\u0026sect;2.3).\u003c/p\u003e\n\u003cp\u003eSubject was included as a random intercept for all lme models. Following [68], models were initially fit with maximal random effects structures, and random slope parameters were only reduced from the model if they eliminated singularity [68]. The resulting models are illustrated in (2), where X is determined by the analysis as outlined above.\u0026nbsp;\u003c/p\u003e\n\u003col start=\"2\" type=\"1\"\u003e\n \u003cli\u003eLinear mixed effects model structures\u003col start=\"1\" type=\"a\"\u003e\n \u003cli\u003elmer(X~stress*vowel_quality+(1+stress|Subject)\u003c/li\u003e\n \u003cli\u003elmer(X~stress*gesture+(1+gesture|Subject)\u003c/li\u003e\n \u003c/ol\u003e\n \u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003e4.6.2 Generalized Additive Mixed Models\u003c/p\u003e\n\u003cp\u003eGeneralized Additive Mixed Model (GAMM) analyses were implemented using the \u003cem\u003ebam\u003c/em\u003e package to assess differences in displacement and velocity of the oral articulators over time during target word production between the gesture and the no gesture conditions. All GAMM analyses used the basic model formula provided in (3), where X is either the vertical displacement, horizontal displacement, or vertical velocity of the relevant sensor, and time is normalized such that the onset of the target word corresponds with zero and the offset of the target word corresponds with 1. These models predict a given articulatory trajectory with gesture presence, smoothed time, and time smoothed by gesture as fixed effects, the random intercepts of \u0026nbsp;subject and gesture, and the random smooths of subject and time. This model gives both the nonlinear and the constant difference between the gesture tasks.\u003c/p\u003e\n\u003col start=\"3\"\u003e\n \u003cli\u003eGeneralized additive mixed model structure\u003col start=\"1\" type=\"a\"\u003e\n \u003cli\u003ebam(X~gesture + s(time) + s(time, by=gesture, bs=\u0026quot;tp\u0026quot;, k=10) + s(subject, gesture) + s(time, subject)\u003c/li\u003e\n \u003c/ol\u003e\n \u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eAcross all data in the GAMM analyses, time was normalized to the production of the target word, and articulatory trajectories and velocities were compared in the presence of a CSG (gesture condition) and the absence of a CSG (no-gesture condition).\u0026nbsp;\u003c/p\u003e\n\u003ch2\u003e4.7 Defined measures of analysis\u0026nbsp;\u003c/h2\u003e\n\u003cp\u003eIn this section, we define each of the measures used in the analysis of this study.\u003c/p\u003e\n\u003col start=\"4\" type=\"1\"\u003e\n \u003cli\u003e\u003cstrong\u003eApex (AX)\u003c/strong\u003e is defined as the point of minimum speed of the right wrist during the execution of a gesture.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eTime of max f0\u003c/strong\u003e is defined as the time of the pitch peak for the phone coinciding with the gesture apex.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eTB vertical displacement\u0026nbsp;\u003c/strong\u003eis defined as the vertical (y) position of the TB sensor during the target achievement of the vowel\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eTB horizontal displacement\u003c/strong\u003e \u003cstrong\u003e\u0026nbsp;\u003c/strong\u003eis defined as the horizontal (x) position of the TB sensor during the target achievement of the vowel\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eStressed vowel duration\u003c/strong\u003e: Acoustic duration of the stressed vowel as defined by the start point and end point of the parsed segment in the forced alignment process.\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eCV lag Relative Standard Deviation (RSD)\u003c/strong\u003e: standard deviation of the lag divided by the mean of the lag, where lag is the target achievement of the consonant subtracted from the target achievement of the vowel, and target achievement is the local xy speed minimum of a given gesture (as defined in \u0026sect;4.4.2).\u003c/li\u003e\n \u003cli\u003e\u003cstrong\u003eJW to TB lag\u003c/strong\u003e: Absolute value of the lag between maximum extension of the TB and JW during stressed vowel production.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Declarations","content":"\u003ch3\u003e6. Data accessibility\u003c/h3\u003e\n\u003cp\u003eData files and scripts for statistical analysis are included with the supplementary files and will be made available in an open-access repository upon acceptance of the paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e7. Acknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWithheld for anonymity. To be added upon acceptance.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e8. Author contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAuthor 1 and Author 3 conceived of the study and wrote the main manuscript text; all authors contributed to the data collection, processing, and writing the methods section. Author 1 prepared all figures. All authors reviewed the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e9. Additional information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e9.1 Competing interests statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNone declared.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e10. Ethics declarations\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was granted ethics approval by the Institutional Review Board of [WITHHELD FOR ANONYMITY] (Protocol IRB 22-1097).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSueyoshi, A, \u0026amp; Hardison, D. M. The role of gestures and facial cues in second language listening comprehension. Language Learning 55, 661\u0026ndash;99 (2005).\u003c/li\u003e\n\u003cli\u003eHostetter, Autumn B. When do gestures communicate? A meta-analysis. Psychological Bulletin 137, 297\u0026ndash;315 (2011).\u003c/li\u003e\n\u003cli\u003eGoldin-Meadow, S. \u0026amp; Alibali, M. W. Gesture\u0026rsquo;s role in speaking, learning, and creating language. Annual Review of Psychology 64, 257\u0026ndash;83 (2013).\u003c/li\u003e\n\u003cli\u003eBavelas, J. Gerwing, J. Sutton, Ch. \u0026amp; Prevost, D. Gesturing on the telephone: Independent effects of dialogue and visibility. Journal of Memory and Language 58. 495\u0026ndash;520 (2008).\u003c/li\u003e\n\u003cli\u003e\u0026Ouml;z\u0026ccedil;alışkan, Ş., Adamson, L.B., Dimitrova, N., \u0026amp; Baumann, S. Early gesture provides a helping hand to spoken vocabulary development for children with autism, down syndrome, and typical development. Journal of Cognition and Development 18, 325\u0026ndash;37 (2017).\u003c/li\u003e\n\u003cli\u003eEsteve-Gibert, N., Borr\u0026agrave;s-Comes, J., Asor, E., Swerts, M., \u0026amp; Prieto, P. The timing of head movements: The role of prosodic heads and edges. The Journal of the Acoustical Society of America. 141, 4727\u0026ndash;4739 (2017). \u003c/li\u003e\n\u003cli\u003eKendon, A. Gesticulation and speech: Two Aspects of the process of utterance in The Relationship of Verbal and Nonverbal Communication. (ed. Key, M. R.) 207\u0026ndash;228 (De Gruyter Mouton, 1980).\u003c/li\u003e\n\u003cli\u003eLeonard, T., \u0026amp; Cummins, F. The temporal relation between beat gestures and speech. Language and Cognitive Processes. 26, 1457\u0026ndash;1471 (2011). \u003c/li\u003e\n\u003cli\u003eLoehr, D. P. Temporal, structural, and pragmatic synchrony between intonation and gesture. Laboratory Phonology. 3, 71-89 (2012).\u003c/li\u003e\n\u003cli\u003eRochet-Capellan, A., Laboissi\u0026egrave;re, R., Galv\u0026aacute;n, A., \u0026amp; Schwartz, J.-L. The speech focus position effect on jaw\u0026ndash;finger coordination in a pointing task. Journal of speech, language, and hearing research. 51, 1507\u0026ndash;1521 (2008). \u003c/li\u003e\n\u003cli\u003eMayberry, R. I., \u0026amp; Jaques, J. Gesture production during stuttered speech: Insights into the nature of gesture\u0026ndash;speech integration in Language and Gesture, (ed. McNeill D.) 199\u0026ndash;214 (Cambridge University Press, 2000). \u003c/li\u003e\n\u003cli\u003eDevanga, S. R., \u0026amp; Mathew, M. Exploring the use of co-speech hand gestures as treatment outcome measures for aphasia. Aphasiology. Advance https://doi.org/10.1080/02687038.2024.2356287 (2024). \u003c/li\u003e\n\u003cli\u003eBrady, J. P. Studies on the metronome effect on stuttering. Behaviour Research and Therapy. 7, 197\u0026ndash;204 (1969).\u003c/li\u003e\n\u003cli\u003eToyomura, A., Fujii, T., \u0026amp; Kuriki, S. Effect of external auditory pacing on the neural activity of stuttering speakers. NeuroImage. 57, 1507\u0026ndash;16 (2011).\u003c/li\u003e\n\u003cli\u003evon Holst, E. The behavioural physiology of animals and man in The collected papers of Eric von Holst. (University of Miami Press, 1973)\u003c/li\u003e\n\u003cli\u003eHoyt, D. F., \u0026amp; C. Taylor, R. Gait and the energetics of locomotion in horses. Nature. 292, 239\u0026ndash;40 (1981).\u003c/li\u003e\n\u003cli\u003eHaken, H., Kelso, J. A. S., \u0026amp; Bunz, H. A theoretical model of phase transitions in human hand movements. Biological Cybernetics. 51, 347\u0026ndash;356 (1985). \u003c/li\u003e\n\u003cli\u003eBeek, P. J., Peper, C. E., \u0026amp; Stegeman, D. F. Dynamical models of movement coordination. Human Movement Science. 14, 573\u0026ndash;608 (1995).\u003c/li\u003e\n\u003cli\u003eKelso, J. A. S. Dynamic patterns: The self-organization of brain and behavior. (MIT Press, 1995).\u003c/li\u003e\n\u003cli\u003eDe Poel, H. J., Roerdink, M., Peper, C. (L.) E., \u0026amp; Beek, P. J. A re-appraisal of the effect of amplitude on the stability of interlimb coordination based on tightened normalization procedures. Brain Sciences. 10 https://doi.org/10.3390/brainsci10100724 (2020).\u003c/li\u003e\n\u003cli\u003eSchwartz, M., Amazeen, E. L., \u0026amp; Turvey, M. T. Superimposition in interlimb coordination. Human Movement Science. 14, 681\u0026ndash;694 (1995).\u003c/li\u003e\n\u003cli\u003eKudo, K., Park, H., Kay, B. A., \u0026amp; Turvey, M. T. Environmental coupling modulates the attractors of rhythmic coordination. Journal of Experimental Psychology: Human Perception and Performance. 32, 599\u0026ndash;609 (2006). \u003c/li\u003e\n\u003cli\u003eFitts, P. M. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology. 47, 381\u0026ndash;391 (1954). \u003c/li\u003e\n\u003cli\u003eMessier, J, \u0026amp; Kalaska, J. F. Differential effect of task conditions on errors of direction and extent of reaching movements. Experimental Brain Research. 115, 469\u0026ndash;78 (1997).\u003c/li\u003e\n\u003cli\u003eKozhevnikov, V. A. \u0026amp; Chistovich, L. A. Speech: articulation and perception. (Joint Publications Research Service, 1966). \u003c/li\u003e\n\u003cli\u003eL\u0026ouml;fqvist, A., \u0026amp; Gracco, V. L. Interarticulator programming in VCV sequences: Lip and tongue movements. The Journal of the Acoustical Society of America. 105, 1864\u0026ndash;1876 (1999).\u003c/li\u003e\n\u003cli\u003eNam, H., Goldstein, L.M., \u0026amp; Saltzman, E. Self-organization of syllable structure: A Coupled Oscillator Model in Approaches to phonological complexity. 299\u0026ndash;328. (2009).\u003c/li\u003e\n\u003cli\u003eBrowman, C. P., \u0026amp; Goldstein, L. M. Some notes on syllable structure in articulatory phonology. Phonetica. 45, 140-155 (1988).\u003c/li\u003e\n\u003cli\u003eMarin, S., \u0026amp; Pouplier, M. Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control. 14, 380-407 (2010). \u003c/li\u003e\n\u003cli\u003eTilsen, S. et al. A cross-linguistic investigation of articulatory coordination in word-initial consonant clusters. Cornell Working Papers in Phonetics and Phonology. 51-81 (2012). \u003c/li\u003e\n\u003cli\u003eFranich, K. How we speak when we speak to a beat: The influence of temporal coupling on phonetic enhancement. Laboratory Phonology 13, https://doi.org/10.16995/labphon.6452 (2022).\u003c/li\u003e\n\u003cli\u003eCummins, F. On synchronous speech. Acoustic Research Letters Online. 3, 7\u0026ndash;11 (2002). \u003c/li\u003e\n\u003cli\u003eSwerts, M. G. J., \u0026amp; Krahmer, E. J. Facial expressions and prosodic prominence: Effects of modality and facial area. Journal of Phonetics. 36, 219-238 (2008). \u003c/li\u003e\n\u003cli\u003ede Jong, K. J., Beckman, M.E., \u0026amp; Edwards, J. The interplay between prosodic structure and coarticulation. Language and speech. 36, 197\u0026ndash;212 (1993).\u003c/li\u003e\n\u003cli\u003ede Jong, K. J. The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. Journal of the Acoustical Society of America. 97, 491\u0026ndash;504 (1995). \u003c/li\u003e\n\u003cli\u003eErickson, D. Articulation of extreme formant patterns for emphasized vowels. Phonetica. 59, 134\u0026ndash;149 (2002). \u003c/li\u003e\n\u003cli\u003eCho, T. Prosodic strengthening and featural enhancement: Evidence from acoustic and articulatory realizations of /ɑ, i/ in English. The Journal of the Acoustical Society of America. 117, 3867\u0026ndash;3878 (2005). \u003c/li\u003e\n\u003cli\u003eSteffman, J. Contextual prominence in vowel perception: Testing listener sensitivity to sonority expansion and hyperarticulation. JASA Express Letters 1, 045203. https://doi.org/10.1121/10.0003984 (2021). \u003c/li\u003e\n\u003cli\u003eEsteve-Gibert, N. \u0026amp; Prieto, P. Prosodic structure shapes the temporal realization of intonation and manual gesture movements. Journal of speech, language, and hearing research. 56, 850-864 (2013).\u003c/li\u003e\n\u003cli\u003eKrivokapic, J., Tiede, M. K., Tyrone, M. E., \u0026amp; Goldenberg, D. Speech and manual gesture coordination in a pointing task in Proceedings of Speech Prosody. 1240-1244 (2016).\u003c/li\u003e\n\u003cli\u003eMunhall, K. G., Ostry, D. J., \u0026amp; Parush, A. Characteristics of velocity profiles of speech movements. Journal of experimental psychology: Human perception and performance, 11, 457-474 (1985). \u003c/li\u003e\n\u003cli\u003eJohnson, K. Speech production patterns in producing linguistic contrasts are partly determined by individual differences in anatomy. UC Berkeley Phonetics and Phonology Lab Annual Report. http://dx.doi.org/10.5070/P7141042483 (2018).\u003c/li\u003e\n\u003cli\u003eKong, A. P.-H., Law, S.-P., Wat, W. K.-C. \u0026amp; Lai, C. Co-verbal gestures among speakers with aphasia: Influence of aphasia severity, linguistic and semantic skills, and hemiplegia on gesture employment in oral discourse. Journal of Communication Disorders 56, 88\u0026ndash;102 (2015). \u003c/li\u003e\n\u003cli\u003eCavicchio, F. \u0026amp; Grazia Bus\u0026agrave;, M. Lending a hand to speech: Gestures help fluency and increase pitch in second language speakers. LIA 14, 218\u0026ndash;246 (2023). \u003c/li\u003e\n\u003cli\u003eByrd, D., Tobin, S., Bresch, E., \u0026amp; Narayanan, S. Timing effects of syllable structure and stress on nasals: A real-time MRI examination. Journal of Phonetics. 37, 97\u0026ndash;110 (2009). \u003c/li\u003e\n\u003cli\u003eGarvin, K. Word-medial syllabification and gestural coordination. Doctoral Dissertation, University of California, Berkeley. (2021).\u003c/li\u003e\n\u003cli\u003eParrell, B., Goldstein, L., Lee, S., \u0026amp; Byrd, D. Spatiotemporal coupling between speech and manual motor actions. Journal of Phonetics. 42, 1\u0026ndash;11 (2014). \u003c/li\u003e\n\u003cli\u003eKrivokapić, J., Tiede, M. K., \u0026amp; Tyrone, M. E. A kinematic study of prosodic structure in articulatory and manual gestures: Results from a novel method of data collection. Laboratory Phonology. 8, https://doi.org/10.5334/labphon.75 (2017). \u003c/li\u003e\n\u003cli\u003eMatisoff, J. A. Tibeto-Burman tonology in an areal context in Procedings of the symposium: Cross-linguistic studies of tonal phenomena: Tonogenesis, typology and related topics (ed. Kaji, S.) 3\u0026ndash;32 (ILCAA, 1999). \u003c/li\u003e\n\u003cli\u003eEsteve‐Gibert, N., L\u0026oelig;venbruck, H., Dohen, M. \u0026amp; D\u0026rsquo;Imperio, M. Pre‐schoolers use head gestures rather than prosodic cues to highlight important information in speech. Developmental Science 25, e13154; https://doi.org/10.1111/desc.13154 (2022).\u003c/li\u003e\n\u003cli\u003eMath\u0026ocirc;t, S., Schreij, D. and Theeuwes, J. (2012). OpenSesame: An open-source, graphical experiment builder for the social sciences.\u003c/li\u003e\n\u003cli\u003eBerry, J. J. Accuracy of the NDI Wave speech research system. Journal of speech, language, and hearing research. 54, 1295-1301 (2011). \u003c/li\u003e\n\u003cli\u003eMcAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., and Sonderegger, M. Montreal Forced Aligner: trainable text-speech alignment using Kaldi. In Proceedings of the 18th Conference of the International Speech Communication Association. (2017). \u003c/li\u003e\n\u003cli\u003eBoersma, P., and Weenink, D. Praat: Doing phonetics by computer. 6.2.23 http://www.praat.org/ (2022).\u003c/li\u003e\n\u003cli\u003eMiller, B. Audalign 1.2.4. https://pypi.org/project/audalign/ (2024). \u003c/li\u003e\n\u003cli\u003eVan Rossum, G., \u0026amp; Drake Jr, F. L. Python reference manual. 3.10.12 Centrum voor wiskunde en informatica Amsterdam. (1995). \u003c/li\u003e\n\u003cli\u003eJohnson, K., \u0026amp; Sprouse, R. L. Head correction of point tracking data. UC Berkeley PhonLab Annual Report. 15, https://doi.org/10.5070/P7151050341 (2019).\u003c/li\u003e\n\u003cli\u003eGarcia, D. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics and Data Analysis. 54, 1167\u0026ndash;1178 (2010). \u003c/li\u003e\n\u003cli\u003eTiede, M. MVIEW: Multi-channel visualization application for displaying dynamic sensor movements. (2010) \u003c/li\u003e\n\u003cli\u003eELAN 6.4 https://archive.mpi.nl/tla/elan (2022)\u003c/li\u003e\n\u003cli\u003eMIT speech communication group gesture coding manual. http://scg.mit.edu/gesture/coding-manual.html\u003c/li\u003e\n\u003cli\u003eDych, W., Garvin, K., \u0026amp; Franich, K. Creating multimodal corpora for co-speech gesture research. CorpusPhon. abstr. (2024). \u003c/li\u003e\n\u003cli\u003eLugaresi et al. MediaPipe: A Framework for Building Perception Pipelines. (2019). \u003c/li\u003e\n\u003cli\u003eR Core Team. R: A language and environment for statistical computing. (2013). \u003c/li\u003e\n\u003cli\u003eBates D, M\u0026auml;chler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. 67, 1\u0026ndash;48 (2015).\u003c/li\u003e\n\u003cli\u003eWood, S. Generalized additive models: an introduction with R. (CRC Press, 2006).\u003c/li\u003e\n\u003cli\u003eBarr, D. J., Levy, R., Scheepers, C., \u0026amp; Tily, H. J. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language. 68, 255\u0026ndash;278 (2013).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Speech, Co-speech gestures, Articulation, Prosody, Speech-motor coupling","lastPublishedDoi":"10.21203/rs.3.rs-5073434/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5073434/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eHumans rarely speak without producing co-speech gestures of the hands, head, and other parts of the body. Co-speech gestures are also highly restricted in how they are timed with speech, typically synchronizing with prosodically-prominent syllables. What functional principles underlie this relationship? Here, we examine how the production of co-speech manual gestures influences spatiotemporal patterns of the oral articulators during speech production. We provide novel evidence that co-speech gestures induce more extreme tongue and jaw displacement and that they contribute to greater temporal stability of oral articulatory movements. This effect\u0026ndash;which we term \u003cem\u003ecoupling enhancement\u003c/em\u003e\u0026ndash;differs from stress-based hyperarticulation in that differences in articulatory magnitude are not vowel-specific in their patterning. Speech and gesture synergies therefore constitute an independent variable to consider when modeling the effects of prosodic prominence on articulatory patterns. Our results are consistent with work in language acquisition and speech-motor control suggesting that synchronizing speech to gesture can entrain acoustic prominence.\u003c/p\u003e","manuscriptTitle":"Co-speech gestures influence the magnitude and stability of articulatory movements: Evidence for coupling-based enhancement","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-19 14:26:17","doi":"10.21203/rs.3.rs-5073434/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-10-31T09:35:47+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-10-23T23:58:59+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-10-10T00:27:24+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"45563992219502348284184076348789856773","date":"2024-10-05T22:06:40+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"327303133986879269496850580026418660735","date":"2024-10-05T20:00:03+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-10-05T14:20:15+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-10-03T06:49:55+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-09-30T11:24:57+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-09-27T05:05:40+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2024-09-11T20:29:44+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2d2216b1-99be-494d-a89b-f7d79f7d2c13","owner":[],"postedDate":"November 19th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":39655528,"name":"Biological sciences/Developmental biology"},{"id":39655529,"name":"Biological sciences/Evolution/Evolutionary developmental biology"},{"id":39655530,"name":"Biological sciences/Neuroscience/Motor control"},{"id":39655531,"name":"Biological sciences/Psychology/Human behaviour"}],"tags":[],"updatedAt":"2025-01-06T16:03:09+00:00","versionOfRecord":{"articleIdentity":"rs-5073434","link":"https://doi.org/10.1038/s41598-024-84097-6","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2025-01-02 15:57:41","publishedOnDateReadable":"January 2nd, 2025"},"versionCreatedAt":"2024-11-19 14:26:17","video":"","vorDoi":"10.1038/s41598-024-84097-6","vorDoiUrl":"https://doi.org/10.1038/s41598-024-84097-6","workflowStages":[]},"version":"v1","identity":"rs-5073434","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5073434","identity":"rs-5073434","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-19T01:45:01.086888+00:00