Mapping internal representations of timbre and movement in metaphorical descriptions of violin performance | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Mapping internal representations of timbre and movement in metaphorical descriptions of violin performance Aviel Sulem, Ehud Bodner, Noam Amir This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7436859/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Expressive performance in Western classical music relies on performers’ interpretive choices and on expressive musical terms (EMTs) such as amoroso or risoluto. These terms often draw on metaphorical language, yet it remains unclear how EMTs relate to listeners’ cross-modal descriptions of sound and movement. We investigated this question in violin performance, asking seven professional string players to evaluate short excerpts performed according to four EMT categories (Amoroso/Affettuoso, Giocoso/Animato, Risoluto/Feroce, Tristamente/Lagrimoso) and a neutral condition. Participants rated each excerpt on 26 timbre and eight movement descriptors, the latter derived from Laban Movement Analysis. Multidimensional scaling revealed three perceptual dimensions for timbre (Smoothness, Aliveness, Pureness) and a lower-dimensional structure for movement descriptors aligned with Laban effort factors. Correlation and clustering analyses showed that EMTs consistently mapped onto distinct timbre–movement profiles, with clearer associations in less ambiguous performances. These findings demonstrate that EMTs are perceptually grounded in embodied cross-modal mappings, linking auditory qualities to imagined movement. The study provides a systematic framework for understanding expressive intention in performance and offers applications in pedagogy and computational modeling. Physical sciences/Mathematics and computing Biological sciences/Neuroscience Biological sciences/Psychology Social science/Psychology perception of expression violin performance expressive musical terms timbre fictional movement Laban movement analysis Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Introduction In Western classical music, expressive performance emerges from the interplay between performers’ interpretive choices and the expressive musical terms (EMTs) notated by composers. Performers shape expression by manipulating timing, dynamics, articulation, timbre, intonation, and vibrato—much like actors interpreting a script—while composers often guide interpretation through EMTs such as amoroso , giocoso , or risoluto . Rather than prescribing technical execution, these terms aim to evoke character and affect. Previous work has mapped EMTs onto Russell’s (1980) circumplex model of affect, linking musical expression to the dimensions of valence and arousal (Sulem et al., 2019). Listeners and musicians also describe performances with adjectives that convey timbre (e.g., warm, brilliant, rough) (Holmes, 2012) or fictional movement—motion-like qualities metaphorically attributed to sound (e.g., free, sustained, flexible) (Johnson & Larson, 2003). Such descriptors are cross-modal, linking auditory perception with tactile, visual, and kinesthetic imagery (Haverkamp, 2009). Although these terms differ from EMTs in their typical use, both may draw from overlapping conceptual structures grounded in shared sensory–motor metaphors. Theories of conceptual metaphor (Lakoff & Johnson, 1980) propose that abstract expressive and emotional concepts are grounded in embodied sensory–motor schemas (Johnson, 1987; Gibbs, 2006), raising the possibility that EMTs and cross-modal descriptors share common conceptual foundations. Previous research has shown that timbre contributes to the communication of affect (Eerola, Ferrer, & Alluri, 2012; Hailstone et al., 2009; McAdams et al., 1995). Still, most studies have examined it using isolated tones or cross-instrument comparisons, rather than stylistically coherent performances on a single instrument. Studies of fictional movement have primarily focused on timing and phrasing, especially in keyboard performance (Repp, 1996; Repp, 1998), and in wind instruments such as the clarinet, where timing, timbre, and dynamics variations across phrases mediate expression (Barthet et al., 2010). The violin offers a particularly suitable context for studying these phenomena, as it can produce a wide range of timbres (Fritz et al., 2012) and movement-like qualities (Leman, 2010; Shove & Repp, 1995) using continuous pitch control, varied bowing techniques, and subtle changes in articulation. Yet, to the best of our knowledge, no previous studies have examined how musicians’ and listeners’ metaphorical descriptions of timbre and movement align with EMT-based evaluations in violin performance. Here, we investigate whether EMT categories can be understood through patterns of cross-modal metaphors—descriptors of timbre and fictional movement grounded in sensory–motor imagery. To establish the descriptor set, we first conducted a pre-experiment in which professional string players selected timbre adjectives from a large candidate pool, yielding 26 descriptors most relevant to expressive violin performance (Fig. 1; see Methods). Movement descriptors were drawn from the four Effort factors of Laban Movement Analysis (LMA)— Weight , Space , Time , Flow —resulting in eight bipolar terms ( strong–light , straight-flexible , sudden–sustained, free-bound ) (Laban & Ullmann 1971). In the main experiment, professional string players listened to short violin excerpts performed according to four EMT categories ( Amoroso/Affettuoso , Giocoso/Animato , Risoluto/Feroce , Tristamente/Lagrimoso ) and a neutral condition, and rated them using a graphical interface designed for timbre and movement evaluations (Fig. 2). Ratings by the same participants were collected in two separate sessions—one for perceived EMT categories (Sulem et al., 2023), the other for cross-modal metaphors—ensuring that any correspondences would arise from shared perceptual representations of the performance rather than direct comparison during the task. We analyzed the structure of metaphor ratings using multidimensional scaling (MDS) and cluster analysis, and examined how the resulting perceptual dimensions aligned with EMT categories, affective models such as valence and arousal, and LMA effort factors. LMA offers a framework for describing physical movement qualities (Bartenieff & Lewis, 2013; Newlove & Dalby, 2019); here, we adapted it to characterize perceived dynamic qualities of sound. We hypothesized that EMTs and cross-modal metaphors would occupy a shared perceptual–conceptual space; that movement descriptors would form dimensions aligned with affective axes such as valence and arousal; and that specific combinations of timbre and movement descriptors would reliably distinguish the four EMT categories. By linking EMTs to cross-modal metaphors in a single-instrument context, this study seeks to clarify how expressive intentions in violin performance are internally represented, and how abstract musical character is grounded in embodied auditory imagery (Godøy & Leman, 2010; Leman, 2007). Results To examine the relationship between EMT categories and the perceptual representations of timbre and movement descriptors in violin performance, we conducted four analyses. First, we characterized the global descriptor structure (Step 1). Second, we assessed how descriptors differentiated EMT categories and whether these patterns varied across performances previously rated as clearly, moderately, or ambiguously conveying their intended EMT (Sulem et al., 2023; Step 2). Third, we tested the correspondence between descriptor ratings and EMT ratings from our previous study (Step 3). Finally, we mapped EMT categories within the perceptual spaces defined by timbre and movement (Step 4). Step 1: Perceptual spaces of timbre and movement descriptors We first examined the perceptual structure of the 34 descriptors (26 timbre, 8 movement) using multidimensional scaling (MDS). Inspection of the stress function S (goodness of fit) indicated that a three-dimensional solution provided an excellent fit (S = 0.03), whereas two dimensions were insufficient (S = 0.10; Kruskal & Wish, 1978). The resulting space and its coordinate projections are shown in Fig. 3. The three dimensions were interpreted as Smoothness (rough–sharp vs. warm–mellow), Aliveness (dead–cold vs. alive–brilliant), and Pureness (dark vs. clean). A partition into five clusters, chosen based on the spatial distribution of descriptors and aimed to align with the five EMT categories from our previous study (Sulem et al., 2023), is depicted in Fig. 3 by distinct colors and symbols (cluster centroids marked by ⊗). The magenta square cluster was characterized by high Smoothness, the green downward triangle by low Smoothness, the red left triangle by high Aliveness, the blue circle by low Pureness, and the black hexagram by low Smoothness and low Aliveness. Repeating the analysis separately for performances previously rated as least, moderately, or most clearly conveying their intended EMT produced very similar layouts, indicating that inter-descriptor correlations were largely insensitive to the degree of match with intended EMTs and that participants rated descriptors consistently. Strong and highly significant correlations (|r| > 0.7, p < 10⁻⁸) emerged between timbre and movement descriptors (Table 1). A first group of timbre descriptors—sweet, soft, airy, warm, mellow, and singing—was strongly associated with the movement qualities free and light. A partially overlapping set—soft, warm, mellow, singing, and calm—also correlated with flexible and sustained. Conversely, rough and sharp correlated with bound and strong; sharp and rough also with straight; and brash, brilliant, and sharp with sudden. Notably, sharp correlated strongly with both straight and sudden. To characterize the movement space more explicitly, we observed in Fig. 4 (left) that all eight movement descriptors lay close to an oblique plane M (best fit: 0.21x + 0.45y – z – 0.01 = 0, RMSE = 0.19). An MDS restricted to the movement descriptors alone yielded a two-dimensional configuration (Fig. 4, right) that closely matched the projection of the movement points onto plane M. Step 2: Descriptor ratings across EMT categories and matching levels We next examined how suitable each descriptor was for distinguishing the intended EMT categories. Figure 5 shows mean ratings (± SD) for timbre and movement descriptors, computed separately for performances at three confusion levels ( least , moderate , most ) in perceived EMT categories, as well as for all performances combined (descriptors ordered by overall means). Within each EMT, timbre and movement ratings decayed progressively, allowing us to distinguish strongly, moderately, and weakly appropriate descriptors (Table 2). To facilitate comparison across EMTs, Figure 6 presents an alternative view (all performances combined), with timbre and movement descriptors ordered separately within each EMT. EMTs with the highest mean descriptor ratings (“dominant EMTs”) generally corresponded to the five descriptor clusters identified in Step 1 (cluster colors shown on the x-axis), linking each cluster to a corresponding dominant EMT. Exceptions included brash, sudden, sustained, straight, ringing, and resonant, which showed comparable ratings across adjacent EMT categories and/or occupied boundary positions in the perceptual space. A closer look reveals shared and distinctive descriptors across EMTs. For example, Amoroso/Affettuoso and Tristamente/Lagrimoso both include calm, singing, mellow, warm, and sweet, yet they diverge in that soft and airy are more typical of Amoroso/Affettuoso, while rich, deep, and dark are more typical of Tristamente/Lagrimoso. For movement, both are described as flexible and sustained, but Amoroso/Affettuoso tends more toward free and light. Finally, standard deviations across descriptors were generally smaller for the least-confused performances compared to moderately or most-confused performances, indicating greater consistency among participants when the intended EMT was clearly conveyed. Step 3: Correspondence between descriptor and EMT dimensional structures To examine how EMT ratings relate to timbre and movement descriptor ratings, we computed Pearson correlations across performances, relating participants’ ratings of each EMT category (Sulem et al., 2023) to their ratings of each descriptor in the present study (Fig. 7). Descriptors are ordered so that, for each EMT, the strongest positive correlations appear adjacent. The y-axis is divided into three equal ranges, indicating positive, near-zero, and negative correlations. All reported correlations and anti-correlations were significant (p values between 10⁻²³ and 10⁻²). Strong positive associations (r ≥ 0.7) included: Amoroso/Affettuoso with timbre soft, sweet, warm, airy, singing, mellow and movement free, flexible, light; Giocoso/Animato with timbre alive, brilliant; Risoluto/Feroce with timbre brash, rough, sharp and movement strong, sudden; Tristamente/Lagrimoso with timbre soft, warm, mellow, calm and movement sustained; Neutral with timbre dead, monotonous, cold, plain, even. In most cases, descriptors most correlated with a perceived EMT were also those most characteristic of the same intended EMT (cf. Fig. 6), supporting alignment between the two rating sessions. Step 4: EMT categories within timbre and movement descriptor perceptual spaces Figure 8 (left) displays the EMT categories as points in the 3D descriptor space (coordinates from all performances). Coordinates were obtained as the linear combination of descriptor coordinates (Fig. 3), weighted by their corresponding ratings (Fig. 5). Categories are denoted A, G, R, T, and N, for Amoroso/Affettuoso, Giocoso/Animato, Risoluto/Feroce, Tristamente/Lagrimoso, and Neutral, respectively, with indices 1–3 for confusion levels and 0 for “all performances” (See Methods). The four expressive EMT categories lay close to a common plane P (RMSE = 0.05), described by 0.25x + 0.59y – z – 0.024 = 0. Neutral performances lay along a straight line intersecting this plane, given by x = –0.45 – 25.20 (z – 0.10); y = –0.13 – 18.27 (z – 0.10). Notably, plane P almost coincided with plane M from the movement analysis: both pass within a few hundredths of the origin and form an angle of only ~6°. A corresponding analysis restricted to solely movement descriptors (Fig. 8, right) yielded EMT placements closely matching their positions relative to plane P. This reinforces the correspondence between dimensional structures derived from movement descriptors and those derived from affect (valence and arousal). Specifically, one movement dimension can be interpreted as wringing vs. dabbing, reminiscent of valence, while the other corresponds to floating vs. thrusting, reminiscent of arousal. An interesting observation is that arousal emerged as the primary dimension over valence. This suggests that listeners were more sensitive to performance expression in terms of arousal than in terms of valence. These findings suggest an inherent relationship between mental representations of expressive violin performance in terms of movement, affect, and cross-modal descriptors. Discussion This study investigated how expressive musical terms (EMTs) in violin performance are perceptually grounded in timbre and fictional movement, expressed through cross-modal metaphors. Multidimensional scaling revealed that the full set of timbre and movement descriptors formed a three-dimensional perceptual space (Smoothness, Aliveness, Pureness), while an analysis restricted to the eight movement descriptors alone, derived from the Effort factors of Laban Movement Analysis, yielded a lower-dimensional plane closely aligned with the plane of the EMT categories. EMT categories mapped consistently within these spaces, arranged primarily along an arousal axis and secondarily along a valence axis, with arousal emerging as the dominant dimension. Together, these findings demonstrate that EMTs are systematically linked to perceptual representations of timbre and movement, supporting the view that expressive musical performance is grounded in embodied auditory–motor associations. Distinct perceptual structures for timbre and movement descriptors The organization of timbre descriptors into three orthogonal dimensions is consistent with prior work showing that timbre perception is inherently multidimensional (Grey, 1977; McAdams, 2013; McAdams et al., 1995). Our dimensions—Smoothness, Aliveness, and Pureness— echo timbre qualities such as roughness, brilliance, and clarity that have been identified in listener-derived adjectives for violin and other instruments (Fritz et al., 2012). This supports the idea that listeners rely on multiple, partly independent acoustic cues when characterizing violin timbre. Movement descriptors, by contrast, collapsed onto a lower-dimensional plane. This result dovetails with research on embodied music cognition (Godøy & Leman, 2010), where expressive qualities are often captured by holistic motor imagery rather than independent perceptual axes. The plane’s alignment with Laban’s Effort factors suggests that listeners interpret sound using embodied categories of weight, space, time, and flow. The two principal dimensions within this plane can be interpreted as effort actions such as wringing vs. dabbing and floating vs. thrusting, consistent with findings that listeners spontaneously map sound to motion metaphors (Eitan & Timmers, 2010; Sievers et al., 2013). Mapping EMT categories within perceptual spaces When EMT categories were mapped onto the 3D descriptor space, they occupied distinct locations, reminiscent of their location in Russell’s (1980) circumplex model of affect (Sulem et al., 2019), supporting consistency between conveyed and perceived expressions guided by EMTs. Within this perceptual plane, the arrangement of the four expressive EMTs corresponded primarily to an arousal axis (e.g., Giocoso/Animato and Risoluto/Feroce at the high-arousal end; Amoroso/Affettuoso and Tristamente/Lagrimoso at the low-arousal end) and secondarily to a valence axis (e.g., Giocoso/Animato and Amoroso/Affettuoso more positive; Risoluto/Feroce and Tristamente/Lagrimoso more negative). This structure aligns with emotion research showing that arousal is often the most salient and discriminable dimension, with valence emerging as a secondary contrast (Posner et al., 2005; Scherer, 2005). In contrast, Neutral performances were not located on the expressive EMT plane but along a separate axis, suggesting that expressiveness itself differentiates expressive from neutral interpretations (Sulem et al., 2023). Performances most clearly perceived as Neutral were positioned furthest from the expressive plane, whereas those perceived as less Neutral lay closer to it. This axis was not perfectly orthogonal to the expressive plane, indicating that Neutral performances were not entirely affectless but conveyed aspects of low arousal and negative valence. Our confusion-level analyses further show how clarity of conveyed expression modulates EMT–descriptor mappings. Least-confused performances elicited consistent profiles across listeners, whereas most-confused performances produced more heterogeneous ratings. This pattern aligns with studies suggesting that expressive communication is categorical when cues converge, but becomes graded or ambiguous when cues are partially conflicting (Gabrielsson & Juslin, 1996; Palmer, 1997;). Importantly, even under conditions of ambiguity, EMT categories retained distinct positions in the perceptual plane, pointing to robust perceptual anchors for each expressive intention. Cross-modal correspondences in expressive communication Correlation analyses between perceived EMT ratings and descriptor ratings showed that the descriptors most strongly associated with a given perceived EMT were generally those most characteristic of the same EMT when intended by the performer. This reinforces the view that EMTs are grounded in robust cross-modal associations between timbral qualities and imagined movement. Such correspondences may function as a shared expressive vocabulary among musicians, facilitating communication in rehearsal, pedagogy, and performance (Juslin, 2003; Palmer, 1997). At the same time, overlap between categories, particularly in the movement domain, suggests that EMTs often draw on partially shared gestural schemata. This echoes findings that listeners associate similar motion metaphors with multiple emotions (Sievers et al. 2013) and that musical expression is perceived along both categorical and continuous dimensions (Cespedes-Guevara & Eerola, 2018). Thus, EMTs appear to function both as discrete labels and as pointers to underlying continuous perceptual–conceptual dimensions. Methodological contributions and applications The present approach—selecting empirically relevant descriptors, applying MDS to reveal perceptual structures, and mapping EMT categories onto these structures—provides a systematic framework for linking abstract musical terminology to measurable perceptual dimensions. While our participant pool was relatively small and highly expert, the ratings were agreed across participants, particularly for the least-confused EMT performances. Furthermore, the method is adaptable to other instruments, genres, and listener populations, including non-musicians. Potential applications include: (1) Performance pedagogy: identifying combinations of timbre and movement cues that reliably convey specific expressive intentions (Palmer, 1997); (2) Cross-cultural research: testing whether similar perceptual structures and EMT mappings arise in different musical traditions; (3) Computational modelling: supplying empirically grounded dimensions for machine-learning models of expressive performance (Widmer & Goebl, 2004). Limitations and future directions The primary limitation of this study is the small, expert-only sample, which constrains generalization. Previous work shows that musical expertise influences both timbre categorization (McAdams et al., 1995) and metaphorical mappings of expression (Eitan & Timmers, 2010). Future research should therefore assess whether similar structures also hold among less-trained listeners and across different instruments. Another promising direction is to investigate the temporal dynamics of descriptors within performances, extending studies of expressive timing and phrasing (Repp, 1990; Palmer, 1997). Finally, focusing on Western classical violin raises the question of generalizability. Comparative work across instruments and cultures could help identify which aspects of timbre–movement mappings are universal and which are shaped by specific stylistic conventions (Balkwill & Thompson, 1999; Fritz et al., 2009). Conclusion By linking EMTs to both timbre and fictional movement descriptors in a controlled, single-instrument context, this study clarifies how listeners internally represent expressive intentions in violin performance. The findings highlight the complementary contributions of timbre and movement to expressive communication, supporting an embodied view of musical meaning, in which abstract expressive categories are grounded in both auditory–sensory and motor-imagery representations. Methods Participants The experiment was conducted in the Computer Laboratory of the Music Department at Bar-Ilan University with seven professional string players serving as listeners. The group included four violinists, one violinist/violist, one violist, and one cellist, all specializing in classical music and graduates of renowned music academies in Brazil, England, France, Israel, the Netherlands, and Russia. Participants’ ages ranged from 35 to 72 years (M = 46, SD = 12), with an average of 23 years of professional experience (SD = 16) as orchestra, chamber music, and solo performers. Their varied training and professional backgrounds exposed them to a wide range of interpretative approaches, enhancing the representativeness of the sample. All participants were fluent in English, the language in which the rating scales were presented. Materials To examine the relationships between the dimensional structures underlying the perception of expressive musical terms (EMTs) and cross-modal metaphors, we conducted an experiment in which seven professional string players listened to and evaluated short violin excerpts in terms of timbre and fictional movement descriptors. The excerpts were performed by a professional violinist (the first author) according to five EMT categories: Amoroso/Affettuoso [A], Giocoso/Animato [G], Risoluto/Feroce [R], Tristamente/Lagrimoso [T], and Neutral [N]. For consistency, we used recordings from an audio corpus constructed in our previous study (Sulem et al., 2023). In that study, the violinist performed 20 excerpts twice for each of the four expressive EMT categories and once in a neutral manner. The list of excerpts, their main compositional characteristics, and corresponding scores are provided in Appendix A. All recordings had previously been rated by the same participants for the five EMT categories several weeks prior to the current experiment. From this corpus, 10 performances were selected for each EMT category based on their perceived degree of match to the intended category, as reported in Experiment 1 of Sulem et al. (2023). The match score ranged from 0 (no match) to 5 (perfect match). For each category, we selected three performances [a, b, c] perceived as least confused, four [a, b, c, d] as moderately confused, and three [a, b, c] as most confused (i.e., involving perceptual ambiguity with other EMT categories). The complete list of 50 performances, along with their mean ratings and standard deviations for the intended EMT category, is provided in Appendix B. Procedure The procedure consisted of evaluating each performance in terms of timbre and fictional movement cross-modal metaphors. The timbre descriptors were selected through a preliminary pre-experiment, whereas the movement descriptors were chosen based on the effort factors of Laban Movement Analysis (LMA). Selection of timbre descriptors. To identify a set of adjectives related to expressive violin performance from a broad pool of tone-quality descriptors, we first compiled 68 candidate adjectives (Appendix C): 55 from Fritz et al. (2012), six from Huang and Akagi (2008), and seven from Fischer (2004, 2013). Participants listened to 25 performances (five per EMT category, randomly chosen from the audio corpus) and, for each, freely selected any adjectives they felt were appropriate to describe the sound quality (no limit on number). This task lasted about 30 minutes. Figure 1 shows the selection frequency of each adjective in descending order. To create a manageable set for the subsequent rating phase, we reduced the list while maintaining coverage of the range of violin timbres. Adjectives with selection frequencies below the median value (16.0) were excluded. We then removed near-synonyms, retaining the most frequently selected term (e.g., focused → clear, lively → alive, sonorous → ringing). Finally, we removed light, free, and strong, which were already included as movement descriptors, but retained rough, which—despite falling just below the median—had no equivalent among higher-frequency terms. This process resulted in 26 timbre adjectives (Figure 2). To validate the selection, we compared the selection frequencies of the included versus excluded adjectives using a Mann–Whitney U test. The retained adjectives were chosen significantly more often than the excluded ones (p < 0.0001). Selection of movement descriptors. Movement descriptors were chosen to capture the fictional movement qualities of the sound, drawing on the four effort factors in LMA. For each factor, we used a pair of opposite adjectives: Strong–Light (Weight), Straight–Flexible (Space), Sudden–Sustained (Time), and Bound–Free (Flow). Originally developed to describe human motion, these terms are also commonly used by musicians to describe expressive performance. Evaluation of performances. The 50 performances listed in Appendix B were presented in different random orders several weeks after the timbre descriptor selection phase. For each performance, participants rated the 26 timbre descriptors and the eight movement descriptors (Figure 2) according to how well they matched the performance, using a 4-point Likert scale: 1 = not appropriate, 2 = slightly appropriate, 3 = moderately appropriate, 4 = strongly appropriate. For each LMA effort factor, only one of the two opposite movement descriptors could be selected, with the option to assign a score of 0 to both if neither was appropriate. The experiment lasted approximately 1.5 hours. The ethical committee of the Music Department at Bar-Ilan University approved the study. Each participant received ILS 250 for their time. All methods were performed in accordance with relevant guidelines and regulations. Written informed consent was obtained from all participants. Data analysis The analysis of listeners’ ratings was conducted in three main steps. Step 1: Perceptual space of timbre and movement descriptors. We constructed a geometric representation of the perceptual space (Figure 3) using non-metric multidimensional scaling (MDS) in MATLAB. Each descriptor was represented as a 50-component vector of mean ratings for the 50 performances. The descriptor–descriptor correlation matrix was computed as the Pearson correlation between and . A dissimilarity matrix was then defined as , where is the identity matrix. Applying the mdscale function to yielded the coordinates in the optimal dimensionality. We also performed hierarchical cluster analysis, partitioning the 34 descriptors into five clusters. In addition, a separate MDS analysis was conducted on the subset of movement descriptors alone (Figure 4, right panel). Step 2: Description of EMT categories in terms of descriptors. For each intended EMT category, we computed the mean and standard deviation of each descriptor rating, averaged across participants and performances (Table 1, Figure 6). We repeated this analysis separately for performances classified as least, moderately, and most confused (Figure 5). Pearson correlations were also computed between descriptor ratings and the perceived EMT ratings from Experiment 1 in Sulem et al. (2023) (Figure 7). Step 3: Relation between descriptor and EMT dimensional structures. Each EMT category was represented as a linear combination of descriptor coordinates (from Step 1), weighted by the mean descriptor ratings (from Step 2). These EMT representations were then projected into the descriptor perceptual space to visualize their spatial relationships (Figure 8). Declarations Author contributions A.S., E.B., and N.A. conceived and designed the study. A.S. conducted the experiments and analyzed the data. A.S. wrote the first draft of the manuscript, and all authors contributed to writing, reviewing, and editing. E.B. and N.A. supervised the project. Funding Declaration This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. Data availability statement The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request. References Balkwill, L. L., & Thompson, W. F. (1999). A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues. Music perception , 17(1), 43-64. Bartenieff, I., & Lewis, D. (2013). Body movement: Coping with the environment . Routledge. Barthet, M., Depalle, P., Kronland-Martinet, R., & Ystad, S. (2010). Acoustical correlates of timbre and expressiveness in clarinet performance. Music Perception , 28(2), 135-154. Cespedes-Guevara, J., & Eerola, T. (2018). Music communicates affects, not basic emotions–A constructionist account of attribution of emotional meanings to music. Frontiers in psychology , 9, 215. Eerola, T., Ferrer, R., & Alluri, V. (2012). Timbre and affect dimensions: Evidence from affect and similarity ratings and acoustic correlates of isolated instrument sounds. Music Perception: An Interdisciplinary Journal , 30(1), 49-70. Eitan, Z., & Timmers, R. (2010). Beethoven’s last piano sonata and those who follow crocodiles: Cross-domain mappings of auditory pitch in a musical context. Cognition, 114(3), 405-422. Fischer, S. (2004). Practice: 250 step-by-step practice methods for the violin. Edition Peters. Fischer, S. (2013). The violin lesson. Peters. Fritz, C., Blackwell, A. F., Cross, I., Woodhouse, J., & Moore, B. C. (2012). Exploring violin sound quality: Investigating English timbre descriptors and correlating resynthesized acoustical modifications with perceptual properties. The Journal of the Acoustical Society of America, 131(1), 783-794. Fritz, T., Jentschke, S., Gosselin, N., Sammler, D., Peretz, I., Turner, R., ... & Koelsch, S. (2009). Universal recognition of three basic emotions in music. Current biology , 19(7), 573-576. Gabrielsson, A., & Juslin, P. N. (1996). Emotional expression in music performance: Between the performer's intention and the listener's experience. Psychology of music , 24(1), 68-91. Gibbs Jr, R. W. (2005). Embodiment and cognitive science . Cambridge University Press. Godøy, R. I., & Leman, M. (Eds.). (2010). Musical gestures: Sound, movement, and meaning. Routledge. Grey, J. M. (1977). Multidimensional perceptual scaling of musical timbres. the Journal of the Acoustical Society of America , 61(5), 1270-1277. Hailstone, J. C., Omar, R., Henley, S. M., Frost, C., Kenward, M. G., & Warren, J. D. (2009). It's not what you play, it's how you play it: Timbre affects perception of emotion in music. Quarterly journal of experimental psychology , 62(11), 2141-2155. Hajda, J. M., Kendall, R. A., Carterette, E. C., & Harshberger, M. L. (1997). Methodological issues in timbre research. Haverkamp, M. (2009). Look at that sound! Visual aspects of auditory perception. In III Congreso Intyernacional de Sinestesia , Granada. Holmes, P. A. (2012). An exploration of musical communication through expressive use of timbre: The performer’s perspective. Psychology of Music , 40(3), 301-323. Huang, C. F., & Akagi, M. (2008). A three-layered model for expressive speech perception. Speech Communication , 50(10), 810-828. Johnson, M. (1987). The body in the mind. The Bodily Basis of Meaning, Imagination, and Reason/The University of Chicago. Johnson, M. L., & Larson, S. (2003). " Something in the way she moves"-metaphors of musical motion. Metaphor and symbol , 18(2), 63-84. Juslin, P. N. (2003). Five facets of musical expression: A psychologist's perspective on music performance. Psychology of music , 31(3), 273-302. Kruskal, J. B., & Wish, M. (1978). Multidimensional scaling (No. 11). Sage. Laban, R., & Ullmann, L. (1971). The mastery of movement . Leman, M. (2007). Embodied music cognition and mediation technology. MIT press. Leman, M. (2010). Music, gesture, and the formation of embodied meaning. In Musical gestures (pp. 138-165). Routledge. McAdams, S. (2013). Musical timbre perception. The psychology of music , 3. McAdams, S., Winsberg, S., Donnadieu, S., De Soete, G., & Krimphoff, J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes. Psychological research , 58(3), 177-192. Newlove, J., & Dalby, J. (2019). Laban for all. Routledge. Palmer, C. (1997). Music performance. Annual review of psychology, 48(1), 115-138. Posner, J., Russell, J. A., & Peterson, B. S. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and psychopathology , 17(3), 715-734. Repp, B. H. (1990). Patterns of expressive timing in performances of a Beethoven minuet by nineteen famous pianists. The Journal of the Acoustical Society of America , 88(2), 622-641. Repp, B. H. (1996). Pedal timing and tempo in expressive piano performance: A preliminary investigation. Psychology of Music , 24(2), 199-221. Repp, B. H. (1998). Variations on a theme by Chopin: Relations between perception and production of timing in music. Journal of Experimental Psychology: Human Perception and Performance , 24(3), 791. Russell, J. A. (1980). A circumplex model of affect. Journal of personality and social psychology , 39(6), 1161. Scherer, K. R. (2005). What are emotions? And how can they be measured?. Social science information , 44(4), 695-729. Shove, P., & Repp, B. H. (1995). Musical motion and performance: Theoretical and empirical perspectives. The practice of performance , 55-83. Sievers, B., Polansky, L., Casey, M., & Wheatley, T. (2013). Music and movement share a dynamic structure that supports universal expressions of emotion. Proceedings of the national academy of sciences, 110(1), 70-75. Sulem, A., Bodner, E., & Amir, N. (2019). Perception-based classification of expressive musical terms: toward a parameterization of musical expressiveness. Music Perception: An Interdisciplinary Journal , 37(2), 147-164. Sulem, A., Bodner, E., & Amir, N. (2023). Perception of violin performance expression through expressive musical terms. Musicae Scientiae , 27(2), 442-470. Widmer, G., & Goebl, W. (2004). Computational models of expressive music performance: The state of the art. Journal of new music research , 33(3), 203-216. Tables Tables 1 and 2 are available in the Supplementary Files section Additional Declarations No competing interests reported. Supplementary Files Tables.docx Appendix.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7436859","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":517136657,"identity":"f7ef9a73-5a84-4f66-ba90-d7c68e57c688","order_by":0,"name":"Aviel Sulem","email":"","orcid":"","institution":"The Hebrew University of Jerusalem","correspondingAuthor":false,"prefix":"","firstName":"Aviel","middleName":"","lastName":"Sulem","suffix":""},{"id":517136658,"identity":"611bd813-757b-44ef-bf99-1ca1390e990e","order_by":1,"name":"Ehud Bodner","email":"","orcid":"","institution":"Bar-Ilan University","correspondingAuthor":false,"prefix":"","firstName":"Ehud","middleName":"","lastName":"Bodner","suffix":""},{"id":517136659,"identity":"9068291e-b952-4d47-a316-5b1297787b56","order_by":2,"name":"Noam Amir","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA5ElEQVRIiWNgGAWjYNCCCjT+gQcEtZwxgCqFaUkgpIOxDU0LAz4t5uzNTzf8nPdH3uB4jwHzh5rDDPztBxjx2mLZc8zsZu82A8MNZ44lMBw4dphB4kwCfocZ3Egwu8G7zYBxw43kAwwH2A4zMNwg4BeDG+nfbv6dY2C/4f7DBoYD/w4zyBPWkmN2m7fBIHHDDeYDDAfbDgNFCGk5c6bstswx4+SZZ9ISDpztS+cxPJPYgF/L8fZtN9/UyNn2HT9j+KDim7Wc3PHDhz98wKMFBRwAYh5gNDUQq2EUjIJRMApGAQ4AAHmQWeo3k9iSAAAAAElFTkSuQmCC","orcid":"","institution":"Tel-Aviv University","correspondingAuthor":true,"prefix":"","firstName":"Noam","middleName":"","lastName":"Amir","suffix":""}],"badges":[],"createdAt":"2025-08-22 18:38:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7436859/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7436859/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":91937075,"identity":"40dcdd31-5cb8-4bc5-8533-452da372f598","added_by":"auto","created_at":"2025-09-23 02:53:33","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":85771,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePre-selection of timbre descriptors.\u003c/strong\u003e Bars show the selection frequency of 68 candidate adjectives in a pre-experiment with seven professional string players who listened to 25 violin performances (fiRve per EMT category) and freely chose any timbre descriptors that fit each excerpt. Adjectives are ordered by frequency (descending). For the main experiment, we retained terms with frequencies above the median (16), removed near-synonyms (keeping the more frequent term), and excluded adjectives overlapping with the Laban movement set (light, free, strong); rough was retained despite being just below the median to ensure coverage of the timbre space. This procedure yielded 26 timbre descriptors used in the rating task (see Methods and Appendix C).\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/01ed1c6699cd8ffabafa2478.png"},{"id":91937076,"identity":"0272d780-d175-4394-865c-4b4825eea2c4","added_by":"auto","created_at":"2025-09-23 02:53:33","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":142508,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRating interface for timbre and movement descriptors.\u003c/strong\u003e Example screen from the main experiment. For each violin excerpt, participants rated all timbre descriptors (26 terms pre-selected in the pre-experiment; see Fig. 1) and all movement descriptors (eight bipolar terms derived from the Effort factors of Laban Movement Analysis). Ratings were given by selecting the most appropriate value on a four-point Likert scale from 0 (“not appropriate”) to 3 (“strongly appropriate”).\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/4c09e36c6441270d37879472.png"},{"id":91935451,"identity":"bd826f07-24d5-4eae-9cf0-d09e20f0039f","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":200291,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThree-dimensional perceptual space of timbre and movement descriptors.\u003c/strong\u003e Multidimensional scaling of the 34 descriptors (26 timbre, 8 movement) yielded a three-dimensional solution with excellent fit (stress = 0.03). The three dimensions were interpreted as Smoothness (rough–sharp vs. warm–mellow), Aliveness (dead–cold vs. alive–brilliant), and Pureness (dark vs. clean). Descriptors are shown with their coordinates and grouped into five clusters (symbols and colors; cluster centroids marked by ⊗). Clusters were identified based on spatial distribution and chosen to align with the five EMT categories tested in Sulem et al. (2023). Projections onto the three coordinate planes are shown.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/1ac9498fe1c9c9c8d5e5fac5.png"},{"id":91935454,"identity":"906d4194-a061-4098-a71d-5f931d310b7c","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":78868,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePerceptual plane of movement descriptors.\u003c/strong\u003e (Left) Position of the eight movement descriptors within the three-dimensional descriptor space (cf. Fig. 3). All descriptors lie close to an oblique plane M (best-fit equation: 0.21x + 0.45y – z – 0.01 = 0; RMSE = 0.19). (Right) Multidimensional scaling restricted to movement descriptors alone produced a two-dimensional configuration that closely matched the projection of the movement points onto plane M. Together, these results indicate that movement descriptors are organized within a lower-dimensional subspace.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/fea57e681df8727848cc619d.png"},{"id":91935457,"identity":"652a34fd-bae9-430e-99af-ab4340a54321","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":264113,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDescriptor ratings across EMT categories and confusion levels.\u003c/strong\u003e Mean ratings (± s.d.) for all timbre (left) and movement (right) descriptors, shown separately for each intended EMT category (Amoroso/Affettuoso, Giocoso/Animato, Risoluto/Feroce, Tristamente/Lagrimoso, Neutral). Within each EMT, ratings are plotted for three confusion levels of perceived EMT categorization (least, moderate, most) and for all performances combined. Descriptors are ordered by their overall mean ratings across performances. This visualization highlights which descriptors were judged most strongly, moderately, or weakly appropriate for each EMT and how consistency varied with confusion level.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/7267c350fa4260f756e859de.png"},{"id":91935459,"identity":"91347be5-7438-4e5d-993f-8d4fe5a65c84","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":82752,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eDominant descriptors for each EMT category.\u003c/strong\u003e Mean ratings (all performances combined) for timbre (left) and movement (right) descriptors, ordered separately within each EMT category. Descriptor cluster membership from the 3D perceptual space (Fig. 3) is indicated by color on the x-axis. This ordering highlights the “dominant descriptors” most strongly associated with each EMT and illustrates correspondences between EMT categories and the five descriptor clusters.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/7c5c1527fbf873ef7055a08d.png"},{"id":91937077,"identity":"7e03c2be-d00b-4b2b-8da5-629ea09d163c","added_by":"auto","created_at":"2025-09-23 02:53:33","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":115584,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCorrelations between EMT ratings and descriptor ratings.\u003c/strong\u003e Pearson correlations across performances between participants’ ratings of each EMT category (Sulem et al., 2023) and their ratings of individual descriptors in the present study. Descriptors are ordered so that, for each EMT, the strongest positive correlations appear adjacent. The vertical axis is divided into three ranges, indicating positive, near-zero, and negative correlations. All reported correlations and anti-correlations were significant (p values between 10⁻²³ and 10⁻²). The pattern shows that descriptors most correlated with a perceived EMT were also those most characteristic of its intended EMT, supporting alignment between intended and perceived expression.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/c8d1c68382d7d28e6c79e49b.png"},{"id":91937078,"identity":"c3d71fd4-05d3-49e1-a82c-68d76807566b","added_by":"auto","created_at":"2025-09-23 02:53:33","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":92967,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEMT categories within descriptor perceptual spaces.\u003c/strong\u003e (Left) Positions of EMT categories (A = Amoroso/Affettuoso, G = Giocoso/Animato, R = Risoluto/Feroce, T = Tristamente/Lagrimoso, N = Neutral) in the three-dimensional descriptor space (from Fig. 3). Coordinates were computed as the weighted average of descriptor positions, with weights given by mean ratings. Indices 1–3 denote confusion levels of perceived EMT categorization (least, moderate, most), and 0 denotes all performances combined. The four expressive EMTs lay close to a common plane P (best-fit equation: 0.25x + 0.59y – z – 0.024 = 0; RMSE = 0.05), which was almost identical to plane M defined by the movement descriptors (see Fig. 4). Neutral performances lay along a separate axis intersecting this expressive plane: performances most clearly perceived as Neutral were positioned furthest from the plane, while less-Neutral performances lay closer to it. (Right) Analysis restricted to movement descriptors alone yielded EMT placements closely matching their positions relative to plane P. The orientation of EMTs within this movement plane suggests that the axes of wringing–dabbing and floating–thrusting correspond to affective dimensions of valence and arousal, respectively.\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/d7eea1790b2ec67f9c79df25.png"},{"id":95529654,"identity":"83d8347f-8304-4ce1-b375-8316219bda05","added_by":"auto","created_at":"2025-11-10 10:17:21","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1867892,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/e76b162e-5f0a-44f2-867b-d56205404dcf.pdf"},{"id":91935449,"identity":"83f2f624-7c1b-41ee-9547-846e4f6bc96a","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":18378,"visible":true,"origin":"","legend":"","description":"","filename":"Tables.docx","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/f48167b1d9da3440a94b268a.docx"},{"id":91935456,"identity":"828d3156-8cbb-4fba-9389-1124f9325e9a","added_by":"auto","created_at":"2025-09-23 02:45:33","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":556483,"visible":true,"origin":"","legend":"","description":"","filename":"Appendix.docx","url":"https://assets-eu.researchsquare.com/files/rs-7436859/v1/3d3ca7c18749ad4987d3c190.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Mapping internal representations of timbre and movement in metaphorical descriptions of violin performance","fulltext":[{"header":"Introduction","content":"\u003cp\u003eIn Western classical music, expressive performance emerges from the interplay between performers’ interpretive choices and the expressive musical terms (EMTs) notated by composers. Performers shape expression by manipulating timing, dynamics, articulation, timbre, intonation, and vibrato—much like actors interpreting a script—while composers often guide interpretation through EMTs such as \u003cem\u003eamoroso\u003c/em\u003e, \u003cem\u003egiocoso\u003c/em\u003e, or \u003cem\u003erisoluto\u003c/em\u003e. Rather than prescribing technical execution, these terms aim to evoke character and affect. Previous work has mapped EMTs onto Russell’s (1980) circumplex model of affect, linking musical expression to the dimensions of valence and arousal (Sulem et al., 2019).\u003c/p\u003e\n\u003cp\u003eListeners and musicians also describe performances with adjectives that convey timbre (e.g., warm, brilliant, rough) (Holmes, 2012) or fictional movement—motion-like qualities metaphorically attributed to sound (e.g., free, sustained, flexible) (Johnson \u0026amp; Larson, 2003). Such descriptors are cross-modal, linking auditory perception with tactile, visual, and kinesthetic imagery (Haverkamp, 2009). Although these terms differ from EMTs in their typical use, both may draw from overlapping conceptual structures grounded in shared sensory–motor metaphors. Theories of conceptual metaphor (Lakoff \u0026amp; Johnson, 1980) propose that abstract expressive and emotional concepts are grounded in embodied sensory–motor schemas (Johnson, 1987; Gibbs, 2006), raising the possibility that EMTs and cross-modal descriptors share common conceptual foundations.\u003c/p\u003e\n\u003cp\u003ePrevious research has shown that timbre contributes to the communication of affect (Eerola, Ferrer, \u0026amp; Alluri, 2012; Hailstone et al., 2009; McAdams et al., 1995). Still, most studies have examined it using isolated tones or cross-instrument comparisons, rather than stylistically coherent performances on a single instrument. Studies of fictional movement have primarily focused on timing and phrasing, especially in keyboard performance (Repp, 1996; Repp, 1998), and in wind instruments such as the clarinet, where timing, timbre, and dynamics variations across phrases mediate expression (Barthet et al., 2010). The violin offers a particularly suitable context for studying these phenomena, as it can produce a wide range of timbres (Fritz et al., 2012) and movement-like qualities (Leman, 2010; Shove \u0026amp; Repp, 1995) using continuous pitch control, varied bowing techniques, and subtle changes in articulation. Yet, to the best of our knowledge, no previous studies have examined how musicians’ and listeners’ metaphorical descriptions of timbre and movement align with EMT-based evaluations in violin performance.\u003c/p\u003e\n\u003cp\u003eHere, we investigate whether EMT categories can be understood through patterns of cross-modal metaphors—descriptors of timbre and fictional movement grounded in sensory–motor imagery. To establish the descriptor set, we first conducted a pre-experiment in which professional string players selected timbre adjectives from a large candidate pool, yielding 26 descriptors most relevant to expressive violin performance (Fig. 1; see Methods). Movement descriptors were drawn from the four Effort factors of Laban Movement Analysis (LMA)—\u003cem\u003eWeight\u003c/em\u003e, \u003cem\u003eSpace\u003c/em\u003e, \u003cem\u003eTime\u003c/em\u003e, \u003cem\u003eFlow\u003c/em\u003e—resulting in eight bipolar terms (\u003cem\u003estrong–light\u003c/em\u003e, \u003cem\u003estraight-flexible\u003c/em\u003e, \u003cem\u003esudden–sustained, free-bound\u003c/em\u003e) (Laban \u0026amp; Ullmann 1971). In the main experiment, professional string players listened to short violin excerpts performed according to four EMT categories (\u003cem\u003eAmoroso/Affettuoso\u003c/em\u003e, \u003cem\u003eGiocoso/Animato\u003c/em\u003e, \u003cem\u003eRisoluto/Feroce\u003c/em\u003e, \u003cem\u003eTristamente/Lagrimoso\u003c/em\u003e) and a neutral condition, and rated them using a graphical interface designed for timbre and movement evaluations (Fig. 2). Ratings by the same participants were collected in two separate sessions—one for perceived EMT categories (Sulem et al., 2023), the other for cross-modal metaphors—ensuring that any correspondences would arise from shared perceptual representations of the performance rather than direct comparison during the task.\u003c/p\u003e\n\u003cp\u003eWe analyzed the structure of metaphor ratings using multidimensional scaling (MDS) and cluster analysis, and examined how the resulting perceptual dimensions aligned with EMT categories, affective models such as valence and arousal, and LMA effort factors. LMA offers a framework for describing physical movement qualities (Bartenieff \u0026amp; Lewis, 2013; Newlove \u0026amp; Dalby, 2019); here, we adapted it to characterize perceived dynamic qualities of sound. We hypothesized that EMTs and cross-modal metaphors would occupy a shared perceptual–conceptual space; that movement descriptors would form dimensions aligned with affective axes such as valence and arousal; and that specific combinations of timbre and movement descriptors would reliably distinguish the four EMT categories. By linking EMTs to cross-modal metaphors in a single-instrument context, this study seeks to clarify how expressive intentions in violin performance are internally represented, and how abstract musical character is grounded in embodied auditory imagery (Godøy \u0026amp; Leman, 2010; Leman, 2007).\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eTo examine the relationship between EMT categories and the perceptual representations of timbre and movement descriptors in violin performance, we conducted four analyses. First, we characterized the global descriptor structure (Step 1). Second, we assessed how descriptors differentiated EMT categories and whether these patterns varied across performances previously rated as clearly, moderately, or ambiguously conveying their intended EMT (Sulem et al., 2023; Step 2). Third, we tested the correspondence between descriptor ratings and EMT ratings from our previous study (Step 3). Finally, we mapped EMT categories within the perceptual spaces defined by timbre and movement (Step 4).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eStep 1: Perceptual spaces of timbre and movement descriptors\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe first examined the perceptual structure of the 34 descriptors (26 timbre, 8 movement) using multidimensional scaling (MDS). Inspection of the stress function S (goodness of fit) indicated that a three-dimensional solution provided an excellent fit (S = 0.03), whereas two dimensions were insufficient (S = 0.10; Kruskal \u0026amp; Wish, 1978). The resulting space and its coordinate projections are shown in Fig. 3. The three dimensions were interpreted as \u003cem\u003eSmoothness\u003c/em\u003e (rough\u0026ndash;sharp vs. warm\u0026ndash;mellow), \u003cem\u003eAliveness\u003c/em\u003e (dead\u0026ndash;cold vs. alive\u0026ndash;brilliant), and \u003cem\u003ePureness\u003c/em\u003e (dark vs. clean).\u003c/p\u003e\n\u003cp\u003eA partition into five clusters, chosen based on the spatial distribution of descriptors and aimed to align with the five EMT categories from our previous study (Sulem et al., 2023), is depicted in Fig. 3 by distinct colors and symbols (cluster centroids marked by\u0026nbsp;\u0026otimes;). The magenta square cluster was characterized by high Smoothness, the green downward triangle by low Smoothness, the red left triangle by high Aliveness, the blue circle by low Pureness, and the black hexagram by low Smoothness and low Aliveness.\u003c/p\u003e\n\u003cp\u003eRepeating the analysis separately for performances previously rated as least, moderately, or most clearly conveying their intended EMT produced very similar layouts, indicating that inter-descriptor correlations were largely insensitive to the degree of match with intended EMTs and that participants rated descriptors consistently.\u003c/p\u003e\n\u003cp\u003eStrong and highly significant correlations (|r| \u0026gt; 0.7, p \u0026lt; 10⁻⁸) emerged between timbre and movement descriptors (Table 1). A first group of timbre descriptors\u0026mdash;sweet, soft, airy, warm, mellow, and singing\u0026mdash;was strongly associated with the movement qualities free and light. A partially overlapping set\u0026mdash;soft, warm, mellow, singing, and calm\u0026mdash;also correlated with flexible and sustained. Conversely, rough and sharp correlated with bound and strong; sharp and rough also with straight; and brash, brilliant, and sharp with sudden. Notably, sharp correlated strongly with both straight and sudden.\u003c/p\u003e\n\u003cp\u003eTo characterize the movement space more explicitly, we observed in Fig. 4 (left) that all eight movement descriptors lay close to an oblique plane M (best fit: 0.21x + 0.45y \u0026ndash; z \u0026ndash; 0.01 = 0, RMSE = 0.19). An MDS restricted to the movement descriptors alone yielded a two-dimensional configuration (Fig. 4, right) that closely matched the projection of the movement points onto plane M.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eStep 2: Descriptor ratings across EMT categories and matching levels\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe next examined how suitable each descriptor was for distinguishing the intended EMT categories. Figure 5 shows mean ratings (\u0026plusmn; SD) for timbre and movement descriptors, computed separately for performances at three confusion levels (\u003cem\u003eleast\u003c/em\u003e, \u003cem\u003emoderate\u003c/em\u003e, \u003cem\u003emost\u003c/em\u003e) in perceived EMT categories, as well as for \u003cem\u003eall\u003c/em\u003e \u003cem\u003eperformances\u003c/em\u003e combined (descriptors ordered by overall means). Within each EMT, timbre and movement ratings decayed progressively, allowing us to distinguish strongly, moderately, and weakly appropriate descriptors (Table 2).\u003c/p\u003e\n\u003cp\u003eTo facilitate comparison across EMTs, Figure 6 presents an alternative view (all performances combined), with timbre and movement descriptors ordered separately within each EMT. EMTs with the highest mean descriptor ratings (\u0026ldquo;dominant EMTs\u0026rdquo;) generally corresponded to the five descriptor clusters identified in Step 1 (cluster colors shown on the x-axis), linking each cluster to a corresponding dominant EMT. Exceptions included brash, sudden, sustained, straight, ringing, and resonant, which showed comparable ratings across adjacent EMT categories and/or occupied boundary positions in the perceptual space.\u003c/p\u003e\n\u003cp\u003eA closer look reveals shared and distinctive descriptors across EMTs. For example, Amoroso/Affettuoso and Tristamente/Lagrimoso both include calm, singing, mellow, warm, and sweet, yet they diverge in that soft and airy are more typical of Amoroso/Affettuoso, while rich, deep, and dark are more typical of Tristamente/Lagrimoso. For movement, both are described as flexible and sustained, but Amoroso/Affettuoso tends more toward free and light.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFinally, standard deviations across descriptors were generally smaller for the least-confused performances compared to moderately or most-confused performances, indicating greater consistency among participants when the intended EMT was clearly conveyed.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eStep 3: Correspondence between descriptor and EMT dimensional structures\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo examine how EMT ratings relate to timbre and movement descriptor ratings, we computed Pearson correlations across performances, relating participants\u0026rsquo; ratings of each EMT category (Sulem et al., 2023) to their ratings of each descriptor in the present study (Fig. 7). Descriptors are ordered so that, for each EMT, the strongest positive correlations appear adjacent. The y-axis is divided into three equal ranges, indicating positive, near-zero, and negative correlations. All reported correlations and anti-correlations were significant (p values between 10⁻\u0026sup2;\u0026sup3; and 10⁻\u0026sup2;).\u003c/p\u003e\n\u003cp\u003eStrong positive associations (r \u0026ge; 0.7) included: Amoroso/Affettuoso with timbre soft, sweet, warm, airy, singing, mellow and movement free, flexible, light; Giocoso/Animato with timbre alive, brilliant; Risoluto/Feroce with timbre brash, rough, sharp and movement strong, sudden; Tristamente/Lagrimoso with timbre soft, warm, mellow, calm and movement sustained; Neutral with timbre dead, monotonous, cold, plain, even. In most cases, descriptors most correlated with a perceived EMT were also those most characteristic of the same intended EMT (cf. Fig. 6), supporting alignment between the two rating sessions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eStep 4: EMT categories within timbre and movement descriptor perceptual spaces\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFigure 8 (left) displays the EMT categories as points in the 3D descriptor space (coordinates from all performances). Coordinates were obtained as the linear combination of descriptor coordinates (Fig. 3), weighted by their corresponding ratings (Fig. 5). Categories are denoted A, G, R, T, and N, for Amoroso/Affettuoso, Giocoso/Animato, Risoluto/Feroce, Tristamente/Lagrimoso, and Neutral, respectively, with indices 1\u0026ndash;3 for confusion levels and 0 for \u0026ldquo;all performances\u0026rdquo; (See Methods). The four expressive EMT categories lay close to a common plane P (RMSE = 0.05), described by 0.25x + 0.59y \u0026ndash; z \u0026ndash; 0.024 = 0. Neutral performances lay along a straight line intersecting this plane, given by x = \u0026ndash;0.45 \u0026ndash; 25.20 (z \u0026ndash; 0.10); y = \u0026ndash;0.13 \u0026ndash; 18.27 (z \u0026ndash; 0.10). Notably, plane P almost coincided with plane M from the movement analysis: both pass within a few hundredths of the origin and form an angle of only ~6\u0026deg;.\u003c/p\u003e\n\u003cp\u003eA corresponding analysis restricted to solely movement descriptors (Fig. 8, right) yielded EMT placements closely matching their positions relative to plane P. This reinforces the correspondence between dimensional structures derived from movement descriptors and those derived from affect (valence and arousal). Specifically, one movement dimension can be interpreted as wringing vs. dabbing, reminiscent of valence, while the other corresponds to floating vs. thrusting, reminiscent of arousal. An interesting observation is that arousal emerged as the primary dimension over valence. This suggests that listeners were more sensitive to performance expression in terms of arousal than in terms of valence. These findings suggest an inherent relationship between mental representations of expressive violin performance in terms of movement, affect, and cross-modal descriptors.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThis study investigated how expressive musical terms (EMTs) in violin performance are perceptually grounded in timbre and fictional movement, expressed through cross-modal metaphors. Multidimensional scaling revealed that the full set of timbre and movement descriptors formed a three-dimensional perceptual space (Smoothness, Aliveness, Pureness), while an analysis restricted to the eight movement descriptors alone, derived from the Effort factors of Laban Movement Analysis, yielded a lower-dimensional plane closely aligned with the plane of the EMT categories. EMT categories mapped consistently within these spaces, arranged primarily along an arousal axis and secondarily along a valence axis, with arousal emerging as the dominant dimension. Together, these findings demonstrate that EMTs are systematically linked to perceptual representations of timbre and movement, supporting the view that expressive musical performance is grounded in embodied auditory–motor associations.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eDistinct perceptual structures for timbre and movement descriptors\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe organization of timbre descriptors into three orthogonal dimensions is consistent with prior work showing that timbre perception is inherently multidimensional (Grey, 1977; McAdams, 2013; McAdams et al., 1995). Our dimensions—Smoothness, Aliveness, and Pureness— echo timbre qualities such as roughness, brilliance, and clarity that have been identified in listener-derived adjectives for violin and other instruments (Fritz et al., 2012). This supports the idea that listeners rely on multiple, partly independent acoustic cues when characterizing violin timbre.\u003c/p\u003e\n\u003cp\u003eMovement descriptors, by contrast, collapsed onto a lower-dimensional plane. This result dovetails with research on embodied music cognition (Godøy \u0026amp; Leman, 2010), where expressive qualities are often captured by holistic motor imagery rather than independent perceptual axes. The plane’s alignment with Laban’s Effort factors suggests that listeners interpret sound using embodied categories of weight, space, time, and flow. The two principal dimensions within this plane can be interpreted as effort actions such as wringing vs. dabbing and floating vs. thrusting, consistent with findings that listeners spontaneously map sound to motion metaphors (Eitan \u0026amp; Timmers, 2010; Sievers et al., 2013).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eMapping EMT categories within perceptual spaces\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWhen EMT categories were mapped onto the 3D descriptor space, they occupied distinct locations, reminiscent of their location in Russell’s (1980) circumplex model of affect (Sulem et al., 2019), supporting consistency between conveyed and perceived expressions guided by EMTs. Within this perceptual plane, the arrangement of the four expressive EMTs corresponded primarily to an arousal axis (e.g., Giocoso/Animato and Risoluto/Feroce at the high-arousal end; Amoroso/Affettuoso and Tristamente/Lagrimoso at the low-arousal end) and secondarily to a valence axis (e.g., Giocoso/Animato and Amoroso/Affettuoso more positive; Risoluto/Feroce and Tristamente/Lagrimoso more negative). This structure aligns with emotion research showing that arousal is often the most salient and discriminable dimension, with valence emerging as a secondary contrast (Posner et al., 2005; Scherer, 2005).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp; \u0026nbsp;\u0026nbsp;In contrast, Neutral performances were not located on the expressive EMT plane but along a separate axis, suggesting that expressiveness itself differentiates expressive from neutral interpretations (Sulem et al., 2023). Performances most clearly perceived as Neutral were positioned furthest from the expressive plane, whereas those perceived as less Neutral lay closer to it. This axis was not perfectly orthogonal to the expressive plane, indicating that Neutral performances were not entirely affectless but conveyed aspects of low arousal and negative valence.\u003c/p\u003e\n\u003cp\u003eOur confusion-level analyses further show how clarity of conveyed expression modulates EMT–descriptor mappings. Least-confused performances elicited consistent profiles across listeners, whereas most-confused performances produced more heterogeneous ratings. This pattern aligns with studies suggesting that expressive communication is categorical when cues converge, but becomes graded or ambiguous when cues are partially conflicting (Gabrielsson \u0026amp; Juslin, 1996; Palmer, 1997;). Importantly, even under conditions of ambiguity, EMT categories retained distinct positions in the perceptual plane, pointing to robust perceptual anchors for each expressive intention.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eCross-modal correspondences in expressive communication\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCorrelation analyses between perceived EMT ratings and descriptor ratings showed that the descriptors most strongly associated with a given perceived EMT were generally those most characteristic of the same EMT when intended by the performer. This reinforces the view that EMTs are grounded in robust cross-modal associations between timbral qualities and imagined movement. Such correspondences may function as a shared expressive vocabulary among musicians, facilitating communication in rehearsal, pedagogy, and performance (Juslin, 2003; Palmer, 1997).\u003c/p\u003e\n\u003cp\u003eAt the same time, overlap between categories, particularly in the movement domain, suggests that EMTs often draw on partially shared gestural schemata. This echoes findings that listeners associate similar motion metaphors with multiple emotions (Sievers et al. 2013) and that musical expression is perceived along both categorical and continuous dimensions (Cespedes-Guevara \u0026amp; Eerola, 2018). Thus, EMTs appear to function both as discrete labels and as pointers to underlying continuous perceptual–conceptual dimensions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eMethodological contributions and applications\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe present approach—selecting empirically relevant descriptors, applying MDS to reveal perceptual structures, and mapping EMT categories onto these structures—provides a systematic framework for linking abstract musical terminology to measurable perceptual dimensions. While our participant pool was relatively small and highly expert, the ratings were agreed across participants, particularly for the least-confused EMT performances. Furthermore, the method is adaptable to other instruments, genres, and listener populations, including non-musicians. Potential applications include: (1) Performance pedagogy: identifying combinations of timbre and movement cues that reliably convey specific expressive intentions (Palmer, 1997); (2) Cross-cultural research: testing whether similar perceptual structures and EMT mappings arise in different musical traditions; (3) Computational modelling: supplying empirically grounded dimensions for machine-learning models of expressive performance (Widmer \u0026amp; Goebl, 2004).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eLimitations and future directions\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe primary limitation of this study is the small, expert-only sample, which constrains generalization. Previous work shows that musical expertise influences both timbre categorization (McAdams et al., 1995) and metaphorical mappings of expression (Eitan \u0026amp; Timmers, 2010). Future research should therefore assess whether similar structures also hold among less-trained listeners and across different instruments. Another promising direction is to investigate the temporal dynamics of descriptors within performances, extending studies of expressive timing and phrasing (Repp, 1990; Palmer, 1997).\u003c/p\u003e\n\u003cp\u003eFinally, focusing on Western classical violin raises the question of generalizability. Comparative work across instruments and cultures could help identify which aspects of timbre–movement mappings are universal and which are shaped by specific stylistic conventions (Balkwill \u0026amp; Thompson, 1999; Fritz et al., 2009).\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eBy linking EMTs to both timbre and fictional movement descriptors in a controlled, single-instrument context, this study clarifies how listeners internally represent expressive intentions in violin performance. The findings highlight the complementary contributions of timbre and movement to expressive communication, supporting an embodied view of musical meaning, in which abstract expressive categories are grounded in both auditory–sensory and motor-imagery representations.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cstrong\u003eParticipants\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe experiment was conducted in the Computer Laboratory of the Music Department at Bar-Ilan University with seven professional string players serving as listeners. The group included four violinists, one violinist/violist, one violist, and one cellist, all specializing in classical music and graduates of renowned music academies in Brazil, England, France, Israel, the Netherlands, and Russia. Participants\u0026rsquo; ages ranged from 35 to 72 years (M = 46, SD = 12), with an average of 23 years of professional experience (SD = 16) as orchestra, chamber music, and solo performers. Their varied training and professional backgrounds exposed them to a wide range of interpretative approaches, enhancing the representativeness of the sample. All participants were fluent in English, the language in which the rating scales were presented.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMaterials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo examine the relationships between the dimensional structures underlying the perception of expressive musical terms (EMTs) and cross-modal metaphors, we conducted an experiment in which seven professional string players listened to and evaluated short violin excerpts in terms of timbre and fictional movement descriptors. The excerpts were performed by a professional violinist (the first author) according to five EMT categories: Amoroso/Affettuoso [A], Giocoso/Animato [G], Risoluto/Feroce [R], Tristamente/Lagrimoso [T], and Neutral [N].\u003c/p\u003e\n\u003cp\u003eFor consistency, we used recordings from an audio corpus constructed in our previous study (Sulem et al., 2023). In that study, the violinist performed 20 excerpts twice for each of the four expressive EMT categories and once in a neutral manner. The list of excerpts, their main compositional characteristics, and corresponding scores are provided in Appendix A. All recordings had previously been rated by the same participants for the five EMT categories several weeks prior to the current experiment.\u003c/p\u003e\n\u003cp\u003eFrom this corpus, 10 performances were selected for each EMT category based on their perceived degree of match to the intended category, as reported in Experiment 1 of Sulem et al. (2023). The match score ranged from 0 (no match) to 5 (perfect match). For each category, we selected three performances [a, b, c] perceived as least confused, four [a, b, c, d] as moderately confused, and three [a, b, c] as most confused (i.e., involving perceptual ambiguity with other EMT categories). The complete list of 50 performances, along with their mean ratings and standard deviations for the intended EMT category, is provided in Appendix B.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eProcedure\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe procedure consisted of evaluating each performance in terms of timbre and fictional movement cross-modal metaphors. The timbre descriptors were selected through a preliminary pre-experiment, whereas the movement descriptors were chosen based on the effort factors of Laban Movement Analysis (LMA).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eSelection of timbre descriptors.\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo identify a set of adjectives related to expressive violin performance from a broad pool of tone-quality descriptors, we first compiled 68 candidate adjectives (Appendix C): 55 from Fritz et al. (2012), six from Huang and Akagi (2008), and seven from Fischer (2004, 2013). Participants listened to 25 performances (five per EMT category, randomly chosen from the audio corpus) and, for each, freely selected any adjectives they felt were appropriate to describe the sound quality (no limit on number). This task lasted about 30 minutes. Figure 1 shows the selection frequency of each adjective in descending order.\u003c/p\u003e\n\u003cp\u003eTo create a manageable set for the subsequent rating phase, we reduced the list while maintaining coverage of the range of violin timbres. Adjectives with selection frequencies below the median value (16.0) were excluded. We then removed near-synonyms, retaining the most frequently selected term (e.g., focused \u0026rarr; clear, lively \u0026rarr; alive, sonorous \u0026rarr; ringing). Finally, we removed light, free, and strong, which were already included as movement descriptors, but retained rough, which\u0026mdash;despite falling just below the median\u0026mdash;had no equivalent among higher-frequency terms. This process resulted in 26 timbre adjectives (Figure 2).\u003c/p\u003e\n\u003cp\u003eTo validate the selection, we compared the selection frequencies of the included versus excluded adjectives using a Mann\u0026ndash;Whitney U test. The retained adjectives were chosen significantly more often than the excluded ones (p \u0026lt; 0.0001).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eSelection of movement descriptors.\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMovement descriptors were chosen to capture the fictional movement qualities of the sound, drawing on the four effort factors in LMA. For each factor, we used a pair of opposite adjectives: Strong\u0026ndash;Light (Weight), Straight\u0026ndash;Flexible (Space), Sudden\u0026ndash;Sustained (Time), and Bound\u0026ndash;Free (Flow). Originally developed to describe human motion, these terms are also commonly used by musicians to describe expressive performance.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEvaluation of performances.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe 50 performances listed in Appendix B were presented in different random orders several weeks after the timbre descriptor selection phase. For each performance, participants rated the 26 timbre descriptors and the eight movement descriptors (Figure 2) according to how well they matched the performance, using a 4-point Likert scale: 1 = not appropriate, 2 = slightly appropriate, 3 = moderately appropriate, 4 = strongly appropriate. For each LMA effort factor, only one of the two opposite movement descriptors could be selected, with the option to assign a score of 0 to both if neither was appropriate. The experiment lasted approximately 1.5 hours.\u003c/p\u003e\n\u003cp\u003eThe ethical committee of the Music Department at Bar-Ilan University approved the study. Each participant received ILS 250 for their time. All methods were performed in accordance with relevant guidelines and regulations. Written informed consent was obtained from all participants.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe analysis of listeners\u0026rsquo; ratings was conducted in three main steps.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStep 1: Perceptual space of timbre and movement descriptors.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe constructed a geometric representation of the perceptual space (Figure 3) using non-metric multidimensional scaling (MDS) in MATLAB. Each descriptor was represented as a 50-component vector of mean ratings for the 50 performances. The descriptor\u0026ndash;descriptor correlation matrix\u0026nbsp;\u003cimg width=\"19\" height=\"21\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABMAAAAVCAMAAACT1yXjAAAAAXNSR0IArs4c6QAAAGZQTFRFAAAAAAAAAAA6AABmADo6ADpmADqQAGa2OgAAOjpmOpDbZgAAZgBmZrb/kDoAkDo6kGYAkGY6kNv/tmYAtmY6tpBmtrb/ttu2ttv/tv//25A627Zm2////7Zm/9uQ/9vb//+2///bTyyWTAAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAAAe0lEQVQoU8WPXRKCMAyEN/yJFKuCIFSB5P6XtNHRMdMDsA952PmS3QD7SYaaqOj+C4gvJzwy4/X5DHAbx09rfUpqB8WsxB8SjF2TeOK/npw1Ouhan0/YLhaWKxEdO7B7E7ZnUNb2xE3PLaYFt/cxRpjD7KonPtlWiz71Aj8EBb4B56fvAAAAAElFTkSuQmCC\" alt=\"image\"\u003e\u0026nbsp;was computed as the Pearson correlation between\u0026nbsp;\u003cimg width=\"80\" height=\"19\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFAAAAATCAMAAAAwA3pdAAAAAXNSR0IArs4c6QAAAK5QTFRFAAAAAAAAAAA6AABmADo6ADpmADqQAGa2OgAAOgA6OgBmOjqQOmZmOmaQOma2OpCQOpC2OpDbZgAAZgA6ZgBmZjoAZjo6ZmYAZmZmZpC2ZpDbZrbbZrb/kDoAkDo6kDpmkGY6kJBmkJC2kLb/kNv/tmYAtmY6tpA6tpBmtpCQtrb/ttu2ttv/tv/btv//25A625Bm27Zm2/+22////7Zm/9uQ/9u2/9vb//+2///bzV45XgAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAABi0lEQVQ4T+1T0VaCQBCdxRTMIiWtFCgtF60IowXa/f8fa2YWTc9J4cHH5gF2du69e3cGAP7jTB2Qwj2TUi1j4qhJUAeNkD0JfbtsEiz7+5DyMjtJKAeqSfCgbpLecUI1Ec6CWmjmQhBu5QnhK6gw7XzMRVRNnCVIMSIAJrjQgcBaBtWdENfIsAVEUZSer3QwAkQNFd2r6KSQu5j6qhwsUnk1/RpnmEawTuXNs9qgku156Q2ViZGKBYuikGiKWyhd+E4QXPbD7T7Zju38uWkm9gEKdGIbSlSi/aJwycMjNF3jYkqdyb1uavdJqJ5FQWROyCFnfC92uENtl3TK3hQNHr0TQjobJh7psGW2xRDyi0+LMo9o3ovM671bznQQgkkiKIZooodtDaF6yiyVjIR5BLKbGeoK2jLrF3JYBVyvUQxdiW6a01Q/PeHM6MICd4BSfNkbEQVrJn7wxDXWYMPQAhncpBqlrXr7OPy2/+Bt6tPbSvIoTkXS+L8dsqXDH9TR0OP3t7bmWuF00FM/Lj4vly4DTCsAAAAASUVORK5CYII=\" alt=\"image\"\u003e\u0026nbsp;and\u0026nbsp;\u003cimg width=\"80\" height=\"21\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFAAAAAVCAMAAADmWplAAAAAAXNSR0IArs4c6QAAAKhQTFRFAAAAAAAAAAA6AABmADo6ADpmADqQAGa2OgAAOgA6OgBmOjqQOmZmOmaQOma2OpCQOpC2OpDbZgAAZgA6ZgBmZjoAZjo6ZmYAZmZmZpC2ZpDbZrbbZrb/kDoAkDo6kDpmkGY6kJBmkJC2kLb/kNv/tmYAtmY6tpA6tpBmtpCQttv/tv/btv//25A625Bm27Zm2/+22////7Zm/9uQ/9u2/9vb//+2///bBFRMtAAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAABkUlEQVQ4T+1U0VLCQAxMitAiglBFhbYKygHW1tpe7f3/n5nkCsI4UJjxwQfzAJfLbm5vc1OA//irDih0f1eaicKmhpXfCNlpUd0smhrq7i5EXyZHCbqXNzXcq5tl5zChHKMzZwvNDJFxKw9xkENJaetthmE5dhagcMQASmhR+Ui1BMo7xD4xbIFQHNob5JU/AkINc75X0Yohcykd5Lo3j9XV5OM2oTSEdayun/OUOlnPtTfMTURUKlgUhyJRYqFy4XNJYN0NNvssO7LzF9NMNAAoSIk1lKlM+0bRUobHaL7GxYSdybx2bPe5UT2LgsmSsELJ5F6icIvaLPmUnSkaOnrbiOgimHncRySLLIGwXvq1qJT7eKF5vXf1tPIDMMsQiiEVOmRrAOVTYqksJMhCUO3EsCsky6xfWGHpS71GCXSF7Tjjqb576Ez5wkg7wCn92RsxhWomevCwTzVIBVoQQ0yqUaqes2WcEPtv+yehqud8QisLkVEcieLcD4Jy5EEdjLS252SFDUDz2PhBOO8oeTtfh+YuURpg/pwAAAAASUVORK5CYII=\" alt=\"image\"\u003e. A dissimilarity matrix\u0026nbsp;\u003cimg width=\"12\" height=\"19\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAwAAAATCAMAAACTKxybAAAAAXNSR0IArs4c6QAAAFFQTFRFAAAAAAAAAABmADo6ADpmADqQAGa2OgAAOgA6OpDbZgAAZjoAZpC2ZrbbZrb/kDoAkNv/tmYAtpA6ttv/tv//25A629u22////9uQ//+2///bM2XAbQAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAAAVklEQVQYV2NgoByIMgIBMy/EICl+NgYpIUZOMEeSiw9ICrKKgzgSHAJAUpRJBMQRBYtBOYJsICGIMogWCXawAWAtUvwQ/SAtYtwswmAJLpCVPGBxcgAAhroDagRgs7EAAAAASUVORK5CYII=\" alt=\"image\"\u003e\u0026nbsp;was then defined as\u0026nbsp;\u003cimg width=\"100\" height=\"21\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAGQAAAAVCAMAAACHQrsCAAAAAXNSR0IArs4c6QAAAHhQTFRFAAAAAAAAAAA6AABmADo6ADpmADqQAGa2OgAAOgA6OjpmOpDbZgAAZgBmZjoAZpC2ZrbbZrb/kDoAkDo6kGYAkGY6kNv/tmYAtmY6tpA6tpBmtrb/ttu2ttv/tv//25A627Zm29u22////7Zm/9uQ/9vb//+2///bWG6zrwAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAABEklEQVRIS+1UyxKCMAysT1Drqyr1jQo2//+HJjBKi6HUGb25By5kd5NsQIg//hv44QbSDqK/CHdAwsRbDfsYFRO7BlQkYNfCcwnDm88E1OAkLl3HxMglUrSfaIuWhGbo3lkIM8VHhXxEnim9qQEUbZLgdJC5TdZZecz0kBYKnElDsyWhEaySjqg+fF2U4RPMrPbrV1254TxmDoZfV0skRjJKRSSgaAWwKtKpGmVXkjHpWYWgniaWHG34OsOjC4UevJ+IzdW9k7ivnXFySZ/inKI0kmbQ/tsR+Cn6R4UNKo6TJrmU7GsHHjofU8fLbSmSrCWScFdWzkyPB8zd/18K9+DljMQbKE/iG/DItRzop+6F3ANGHhKAvas3dAAAAABJRU5ErkJggg==\" alt=\"image\"\u003e, where\u0026nbsp;\u003cimg width=\"7\" height=\"19\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAcAAAATCAMAAABry+dsAAAAAXNSR0IArs4c6QAAADlQTFRFAAAAAAAAAAA6AABmADpmADqQAGa2OpDbZgAAZrb/kDoAkNv/tmYAtv//25A62////7Zm//+2///bfnLChQAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAAAN0lEQVQYV2NgIBrwMTKygxQLcbHyg2hBDk6wXgEmbjDNBxEW4mIDc+HSzLxgPg8LmAaaApHHDwBdMgFAz1QCAwAAAABJRU5ErkJggg==\" alt=\"image\"\u003e\u0026nbsp;is the identity matrix. Applying the mdscale function to\u0026nbsp;\u003cimg width=\"12\" height=\"19\" src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAwAAAATCAMAAACTKxybAAAAAXNSR0IArs4c6QAAAFFQTFRFAAAAAAAAAABmADo6ADpmADqQAGa2OgAAOgA6OpDbZgAAZjoAZpC2ZrbbZrb/kDoAkNv/tmYAtpA6ttv/tv//25A629u22////9uQ//+2///bM2XAbQAAAAF0Uk5TAEDm2GYAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAAZdEVYdFNvZnR3YXJlAE1pY3Jvc29mdCBPZmZpY2V/7TVxAAAAVklEQVQYV2NgoByIMgIBMy/EICl+NgYpIUZOMEeSiw9ICrKKgzgSHAJAUpRJBMQRBYtBOYJsICGIMogWCXawAWAtUvwQ/SAtYtwswmAJLpCVPGBxcgAAhroDagRgs7EAAAAASUVORK5CYII=\" alt=\"image\"\u003e\u0026nbsp;yielded the coordinates in the optimal dimensionality.\u003c/p\u003e\n\u003cp\u003eWe also performed hierarchical cluster analysis, partitioning the 34 descriptors into five clusters. In addition, a separate MDS analysis was conducted on the subset of movement descriptors alone (Figure 4, right panel).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStep 2: Description of EMT categories in terms of descriptors.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor each intended EMT category, we computed the mean and standard deviation of each descriptor rating, averaged across participants and performances (Table 1, Figure 6). We repeated this analysis separately for performances classified as least, moderately, and most confused (Figure 5). Pearson correlations were also computed between descriptor ratings and the perceived EMT ratings from Experiment 1 in Sulem et al. (2023) (Figure 7).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStep 3: Relation between descriptor and EMT dimensional structures.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eEach EMT category was represented as a linear combination of descriptor coordinates (from Step 1), weighted by the mean descriptor ratings (from Step 2). These EMT representations were then projected into the descriptor perceptual space to visualize their spatial relationships (Figure 8).\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA.S., E.B., and N.A. conceived and designed the study. A.S. conducted the experiments and analyzed the data. A.S. wrote the first draft of the manuscript, and all authors contributed to writing, reviewing, and editing. E.B. and N.A. supervised the project.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding Declaration\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eBalkwill, L. L., \u0026amp; Thompson, W. F. (1999). A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues. \u003cem\u003eMusic perception\u003c/em\u003e, 17(1), 43-64.\u003c/li\u003e\n \u003cli\u003eBartenieff, I., \u0026amp; Lewis, D. (2013). \u003cem\u003eBody movement: Coping with the environment\u003c/em\u003e. Routledge.\u003c/li\u003e\n \u003cli\u003eBarthet, M., Depalle, P., Kronland-Martinet, R., \u0026amp; Ystad, S. (2010). Acoustical correlates of timbre and expressiveness in clarinet performance. \u003cem\u003eMusic Perception\u003c/em\u003e, 28(2), 135-154.\u003c/li\u003e\n \u003cli\u003eCespedes-Guevara, J., \u0026amp; Eerola, T. (2018). Music communicates affects, not basic emotions\u0026ndash;A constructionist account of attribution of emotional meanings to music. \u003cem\u003eFrontiers in psychology\u003c/em\u003e, 9, 215.\u003c/li\u003e\n \u003cli\u003eEerola, T., Ferrer, R., \u0026amp; Alluri, V. (2012). Timbre and affect dimensions: Evidence from affect and similarity ratings and acoustic correlates of isolated instrument sounds. \u003cem\u003eMusic Perception: An Interdisciplinary Journal\u003c/em\u003e, 30(1), 49-70.\u003c/li\u003e\n \u003cli\u003eEitan, Z., \u0026amp; Timmers, R. (2010). Beethoven\u0026rsquo;s last piano sonata and those who follow crocodiles: Cross-domain mappings of auditory pitch in a musical context. Cognition, 114(3), 405-422.\u003c/li\u003e\n \u003cli\u003eFischer, S. (2004). Practice: 250 step-by-step practice methods for the violin. Edition Peters.\u003c/li\u003e\n \u003cli\u003eFischer, S. (2013). The violin lesson. Peters.\u003c/li\u003e\n \u003cli\u003eFritz, C., Blackwell, A. F., Cross, I., Woodhouse, J., \u0026amp; Moore, B. C. (2012). Exploring violin sound quality: Investigating English timbre descriptors and correlating resynthesized acoustical modifications with perceptual properties. The Journal of the Acoustical Society of America, 131(1), 783-794.\u003c/li\u003e\n \u003cli\u003eFritz, T., Jentschke, S., Gosselin, N., Sammler, D., Peretz, I., Turner, R., ... \u0026amp; Koelsch, S. (2009). Universal recognition of three basic emotions in music. \u003cem\u003eCurrent biology\u003c/em\u003e, 19(7), 573-576.\u003c/li\u003e\n \u003cli\u003eGabrielsson, A., \u0026amp; Juslin, P. N. (1996). Emotional expression in music performance: Between the performer\u0026apos;s intention and the listener\u0026apos;s experience. \u003cem\u003ePsychology of music\u003c/em\u003e, 24(1), 68-91.\u003c/li\u003e\n \u003cli\u003eGibbs Jr, R. W. (2005). \u003cem\u003eEmbodiment and cognitive science\u003c/em\u003e. Cambridge University Press.\u003c/li\u003e\n \u003cli\u003eGod\u0026oslash;y, R. I., \u0026amp; Leman, M. (Eds.). (2010). Musical gestures: Sound, movement, and meaning. Routledge.\u003c/li\u003e\n \u003cli\u003eGrey, J. M. (1977). Multidimensional perceptual scaling of musical timbres. \u003cem\u003ethe Journal of the Acoustical Society of America\u003c/em\u003e, 61(5), 1270-1277.\u003c/li\u003e\n \u003cli\u003eHailstone, J. C., Omar, R., Henley, S. M., Frost, C., Kenward, M. G., \u0026amp; Warren, J. D. (2009). It\u0026apos;s not what you play, it\u0026apos;s how you play it: Timbre affects perception of emotion in music. \u003cem\u003eQuarterly journal of experimental psychology\u003c/em\u003e, 62(11), 2141-2155.\u003c/li\u003e\n \u003cli\u003eHajda, J. M., Kendall, R. A., Carterette, E. C., \u0026amp; Harshberger, M. L. (1997). Methodological issues in timbre research.\u003c/li\u003e\n \u003cli\u003eHaverkamp, M. (2009). Look at that sound! Visual aspects of auditory perception. In \u003cem\u003eIII Congreso Intyernacional de Sinestesia\u003c/em\u003e, Granada.\u003c/li\u003e\n \u003cli\u003eHolmes, P. A. (2012). An exploration of musical communication through expressive use of timbre: The performer\u0026rsquo;s perspective. \u003cem\u003ePsychology of Music\u003c/em\u003e, 40(3), 301-323.\u003c/li\u003e\n \u003cli\u003eHuang, C. F., \u0026amp; Akagi, M. (2008). A three-layered model for expressive speech perception. \u003cem\u003eSpeech Communication\u003c/em\u003e, 50(10), 810-828.\u003c/li\u003e\n \u003cli\u003eJohnson, M. (1987). The body in the mind. The Bodily Basis of Meaning, Imagination, and Reason/The University of Chicago.\u003c/li\u003e\n \u003cli\u003eJohnson, M. L., \u0026amp; Larson, S. (2003). \u0026quot; Something in the way she moves\u0026quot;-metaphors of musical motion. \u003cem\u003eMetaphor and symbol\u003c/em\u003e, 18(2), 63-84.\u003c/li\u003e\n \u003cli\u003eJuslin, P. N. (2003). Five facets of musical expression: A psychologist\u0026apos;s perspective on music performance. \u003cem\u003ePsychology of music\u003c/em\u003e, 31(3), 273-302.\u003c/li\u003e\n \u003cli\u003eKruskal, J. B., \u0026amp; Wish, M. (1978). \u003cem\u003eMultidimensional scaling\u003c/em\u003e (No. 11). Sage.\u003c/li\u003e\n \u003cli\u003eLaban, R., \u0026amp; Ullmann, L. (1971). \u003cem\u003eThe mastery of movement\u003c/em\u003e.\u003c/li\u003e\n \u003cli\u003eLeman, M. (2007). Embodied music cognition and mediation technology. MIT press.\u003c/li\u003e\n \u003cli\u003eLeman, M. (2010). Music, gesture, and the formation of embodied meaning. \u003cem\u003eIn Musical gestures\u003c/em\u003e (pp. 138-165). Routledge.\u003c/li\u003e\n \u003cli\u003eMcAdams, S. (2013). Musical timbre perception. \u003cem\u003eThe psychology of music\u003c/em\u003e, 3.\u003c/li\u003e\n \u003cli\u003eMcAdams, S., Winsberg, S., Donnadieu, S., De Soete, G., \u0026amp; Krimphoff, J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent subject classes. \u003cem\u003ePsychological research\u003c/em\u003e, 58(3), 177-192.\u003c/li\u003e\n \u003cli\u003eNewlove, J., \u0026amp; Dalby, J. (2019). Laban for all. Routledge.\u003c/li\u003e\n \u003cli\u003ePalmer, C. (1997). Music performance. Annual review of psychology, 48(1), 115-138.\u003c/li\u003e\n \u003cli\u003ePosner, J., Russell, J. A., \u0026amp; Peterson, B. S. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. \u003cem\u003eDevelopment and psychopathology\u003c/em\u003e, 17(3), 715-734.\u003c/li\u003e\n \u003cli\u003eRepp, B. H. (1990). Patterns of expressive timing in performances of a Beethoven minuet by nineteen famous pianists. \u003cem\u003eThe Journal of the Acoustical Society of America\u003c/em\u003e, 88(2), 622-641.\u003c/li\u003e\n \u003cli\u003eRepp, B. H. (1996). Pedal timing and tempo in expressive piano performance: A preliminary investigation. \u003cem\u003ePsychology of Music\u003c/em\u003e, 24(2), 199-221.\u003c/li\u003e\n \u003cli\u003eRepp, B. H. (1998). Variations on a theme by Chopin: Relations between perception and production of timing in music. \u003cem\u003eJournal of Experimental Psychology: Human Perception and Performance\u003c/em\u003e, 24(3), 791.\u003c/li\u003e\n \u003cli\u003eRussell, J. A. (1980). A circumplex model of affect. \u003cem\u003eJournal of personality and social psychology\u003c/em\u003e, 39(6), 1161.\u003c/li\u003e\n \u003cli\u003eScherer, K. R. (2005). What are emotions? And how can they be measured?. \u003cem\u003eSocial science information\u003c/em\u003e, 44(4), 695-729.\u003c/li\u003e\n \u003cli\u003eShove, P., \u0026amp; Repp, B. H. (1995). Musical motion and performance: Theoretical and empirical perspectives. \u003cem\u003eThe practice of performance\u003c/em\u003e, 55-83.\u003c/li\u003e\n \u003cli\u003eSievers, B., Polansky, L., Casey, M., \u0026amp; Wheatley, T. (2013). Music and movement share a dynamic structure that supports universal expressions of emotion. Proceedings of the national academy of sciences, 110(1), 70-75.\u003c/li\u003e\n \u003cli\u003eSulem, A., Bodner, E., \u0026amp; Amir, N. (2019). Perception-based classification of expressive musical terms: toward a parameterization of musical expressiveness. \u003cem\u003eMusic Perception: An Interdisciplinary Journal\u003c/em\u003e, 37(2), 147-164.\u003c/li\u003e\n \u003cli\u003eSulem, A., Bodner, E., \u0026amp; Amir, N. (2023). Perception of violin performance expression through expressive musical terms. \u003cem\u003eMusicae Scientiae\u003c/em\u003e, 27(2), 442-470.\u003c/li\u003e\n \u003cli\u003eWidmer, G., \u0026amp; Goebl, W. (2004). Computational models of expressive music performance: The state of the art. \u003cem\u003eJournal of new music research\u003c/em\u003e, 33(3), 203-216.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTables 1 and 2 are available in the Supplementary Files section\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"perception of expression, violin performance, expressive musical terms, timbre, fictional movement, Laban movement analysis","lastPublishedDoi":"10.21203/rs.3.rs-7436859/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7436859/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Expressive performance in Western classical music relies on performers’ interpretive choices and on expressive musical terms (EMTs) such as amoroso or risoluto. These terms often draw on metaphorical language, yet it remains unclear how EMTs relate to listeners’ cross-modal descriptions of sound and movement. We investigated this question in violin performance, asking seven professional string players to evaluate short excerpts performed according to four EMT categories (Amoroso/Affettuoso, Giocoso/Animato, Risoluto/Feroce, Tristamente/Lagrimoso) and a neutral condition. Participants rated each excerpt on 26 timbre and eight movement descriptors, the latter derived from Laban Movement Analysis. Multidimensional scaling revealed three perceptual dimensions for timbre (Smoothness, Aliveness, Pureness) and a lower-dimensional structure for movement descriptors aligned with Laban effort factors. Correlation and clustering analyses showed that EMTs consistently mapped onto distinct timbre–movement profiles, with clearer associations in less ambiguous performances. These findings demonstrate that EMTs are perceptually grounded in embodied cross-modal mappings, linking auditory qualities to imagined movement. The study provides a systematic framework for understanding expressive intention in performance and offers applications in pedagogy and computational modeling.","manuscriptTitle":"Mapping internal representations of timbre and movement in metaphorical descriptions of violin performance","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-23 02:45:28","doi":"10.21203/rs.3.rs-7436859/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ad95aa15-901b-4aba-88f9-d04444cbc814","owner":[],"postedDate":"September 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":54952360,"name":"Physical sciences/Mathematics and computing"},{"id":54952361,"name":"Biological sciences/Neuroscience"},{"id":54952362,"name":"Biological sciences/Psychology"},{"id":54952363,"name":"Social science/Psychology"}],"tags":[],"updatedAt":"2025-11-10T08:09:18+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-23 02:45:28","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7436859","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7436859","identity":"rs-7436859","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.