The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation

doi:10.21203/rs.3.rs-5975927/v1

The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation

2025 · doi:10.21203/rs.3.rs-5975927/v1

preprint OA: closed

Full text JSON View at publisher

Full text 14,849 characters · extracted from preprint-html · click to expand

The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation Aina Garí Soler, Jenny Myrendal, Chloé Clavel, Staffan Larsson This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5975927/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 10 Apr, 2026 Read the published version in Language Resources and Evaluation → Version 1 posted 11 You are reading this latest preprint version Abstract Word Meaning Negotiation (WMN) sequences occur when participants focus on clarifying or negotiating the meaning of a word or phrase, often prompted by questions or challenges. These interactions temporarily shift the conversation to explore nuances of meaning - sometimes resulting in quick clarification when due to insufficient understanding of word meaning, and other times leading to extended debates, such as disagreements on what a word can or should mean. This paper presents the largest and freely available manually annotated corpus of WMNs to date, encompassing spoken dyadic and multiparty conversations as well as online discussions. Our methodology combines searching for WMNs using regular expressions with a detailed annotation scheme that categorizes WMNs into types triggered by non-understanding (NONs: Non-understanding WMN) or disagreement (DINs: Disagreement WMN), and distinguishes between negotiations of situated and potential meanings. We also annotate incomplete negotiations and related phenomena, and analyze inter-annotator agreement to evaluate the reliability of the annotation schema. Preliminary investigations of WMNs in the corpus reveal distinct patterns in WMNs across contexts, with NONs prevalent in spoken interactions and DINs dominating online debates. This resource lays a foundation for studying semantic alignment, developing automated WMN detection, and creating adaptive dialogue systems. Our findings highlight the complexity of WMNs and provide practical insights for their identification and analysis. Word meaning negotiation WMN semantic alignment semantic coordination interactional linguistics misunderstanding Full Text Additional Declarations Competing interest reported. This work was supported by the Swedish Research Council (VR) grant 2022-02125 Not Just Semantics: Word Meaning Negotiation in Social Media and Spoken Interaction, and by state funding managed by the Agence Nationale de la Recherche under the France 2030 program, with reference “ANR-23-IACL-0008”. The authors have no financial or proprietary interests in any material discussed in this article. Cite Share Download PDF Status: Published Journal Publication published 10 Apr, 2026 Read the published version in Language Resources and Evaluation → Version 1 posted Editorial decision: Revision requested 12 May, 2025 Reviews received at journal 06 May, 2025 Reviews received at journal 08 Apr, 2025 Reviews received at journal 02 Apr, 2025 Reviewers agreed at journal 13 Mar, 2025 Reviewers agreed at journal 11 Mar, 2025 Reviewers agreed at journal 10 Mar, 2025 Reviewers invited by journal 10 Mar, 2025 Editor assigned by journal 10 Mar, 2025 Submission checks completed at journal 08 Feb, 2025 First submitted to journal 06 Feb, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5975927","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":439911088,"identity":"86eecc78-ef8a-4d03-815e-0f8077ce5e81","order_by":0,"name":"Aina Garí Soler","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAElEQVRIie3QPUsDMRjA8ec4SJdY14Cl+QpPCRS7+FmeUGin0sGlYMFMcTk7+3HuCNil4Ook7eLU4dwOOcREu0lqR8H8hyRDfuQFIJX6i+WHmYVhC9APi9LPJxICUAeiTjzSE22+l3FyfndW1fB+Ne9CZ7elxct0dXG/KxvAeYwI1x2LbDUeWeAKaXM9s701VgXgyMSM4yiyIkcm90xoSzMrJlByaDEmpOOqyYpbZNB5FfqDpsyTqgWMEnR8KKBxnsBQaEMUiONHyMCTS23WnoS3PNIgXMz1ME76Txv1XLc3KI3/sXpJUj5M8rf9Ik6+0vbnhY8CX/vbhlQqlfrXfQJYU0nnd7kMyAAAAABJRU5ErkJggg==","orcid":"","institution":"INRIA Paris","correspondingAuthor":true,"prefix":"","firstName":"Aina","middleName":"Garí","lastName":"Soler","suffix":""},{"id":439911089,"identity":"10de1a08-b69c-4040-acec-e8c160ef3c1a","order_by":1,"name":"Jenny Myrendal","email":"","orcid":"","institution":"Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg","correspondingAuthor":false,"prefix":"","firstName":"Jenny","middleName":"","lastName":"Myrendal","suffix":""},{"id":439911090,"identity":"1361c7ce-52e1-4914-b11b-1f89c63fe509","order_by":2,"name":"Chloé Clavel","email":"","orcid":"","institution":"INRIA Paris","correspondingAuthor":false,"prefix":"","firstName":"Chloé","middleName":"","lastName":"Clavel","suffix":""},{"id":439911091,"identity":"7616375a-990b-424e-9796-3ca79bb55c86","order_by":3,"name":"Staffan Larsson","email":"","orcid":"","institution":"Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg","correspondingAuthor":false,"prefix":"","firstName":"Staffan","middleName":"","lastName":"Larsson","suffix":""}],"badges":[],"createdAt":"2025-02-06 19:08:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5975927/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5975927/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1007/s10579-026-09907-x","type":"published","date":"2026-04-10T15:57:06+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":106809523,"identity":"ba8974b2-c38c-4ea0-8bc9-9baa4f4ac1fd","added_by":"auto","created_at":"2026-04-13 16:11:13","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":863277,"visible":true,"origin":"","legend":"","description":"","filename":"NeWMesubmission.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5975927/v1_covered_173c4775-7bc7-411c-8c3d-55aa3ad9966f.pdf"}],"financialInterests":"Competing interest reported. This work was supported by the Swedish Research Council (VR) grant 2022-02125 Not Just Semantics: Word Meaning Negotiation in Social Media and Spoken Interaction, and by state funding managed by the Agence Nationale de la Recherche under the France 2030 program, with reference “ANR-23-IACL-0008”. The authors have no financial or proprietary interests in any material discussed in this article.","formattedTitle":"The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"language-resources-and-evaluation","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"lrev","sideBox":"Learn more about [Language Resources and Evaluation](http://link.springer.com/journal/10579)","snPcode":"10579","submissionUrl":"https://submission.nature.com/new-submission/10579/3","title":"Language Resources and Evaluation","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"Word meaning negotiation, WMN, semantic alignment, semantic coordination, interactional linguistics, misunderstanding","lastPublishedDoi":"10.21203/rs.3.rs-5975927/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5975927/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e Word Meaning Negotiation (WMN) sequences occur when participants focus on clarifying or negotiating the meaning of a word or phrase, often prompted by questions or challenges. These interactions temporarily shift the conversation to explore nuances of meaning - sometimes resulting in quick clarification when due to insufficient understanding of word meaning, and other times leading to extended debates, such as disagreements on what a word can or should mean.\u003c/p\u003e \u003cp\u003eThis paper presents the largest and freely available manually annotated corpus of WMNs to date, encompassing spoken dyadic and multiparty conversations as well as online discussions. Our methodology combines searching for WMNs using regular expressions with a detailed annotation scheme that categorizes WMNs into types triggered by non-understanding (NONs: Non-understanding WMN) or disagreement (DINs: Disagreement WMN), and distinguishes between negotiations of situated and potential meanings. We also annotate incomplete negotiations and related phenomena, and analyze inter-annotator agreement to evaluate the reliability of the annotation schema.\u003c/p\u003e \u003cp\u003ePreliminary investigations of WMNs in the corpus reveal distinct patterns in WMNs across contexts, with NONs prevalent in spoken interactions and DINs dominating online debates. This resource lays a foundation for studying semantic alignment, developing automated WMN detection, and creating adaptive dialogue systems. Our findings highlight the complexity of WMNs and provide practical insights for their identification and analysis.\u003c/p\u003e","manuscriptTitle":"The NeWMe Corpus: A gold standard corpus for the study of Word Meaning Negotiation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-23 06:44:28","doi":"10.21203/rs.3.rs-5975927/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-05-12T09:44:24+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-05-06T13:40:45+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-08T06:46:51+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-04-02T16:14:00+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"278936074475340494409651025969880894967","date":"2025-03-13T10:53:27+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"89331025020024963001284899088932982553","date":"2025-03-11T10:46:17+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"220297256248579018859055138677055618202","date":"2025-03-10T16:54:27+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-03-10T16:19:34+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-03-10T15:57:10+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-02-08T15:45:32+00:00","index":"","fulltext":""},{"type":"submitted","content":"Language Resources and Evaluation","date":"2025-02-06T18:52:42+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"language-resources-and-evaluation","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"lrev","sideBox":"Learn more about [Language Resources and Evaluation](http://link.springer.com/journal/10579)","snPcode":"10579","submissionUrl":"https://submission.nature.com/new-submission/10579/3","title":"Language Resources and Evaluation","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"df6a2179-2cfa-44e8-a8e7-c0a4ccc27822","owner":[],"postedDate":"April 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-04-13T16:07:04+00:00","versionOfRecord":{"articleIdentity":"rs-5975927","link":"https://doi.org/10.1007/s10579-026-09907-x","journal":{"identity":"language-resources-and-evaluation","isVorOnly":false,"title":"Language Resources and Evaluation"},"publishedOn":"2026-04-10 15:57:06","publishedOnDateReadable":"April 10th, 2026"},"versionCreatedAt":"2025-04-23 06:44:28","video":"","vorDoi":"10.1007/s10579-026-09907-x","vorDoiUrl":"https://doi.org/10.1007/s10579-026-09907-x","workflowStages":[]},"version":"v1","identity":"rs-5975927","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5975927","identity":"rs-5975927","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00