A Hybrid-Metric and Layer-Freezing Framework for Coreference Resolution | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A Hybrid-Metric and Layer-Freezing Framework for Coreference Resolution Yu Wang, Dong Ding, Shu Xu, Zenghui Ding, Jianqing Gao, Min Yang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8322598/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 5 You are reading this latest preprint version Abstract Coreference resolution based on pretrained language models remains constrained in long-span contexts. This paper first provides a formal explanation from a mathematical modeling perspective of the root causes of performance degradation in long-span scenarios: the attention distribution exhibits magnitude contraction and directional noise accumulation with increasing cross-sentence distance, leading to attention dilution. Building on this analysis, This paper propose three improvements within the BERT framework: (1) a hybrid-similarity self-attention mechanism that balances magnitude sensitivity and directional stability between querykey vectors to suppress long-distance attention dilution; (2) a contrastive reformulation of cross-entropy, which, in line with the tasks binary nature and the modified representations, introduces a positivenegative information separation term to enhance inter-class separability and robustness to hard negatives; and (3) a layer-stable optimization strategy that, motivated by the semantic heterogeneity of attention heads, employs layer freezing and a three-stage pretrainingfine-tuningrefining pipeline to preserve lower-layer lexical and syntactic cues while progressively strengthening discourse-level semantics and stabilizing higher-layer representations. Experiments on the Chinese and English portions of OntoNotes 5.0 show consistent gains over strong baselines, with F1 improved by 0.9 and 1.1 points, respectively, providing an interpretable and extensible solution for cross-lingual coreference modeling. Physical sciences/Engineering Physical sciences/Mathematics and computing Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviewers invited by journal 07 Jan, 2026 Editor assigned by journal 07 Jan, 2026 Editor invited by journal 07 Jan, 2026 Submission checks completed at journal 04 Jan, 2026 First submitted to journal 04 Jan, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8322598","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":571339727,"identity":"7cbaf62c-ad14-4a3e-8126-180222fe6b00","order_by":0,"name":"Yu Wang","email":"","orcid":"","institution":"Hefei Institutes of Physical Science, Chinese Academy of Sciences","correspondingAuthor":false,"prefix":"","firstName":"Yu","middleName":"","lastName":"Wang","suffix":""},{"id":571339728,"identity":"848ae157-8bcc-4835-8f52-586e21344a7f","order_by":1,"name":"Dong Ding","email":"","orcid":"","institution":"Hefei Institutes of Physical Science, Chinese Academy of Sciences","correspondingAuthor":false,"prefix":"","firstName":"Dong","middleName":"","lastName":"Ding","suffix":""},{"id":571339729,"identity":"4d2e16d0-08ef-43a3-9dd7-317f39ee21b6","order_by":2,"name":"Shu Xu","email":"","orcid":"","institution":"University of Science and Technology of China","correspondingAuthor":false,"prefix":"","firstName":"Shu","middleName":"","lastName":"Xu","suffix":""},{"id":571339730,"identity":"718a73b3-7d2b-4635-8c7b-ff6194688e60","order_by":3,"name":"Zenghui Ding","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA4ElEQVRIiWNgGAWjYDCCw1DagIGx8UGCAYlamg2I03IAroWBTYIod/Ed5z388kfFHbvtEsltFQ8KtjHwtx9g/FyAR4vkYb40C4kzz5J3zkhsu5FgcJtB4kwCs/QMPFoMDvOYGRi2HU42uAHVwnCDgY2Zh5CWxH8QLQUgLfJEaDF+cLDhsB1ICwNIiwEhLZJAWxgbjh1OMDjzsFkCqIXH8ExiszQ+LXznzxh//FFz2N7gePrDjz/+3JaTO3744Gd8Whig0ZHYAOUBFTM24FYMAcwfgIQ9IVWjYBSMglEwggEA4zlRTurE5PsAAAAASUVORK5CYII=","orcid":"","institution":"Hefei Institutes of Physical Science, Chinese Academy of Sciences","correspondingAuthor":true,"prefix":"","firstName":"Zenghui","middleName":"","lastName":"Ding","suffix":""},{"id":571339731,"identity":"36b0a3ba-eb3f-4c84-9855-f02072c077e6","order_by":4,"name":"Jianqing Gao","email":"","orcid":"","institution":"iFLYTEK CO.LTD","correspondingAuthor":false,"prefix":"","firstName":"Jianqing","middleName":"","lastName":"Gao","suffix":""},{"id":571339732,"identity":"a65cf637-1003-4a73-9659-2881f7016be0","order_by":5,"name":"Min Yang","email":"","orcid":"","institution":"The Second Affiliated Hospital of Anhui University of Chinese Medicine","correspondingAuthor":false,"prefix":"","firstName":"Min","middleName":"","lastName":"Yang","suffix":""},{"id":571339737,"identity":"efeba1af-c7d5-4509-8116-8e7418706a72","order_by":6,"name":"Xianjun Yang","email":"","orcid":"","institution":"Hefei Institutes of Physical Science, Chinese Academy of Sciences","correspondingAuthor":false,"prefix":"","firstName":"Xianjun","middleName":"","lastName":"Yang","suffix":""}],"badges":[],"createdAt":"2025-12-10 03:38:25","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8322598/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8322598/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":99862639,"identity":"e4d831c7-7634-4e4a-9320-8fe17d534f82","added_by":"auto","created_at":"2026-01-09 07:20:22","extension":"json","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8738,"visible":true,"origin":"","legend":"","description":"","filename":"ee94d92ca9654c4aa3fb36763f748b7f.json","url":"https://assets-eu.researchsquare.com/files/rs-8322598/v1/4ef425ac7bb3e07d93170b13.json"},{"id":100357364,"identity":"acd21de0-5d37-4a8d-bce3-c152dbc25568","added_by":"auto","created_at":"2026-01-16 07:19:46","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4706461,"visible":true,"origin":"","legend":"","description":"","filename":"AHybridMetricandLayerFreezingFrameworkforCoreferenceResolution.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8322598/v1_covered_db53a16f-8e7c-460b-99a4-833e3d164cab.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A Hybrid-Metric and Layer-Freezing Framework for Coreference Resolution","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-8322598/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8322598/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eCoreference resolution based on pretrained language models remains constrained in long-span contexts. This paper first provides a formal explanation from a mathematical modeling perspective of the root causes of performance degradation in long-span scenarios: the attention distribution exhibits magnitude contraction and directional noise accumulation with increasing cross-sentence distance, leading to attention dilution. Building on this analysis, This paper propose three improvements within the BERT framework: (1) a hybrid-similarity self-attention mechanism that balances magnitude sensitivity and directional stability between querykey vectors to suppress long-distance attention dilution; (2) a contrastive reformulation of cross-entropy, which, in line with the tasks binary nature and the modified representations, introduces a positivenegative information separation term to enhance inter-class separability and robustness to hard negatives; and (3) a layer-stable optimization strategy that, motivated by the semantic heterogeneity of attention heads, employs layer freezing and a three-stage pretrainingfine-tuningrefining pipeline to preserve lower-layer lexical and syntactic cues while progressively strengthening discourse-level semantics and stabilizing higher-layer representations. Experiments on the Chinese and English portions of OntoNotes 5.0 show consistent gains over strong baselines, with F1 improved by 0.9 and 1.1 points, respectively, providing an interpretable and extensible solution for cross-lingual coreference modeling.\u0026nbsp;\u003c/p\u003e","manuscriptTitle":"A Hybrid-Metric and Layer-Freezing Framework for Coreference Resolution","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-01-09 07:20:12","doi":"10.21203/rs.3.rs-8322598/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewersInvited","content":"","date":"2026-01-08T01:34:25+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-01-08T00:41:13+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-01-07T17:46:07+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-01-04T09:31:41+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2026-01-04T09:30:13+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"cef9d533-05e8-42de-98f3-0e2918097496","owner":[],"postedDate":"January 9th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":60799195,"name":"Physical sciences/Engineering"},{"id":60799196,"name":"Physical sciences/Mathematics and computing"}],"tags":[],"updatedAt":"2026-01-09T07:20:12+00:00","versionOfRecord":[],"versionCreatedAt":"2026-01-09 07:20:12","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8322598","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8322598","identity":"rs-8322598","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.