Fine-Grained Sentiment Mining, at Document Level on Big Data, using a state-of-the-art Representation-based Transformer: ModernBERT | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Fine-Grained Sentiment Mining, at Document Level on Big Data, using a state-of-the-art Representation-based Transformer: ModernBERT Bonaventure Chidube Molokwu, Audrey Rah, Reginald Chukwuka Molokwu This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7595618/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract As our active and passive digital footprints continuously aggregate into Big Data (bData); correspondingly, the field of Artificial Intelligence (AI) has continually provided methodologies and tools for exploiting this Big Data. Taking into consideration the prevalence of Big Data, Sentiment Mining (SM) and Opinion Mining (OM) or Aspect-based Sentiment Analysis (ABSA) have increasingly become interesting and relevant topics – within the subfield of Social Network Analysis (SNA) – with respect to the field of Artificial Intelligence. Thus, “fine-grained” Sentiment Mining or Opinion Mining essentially focuses on determining deeper intensities of users’ emotions or viewpoints with respect to a given topic. Our work herein aims at examining and exploiting Big Data collections, with the goal of extracting fine-grained sentiment(s) on a given topic, using state-of-the-art representation-based transformer architectures. Several existing literature have exploited Structured Data for Sentiment Mining using Recurrent Neural Network (RNN) and Recursive Neural Network (RvNN) architectures. However, and majorly due to computational constraints, only a few existing literature have exploited Big Data for Sentiment Mining using transformer-based architectures. To this end, our research herein contributes to the latter existing literature and fills the literature-gap via employing a dedicated, high-end, enterprise-grade data center Graphics Processing Unit (GPU) in a bid to overcome common computational constraints associated with harnessing Big Data and post-training transformer architectures. Our proposed framework leverages the fundamental architecture of encoder-only transformers, irrespective of noisy data, with respect to a pre-trained Modern Bidirectional Encoder Representations from Transformers (ModernBERT) architecture. In this regard, the results of our experiments aggregated herein have been very auspicious with respect to the objective functions employed in our research. Transformers Encoder-only Transformer ModernBERT Big Data Sentiment Analysis Sentiment Mining Opinion Mining Natural Language Processing Full Text Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7595618","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":515069215,"identity":"e1416fc9-5b17-47ec-8a71-28794a1790a2","order_by":0,"name":"Bonaventure Chidube Molokwu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABFUlEQVRIiWNgGAWjYBACAzS+BYMEAwPjAxyyKIIScBrIYjZAkyWshU0CTRYFmLP3HvzMU3GnzuD24QNMNyok5CRn5JhV/Phz2J6BvXmbBBYtlj3nkqV5zjyTMDiXlsCcc0bCWFoix+xmb9vhxAaeY2XYtBjcyDGQnNl2WMLgDI8Bc26bROI86Ryz24wNhxMYgHpxaDH+OfMfSAv/B7iWYgaQw+Tf4NJiJvGxAWwLA1jLbKAWZga2w4wNEjxYtVj2nDGz+HDssOTMM2wGh0F+kZz/rFiyty09sY0nrdgCa4j1GN9IqDnMz3eG+eHjnAobOYkzhzd++PHH2p6f/fDGG9hCGRkcQOGxEVI+CkbBKBgFowAnAACixF9WUDHXXgAAAABJRU5ErkJggg==","orcid":"","institution":"California State University","correspondingAuthor":true,"prefix":"","firstName":"Bonaventure","middleName":"Chidube","lastName":"Molokwu","suffix":""},{"id":515069216,"identity":"a28381bb-a95f-43f9-ab74-793b94c9ceab","order_by":1,"name":"Audrey Rah","email":"","orcid":"","institution":"University of Houston","correspondingAuthor":false,"prefix":"","firstName":"Audrey","middleName":"","lastName":"Rah","suffix":""},{"id":515069217,"identity":"7ff06aff-b5b0-4e9f-a9ea-9bb1ab161b60","order_by":2,"name":"Reginald Chukwuka Molokwu","email":"","orcid":"","institution":"Vyrux Group Inc","correspondingAuthor":false,"prefix":"","firstName":"Reginald","middleName":"Chukwuka","lastName":"Molokwu","suffix":""}],"badges":[],"createdAt":"2025-09-12 02:08:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7595618/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7595618/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":91788364,"identity":"9c6bcf9d-d964-4f03-84a8-f6b9ad173d15","added_by":"auto","created_at":"2025-09-21 10:01:37","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1231190,"visible":true,"origin":"","legend":"","description":"","filename":"MOLOKWUetalML2025.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7595618/v1_covered_3a6aced2-0834-4840-8008-686da89c3945.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Fine-Grained Sentiment Mining, at Document Level on Big Data, using a state-of-the-art Representation-based Transformer: ModernBERT","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Transformers, Encoder-only Transformer, ModernBERT, Big Data, Sentiment Analysis, Sentiment Mining, Opinion Mining, Natural Language Processing","lastPublishedDoi":"10.21203/rs.3.rs-7595618/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7595618/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"As our active and passive digital footprints continuously aggregate into Big Data (bData); correspondingly, the field of Artificial Intelligence (AI) has continually provided methodologies and tools for exploiting this Big Data. Taking into consideration the prevalence of Big Data, Sentiment Mining (SM) and Opinion Mining (OM) or Aspect-based Sentiment Analysis (ABSA) have increasingly become interesting and relevant topics – within the subfield of Social Network Analysis (SNA) – with respect to the field of Artificial Intelligence. Thus, “fine-grained” Sentiment Mining or Opinion Mining essentially focuses on determining deeper intensities of users’ emotions or viewpoints with respect to a given topic. Our work herein aims at examining and exploiting Big Data collections, with the goal of extracting fine-grained sentiment(s) on a given topic, using state-of-the-art representation-based transformer architectures. Several existing literature have exploited Structured Data for Sentiment Mining using Recurrent Neural Network (RNN) and Recursive Neural Network (RvNN) architectures. However, and majorly due to computational constraints, only a few existing literature have exploited Big Data for Sentiment Mining using transformer-based architectures. To this end, our research herein contributes to the latter existing literature and fills the literature-gap via employing a dedicated, high-end, enterprise-grade data center Graphics Processing Unit (GPU) in a bid to overcome common computational constraints associated with harnessing Big Data and post-training transformer architectures. Our proposed framework leverages the fundamental architecture of encoder-only transformers, irrespective of noisy data, with respect to a pre-trained Modern Bidirectional Encoder Representations from Transformers (ModernBERT) architecture. In this regard, the results of our experiments aggregated herein have been very auspicious with respect to the objective functions employed in our research.","manuscriptTitle":"Fine-Grained Sentiment Mining, at Document Level on Big Data, using a state-of-the-art Representation-based Transformer: ModernBERT","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-19 02:57:31","doi":"10.21203/rs.3.rs-7595618/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"bda07cf6-9304-4292-8504-2c7e369ea248","owner":[],"postedDate":"September 19th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-09-21T09:53:25+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-19 02:57:31","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7595618","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7595618","identity":"rs-7595618","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.