Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques

doi:10.21203/rs.3.rs-5320308/v1

Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques

2024 · doi:10.21203/rs.3.rs-5320308/v1

preprint OA: closed

Full text JSON View at publisher

Full text 94,984 characters · extracted from preprint-html · click to expand

Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques Parminder Singh, Saurabh Dhyani This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5320308/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract In the rapidly evolving digital age, sentiment analysis is crucial for understanding consumer behavior on social media platforms. Advanced sentiment analysis techniques integrate swarm based feature selection strategy with deep learning approaches, enhancing emotion classification accuracy and contributing to Sustainable Development Goal (SDG) 9: Infrastructure Innovation. In order to evaluate social media postings and movie reviews, the suggested ensemble model integrates advance strategy of feature selection with deep neural network architecture, making use of swarm-based feature selection and Long-Short Term memory Network (LSTM). Particle Swarm Optimization (PSO) greatly increases the accuracy of emotion prediction by using it for feature selection. Rigorous evaluations validate the hybrid model, demonstrating significant improvements over traditional methods and achieving an impressive accuracy of 93.5%. This highlights its robustness in handling data challenges like sarcasm and ambiguity. The implementation advances sentiment analysis, offering comprehensive solutions that support economic and industrial growth, making it a valuable tool for modern data-driven decision-making. Natural Language Processing Sentiment Analysis Swarm Intelligence Deep Learning LSTM Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 1. Introduction Web 2.0's emergence has profoundly changed how people communicate and share information online, which has resulted in a experimental growth of contents which are generated by users on social platform sites like WhatsApp, Twitter and Facebook. Based on Web 2.0 concepts and technical principles that facilitate the creation and editing of user-generated content. [1], [2], [3] Building resilient infrastructure, promoting inclusive and sustainable industrialization and stimulating innovation [4] are all important aspects of achieving Sustainable Development Goal (SDG) 9 (UN, 2015). [5] In the context of sentiment analysis, achieving SDG 9 requires improving analytics and data processing technologies to support industrial and economic growth. [6] Blogs, forum posts, and other online communities [7] allow users to share opinions about anything.[8] Sentiment analysis mines Internet-based information for attitudes, opinions, and feelings by using computational algorithms to determine the emotional tone of words.[9] The use of social media on a large scale has put abundant user-generated content in circulation, making it very useful for gauging public opinion. [10], [11] For researchers, businesses, and policymakers, reaching an understanding of public opinion trends based on this study could help adapt strategies and make informed decisions.[12] More sophisticated sentiment analysis can be a powerful tool for understanding consumer demands and market trends, which will help us make more optimized decisions and develop creative solutions in public policy, marketing, and finance.[13] Although significant progress has been made, a gap still exists in the effective integration of traditional feature selection strategies with deep learning techniques for sentiment analysis. Traditional methods have difficulty with subtleties in human language like sarcasm, ambiguity, and context dependencies, leading to suboptimal sentiment predictions.[14], [15] Deep learning, although effective, requires substantial resources and may not necessarily capture higher-level features optimally. Addressing these challenges requires a hybrid approach that leverages the power of both methodologies. This paper aims to bridge this research gap by proposing a novel hybrid approach with advanced features combined with deep neural network architecture. This study contributes: (1) An ensemble model that combines swarm-based feature selection strategy and Long-Short-Term Memory Networks (LSTM) trained on thematically-different datasets including social media posts as well as movie review. (2) To incorporate particle swarms optimization (PSO) for feature selection to enhance sentiment prediction accuracy. (3) To Conduct rigorous evaluations to validate the model’s performance, demonstrating significant improvements over traditional methods. Organization of manuscript as follows : Section 2 discusses related work and the theoretical background; Section 3 details the proposed hybrid model and its implementation; and Section 4 presents the experimental results and analysis, highlighting the model’s effectiveness and potential applications. 2. Literature Review Data sources for sentiment analysis[ 7 ] are primarily drawn from online social media platforms, where users continuously generate exponential growth of information. Consequently, these sources of data must be evaluated within big data framework, addressing challenges related to data quality, storage, accessibility, resource availability, and monitoring to ensure that results are reliable.[ 16 ] Automated sentiment analysis is a growing research area and an essential multi-application task, even though it is complex and has numerous challenges with natural language processing. (NLP)[ 7 ], [ 17 ], [ 18 ], [ 19 ] Social media platforms, as key sources of sentiment analysis data, are expanding continuously, producing increasingly complex and interconnected content. In this regard Neveen Ghali recommended a shift away from solely focusing on data structure and correlations, advocating to develop a lifelong understanding of data presentation, analysis, inference, visualization, search and navigation, and decision-making in complex networks.[ 20 ] Numerous work have developed robust model to manage the growing volume of big data and have extended sensitivity analysis to a broad range of applications, including forecasting of financials [ 21 ], [ 22 ], market strategies, medial research[ 23 ], [ 24 ] and various other industries, thereby providing practical evidence of their performance[ 25 ], [ 26 ] The performance of Convolutional Neural Network (CNN)[ 27 ] and Recurrent Neural Network (RNN)[ 28 ] on a specific dataset in a specific domain is evaluated with a relatively high accuracy. When evaluating the performance of a method on a specific dataset in a specific domain, the results show a relatively high overall accuracy for Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) consistently have shown that CNN and RNN models can overcome minor textual deficiencies in deep learning models. Qian et al. [ 29 ]showed that long-term and short-term memory(LSTM) is more efficient when weather, mood and emotion tweets are processed in different contexts. Li et al. [ 24 ]investigated the effect of data quality on perceptual classification performance. Three factors- feedback, readability, and objectivity-were considered to assess online reproduction quality. To factors were highlighted in this study which affecting the accuracy of sensory analysis: readability and duration. Higher readability and smaller text data sets result in higher sensitivity classification. However, the reliability of the proposed method is questionable when the size or scope of the data varies. The majority of papers in comparative studies prefer to ignore processing time and concentrate on reliability indicators like overall accuracy or F-Score. Moreover, models are frequently evaluated using a small sample size of data. This paper fills the gap by offering a thorough comparison of experimental studies' and the literature's sensitivity analysis techniques for assessing the effectiveness of deep learning models and similar techniques on a range of topic-spanning datasets. The purpose of this research question is to ascertain whether or not it is possible to discover the best approaches for data sets of various sizes and types. By assessing results based on three factors—total accuracy, F-Score, and processing time—the study builds on earlier studies to increase SA performance. This comparison study aims to give an objective assessment of the approaches that can direct research to produce the best possible outcome. Sensitivity analysis based on deep learning techniques, including convolutional neural networks (CNN), recurrent neural networks (RNN), short-term memory networks (LSTM), and deep neural networks (DNN), has been the subject of numerous studies in recent years. LSTM employs our strategy to address issues like sarcasm (e.g., "Oh fantastic! "Another traffic jam!"), ambiguity (as in "I like spicy food about"), and contextual awareness (like in "The movie is too long") in order to preserve context. It is regarded as slang, metaphors, emoticons, domain-specific attitudes, and negatives (like "I hate that"). In order to efficiently perform feature selection and consume social media data and nuanced expressions, the proposed hybrid model combines deep learning with swarm intelligence. It does this by utilizing swarm-based feature selection techniques like particle swarm optimization (PSO), which results in a comprehensive sentiment analysis solution. Fig-1 Shows that the Deep learning uses a multi-layered approach to extract and learn real-world features, enhancing accuracy and performance over traditional machine learning, where features are manually defined [ 30 ]. Deep learning optimizes hyperparameters automatically, unlike traditional methods like SVM, Bayesian networks, or decision trees. It effectively tackles challenges in image and speech recognition, and NLP, with LSTM networks addressing complex issues like sarcasm and domain-specific sentiments. Integrating deep learning with swarm-based feature selection methods creates robust sentiment analysis models. Fig-2 Shows the Deep neural networks (DNNs) consist of multiple layers, including more than two hidden layers, using complex algorithms to process data from an input layer to an output layer [ 31 ], Convolutional Neural Networks (CNNs), a deep learning framework used in computer vision and NLP, include convolutional, pooling, and fully connected layers. These layers apply filters to extract features and reduce complexity, enhancing robustness [ 31 ], [ 32 ]. Recurrent Neural Networks (RNNs), designed for sequential data, form feedback loops that allow them to hold previous calculations [ 32 ], [ 33 ]. Long Short-Term Memory networks (LSTMs), a type of RNN, efficiently capture long-term dependencies in sequences, making them suitable for tasks like time series analysis and NLP [ 34 ]. Sentiment analysis evaluates information to determine if the sentiment is positive, negative, or neutral at feature, sentence, and document levels [ 35 ], [ 36 ], [ 37 ]. Social media platforms like Facebook and Twitter have amplified user-generated content, making sentiment analysis crucial for understanding public opinion, despite challenges like sarcasm, irony, ambiguity, site-specific sensitivity, multilingual nuances, data imbalance, and emoji interpretation [ 38 ]. Lexicon-based techniques, such as SentiWordNet, use predefined dictionaries to categorize emotions but often lack context sensitivity [ 39 ]. Corpus-based methods, like k-nearest neighbors (k-NN) and hidden Markov models (HMM), use statistical analysis to estimate sentiment, capturing subtle emotional content better [ 39 ], [ 40 ]. Machine learning techniques for sentiment analysis include traditional models like Naive Bayes, Maximum Entropy, and Support Vector Machines (SVMs), as well as deep learning models like CNN, DNN, and RNN. [ 40 ], 3. Proposed Hybrid Model Figure 3 shows both the lexicon-based methods and machine learning-based approaches of sentiment analysis process. We propose this model, that follows the yellow arrow shaped path; it starts, as input for processing through tokenization and cleaning using text data. The text is processed even further and it will be analyses by other methods both based on lexicons or deep learning Even more concretely, we leverage these types of deep learning techniques in a machine-learning-based approach for NLP using such entities as LSTM (Long Short-Term Memory) networks. In this, PSO acts as feature selection and can improve the sentiment models to be more sufficient for complex tasks. Ultimately, the enhanced results in a positive, negative and neutral sentiment contexts are confirmed using LSTM network after feature selection by PSO-based. 3.1 Methodological Analysis Data Collection For this proposed hybrid method for sentiment analysis, we start from the acquisition of actual heterogeneous user data harvested on YouTube and other platforms through YouTube Data API connected by Google Cloud On console (Fig-4), consequently providing high-quality input content needed in subsequent stage to analyze. Once the data are collected, we meticulously transcribe them first. This includes tokenization by segmenting text into logical units, followed by cleaning steps to remove noise such as URLs, emoticons, and special characters. Furthermore, the elimination of syntactic terms increases the focus on terms that provide information about sensory classification. Alternatively, we use syntax-based methods to extract the earlier sensory features using products such as SentiWordNet. This step provides emotional insight starting from the text. Then, using traditional machine learning algorithms such as Naive Bayes, SVM, or others, we extract the most relevant features from the preprocessed data These algorithms include lexical features, emotional words word-based, features of speech to increase accuracy of emotion analysis, and analyzes other characteristics of language shown by Figure-5. Data Preprocessing step uses careful text cleaning is carried out throughout the preparation phase to improve the quality of the data. Regular expressions help to standardize input by making it easier to remove URLs, emoticons, and special characters. Natural language processing tools drive a process called tokenization, which breaks down the text into discrete tokens and lays the foundation for further research. Simultaneously, sentiment analysis becomes more focused when stopwords, or non-informative words like "the" and "and," are removed. Table-1, which presents influential terms together with their frequency, is designed to help find important terms that are important in expressing feelings in film reviews. This table contributes to a more nuanced understanding of user feelings in the context of movie conversations by quantifying word occurrences, which helps train a robust sentiment analysis model on key terms. Table − 1 : Word Frequency Word Frequency movie 691 indian 215 hindustani 222 excellent 125 super 110 love 91 best 83 bad 16 boring 9 --- --- Feature Extraction step refines feature selection and improve model performance, we incorporate Swarm-Based Feature Selection techniques, especially Particle Swarm Optimization (PSO) (Fig. 6 ) PSO helps to select more informative features that contribute more to sentiment classification. PSO falls under a class of particles, where each particle represents a particle. Initially, the positions and velocities of the particles are determined randomly within defined limits. For feature selection in text data, each dimension of the particle's position vector corresponds to a feature in the data set. In the Particle Swarm Optimization (PSO) process for feature selection, each particle's suitability is evaluated by training a classifier, such as an LSTM for ordinal data, and measuring the classifier's accuracy on a validation set. This fitness evaluation unit is designed to help in selecting features that improve how well the classifier works. If the current fitness state of a particle is better than before, its individual best position (pBest) gets updated so it will not forget any promising traits. Analogously, the global best position (gBest) is updated each time a particle's current fitness exceeds that of its previous gBest, meaning it has reached an optimal feature set across all p. Then particles update their velocities to thoroughly cover the search space, giving guidelines between pBest for personal experience and gBest for global knowledge optimizing feature selection. The velocity update formula incorporates: v i (t + 1) = ω.v i (t) + c 1 .r 1 .( pBest i – x i (t)) + c 2 . r 2 . ( gBest – x i (t)) (1) Where: v i (t) is the velocity of particle iii at iteration ttt, ω is the inertia weight, c 1 and c 2 are cognitive and social coefficients, r 1 and r 2 are random numbers between 0 and 1, pBest i is the personal best position of particle i, gBest is the global best position among all particles, x i (t) is the position of particle i at iteration t. Particles update their positions using their updated velocities to explore new potential feature subsets: x i (t + 1) = x i (t) + v i (t + 1) (2) The graph (Figure-7) shows how PSO improves the feature selection by identifying the feature locations in the iterations. It assumes complex or variable factors with respect to classification accuracy. In refining feature subsets for better classification results, the search was supported to understand feature importance and algorithm performance. The long-term and short-term memory (LSTM) networks, which are recurrent neural networks (RNNs) known to capture dependent sequences and contextual contexts, are then used into on textual data process the selected objects. After LSTM processing, the fully assembled layer forms the final sensitivity classification based on the known features. This section consolidates the selected information and provides predictions of sentiment categorized as positive, negative, or neutral. Examples of model performance include metrics such as accuracy, F1 scores, precision, and recall on independent test data sets, to ensure robustness and reliability (formula − 3). $$\:F1=2\:X\:\frac{\text{P}\text{r}\text{e}\text{c}\text{i}\text{s}\text{i}\text{o}\text{n}\:\:\text{X}\:\text{R}\text{e}\text{c}\text{a}\text{l}\text{l}}{\text{P}\text{r}\text{e}\text{c}\text{i}\text{s}\text{i}\text{o}\text{n}+\text{R}\text{e}\text{c}\text{a}\text{l}\text{l}}$$ 3 Model Evaluation in this step Independent experimental data are used to test the accuracy of the model. In addition to accuracy, several metrics such as precision, recall, F1 scores are used to provide a complete picture of model performance and accuracy which is especially relevant in sensitivity analysis. This detailed analysis assures that the sentiment analysis model can accurately represent the nuances of user content found in YouTube content. The Comment Clustering graph (Figure-8), created using K-means clustering and displayed using PCA, provides insightful information about the natural patterns of YouTube content. Each group has a different color, the red line indicates a specific group of content a. By identifying overviews, this visualization can help reveal underlying themes or patterns in a data structure. Clusters provide a nuanced view of user views on movie reviews studied with sentiment distribution, model accuracy Although not described in this article, red lines indicate a specific group of information whose specific characteristics or patterns can be further examined contribute to detailed descriptions of emotional dynamics and user engagement are obtained. 4. Result Analysis The comprehensive analysis of the results reveals the views expressed in the YouTube stories using various methodological approaches. Pie charts (Fig. 9) and word clouds (Fig. 10) are among the colorful and meaningful visualizations used to accurately represent emotion classification. A word cloud, a visual representation of text data, displays words of varying sizes based on their frequency. This visualization provides qualitative insights into the dominant emotions by highlighting common words such as “excellent", ”Super," “love”, "Hindustani”, “awesome", “blockbuster” and other positive statements, as well as specific references to artists like "Kamal Hasan." Fig. 9 showcases the word cloud for YouTube comments, illustrating these frequent terms. In addition, the Pie Chart (Figure. 9) provides a numerical viewpoint by decomposing the attitudes into percentages: 31.3% neutral, 15.9% negative, and 52.8% positive. When combined, these graphic components offer a comprehensive comprehension of the feelings conveyed in YouTube comments, presenting both qualitative and quantitative insights into user viewpoints. In addition, the proposed methodology’s performance is presented in a comprehensive comparison research table, which demonstrates an astounding 93.5% accuracy. This result highlights the superior performance of our model over conventional techniques like Naive Bayes, SVM, and Random Forest in reliably classifying feelings. The comparison table, which is displayed in Table-II, attests to the greater effectiveness and dependability of our suggested sentiment analysis approach. Table – II (Comparison of Sentiment Analysis Models) Model Accuracy F1 Score Naive Bayes 87.3% 0.83 SVM 91.3% 0.82 CNN 86.7% 0.86 Proposed Model 93.5% 0.94 Conclusion our research harnesses the power of sentiment analysis on YouTube movie reviews, tapping into the wealth of user-generated comments to unravel diverse perspectives and emotions. Leveraging advanced natural language processing techniques, the proposed specialized machine learning model achieves an impressive 93.5% accuracy and efficiency in deciphering sentiments. Through a meticulous step wise methodology involving data collection, preprocessing, model selection and training, evaluation, results analysis, we navigate the challenges of extracting meaningful insights from YouTube comments. This research contributes to the evolving landscape of sentiment analysis, offering a reliable tool for content creators, movie studios, and researchers to make sense of the intricate tapestry of opinions within the dynamic realm of online discussions. As we delve deeper into the nuances of sentiments expressed in YouTube movie reviews, our study not only enhances understanding but also holds promise for influencing marketing strategies and decision-making processes in the digital age. Declarations Funding : No Funding Author Contribution Parminder Singh (First Author) had responsible for creating the entire manuscript, including the research, model development, and writing. Saurabh Dhyani (Co-Author) provided guidance and valuable insights throughout the research process. References Dang NC, Moreno-García MN, De la Prieta F (2020) Sentiment Analysis Based on Deep Learning: A Comparative Study, Electronics (Basel) , vol. 9, no. 3, p. 483, Mar. 10.3390/electronics9030483 Araque O, Corcuera-Platas I, Sánchez-Rada JF, Iglesias CA (Jul. 2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236–246. 10.1016/j.eswa.2017.02.002 Denoncourt J, Companies (Jan. 2020) UN 2030 Sustainable Development Goal 9 Industry, Innovation and Infrastructure. J Corp Law Stud 20(1):199–235. 10.1080/14735970.2019.1652027 Hu H (2024) Digitalization and Dependence: Evaluating the Impact of the Belt and Road Initiative on Achieving Sustainable Development Goals 8 and 9 and Shaping Digital Autonomy, Journal of Economic Integration , Jun. 10.11130/jei.2024024 Denoncourt J, Companies (Jan. 2020) UN 2030 Sustainable Development Goal 9 Industry, Innovation and Infrastructure. J Corp Law Stud 20(1):199–235. 10.1080/14735970.2019.1652027 Hajikhani A, Suominen A (2022) Mapping the sustainable development goals (SDGs) in science, technology and innovation: application of machine learning in SDG-oriented artefact detection, Scientometrics , vol. 127, no. 11, pp. 6661–6693, Nov. 10.1007/s11192-022-04358-x Dang NC, Moreno-García MN, De la Prieta F (2020) Sentiment Analysis Based on Deep Learning: A Comparative Study, Electronics (Basel) , vol. 9, no. 3, p. 483, Mar. 10.3390/electronics9030483 Cha M, Pérez JAN, Haddadi H (Sep. 2012) The spread of media content through blogs. Soc Netw Anal Min 2(3):249–264. 10.1007/s13278-011-0040-x Birjali M, Kasri M, Beni-Hssane A (Aug. 2021) A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowl Based Syst 226:107134. 10.1016/j.knosys.2021.107134 Saura JR, Reyes-Menendez A, Thomas SB (Apr. 2020) Gaining a deeper understanding of nutrition using social networks and user-generated content. Internet Interv 20:100312. 10.1016/j.invent.2020.100312 Narangajavana Kaosiri Y, Callarisa Fiol LJ, Moliner Tena MÁ, Rodríguez RM, Artola, Sánchez García J (2019) User-Generated Content Sources in Social Media: A New Approach to Explore Tourist Satisfaction, J Travel Res , vol. 58, no. 2, pp. 253–265, Feb. 10.1177/0047287517746014 Rodrigues AP et al (2022) Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques, Comput Intell Neurosci , vol. pp. 1–14, Apr. 2022, 10.1155/2022/5211949 Gunter B, Koteyko N, Atanasova D (2014) Sentiment Analysis: A Market-Relevant and Reliable Measure of Public Feeling? International Journal of Market Research , vol. 56, no. 2, pp. 231–247, Mar. 10.2501/IJMR-2014-014 Al-Qablan TA, Mohd Noor MH, Al-Betar MA, Khader AT (2023) A survey on sentiment analysis and its applications, Neural Comput Appl , vol. 35, no. 29, pp. 21567–21601, Oct. 10.1007/s00521-023-08941-y Sharma NA, Ali ABMS, Kabir MA (2024) A review of sentiment analysis: tasks, applications, and deep learning techniques, Int J Data Sci Anal , Jul. 10.1007/s41060-024-00594-x Lücking A, Driller C, Stoeckel M, Abrami G, Pachzelt A, Mehler A (2022) Multiple annotation for biodiversity: developing an annotation framework among biology, linguistics and text technology, Lang Resour Eval , vol. 56, no. 3, pp. 807–855, Sep. 10.1007/s10579-021-09553-5 Cambria E, Das D, Bandyopadhyay S, Feraco A (2017) Affective Computing and Sentiment Analysis. 1–10. 10.1007/978-3-319-55394-8_1 Wang Y et al (2022) Jul., A systematic review on affective computing: emotion models, databases, and recent advances, Information Fusion , vol. 83–84, pp. 19–52, 10.1016/j.inffus.2022.03.009 Hussein DME-DM (2018) A survey on sentiment analysis challenges, Journal of King Saud University - Engineering Sciences , vol. 30, no. 4, pp. 330–338, Oct. 10.1016/j.jksues.2016.04.002 Ghali N, Panda M, Hassanien AE, Abraham A, Snasel V (2012) Social Networks Analysis: Tools, Measures and Visualization, in Computational Social Networks . Springer London, London, pp 3–23. 10.1007/978-1-4471-4054-2_1 Jangid H, Singhal S, Shah RR, Zimmermann R (2018) Aspect-Based Financial Sentiment Analysis using Deep Learning, in Companion of the The Web Conference on The Web Conference 2018 - WWW ’18 , New York, New York, USA: ACM Press, 2018, pp. 1961–1966. 10.1145/3184558.3191827 Sohangir S, Wang D, Pomeranets A, Khoshgoftaar TM (Dec. 2018) Big Data: Deep Learning for financial sentiment analysis. J Big Data 5(1). 10.1186/s40537-017-0111-6 Alharbi ASM, de Doncker E (May 2019) Twitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information. Cogn Syst Res 54:50–61. 10.1016/j.cogsys.2018.10.001 Li L, Goh T-T, Jin D (May 2020) How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput Appl 32(9):4387–4415. 10.1007/s00521-018-3865-7 Li L, Goh T-T, Jin D (May 2020) How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput Appl 32(9):4387–4415. 10.1007/s00521-018-3865-7 Alharbi ASM, de Doncker E (May 2019) Twitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information. Cogn Syst Res 54:50–61. 10.1016/j.cogsys.2018.10.001 Abid F, Alam M, Yasir M, Li C (Jun. 2019) Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Future Generation Comput Syst 95:292–308. 10.1016/j.future.2018.12.018 Kiritchenko S, Zhu X, Mohammad SM (2014) Sentiment Analysis of Short Informal Texts, Journal of Artificial Intelligence Research , vol. 50, pp. 723–762, Aug. 10.1613/jair.4272 Qian J, Niu Z, Shi C (2018) Sentiment Analysis Model on Weather Related Tweets with Deep Neural Network, in Proceedings of the 10th International Conference on Machine Learning and Computing , New York, NY, USA: ACM, Feb. 2018, pp. 31–35. 10.1145/3195106.3195111 LeCun Y, Bengio Y, Hinton G (May 2015) Deep learning. Nature 521(7553):436–444. 10.1038/nature14539 Agarwal P, Alam M (2020) A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices. Procedia Comput Sci 167:2364–2373. 10.1016/j.procs.2020.03.289 Zhang L, Wang S, Liu B (Jul. 2018) Deep learning for sentiment analysis: A survey. WIREs Data Min Knowl Discov 8(4). 10.1002/widm.1253 Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (Feb. 2021) ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis. Future Generation Comput Syst 115:279–294. 10.1016/j.future.2020.08.005 Hochreiter S, ¨ J, Schmidhuber U Long Short-Term Memory. Choi G, Oh S, Kim H (2020) Improving document-level sentiment classification using importance of sentences, Entropy , vol. 22, no. 12, pp. 1–11, Dec. 10.3390/e22121336 Farra N, Challita E, Assi RA, Hajj H (2010) Sentence-level and document-level sentiment mining for arabic texts, in Proceedings - IEEE International Conference on Data Mining, ICDM , pp. 1114–1119. 10.1109/ICDMW.2010.95 Rao G, Huang W, Feng Z, Cong Q (2018) LSTM with sentence representations for document-level sentiment classification, Neurocomputing , vol. 308, pp. 49–57, Sep. 10.1016/j.neucom.2018.04.045 Cambria E, Das D, Bandyopadhyay S, Feraco A (2017) Affective Computing and Sentiment Analysis. 1–10. 10.1007/978-3-319-55394-8_1 Rice DR, Zorn C (Jan. 2021) Corpus-based dictionaries for sentiment analysis of specialized vocabularies. Political Sci Res Methods 9(1):20–35. 10.1017/psrm.2019.10 Troussas C, Virvou M, Espinosa KJ, Llaguno K, Caro J Sentiment analysis of Facebook statuses using Naive Bayes classifier for language learning. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5320308","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":372951028,"identity":"76b89d12-da55-4df4-8153-08a6c27157ef","order_by":0,"name":"Parminder Singh","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA9UlEQVRIiWNgGAWjYDCCwwwMB4AUDwPzAcYHIAYf8VrYEpgNQAw2gloOwBhsCWwSYJqQDr7j7BcP/PhlI2POxnys8muOnQwbA/PDRzfwaJE8zFNwsLcvjceyjS3ttuy2ZKDD2IyNc/BoMTjMk3CAt+cwj8H9HrPbktuYgVp42KQJaTn4t+c/j8ExHrNiyW31xGhhP3CY58cBsBbGj9sOE9YC9AvDYdmGZKAWtmRpxm3HediYCfiF7/zxxx/f/LGzNzjGfPDjz23V9vzszQ8f49MCjDsDBsY2CJOZB0ziVQ4C7A8YGP5AmIw/CKoeBaNgFIyCkQgAAglJ3dNvXDkAAAAASUVORK5CYII=","orcid":"","institution":"Uttaranchal University","correspondingAuthor":true,"prefix":"","firstName":"Parminder","middleName":"","lastName":"Singh","suffix":""},{"id":372951029,"identity":"5a15bb5a-700c-4666-9ad8-dca703c41d24","order_by":1,"name":"Saurabh Dhyani","email":"","orcid":"","institution":"Uttaranchal University","correspondingAuthor":false,"prefix":"","firstName":"Saurabh","middleName":"","lastName":"Dhyani","suffix":""}],"badges":[],"createdAt":"2024-10-23 15:53:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5320308/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5320308/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":68904121,"identity":"9b523b08-0941-43fe-a80f-3cc9ec198037","added_by":"auto","created_at":"2024-11-13 10:20:58","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":149370,"visible":true,"origin":"","legend":"\u003cp\u003eHighlights key components and processes in sentiment polarity classification for traditional machine learning versus deep learning approaches.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/445897e691ce2f4c10f7299e.png"},{"id":68903659,"identity":"8aea0413-33c9-47cc-ab94-1c7f976bdaa5","added_by":"auto","created_at":"2024-11-13 10:12:58","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":248352,"visible":true,"origin":"","legend":"\u003cp\u003eDeep Neural Network (DNN), Convolutional Neural Network (CNN), Long-Short-Memory Network (LSTM)\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/1d127a1d12fc9d107d4cb8eb.png"},{"id":68902354,"identity":"79921d81-b8b2-46d9-b710-847278aa5df5","added_by":"auto","created_at":"2024-11-13 09:56:58","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":191134,"visible":true,"origin":"","legend":"\u003cp\u003eProposed Hybrid Model\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/27fc96fe9d82034a4ff4be98.png"},{"id":68902633,"identity":"45688b1a-9a49-4cef-a7cb-095dc60f1a60","added_by":"auto","created_at":"2024-11-13 10:04:58","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":176827,"visible":true,"origin":"","legend":"\u003cp\u003eConsole.cloud.google.com for API\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/1ec8f290543b5021cf2c6612.png"},{"id":68902347,"identity":"9f1016f7-9c17-4f8e-be28-351e40ba9a2f","added_by":"auto","created_at":"2024-11-13 09:56:58","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":147866,"visible":true,"origin":"","legend":"\u003cp\u003eMethodology Components\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/27b9e147670a36b47d302d76.png"},{"id":68902348,"identity":"248a8e29-8d1d-4e9b-951f-2c01cd0c9647","added_by":"auto","created_at":"2024-11-13 09:56:58","extension":"jpeg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":88767,"visible":true,"origin":"","legend":"\u003cp\u003ePSO Swarm Based Feature Selection Architecture\u003c/p\u003e","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/dd91ebb12f8478785e3c8ef7.jpeg"},{"id":68902636,"identity":"9a422fa4-c27a-48a1-9b8a-19f239f89e1b","added_by":"auto","created_at":"2024-11-13 10:04:58","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":258162,"visible":true,"origin":"","legend":"\u003cp\u003ePSO Feature Selection\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/0ff2dacee3f56822d94a7f67.png"},{"id":68902352,"identity":"b54155de-ea8d-4f91-b510-c4aad050067c","added_by":"auto","created_at":"2024-11-13 09:56:58","extension":"jpeg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":39125,"visible":true,"origin":"","legend":"\u003cp\u003eComment Clusters\u003c/p\u003e","description":"","filename":"floatimage8.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/1ca83fce3ae2548d3d880c6e.jpeg"},{"id":68902355,"identity":"61e914a0-238a-4f54-8d88-3e12ce744a14","added_by":"auto","created_at":"2024-11-13 09:56:58","extension":"jpeg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":496665,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFigure 9-10:\u003c/strong\u003e Sentiment Distribution Chart and Word Cloud Map\u003c/p\u003e","description":"","filename":"floatimage9.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/4a9259f6df3739a0de206a70.jpeg"},{"id":69033754,"identity":"9105008d-1306-4899-9fdf-5717747511ff","added_by":"auto","created_at":"2024-11-14 20:31:31","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2200918,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5320308/v1/838018b5-6ffb-4e95-9331-5faee8d07b79.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques","fulltext":[{"header":"1. Introduction ","content":"\u003cp\u003eWeb 2.0\u0026apos;s emergence has profoundly changed how people communicate and share information online, which has resulted in a experimental growth of contents which are generated by users on social platform sites like WhatsApp, Twitter and Facebook. Based on Web 2.0 concepts and technical principles that facilitate the creation and editing of user-generated content. [1], [2], [3]\u0026nbsp;Building resilient infrastructure, promoting inclusive and sustainable industrialization and stimulating innovation [4] are all important aspects of achieving Sustainable Development Goal (SDG) 9 (UN, 2015). [5]\u0026nbsp; In the context of sentiment analysis, achieving SDG 9 requires improving analytics and data processing technologies to support industrial and economic growth. [6] Blogs, forum posts, and other online communities [7] allow users to share opinions about anything.[8] Sentiment analysis mines Internet-based information for attitudes, opinions, and feelings by using computational algorithms to determine the emotional tone of words.[9] The use of social media on a large scale has put abundant user-generated content in circulation, making it very useful for gauging public opinion. [10], [11] For researchers, businesses, and policymakers, reaching an understanding of public opinion trends based on this study could help adapt strategies and make informed decisions.[12] More sophisticated sentiment analysis can be a powerful tool for understanding consumer demands and market trends, which will help us make more optimized decisions and develop creative solutions in public policy, marketing, and finance.[13]\u003c/p\u003e\n\u003cp\u003eAlthough significant progress has been made, a gap still exists in the effective integration of traditional feature selection strategies with deep learning techniques for sentiment analysis. Traditional methods have difficulty with subtleties in human language like sarcasm, ambiguity, and context dependencies, leading to suboptimal sentiment predictions.[14], [15]\u0026nbsp;Deep learning, although effective, requires substantial resources and may not necessarily capture higher-level features optimally. Addressing these challenges requires a hybrid approach that leverages the power of both methodologies. This paper aims to bridge this research gap by proposing a novel hybrid approach with advanced features combined with deep neural network architecture. This study contributes:\u003c/p\u003e\n\u003cp\u003e(1) An ensemble model that combines swarm-based feature selection strategy and Long-Short-Term Memory Networks (LSTM) trained on thematically-different datasets including social media posts as well as movie review.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e(2) To incorporate particle swarms optimization (PSO) for feature selection to enhance sentiment prediction accuracy.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e(3) To Conduct rigorous evaluations to validate the model\u0026rsquo;s performance, demonstrating significant improvements over traditional methods.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOrganization of manuscript as follows : \u003cstrong\u003eSection 2\u003c/strong\u003e discusses related work and the theoretical background; \u003cstrong\u003eSection 3\u003c/strong\u003e details the proposed hybrid model and its implementation; and \u003cstrong\u003eSection 4\u003c/strong\u003e presents the experimental results and analysis, highlighting the model\u0026rsquo;s effectiveness and potential applications.\u003c/p\u003e"},{"header":"2. Literature Review","content":"\u003cp\u003eData sources for sentiment analysis[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e] are primarily drawn from online social media platforms, where users continuously generate exponential growth of information. Consequently, these sources of data must be evaluated within big data framework, addressing challenges related to data quality, storage, accessibility, resource availability, and monitoring to ensure that results are reliable.[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] Automated sentiment analysis is a growing research area and an essential multi-application task, even though it is complex and has numerous challenges with natural language processing. (NLP)[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e], [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e], [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e], [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e] Social media platforms, as key sources of sentiment analysis data, are expanding continuously, producing increasingly complex and interconnected content. In this regard \u003cem\u003eNeveen Ghali\u003c/em\u003e recommended a shift away from solely focusing on data structure and correlations, advocating to develop a lifelong understanding of data presentation, analysis, inference, visualization, search and navigation, and decision-making in complex networks.[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] Numerous work have developed robust model to manage the growing volume of big data and have extended sensitivity analysis to a broad range of applications, including forecasting of financials [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e], [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e], market strategies, medial research[\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e], [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] and various other industries, thereby providing practical evidence of their performance[\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e], [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eThe performance of Convolutional Neural Network (CNN)[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e] and Recurrent Neural Network (RNN)[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] on a specific dataset in a specific domain is evaluated with a relatively high accuracy. When evaluating the performance of a method on a specific dataset in a specific domain, the results show a relatively high overall accuracy for Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) consistently have shown that CNN and RNN models can overcome minor textual deficiencies in deep learning models. \u003cem\u003eQian et al.\u003c/em\u003e [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]showed that long-term and short-term memory(LSTM) is more efficient when weather, mood and emotion tweets are processed in different contexts. \u003cem\u003eLi et al.\u003c/em\u003e [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]investigated the effect of data quality on perceptual classification performance. Three factors- feedback, readability, and objectivity-were considered to assess online reproduction quality. To factors were highlighted in this study which affecting the accuracy of sensory analysis: readability and duration. Higher readability and smaller text data sets result in higher sensitivity classification. However, the reliability of the proposed method is questionable when the size or scope of the data varies.\u003c/p\u003e \u003cp\u003eThe majority of papers in comparative studies prefer to ignore processing time and concentrate on reliability indicators like overall accuracy or F-Score. Moreover, models are frequently evaluated using a small sample size of data. This paper fills the gap by offering a thorough comparison of experimental studies' and the literature's sensitivity analysis techniques for assessing the effectiveness of deep learning models and similar techniques on a range of topic-spanning datasets. The purpose of this research question is to ascertain whether or not it is possible to discover the best approaches for data sets of various sizes and types. By assessing results based on three factors\u0026mdash;total accuracy, F-Score, and processing time\u0026mdash;the study builds on earlier studies to increase SA performance. This comparison study aims to give an objective assessment of the approaches that can direct research to produce the best possible outcome. Sensitivity analysis based on deep learning techniques, including convolutional neural networks (CNN), recurrent neural networks (RNN), short-term memory networks (LSTM), and deep neural networks (DNN), has been the subject of numerous studies in recent years. LSTM employs our strategy to address issues like sarcasm (e.g., \"Oh fantastic! \"Another traffic jam!\"), ambiguity (as in \"I like spicy food about\"), and contextual awareness (like in \"The movie is too long\") in order to preserve context. It is regarded as slang, metaphors, emoticons, domain-specific attitudes, and negatives (like \"I hate that\"). In order to efficiently perform feature selection and consume social media data and nuanced expressions, the proposed hybrid model combines deep learning with swarm intelligence. It does this by utilizing swarm-based feature selection techniques like particle swarm optimization (PSO), which results in a comprehensive sentiment analysis solution.\u003c/p\u003e \u003cp\u003e \u003cb\u003eFig-1\u003c/b\u003e Shows that the Deep learning uses a multi-layered approach to extract and learn real-world features, enhancing accuracy and performance over traditional machine learning, where features are manually defined [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Deep learning optimizes hyperparameters automatically, unlike traditional methods like SVM, Bayesian networks, or decision trees. It effectively tackles challenges in image and speech recognition, and NLP, with LSTM networks addressing complex issues like sarcasm and domain-specific sentiments. Integrating deep learning with swarm-based feature selection methods creates robust sentiment analysis models.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eFig-2\u003c/b\u003e Shows the Deep neural networks (DNNs) consist of multiple layers, including more than two hidden layers, using complex algorithms to process data from an input layer to an output layer [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e], Convolutional Neural Networks (CNNs), a deep learning framework used in computer vision and NLP, include convolutional, pooling, and fully connected layers. These layers apply filters to extract features and reduce complexity, enhancing robustness [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e], [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Recurrent Neural Networks (RNNs), designed for sequential data, form feedback loops that allow them to hold previous calculations [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Long Short-Term Memory networks (LSTMs), a type of RNN, efficiently capture long-term dependencies in sequences, making them suitable for tasks like time series analysis and NLP [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e].\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eSentiment analysis evaluates information to determine if the sentiment is positive, negative, or neutral at feature, sentence, and document levels [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e], [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. Social media platforms like Facebook and Twitter have amplified user-generated content, making sentiment analysis crucial for understanding public opinion, despite challenges like sarcasm, irony, ambiguity, site-specific sensitivity, multilingual nuances, data imbalance, and emoji interpretation [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Lexicon-based techniques, such as SentiWordNet, use predefined dictionaries to categorize emotions but often lack context sensitivity [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. Corpus-based methods, like k-nearest neighbors (k-NN) and hidden Markov models (HMM), use statistical analysis to estimate sentiment, capturing subtle emotional content better [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e], [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Machine learning techniques for sentiment analysis include traditional models like Naive Bayes, Maximum Entropy, and Support Vector Machines (SVMs), as well as deep learning models like CNN, DNN, and RNN. [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e],\u003c/p\u003e"},{"header":"3. Proposed Hybrid Model","content":"\u003cp\u003eFigure \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows both the lexicon-based methods and machine learning-based approaches of sentiment analysis process. We propose this model, that follows the yellow arrow shaped path; it starts, as input for processing through tokenization and cleaning using text data. The text is processed even further and it will be analyses by other methods both based on lexicons or deep learning Even more concretely, we leverage these types of deep learning techniques in a machine-learning-based approach for NLP using such entities as LSTM (Long Short-Term Memory) networks. In this, PSO acts as feature selection and can improve the sentiment models to be more sufficient for complex tasks. Ultimately, the enhanced results in a positive, negative and neutral sentiment contexts are confirmed using LSTM network after feature selection by PSO-based.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Methodological Analysis\u003c/h2\u003e \u003cp\u003e \u003cb\u003eData Collection\u003c/b\u003e For this proposed hybrid method for sentiment analysis, we start from the acquisition of actual heterogeneous user data harvested on YouTube and other platforms through YouTube Data API connected by Google Cloud On console (Fig-4), consequently providing high-quality input content needed in subsequent stage to analyze. Once the data are collected, we meticulously transcribe them first. This includes tokenization by segmenting text into logical units, followed by cleaning steps to remove noise such as URLs, emoticons, and special characters. Furthermore, the elimination of syntactic terms increases the focus on terms that provide information about sensory classification.\u003c/p\u003e\u003cp\u003eAlternatively, we use syntax-based methods to extract the earlier sensory features using products such as SentiWordNet. This step provides emotional insight starting from the text. Then, using traditional machine learning algorithms such as Naive Bayes, SVM, or others, we extract the most relevant features from the preprocessed data These algorithms include lexical features, emotional words word-based, features of speech to increase accuracy of emotion analysis, and analyzes other characteristics of language shown by \u003cb\u003eFigure-5.\u003c/b\u003e\u003c/p\u003e \u003cp\u003e \u003cb\u003eData Preprocessing\u003c/b\u003e step uses careful text cleaning is carried out throughout the preparation phase to improve the quality of the data. Regular expressions help to standardize input by making it easier to remove URLs, emoticons, and special characters. Natural language processing tools drive a process called tokenization, which breaks down the text into discrete tokens and lays the foundation for further research. Simultaneously, sentiment analysis becomes more focused when stopwords, or non-informative words like \"the\" and \"and,\" are removed. Table-1, which presents influential terms together with their frequency, is designed to help find important terms that are important in expressing feelings in film reviews. This table contributes to a more nuanced understanding of user feelings in the context of movie conversations by quantifying word occurrences, which helps train a robust sentiment analysis model on key terms.\u003c/p\u003e \u003cp\u003e \u003cb\u003eTable \u0026minus;\u003c/b\u003e\u0026thinsp;1 : Word Frequency\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWord\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFrequency\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003emovie\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e691\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eindian\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e215\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ehindustani\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e222\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eexcellent\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e125\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003esuper\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e110\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003elove\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e91\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ebest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e83\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ebad\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e16\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eboring\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e---\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e---\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eFeature Extraction\u003c/b\u003e step refines feature selection and improve model performance, we incorporate Swarm-Based Feature Selection techniques, especially Particle Swarm Optimization (PSO) (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e) PSO helps to select more informative features that contribute more to sentiment classification. PSO falls under a class of particles, where each particle represents a particle. Initially, the positions and velocities of the particles are determined randomly within defined limits. For feature selection in text data, each dimension of the particle's position vector corresponds to a feature in the data set.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn the Particle Swarm Optimization (PSO) process for feature selection, each particle's suitability is evaluated by training a classifier, such as an LSTM for ordinal data, and measuring the classifier's accuracy on a validation set. This fitness evaluation unit is designed to help in selecting features that improve how well the classifier works. If the current fitness state of a particle is better than before, its individual best position (pBest) gets updated so it will not forget any promising traits. Analogously, the global best position (gBest) is updated each time a particle's current fitness exceeds that of its previous gBest, meaning it has reached an optimal feature set across all p. Then particles update their velocities to thoroughly cover the search space, giving guidelines between pBest for personal experience and gBest for global knowledge optimizing feature selection.\u003c/p\u003e \u003cp\u003eThe velocity update formula incorporates:\u003c/p\u003e \u003cp\u003e \u003cb\u003ev\u003c/b\u003e \u003csub\u003ei\u003c/sub\u003e (t\u0026thinsp;+\u0026thinsp;1) = ω.v\u003csub\u003ei\u003c/sub\u003e(t)\u0026thinsp;+\u0026thinsp;c\u003csub\u003e1\u003c/sub\u003e.r\u003csub\u003e1\u003c/sub\u003e.(\u003cb\u003epBest\u003c/b\u003e\u003csub\u003ei\u003c/sub\u003e \u0026ndash; \u003cb\u003ex\u003c/b\u003e\u003csub\u003e\u003cb\u003ei\u003c/b\u003e\u003c/sub\u003e(t))\u0026thinsp;+\u0026thinsp;c\u003csub\u003e2\u003c/sub\u003e. r\u003csub\u003e2\u003c/sub\u003e. (\u003cb\u003egBest\u003c/b\u003e \u0026ndash; \u003cb\u003ex\u003c/b\u003e\u003csub\u003ei\u003c/sub\u003e(t)) (1)\u003c/p\u003e \u003cp\u003eWhere:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003ev\u003csub\u003ei\u003c/sub\u003e(t) is the velocity of particle iii at iteration ttt,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eω is the inertia weight,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003ec\u003csub\u003e1\u003c/sub\u003e and c\u003csub\u003e2\u003c/sub\u003e are cognitive and social coefficients,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003er\u003csub\u003e1\u003c/sub\u003e and r\u003csub\u003e2\u003c/sub\u003e are random numbers between 0 and 1,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003epBest\u003c/b\u003e \u003csub\u003e \u003cb\u003ei\u003c/b\u003e \u003c/sub\u003e is the personal best position of particle i,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003egBest\u003c/b\u003e is the global best position among all particles,\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003ex\u003csub\u003ei\u003c/sub\u003e(t) is the position of particle i at iteration t.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eParticles update their positions using their updated velocities to explore new potential feature subsets:\u003c/p\u003e \u003cp\u003ex\u003csub\u003ei\u003c/sub\u003e(t\u0026thinsp;+\u0026thinsp;1)\u0026thinsp;=\u0026thinsp;x\u003csub\u003ei\u003c/sub\u003e(t)\u0026thinsp;+\u0026thinsp;v\u003csub\u003ei\u003c/sub\u003e(t\u0026thinsp;+\u0026thinsp;1) (2)\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe graph (Figure-7) shows how PSO improves the feature selection by identifying the feature locations in the iterations. It assumes complex or variable factors with respect to classification accuracy. In refining feature subsets for better classification results, the search was supported to understand feature importance and algorithm performance. The long-term and short-term memory (LSTM) networks, which are recurrent neural networks (RNNs) known to capture dependent sequences and contextual contexts, are then used into on textual data process the selected objects. After LSTM processing, the fully assembled layer forms the final sensitivity classification based on the known features. This section consolidates the selected information and provides predictions of sentiment categorized as positive, negative, or neutral. Examples of model performance include metrics such as accuracy, F1 scores, precision, and recall on independent test data sets, to ensure robustness and reliability (formula \u0026minus;\u0026thinsp;3).\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\:F1=2\\:X\\:\\frac{\\text{P}\\text{r}\\text{e}\\text{c}\\text{i}\\text{s}\\text{i}\\text{o}\\text{n}\\:\\:\\text{X}\\:\\text{R}\\text{e}\\text{c}\\text{a}\\text{l}\\text{l}}{\\text{P}\\text{r}\\text{e}\\text{c}\\text{i}\\text{s}\\text{i}\\text{o}\\text{n}+\\text{R}\\text{e}\\text{c}\\text{a}\\text{l}\\text{l}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e \u003cb\u003eModel Evaluation\u003c/b\u003e in this step Independent experimental data are used to test the accuracy of the model. In addition to accuracy, several metrics such as precision, recall, F1 scores are used to provide a complete picture of model performance and accuracy which is especially relevant in sensitivity analysis. This detailed analysis assures that the sentiment analysis model can accurately represent the nuances of user content found in YouTube content.\u003c/p\u003e \u003cp\u003eThe Comment Clustering graph (Figure-8), created using K-means clustering and displayed using PCA, provides insightful information about the natural patterns of YouTube content. Each group has a different color, the red line indicates a specific group of content a. By identifying overviews, this visualization can help reveal underlying themes or patterns in a data structure. Clusters provide a nuanced view of user views on movie reviews studied with sentiment distribution, model accuracy Although not described in this article, red lines indicate a specific group of information whose specific characteristics or patterns can be further examined contribute to detailed descriptions of emotional dynamics and user engagement are obtained.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4. Result Analysis","content":"\u003cp\u003eThe comprehensive analysis of the results reveals the views expressed in the YouTube stories using various methodological approaches. Pie charts (Fig.\u0026nbsp;9) and word clouds (Fig.\u0026nbsp;10) are among the colorful and meaningful visualizations used to accurately represent emotion classification. A word cloud, a visual representation of text data, displays words of varying sizes based on their frequency. This visualization provides qualitative insights into the dominant emotions by highlighting common words such as \u0026ldquo;excellent\", \u0026rdquo;Super,\" \u0026ldquo;love\u0026rdquo;, \"Hindustani\u0026rdquo;, \u0026ldquo;awesome\", \u0026ldquo;blockbuster\u0026rdquo; and other positive statements, as well as specific references to artists like \"Kamal Hasan.\" Fig.\u0026nbsp;9 showcases the word cloud for YouTube comments, illustrating these frequent terms.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn addition, the Pie Chart (Figure. 9) provides a numerical viewpoint by decomposing the attitudes into percentages: 31.3% neutral, 15.9% negative, and 52.8% positive. When combined, these graphic components offer a comprehensive comprehension of the feelings conveyed in YouTube comments, presenting both qualitative and quantitative insights into user viewpoints. In addition, the proposed methodology\u0026rsquo;s performance is presented in a comprehensive comparison research table, which demonstrates an astounding 93.5% accuracy. This result highlights the superior performance of our model over conventional techniques like Naive Bayes, SVM, and Random Forest in reliably classifying feelings. The comparison table, which is displayed in Table-II, attests to the greater effectiveness and dependability of our suggested sentiment analysis approach.\u003c/p\u003e \u003cp\u003eTable \u0026ndash; II (Comparison of Sentiment Analysis Models)\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabb\" border=\"1\"\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eF1 Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNaive Bayes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e87.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.83\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSVM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e91.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e86.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.86\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eProposed Model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e93.5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.94\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e "},{"header":"Conclusion","content":"\u003cp\u003eour research harnesses the power of sentiment analysis on YouTube movie reviews, tapping into the wealth of user-generated comments to unravel diverse perspectives and emotions. Leveraging advanced natural language processing techniques, the proposed specialized machine learning model achieves an impressive 93.5% accuracy and efficiency in deciphering sentiments. Through a meticulous step wise methodology involving data collection, preprocessing, model selection and training, evaluation, results analysis, we navigate the challenges of extracting meaningful insights from YouTube comments. This research contributes to the evolving landscape of sentiment analysis, offering a reliable tool for content creators, movie studios, and researchers to make sense of the intricate tapestry of opinions within the dynamic realm of online discussions. As we delve deeper into the nuances of sentiments expressed in YouTube movie reviews, our study not only enhances understanding but also holds promise for influencing marketing strategies and decision-making processes in the digital age.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eFunding :\u003c/h2\u003e \u003cp\u003eNo Funding\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eParminder Singh (First Author) had responsible for creating the entire manuscript, including the research, model development, and writing. Saurabh Dhyani (Co-Author) provided guidance and valuable insights throughout the research process.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eDang NC, Moreno-Garc\u0026iacute;a MN, De la Prieta F (2020) Sentiment Analysis Based on Deep Learning: A Comparative Study, \u003cem\u003eElectronics (Basel)\u003c/em\u003e, vol. 9, no. 3, p. 483, Mar. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/electronics9030483\u003c/span\u003e\u003cspan address=\"10.3390/electronics9030483\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAraque O, Corcuera-Platas I, S\u0026aacute;nchez-Rada JF, Iglesias CA (Jul. 2017) Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst Appl 77:236\u0026ndash;246. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.eswa.2017.02.002\u003c/span\u003e\u003cspan address=\"10.1016/j.eswa.2017.02.002\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDenoncourt J, Companies (Jan. 2020) UN 2030 Sustainable Development Goal 9 Industry, Innovation and Infrastructure. J Corp Law Stud 20(1):199\u0026ndash;235. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1080/14735970.2019.1652027\u003c/span\u003e\u003cspan address=\"10.1080/14735970.2019.1652027\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHu H (2024) Digitalization and Dependence: Evaluating the Impact of the Belt and Road Initiative on Achieving Sustainable Development Goals 8 and 9 and Shaping Digital Autonomy, \u003cem\u003eJournal of Economic Integration\u003c/em\u003e, Jun. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.11130/jei.2024024\u003c/span\u003e\u003cspan address=\"10.11130/jei.2024024\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDenoncourt J, Companies (Jan. 2020) UN 2030 Sustainable Development Goal 9 Industry, Innovation and Infrastructure. J Corp Law Stud 20(1):199\u0026ndash;235. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1080/14735970.2019.1652027\u003c/span\u003e\u003cspan address=\"10.1080/14735970.2019.1652027\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHajikhani A, Suominen A (2022) Mapping the sustainable development goals (SDGs) in science, technology and innovation: application of machine learning in SDG-oriented artefact detection, \u003cem\u003eScientometrics\u003c/em\u003e, vol. 127, no. 11, pp. 6661\u0026ndash;6693, Nov. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s11192-022-04358-x\u003c/span\u003e\u003cspan address=\"10.1007/s11192-022-04358-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDang NC, Moreno-Garc\u0026iacute;a MN, De la Prieta F (2020) Sentiment Analysis Based on Deep Learning: A Comparative Study, \u003cem\u003eElectronics (Basel)\u003c/em\u003e, vol. 9, no. 3, p. 483, Mar. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/electronics9030483\u003c/span\u003e\u003cspan address=\"10.3390/electronics9030483\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCha M, P\u0026eacute;rez JAN, Haddadi H (Sep. 2012) The spread of media content through blogs. Soc Netw Anal Min 2(3):249\u0026ndash;264. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s13278-011-0040-x\u003c/span\u003e\u003cspan address=\"10.1007/s13278-011-0040-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBirjali M, Kasri M, Beni-Hssane A (Aug. 2021) A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowl Based Syst 226:107134. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.knosys.2021.107134\u003c/span\u003e\u003cspan address=\"10.1016/j.knosys.2021.107134\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaura JR, Reyes-Menendez A, Thomas SB (Apr. 2020) Gaining a deeper understanding of nutrition using social networks and user-generated content. Internet Interv 20:100312. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.invent.2020.100312\u003c/span\u003e\u003cspan address=\"10.1016/j.invent.2020.100312\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNarangajavana Kaosiri Y, Callarisa Fiol LJ, Moliner Tena M\u0026Aacute;, Rodr\u0026iacute;guez RM, Artola, S\u0026aacute;nchez Garc\u0026iacute;a J (2019) User-Generated Content Sources in Social Media: A New Approach to Explore Tourist Satisfaction, \u003cem\u003eJ Travel Res\u003c/em\u003e, vol. 58, no. 2, pp. 253\u0026ndash;265, Feb. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/0047287517746014\u003c/span\u003e\u003cspan address=\"10.1177/0047287517746014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRodrigues AP et al (2022) Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques, \u003cem\u003eComput Intell Neurosci\u003c/em\u003e, vol. pp. 1\u0026ndash;14, Apr. 2022, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1155/2022/5211949\u003c/span\u003e\u003cspan address=\"10.1155/2022/5211949\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGunter B, Koteyko N, Atanasova D (2014) Sentiment Analysis: A Market-Relevant and Reliable Measure of Public Feeling? \u003cem\u003eInternational Journal of Market Research\u003c/em\u003e, vol. 56, no. 2, pp. 231\u0026ndash;247, Mar. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.2501/IJMR-2014-014\u003c/span\u003e\u003cspan address=\"10.2501/IJMR-2014-014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAl-Qablan TA, Mohd Noor MH, Al-Betar MA, Khader AT (2023) A survey on sentiment analysis and its applications, \u003cem\u003eNeural Comput Appl\u003c/em\u003e, vol. 35, no. 29, pp. 21567\u0026ndash;21601, Oct. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00521-023-08941-y\u003c/span\u003e\u003cspan address=\"10.1007/s00521-023-08941-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma NA, Ali ABMS, Kabir MA (2024) A review of sentiment analysis: tasks, applications, and deep learning techniques, \u003cem\u003eInt J Data Sci Anal\u003c/em\u003e, Jul. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s41060-024-00594-x\u003c/span\u003e\u003cspan address=\"10.1007/s41060-024-00594-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eL\u0026uuml;cking A, Driller C, Stoeckel M, Abrami G, Pachzelt A, Mehler A (2022) Multiple annotation for biodiversity: developing an annotation framework among biology, linguistics and text technology, \u003cem\u003eLang Resour Eval\u003c/em\u003e, vol. 56, no. 3, pp. 807\u0026ndash;855, Sep. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10579-021-09553-5\u003c/span\u003e\u003cspan address=\"10.1007/s10579-021-09553-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCambria E, Das D, Bandyopadhyay S, Feraco A (2017) Affective Computing and Sentiment Analysis. 1\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-3-319-55394-8_1\u003c/span\u003e\u003cspan address=\"10.1007/978-3-319-55394-8_1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Y et al (2022) Jul., A systematic review on affective computing: emotion models, databases, and recent advances, \u003cem\u003eInformation Fusion\u003c/em\u003e, vol. 83\u0026ndash;84, pp. 19\u0026ndash;52, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.inffus.2022.03.009\u003c/span\u003e\u003cspan address=\"10.1016/j.inffus.2022.03.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHussein DME-DM (2018) A survey on sentiment analysis challenges, \u003cem\u003eJournal of King Saud University - Engineering Sciences\u003c/em\u003e, vol. 30, no. 4, pp. 330\u0026ndash;338, Oct. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jksues.2016.04.002\u003c/span\u003e\u003cspan address=\"10.1016/j.jksues.2016.04.002\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGhali N, Panda M, Hassanien AE, Abraham A, Snasel V (2012) Social Networks Analysis: Tools, Measures and Visualization, in \u003cem\u003eComputational Social Networks\u003c/em\u003e. Springer London, London, pp 3\u0026ndash;23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-1-4471-4054-2_1\u003c/span\u003e\u003cspan address=\"10.1007/978-1-4471-4054-2_1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJangid H, Singhal S, Shah RR, Zimmermann R (2018) Aspect-Based Financial Sentiment Analysis using Deep Learning, in \u003cem\u003eCompanion of the The Web Conference on The Web Conference 2018 - WWW \u0026rsquo;18\u003c/em\u003e, New York, New York, USA: ACM Press, 2018, pp. 1961\u0026ndash;1966. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3184558.3191827\u003c/span\u003e\u003cspan address=\"10.1145/3184558.3191827\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSohangir S, Wang D, Pomeranets A, Khoshgoftaar TM (Dec. 2018) Big Data: Deep Learning for financial sentiment analysis. J Big Data 5(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s40537-017-0111-6\u003c/span\u003e\u003cspan address=\"10.1186/s40537-017-0111-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlharbi ASM, de Doncker E (May 2019) Twitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information. Cogn Syst Res 54:50\u0026ndash;61. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cogsys.2018.10.001\u003c/span\u003e\u003cspan address=\"10.1016/j.cogsys.2018.10.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi L, Goh T-T, Jin D (May 2020) How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput Appl 32(9):4387\u0026ndash;4415. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00521-018-3865-7\u003c/span\u003e\u003cspan address=\"10.1007/s00521-018-3865-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi L, Goh T-T, Jin D (May 2020) How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput Appl 32(9):4387\u0026ndash;4415. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s00521-018-3865-7\u003c/span\u003e\u003cspan address=\"10.1007/s00521-018-3865-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlharbi ASM, de Doncker E (May 2019) Twitter sentiment analysis with a deep neural network: An enhanced approach using user behavioral information. Cogn Syst Res 54:50\u0026ndash;61. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cogsys.2018.10.001\u003c/span\u003e\u003cspan address=\"10.1016/j.cogsys.2018.10.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbid F, Alam M, Yasir M, Li C (Jun. 2019) Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Future Generation Comput Syst 95:292\u0026ndash;308. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.future.2018.12.018\u003c/span\u003e\u003cspan address=\"10.1016/j.future.2018.12.018\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKiritchenko S, Zhu X, Mohammad SM (2014) Sentiment Analysis of Short Informal Texts, \u003cem\u003eJournal of Artificial Intelligence Research\u003c/em\u003e, vol. 50, pp. 723\u0026ndash;762, Aug. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1613/jair.4272\u003c/span\u003e\u003cspan address=\"10.1613/jair.4272\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQian J, Niu Z, Shi C (2018) Sentiment Analysis Model on Weather Related Tweets with Deep Neural Network, in \u003cem\u003eProceedings of the 10th International Conference on Machine Learning and Computing\u003c/em\u003e, New York, NY, USA: ACM, Feb. 2018, pp. 31\u0026ndash;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3195106.3195111\u003c/span\u003e\u003cspan address=\"10.1145/3195106.3195111\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeCun Y, Bengio Y, Hinton G (May 2015) Deep learning. Nature 521(7553):436\u0026ndash;444. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nature14539\u003c/span\u003e\u003cspan address=\"10.1038/nature14539\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAgarwal P, Alam M (2020) A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices. Procedia Comput Sci 167:2364\u0026ndash;2373. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.procs.2020.03.289\u003c/span\u003e\u003cspan address=\"10.1016/j.procs.2020.03.289\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang L, Wang S, Liu B (Jul. 2018) Deep learning for sentiment analysis: A survey. WIREs Data Min Knowl Discov 8(4). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/widm.1253\u003c/span\u003e\u003cspan address=\"10.1002/widm.1253\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBasiri ME, Nemati S, Abdar M, Cambria E, Acharya UR (Feb. 2021) ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis. Future Generation Comput Syst 115:279\u0026ndash;294. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.future.2020.08.005\u003c/span\u003e\u003cspan address=\"10.1016/j.future.2020.08.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHochreiter S, \u0026uml; J, Schmidhuber U Long Short-Term Memory.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChoi G, Oh S, Kim H (2020) Improving document-level sentiment classification using importance of sentences, \u003cem\u003eEntropy\u003c/em\u003e, vol. 22, no. 12, pp. 1\u0026ndash;11, Dec. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/e22121336\u003c/span\u003e\u003cspan address=\"10.3390/e22121336\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFarra N, Challita E, Assi RA, Hajj H (2010) Sentence-level and document-level sentiment mining for arabic texts, in \u003cem\u003eProceedings - IEEE International Conference on Data Mining, ICDM\u003c/em\u003e, pp. 1114\u0026ndash;1119. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICDMW.2010.95\u003c/span\u003e\u003cspan address=\"10.1109/ICDMW.2010.95\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRao G, Huang W, Feng Z, Cong Q (2018) LSTM with sentence representations for document-level sentiment classification, \u003cem\u003eNeurocomputing\u003c/em\u003e, vol. 308, pp. 49\u0026ndash;57, Sep. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.neucom.2018.04.045\u003c/span\u003e\u003cspan address=\"10.1016/j.neucom.2018.04.045\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCambria E, Das D, Bandyopadhyay S, Feraco A (2017) Affective Computing and Sentiment Analysis. 1\u0026ndash;10. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-3-319-55394-8_1\u003c/span\u003e\u003cspan address=\"10.1007/978-3-319-55394-8_1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRice DR, Zorn C (Jan. 2021) Corpus-based dictionaries for sentiment analysis of specialized vocabularies. Political Sci Res Methods 9(1):20\u0026ndash;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1017/psrm.2019.10\u003c/span\u003e\u003cspan address=\"10.1017/psrm.2019.10\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTroussas C, Virvou M, Espinosa KJ, Llaguno K, Caro J Sentiment analysis of Facebook statuses using Naive Bayes classifier for language learning.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Natural Language Processing, Sentiment Analysis, Swarm Intelligence, Deep Learning, LSTM","lastPublishedDoi":"10.21203/rs.3.rs-5320308/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5320308/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eIn the rapidly evolving digital age, sentiment analysis is crucial for understanding consumer behavior on social media platforms. Advanced sentiment analysis techniques integrate swarm based feature selection strategy with deep learning approaches, enhancing emotion classification accuracy and contributing to Sustainable Development Goal (SDG) 9: Infrastructure Innovation. In order to evaluate social media postings and movie reviews, the suggested ensemble model integrates advance strategy of feature selection with deep neural network architecture, making use of swarm-based feature selection and Long-Short Term memory Network (LSTM). Particle Swarm Optimization (PSO) greatly increases the accuracy of emotion prediction by using it for feature selection. Rigorous evaluations validate the hybrid model, demonstrating significant improvements over traditional methods and achieving an impressive accuracy of 93.5%. This highlights its robustness in handling data challenges like sarcasm and ambiguity. The implementation advances sentiment analysis, offering comprehensive solutions that support economic and industrial growth, making it a valuable tool for modern data-driven decision-making.\u003c/p\u003e","manuscriptTitle":"Improving Social Media Sentiment Analysis with Swarm Intelligence Feature Selection and Deep Learning Techniques","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-13 09:56:53","doi":"10.21203/rs.3.rs-5320308/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"0791aa30-fa6b-4a7c-8809-74537584ced0","owner":[],"postedDate":"November 13th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-11-14T20:23:22+00:00","versionOfRecord":[],"versionCreatedAt":"2024-11-13 09:56:53","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5320308","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5320308","identity":"rs-5320308","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00