Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English Azad M. Karim¹, Bryar A.Hassn This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9409057/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The rapid dissemination of misinformation on social media is a growing concern. This concern hit languages like Kurdish hard, whose fewer resources created problems in not identifying and understanding the issue. This work made use of deep learning (DL) techniques—Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Convolutional Neural Networks (CNN)—for misinformation detection in Kurdish and English languages. These DL techniques were tested on the Kurdish Dataset for Fake News Detection (KDFND). Among the three models, CNN achieved the highest accuracy (97.40% for Kurdish). We also examined how FastText embeddings affect performance by comparing models with and without embedding layers. Aim for the highest accuracy and the fastest model. Our tests indicate that FastText models for sentence-level vectors (without embedding layers) perform much better, with almost 97 percent accuracy for the Kurdish and 96 percent for the English. On the other hand, pretrained-embedding models only attain about 50 percent accuracy. The results demonstrate the limitations of static embeddings in low-resource settings and show that flexible, simple models are able to detect fake news without much pretraining. The present research contributes toward further development of NLP techniques for low-resource languages while having practical implications for multilingual fake news detection systems. fake news detection low-resource languages Kurdish NLP deep learning FastText embeddings Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 1. Introduction The spread of false information on social media is a big worry because it breaks people's trust and affects their choices. It's really tough to identify false information in languages with fewer resources, like Kurdish, because there aren't enough datasets and natural language processing (NLP) tools available. Some DL models, such as LSTM, BiLSTM, and CNN, have been effective in identifying misinformation in resource-rich languages; however, their efficacy in resource-poor languages remains largely unexamined. Recent studies have used multimodal DL methods to better detect fake information [ 1 , 2 ] and have employed adversarial training to make models stronger [ 3 ]. However, there is a difficulty in fine-tuning the models for languages with limited linguistic resources. The Kurdish Fake News Detection (KDFND) dataset [ 4 ] is a valuable resource in addressing this problem, but few studies have successfully leveraged it. FastText, a word embedding technique, has been widely used for text preprocessing [ 5 ], [ 6 ], yet it remains less used for Kurdish fake news detection. The present work explores the performance of LSTM, BiLSTM, and CNN models for detecting fake news in the Kurdish and English languages based on the KDFND dataset and FastText embeddings. It aims to improve model effectiveness in low-resource settings while maintaining computational efficiency. Previous studies have tried using ensemble techniques [ 7 ], explainable AI [ 5 ], and hybrid models, but a comprehensive comparison of these DL models specifically for the Kurdish language yet. To achieve these gaps, this study bridges these gaps by This paper compares the performance of LSTM, BiLSTM, and CNN models for English and Kurdish fake news detection on the KDFND dataset. They are comparing their performance to English-language standards for assessing cross-lingual generalizability. They are implementing enhancements to boost precision and effectiveness in environments with limited resources. Building on earlier studies [ 4 , 8 , 9 ], this paper contributes to the advancement of fake news detection techniques for low-resource languages and discusses transferable strategies for multilingual applications. 2. Related work Numerous research works attempted to detect misinformation with ML and DL. The early works about classification, like fake news detection [ 10 , 11 , 12 ], have extensively used classical models like Support Vector Machines (SVM), Decision Trees (DT), and Naïve Bayes (NB). However, these models are still unable to capture the deeper linguistic and contextual features of deception. Recent advances have led to the adaption of DL models, which show better performance in natural language processing (NLP) tasks. The Models such as LSTM and BiLSTM are particularly effective due to their ability to learn long-term dependencies in textual data [ 5 , 13 ]. Similarly, CNNs have been shown to position-invariant local features from text data [ 7 , 14 ]. Hybrid approaches that combine embeddings such as FastText or Word2Vec with neural networks has also improved classification performance [ 5 , 6 ]. According to Nasser et al. [ 2 ], multimodal approaches enable the incorporation of text with other modalities (such as images and metadata) to assist with detection. More recent developments in deepfake generation and misinformation detection [ 15 , 16 , 17 ] have further increased interest in domain-adapted and multimodal approaches. Karim et al. [ 1 ], Salh and Nabi [ 4 ], and Ahmed [ 11 ] have conducted extensive research on Kurdish by constructing datasets and employing classical machine learning models. DL approaches to fake news detection for Kurdish are still scarce. This calls for robust architectures for low-resource languages. The summary of prior research efforts is presented in Table 1 . Table 1: Summary of Previous Works on Fake News Detection Ref. Language Model Embedding Dataset Key Contribution [1] Kurdish & English FastText + LSTM FastText KDFND Proposed a hybrid AI model combining FastText and LSTM for Kurdish and English fake news detection [4] Kurdish SVM, DT, NB, CNN TF-IDF KDFND Introduced ML-based Kurdish fake news detection [5] English CNN + FastText + XAI FastText Twitter, News Hybrid DL with explainability [7] English Ensemble CNN/LSTM Word2Vec LIAR Improved accuracy via ensemble DL [13] English Regularized LSTM Word embeddings News articles Enhanced LSTM robustness [11] Kurdish Hard Voting (SVM, NB, DT) TF-IDF KDFND Ensemble voting approach for Kurdish [18] English ScrutNet (Deep Ensemble) GloVe Kaggle dataset Ensemble learning with attention mechanism 3. The proposed model and experimental setup 3.1. Dataset Overview The dataset employed in the present study is the Kurdish Dataset for Fake News Detection (KDFND) [ 19 ]. The said dataset contains articles that are annotated as either real or fake and hence can be used for binary classification tasks of fake news detection. The data was gathered in various internet resources and the data was then filtered and cleaned off initially to remove the irrelevant and poor data. The important information about the data can be summarized as follows: Number of Entries : 100,966 Key Columns : ID : Unique identifier for each article. Text : The original text of the article. Text_Translate_to_English : The English translation of the text. URL : The source URL of the article. Date : The publication date of the article. Source : The source platform of the article. Label : Binary label indicating whether the news is "Real" (0) or "Fake" (1). The dataset also contains labels determining whether the news is real or not. Both the Kurdish and English texts share the same labels, so we could compare them fairly. We did the usual preprocessing—normalization, tokenization, padding—and split the data separately for each language to keep things consistent. 3.2. Dataset Cleaning and Preprocessing 3.2.1. Handling Missing Data The missing values were in a small part of the dataset, namely, the "Text" column of Kurdish and the column of Text Translate to English of English. The text columns had about 2.3% missing values, which were eliminated in preprocessing. 3.2.2. Feature Selection We removed some of the columns which we thought were not important such as ID, URL, Date and Source in order to concentrate on the important textual features. The analysis is mainly based on the column with the text or the translation to English which was renamed to the article and the label column. 3.2.3. Balancing the Dataset There was a minor skew in the dataset, as the number of real news articles was much larger than the number of fake news. The dataset was balanced by an undersampling method. The majority class (real news) was randomly undersampled to equal the minority class (fake news) and the balanced dataset was mixed to reduce the possibility of ordering bias during training, as recommended in [ 1 ]. The strategy serves to avoid favoring the majority group and still has enough data to conduct meaningful deep learning training. Though the method of undersampling decreases the dataset size slightly, it does not change the distribution of linguistic features needed to train a model, and the resulting dataset is large enough to be used by DL models (50,210 samples per class). The last dataset consists of 50,210 samples each of the categories. Table 2 Balanced class distribution Label Original Dataset undersampling Count 0 (Real) 50751 50210 1 (Fake) 50210 50210 The initial sample was slightly unbalanced and included 100,961 articles that had slight class imbalance (Real: 50,751, Fake: 50,210; ratio 1.01:1). To obtain an utter balance, random undersampling was used to eliminate 541 samples in the majority class (0.5 percent of the total data). This option has been selected because of its insignificant loss of data and calculational efficiency. The final distribution is seen in Table 2 . 3.2.4. Text Preprocessing Preprocessing of text is significant in making text data ready to be used in ML. The following processes were used: Removed URLs, mentions, hashtags, punctuation, and numbers. Converted everything to lowercase. Tokenized and removed stopwords (both Kurdish and English). In the case of Kurdish text, further normalization measures were taken to deal with spelling and character encoding variation. To be able to compare the two languages, these preprocessing steps were used in both languages. 3.2.5. Exploratory Data Analysis (EDA) To gain a clearer picture of the dataset EDA was conducted to study the distribution of text lengths in real and fake news articles. Distribution plots were used to plot the average word length per article as indicated in Figs. 1 and 2 . They provide us with a very approximate picture of how the style of writing may vary between the two classes. Real News Articles : Fake News Articles : 3.3. Feature Extraction 3.3.1. Feature Representation This study investigated two strategies of representation of features. The former approach applies pretrained FastText embeddings to initialize embedding layers. The second strategy employs sentence level FastText vectors which have been trained through a supervised training model. FastText was chosen because it has the capability of capturing subword information, as it is particularly useful in low-resource languages, like Kurdish, where morphological variation is typical. The entire pipeline is presented in Fig. 3 . In case of sentence level vectors, we simply averaged the word embeddings. That makes it easy and quick. The sentence-level representation method has a document representation as a dense vector, which is an average of word embeddings produced by FastText. The representation minimizes the dimensionality and computational expense and maintains semantic information. 3.3.2. FastText Embeddings In the present work, we employed FastText embeddings to represent textual data harvested from the KDFND dataset, comprising both Kurdish and their English-translated news articles [ 1 ]. Which is a representation of each word as a bag of character n-grams, making it capable of capturing “subword” information and dealing with out-of-vocabulary words efficiently, a key benefit for low-resource languages like Kurdish [ 5 ], [ 6 ]. For the models utilizing pretrained embeddings, we used the publicly available 300-dimensional FastText vectors trained on Common Crawl for English, following the approach in [ 5 ], and the c orresponding Kurdish vectors from [ 21 ]. To generate sentence-level representations, we trained a supervised FastText model on labeled examples of fake and real news by utilizing the " fasttext.train_supervised " function. The supervised FastText model was trained using the following hyperparameters: Learning Rate: 0.5 Epochs: 25 Word n-grams: 2 Embedding Dimension: 300 The training process yielded a vocabulary of 69,428 words along with two categorical labels. Sentence vectors were generated for each article. After the training process, sentence vectors for each article in the dataset were generated using the " get_sentence_vector() " function, which finds the average vector of word embeddings contained in a sentence. This method has been shown to be effective in handling noisy text [ 2 ], [ 5 ]. This was done by adding a custom helper function ("fasttext_vector") to make sure that it preprocesses well and that it can deal with edge cases like no input or empty input. The resulting fasttext matrix "X_fasttext" was used as feature set in classification. Whereas FastText tends to be fairly useful in low-resource and multilingual contexts [ 5 ], [ 9 ], experimental findings in this paper show a major drop in performance with the usage of pretrained embeddings relative to the performance with sentence-level representations. The result suggests that despite the theoretical advantages of FastText, sentence-averaged or default representations are unable to detect contextual nuances in Kurdish fake news, a fact corroborated by Daneshfar [ 9 ] on the suitability of embeddings to low-resource sentiment analysis. 3.3.3. Tokenization and Padding Besides sentence-level representations, we also tested the token-level representations. The original news (raw) articles were then tokenized and transformed into padded sequences with Keras Tokenizer and pad sequences. A 10,000-word vocabulary was kept, and sequence lengths were fixed or trimmed to 200 tokens so that the input dimensions are equally sized. 3.3.4. Embedding Matrix An embedding matrix (FastText-based) was created on the current vocabulary. It was the start of embedding layer of the LSTM, BiLSTM and CNN models. The FastText vectors of each word in the tokenized vocabulary were loaded. One can attribute the observed difference in the performance between pretrained embeddings (approximately 50) and sentence-level vectors (approximately 97) to the following factors: Domain Mismatch: General corpus embeddings (Wikipedia, Common Crawl) can fail to learn fake news linguistic patterns. OOV 18% on Kurdish and 12% on English, which indicates the use of domain-specific terms that the pretrained models do not know. Static vs. Dynamic Representations: The FastText vectors are sentence-level vectors that learn task-specific semantics during training, whereas pretrained vectors are fixed. 3.4. Dataset Splitting This processed data was divided into training and test data in the 80:20 ratio. Table 3 shows an overview of the FastText-based data preparation process. The split was performed randomly to ensure balanced representation. Table 3 FastText-Based Data Preparation Step Purpose Train FastText model Supervised training on labeled dataset to learn domain-specific embeddings Generate sentence vectors Apply “get_sentence_vector()” for each article Prepare train/test split Use “train_test_split” to create 80/20 split Tokenize articles Convert raw text into integer sequences Pad sequences Normalize input length to fixed size for deep models Build embedding matrix Map FastText word vectors to vocabulary tokens 4. Proposed Model Architectures and Performance Optimization 4.1 Model Hyperparameters Table 4 presents the detailed hyperparameter configurations for the (LSTM, BiLSTM, and CNN) models. All models utilize pretrained FastText (300-dimensional) embeddings with a vocabulary size of 10,000 and a maximum sequence length of 200 tokens. The embedding layer is trainable across all architectures. Table 4 Experimental Configuration of Deep Learning Models Model Architecture Dropout Optimizer Batch Size Epochs LSTM LSTM (128) → Dense (32) → Dropout (0.2) → Dense (1) 0.2 Adam 64 5 BiLSTM BiLSTM (64) → Dense (16) → Dropout (0.3) → Dense (1) 0.2 (recurrent) + 0.3 (dense) Adam 64 5 CNN Conv1D (64) → MaxPool1D → GlobalMaxPool1D → Dense (16) → Dropout (0.3) → Dense (1) 0.3 Adam 64 5 In the present work, a number of DL models were investigated and employed to improve both classification accuracy and computational speed in the scenario of detecting fake news on multilingual datasets. The experiments contrasted different configurations of embedding techniques and neural network models, such as LSTM, BiLSTM, and CNN, with both pretrained FastText embeddings and non-pretrained variants thereof. 4.1.1 LSTM with Pretrained FastText Embedding The first model used pretrained FastText embeddings in an LSTM model with the following configuration. The LSTM model consists of an input layer followed by an LSTM layer with 128 units, a dropout layer with a rate of 0.5, and a dense output layer with sigmoid activation. The model was trained using the Adam optimizer with a learning rate of 0.001. In spite of the conceptual benefits of operating with semantically dense pretrained embeddings, this model underperformed at just 49.98% accuracy for Kurdish and 49.66% accuracy for English. The near-zero Precision and Recall values for Kurdish indicate that the model predicted only one class (majority class) for all test samples, likely due to the inability of static embeddings to capture domain-specific linguistic patterns. The lack of improvement can also suggest that the FastText embeddings that are trained independently and used as fixed vectors might not fit well the domain semantics that are found within the fake news corpus. Besides, fixed sequence length and potentially sparse vector representations could have led to the model underfitting. Table 5 shows the results of the LSTM model with pretrained FastText embeddings. The LSTM model, without the embedding layer, and directly trained on the sentence vectors provided by FastText, was performing much better with an accuracy of 97.24 on Kurdish and 96.85 on English. This architecture did not use the embedding layer and simply passed the raw dense FastText sentence vectors to the LSTM. Table 6 shows the performance of the LSTM model without embedding layers. By not considering the embedding matrix and relying on FastText to run sentence-level vectorization, the model was able to use contextually aware, semantically complete input representations, which greatly enhanced convergence and classification performance. 4.1.2 Bidirectional LSTM (BiLSTM ) Likewise, the BiLSTM model has a 128-unit, bidirectional LSTM layer and dropout and a dense output layer. To further exploit context both forward and backward, the use of BiLSTM model was used. However, the model did not improve in combination with pretrained FastText embeddings (Kurdish accuracy: 50.02, and English accuracy: 49.66), presumably because the embeddings do not align with the dataset. Table 7 shows the result of the BiLSTM model using pretrained embeddings. While the BiLSTM without embedding has achieved 97.37% accuracy on Kurdish and 97.01% accuracy on English, demonstrating that trainable representations derived directly from input vectors allowed the model to fit the data's structure and meaning more effectively. This model used dropout regularization combined with a smaller dense layer for mitigating overfitting. Table 8 presents the results of the BiLSTM model without embedding layers. 4.1.3 Convolutional Neural Networks (CNN) The CNN model consists of a convolutional layer with 128 filters and a kernel size of 5, followed by a max-pooling layer, dropout (rate = 0.5), and a fully connected dense layer. CNN-based models were explored to capture local n-gram features through 1D convolutions. In combination with FastText embeddings, the performance saturated at 49.98% accuracy for Kurdish and 49.66% accuracy for English, mirroring the limitation in the static embedding fusion. The CNN model’s results using pretrained embeddings are detailed in Table 9 . An embedding-free version of CNN fed directly on reshaped sentence vectors performed much better. CNN model was found to have accuracy of 97.40 Kurdish and a result of 96.69 English. It proves that CNNs can be trained to recognize discriminative features with sequence-level vectors without word representations trained on massive text collections. CNN model performance of excluding embedding layers is shown in Table 10 . Training of models that had no embedding layer was superior to training models with embedding layers in any experimental situation. Directly using the sentence vector of FastText enhances the performance of the model and classification. Besides, small models such as LSTM and CNN would quickly converge and generalise more in the absence of overhead of pretrained embeddings. The findings also highlight the fact that the representation strategies of features need to be accommodated to data to be used successfully, particularly in multilingual or domain-specific settings. 4.2. Evaluation Strategy The data was divided into 8020 train test ratios. In order to make it robust, 5-fold cross-validation was also used. Accuracy, precision, recall and F1-score were used to assess model performance. To give more insights into the behavior of the model, confusion matrices and ROC curves were plotted, as displayed in Figs. 4 and 5 . 5. Discussion and Results This section compares the performance of LSTM, BiLSTM and CNN models to detect fake news on Kurdish and English on KDFND dataset particularly the influence of FastText based representations. The findings indicate that there is a significant performance difference between the embedding-based and sentence-level representation systems, and the results have a considerable implication in the low-resource language processing. The results of the experiment suggest an evident difference between embedding-based and non-embedding. Models based on pretrained FastText embeddings did not perform well, and had almost random accuracy (~ 50) on both Kurdish and English. Table 5 , Table 7 , and Table 9 show the detailed results of the LSTM, BiLSTM, and CNN models that included embedding layers. Conversely, direct models that were trained on sentence-level FastText vectors ranked much higher than embedding-based models. Table 6 showed that the LSTM model was 97.24% accurate with Kurdish and 96.85% accurate with English. Equally, the BiLSTM model had 97.37% and 97.01% accuracy on Kurdish and English respectively (Table 8 ), whereas the CNN model had 97.40% and 96.69% (Table 10 ). This significant enhancement suggests that sentence representations are better at the sentence level to reflect semantic and contextual patterns that are useful in detecting fake news. Table 5 LSTM with pretrained embedding layer results Languages Accuracy Precision Recall F1-Score Kurdish 0.4998 0.0000 0.0000 0.0000 English 0.4966 0.4966 1.000 0.6636 Table 6 LSTM without an Embedding Layer Languages Accuracy Precision Recall F1-Score Kurdish 0.9724 0.9766 0.9680 0.9723 English 0.9685 0.9702 0.9663 0.9683 Table 7 BiLSTM with pretrained embedding layer results Languages Accuracy Precision Recall F1-Score Kurdish 0.5002 0.5002 1.0000 0.6669 English 0.4966 0.4966 1.000 0.6636 Table 8 BiLSTM without an Embedding Layer Languages Accuracy Precision Recall F1-Score Kurdish 0.9737 0.9717 0.9759 0.9738 English 0.9701 0.9691 0.9707 0.9699 Table 9 CNN with pretrained embedding layer results Languages Accuracy Precision Recall F1-Score Kurdish 0.4998 0.0000 0.0000 0.0000 English 0.4966 0.4966 1.000 0.6636 Table 10 CNN without an Embedding Layer Languages Accuracy Precision Recall F1-Score Kurdish 0.9740 0.9690 0.9793 0.9741 English 0.9669 0.9630 0.9707 0.9668 5.1 Key Findings 1. Impact of FastText Embeddings The FastText embedding-based model trained on pretrained embeddings also fared poorly with a score of approximately 50 percent accuracy on both English and Kurdish, which is only slightly more accurate than a simple guessing game. It indicates that fix embeddings do not resolve minor linguistic and contextual tendencies that are characteristic of fake news, especially in low-resource languages. By contrast, models trained on dynamically trained sentence-level FastText vectors (without embedding layers) did significantly better. They were able to get the accuracy of about 97 and 96 on Kurdish and English respectively. These findings emphasize the importance of domain-adapted feature representations. A vast gap in the domains existed between the general domain corpus (e.g., Wikipedia, Common Crawl) that pretrained embeddings are trained and the general social media and news linguistic style, vocabulary, and neologisms of KDFND. This incompatibility restricts the capability of pretrained embeddings to focus on misleading patterns and contextual peculiarities. The other urgent problem is the existence of out-of-vocabulary (OOV) words especially in Kurdish. The comparison of the tokens in our vocabulary and the pretrained model reveals that the OOV rate of Kurdish is around 18% and of English is around 12%. This implies that significant domain-specific words can be given insufficient or no vector representations, which are harmful to model learning. Finally, such embeddings are non-adaptable and cannot learn or disambiguate semantics of words depending on context during learning. Sentence-level representations, on the other hand, enable models to have task-specific semantics, which results in much better outputs. 2. Model Architecture Comparisons Both LSTM and BiLSTM models are best trained on sentence-level vectors. Although BiLSTM outperforms LSTM (97.37% vs. 97.24% on Kurdish), this is due to the fact that it is able to learn the context in both directions. CNN: Achieved similar performance (97.40% for Kurdish) by effectively capturing local n-gram features, illustrating its suitability for text classification tasks. 3. Cross-Lingual Performance In this paper, the differences in performance were found to be negligible less than 1% between the English and the Kurdish which indicates that the methods proposed are sound. This result contradicts the popular belief that languages with low resources naturally restrict model performance. 4. Computational Efficiency The non-embedding layer models were also faster to train and consumed less resources due to the fact that they did not involve computations on huge matrices of embedding. This predisposes them to be used especially in resource-limited settings. 5.2. Comparison with previous Research. Table 11 Comparative Performance on the Kurdish Fake News Detection Task as Compared to Past Studies. Study Model Language Accuracy Salh and Nabi [ 4 ] SVM + TF-IDF Kurdish ~ 92% Ahmed [ 11 ] Hard Voting (SVM, NB, DT) Kurdish ~ 93% Karim and Hassan [ 1 ] LSTM + FastText Kurdish ~ 95% This Study CNN (no embedding) Kurdish 97.40% This Study BiLSTM (no embedding) Kurdish 97.37% The results of Table 11 are contrary to the previous research that indicated high performance with hybrid embedding-deep learning methods [ 5 ], [ 7 ]. This study shows that pretrained embeddings are not generalized to domain-specific tasks like Kurdish fake news detection and supports the same reservations of Daneshfar [ 9 ] regarding the efficiency of embeddings in low-resource settings. However, our results on the downsize model show that Salh and Nabi [ 4 ] recommendation about non-standardized solutions in Kurdish NLP is true. 5.3 Limitations Single Train-Test Split: Results are based on a single 80 − 20 split. Future work should employ k-fold cross-validation to ensure robustness. Single Dataset: Evaluation limited to KDFND. External validation on additional Kurdish datasets is needed. Statistical Significance: Trained models that do not have confidence intervals. It is advisable to repeat the runs with various random seeds. The future work should contain the test of statistical significance and more than one run of the experiment in order to be sure of the reproducibility. 5.4. Implications The high-performing sentence-level FastText vectors are indicative of the possible success of lightweight and flexible feature-extraction approaches in low-resource language. Possible future directions are: Multimodal integration (e.g., images, metadata) to reflect progress made for high-resource languages [ 2 ], [ 20 ]. Hybrid models combining the strengths of CNNs and BiLSTMs to improve feature extraction. Explainability methods ( e.g., SHAP, LIME ) [ 5 ] are utilized to enhance the transparency of the model for users. 6. Conclusion The spread of fake news needs effective detection mechanisms, particularly for low-resource languages that lack advantageous linguistic tools. This research explored the effectiveness of LSTM, BiLSTM, and CNN models for fake news detection in Kurdish and English using the KDFND dataset and FastText embeddings. Our main results are that models trained with sentence-level FastText vectors (without embedding layers) perform close to state-of-the-art (~ 97% for Kurdish, ~ 96% for English), but models with pre-trained embeddings perform worse (~ 50% accuracy). This implies that static embeddings do not generalize well to domain-specific low-resource language tasks, but dynamically trained representations are more accommodating. The BiLSTM and CNN model performance further highlights the contribution of feature extraction with contextual awareness and computational feasibility for fake news detection. The findings can be compared to the current body of information regarding the limitation of embedding-based methodologies to low-resource NLP and propose a potentially productive alternative based on sentence vectorization. Future studies should take into account multimodal and explainable AI [ 5 ] and bigger dialectally diverse datasets to improve model robustness. Lastly, this paper provides a practical roadmap to improve the detection of fake news in low-resource languages by requesting task-oriented optimizations instead of the task-agnostic pre-trained embeddings. Our solution paves the way for more equitable and scalable solutions to the world's misinformation crisis by closing the gap between high-resource and low-resource NLP. 6.1 Key Takeaways : Sentence-level FastText vectors outperform pretrained embeddings in low-resource fake news detection. BiLSTM and CNN models perform with ~ 97% accuracy for Kurdish, demonstrating their efficiency. Static embeddings can fail to generalize and promote flexible feature extraction. Future work involves multimodal integration and explainability improvements. This study pushes the boundary of NLP on low-resource languages and provides practical recommendations for building multilingual fake news detection systems. Declarations Author Contribution Azad M. Karim: Conceptualization, methodology, software implementation, formal analysis, investigation, writing—original draft preparation, visualization, project administration.Bryar A. Hassan: Validation, data curation, resources, writing—review and editing, supervision. References Karim AM, Hassan BA (2025) Research Article: A Hybrid AI Model for Fake News Detection: Leveraging FastText and LSTM for Kurdish and English, cjnst , vol. 0, no. 1, pp. 1–11. https://doi.org/10.31530/cjnst.2025.1.1 Nasser M et al (2024) A systematic review of multimodal fake news detection on social media using deep learning models, Results Eng. , vol. 26, no. December p. 104752, 2025. 10.1016/j.rineng.2025.104752 Maham S, Tariq A, Khan MUG, Alamri FS, Rehman A, Saba T (2024) ANN: adversarial news net for robust fake news classification. Sci Rep 14(1):1–20. 10.1038/s41598-024-56567-4 Salh DA, Nabi RM (2023) Kurdish Fake News Detection Based on Machine Learning Approaches. Passer J Basic Appl Sci 5(2):262–271. 10.24271/PSR.2023.380132.1226 Hashmi E, Yayilgan SY, Yamin MM, Ali S, Abomhara M (2024) Advancing Fake News Detection: Hybrid Deep Learning With FastText and Explainable AI, IEEE Access , vol. 12, no. March, pp. 44462–44480. 10.1109/ACCESS.2024.3381038 Yan K (2024) Optimizing an English text reading recommendation model by integrating collaborative filtering algorithm and FastText classification method. Heliyon 10(9):e30413. 10.1016/j.heliyon.2024.e30413 Almandouh ME, Alrahmawy MF, Eisa M, Elhoseny M, Tolba AS (2024) Ensemble based high performance deep learning models for fake news detection. Sci Rep 14(1):26591. 10.1038/s41598-024-76286-0 Armeen I, Niswanger R, Tian C (2024) Combating Fake News Using Implementation Intentions. Inf Syst Front no June. 10.1007/s10796-024-10502-0 Daneshfar F (2024) Enhancing Low-Resource Sentiment Analysis: A Transfer Learning Approach. Passer J Basic Appl Sci 6(2):265–274. 10.24271/PSR.2024.440793.1484 Bussa S, Bodhankar A, Patil VH, Pal H, Bunkar SK, Qureshi ARK (2023) An Implementation of Machine Learning Algorithm for Fake News Detection, Int. J. Recent Innov. Trends Comput. Commun. , vol. 11, no. March, pp. 392–401. 10.17762/ijritcc.v11i9s.7435 San Ahmed RAM (2023) Hard Voting Approach using SVM, Naïve Bays and Decision Tree for Kurdish Fake News Detection. Iraqi J Comput Sci Math 4(3):25–33. 10.52866/ijcsm.2023.02.03.003 Abdul H, Al H, Abdul H, Al H, Jabardi M (2024) Detecting Fake News Using Machine Learning: A Comparative Study of Techniques. J Kufa Math Comput 11:113–120 Camelia TS, Fahim FR (2024) A Regularized LSTM Method for Detecting Fake News Articles. IEEE, pp. 1–2 Tama FR, Sibaroni Y (2023) Fake News (Hoaxes) Detection on Twitter Social Media Content Through Convolutional Neural Network (CNN) Method. JINAV, 4, 1 Bhadana J, Kouritzin MA, Park S, Zhang I (2024) Markov Processes for Enhanced Deepfake Generation and Detection. arXiv, pp. 1–17 Science I, Don E, Engineering S, Bosco D, Science SI (2024) Deceptive Content Detection Using Machine Learning. IJSREM 1–5. 10.55041/IJSREM34830 Arowolo MO, Misra S, Ogundokun RO (2023) A Machine Learning Technique for Detection of Social Media Fake News. IJSWIS 19(1):1–25. 10.4018/IJSWIS.326120 Verma A et al (2025) ScrutNet: a deep ensemble network for detecting fake news in online text. Soc Netw Anal Min 15(1). 10.1007/s13278-025-01412-3 Abubakr Salh D, Nabi R (2022) Mendeley Data V1. 10.17632/3zx9vpw3wh.1 . Kurdish Dataset for Fake News Detection (KDFND) Nguyen D, Nguyen TT, Nguyen CV (2025) Fake advertisements detection using automated multimodal learning: a case study for Vietnamese real estate data. Appl Intell 55(6). 10.1007/s10489-025-06238-2 FastText crawl-vectors @ fasttext.cc. [Online]. Available: https://fasttext.cc/docs/en/crawl-vectors.html Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9409057","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":622622179,"identity":"f66dd87c-517f-43f2-8320-1a766228ed7b","order_by":0,"name":"Azad M. Karim¹","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA8UlEQVRIiWNgGAWjYJACCTDJfIDxAWMDiJUARgS1SDCwJTAbkKyFTQKuBR/g7z/88AZjTl0dfxvzs4qfOw4z8LPnGDA8KMNjw4FjxhaM2w5LSBxjM7vZe+Ywg2TPGwOGhHN4rDnYYCbBuO2ABMP9BrPbjG2HGQxuAG1JbMOtQ/4w+zegljoJ+WPs34pBWuwJaTE4xgOyhVkCxGAG2yJBQIvhGZ5ii8RthyU3HuMpluxtS+eROPOs4AA+v8idP77xxsdtdfxyx9g3fvjZZi3H35688eEPPCEGBglIbB4QcYCBjYAWLIAMLaNgFIyCUTBsAQCO7k/nrIwsigAAAABJRU5ErkJggg==","orcid":"","institution":"Charmo University","correspondingAuthor":true,"prefix":"","firstName":"Azad","middleName":"M.","lastName":"Karim¹","suffix":""},{"id":622622183,"identity":"1e1237cb-da2f-44df-8abf-07ee76693a98","order_by":1,"name":"Bryar A.Hassn","email":"","orcid":"","institution":"Charmo University","correspondingAuthor":false,"prefix":"","firstName":"Bryar","middleName":"","lastName":"A.Hassn","suffix":""}],"badges":[],"createdAt":"2026-04-14 01:08:29","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-9409057/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9409057/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":106937774,"identity":"c702709c-c90e-4936-93ff-8c0afe569133","added_by":"auto","created_at":"2026-04-15 03:56:06","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":145741,"visible":true,"origin":"","legend":"\u003cp\u003eDistribution of Text Length for True News.\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/f561475b354e8e09b9b62522.png"},{"id":106937796,"identity":"2ef70e52-aef7-4f74-9506-262c767df03b","added_by":"auto","created_at":"2026-04-15 03:56:09","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":139415,"visible":true,"origin":"","legend":"\u003cp\u003eDistribution of Text Length for Fake News\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/6298420643425b819b7f2790.png"},{"id":106937827,"identity":"34fc42da-d99d-4a6f-b3ea-0ac35537bbc1","added_by":"auto","created_at":"2026-04-15 03:56:11","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":253184,"visible":true,"origin":"","legend":"\u003cp\u003eArchitecture of the proposed fake news detection framework\u003c/p\u003e","description":"","filename":"image3.png","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/245584134c4b08d6955c136d.png"},{"id":106937791,"identity":"b91722d6-8e48-4309-8bd1-e25f3db29f3d","added_by":"auto","created_at":"2026-04-15 03:56:09","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":134860,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix illustrating classification performance\u003c/p\u003e","description":"","filename":"image4.png","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/daa64ff2ab670818f2ce9208.png"},{"id":106937777,"identity":"e056d190-a27c-4ff4-9a8d-683f53780d11","added_by":"auto","created_at":"2026-04-15 03:56:06","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":110486,"visible":true,"origin":"","legend":"\u003cp\u003eROC curve illustrating classification performance (AUC = 0.990)\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/0f22b356ab048289dc265b3e.png"},{"id":107707994,"identity":"339af50f-0a2d-425f-81a9-953af0244146","added_by":"auto","created_at":"2026-04-24 09:21:35","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1149229,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9409057/v1/62b00050-4cff-44e1-8d29-291b7ae65d06.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eThe spread of false information on social media is a big worry because it breaks people's trust and affects their choices. It's really tough to identify false information in languages with fewer resources, like Kurdish, because there aren't enough datasets and natural language processing (NLP) tools available. Some DL models, such as LSTM, BiLSTM, and CNN, have been effective in identifying misinformation in resource-rich languages; however, their efficacy in resource-poor languages remains largely unexamined.\u003c/p\u003e \u003cp\u003eRecent studies have used multimodal DL methods to better detect fake information [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e] and have employed adversarial training to make models stronger [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. However, there is a difficulty in fine-tuning the models for languages with limited linguistic resources. The Kurdish Fake News Detection (KDFND) dataset [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e] is a valuable resource in addressing this problem, but few studies have successfully leveraged it. FastText, a word embedding technique, has been widely used for text preprocessing [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], yet it remains less used for Kurdish fake news detection.\u003c/p\u003e \u003cp\u003eThe present work explores the performance of LSTM, BiLSTM, and CNN models for detecting fake news in the Kurdish and English languages based on the KDFND dataset and FastText embeddings. It aims to improve model effectiveness in low-resource settings while maintaining computational efficiency. Previous studies have tried using ensemble techniques [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e], explainable AI [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], and hybrid models, but a comprehensive comparison of these DL models specifically for the Kurdish language yet.\u003c/p\u003e \u003cp\u003eTo achieve these gaps, this study bridges these gaps by\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eThis paper compares the performance of LSTM, BiLSTM, and CNN models for English and Kurdish fake news detection on the KDFND dataset.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eThey are comparing their performance to English-language standards for assessing cross-lingual generalizability.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eThey are implementing enhancements to boost precision and effectiveness in environments with limited resources.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eBuilding on earlier studies [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], this paper contributes to the advancement of fake news detection techniques for low-resource languages and discusses transferable strategies for multilingual applications.\u003c/p\u003e"},{"header":"2. Related work","content":"\u003cp\u003eNumerous research works attempted to detect misinformation with ML and DL. The early works about classification, like fake news detection [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], have extensively used classical models like Support Vector Machines (SVM), Decision Trees (DT), and Na\u0026iuml;ve Bayes (NB). However, these models are still unable to capture the deeper linguistic and contextual features of deception.\u003c/p\u003e \u003cp\u003eRecent advances have led to the adaption of DL models, which show better performance in natural language processing (NLP) tasks. The Models such as LSTM and BiLSTM are particularly effective due to their ability to learn long-term dependencies in textual data [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Similarly, CNNs have been shown to position-invariant local features from text data [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eHybrid approaches that combine embeddings such as FastText or Word2Vec with neural networks has also improved classification performance [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. According to Nasser et al. [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], multimodal approaches enable the incorporation of text with other modalities (such as images and metadata) to assist with detection. More recent developments in deepfake generation and misinformation detection [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e] have further increased interest in domain-adapted and multimodal approaches.\u003c/p\u003e \u003cp\u003eKarim et al. [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e], Salh and Nabi [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e], and Ahmed [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e] have conducted extensive research on Kurdish by constructing datasets and employing classical machine learning models. DL approaches to fake news detection for Kurdish are still scarce. This calls for robust architectures for low-resource languages. The summary of prior research efforts is presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e\u003cstrong\u003eTable 1:\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eSummary of Previous Works on Fake News Detection\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eRef.\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eLanguage\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eModel\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eEmbedding\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eDataset\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eKey Contribution\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e[1]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKurdish \u0026amp; English\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003eFastText + LSTM\u003cp\u003e\u003cbr\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eFastText\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKDFND\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eProposed a hybrid AI model combining FastText and LSTM for Kurdish and English fake news detection\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[4]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKurdish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eSVM, DT, NB, CNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTF-IDF\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKDFND\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eIntroduced ML-based Kurdish fake news detection\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[5]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnglish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eCNN + FastText + XAI\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eFastText\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTwitter, News\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHybrid DL with explainability\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[7]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnglish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnsemble CNN/LSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eWord2Vec\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eLIAR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eImproved accuracy via ensemble DL\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[13]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnglish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eRegularized LSTM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eWord embeddings\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNews articles\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnhanced LSTM robustness\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[11]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKurdish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eHard Voting (SVM, NB, DT)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTF-IDF\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKDFND\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnsemble voting approach for Kurdish\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e[18]\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnglish\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eScrutNet (Deep Ensemble)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eGloVe\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eKaggle dataset\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eEnsemble learning with attention mechanism\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e"},{"header":"3. The proposed model and experimental setup","content":"\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e3.1. Dataset Overview\u003c/h2\u003e \u003cp\u003eThe dataset employed in the present study is the Kurdish Dataset for Fake News Detection (KDFND) [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. The said dataset contains articles that are annotated as either real or fake and hence can be used for binary classification tasks of fake news detection. The data was gathered in various internet resources and the data was then filtered and cleaned off initially to remove the irrelevant and poor data. The important information about the data can be summarized as follows:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eNumber of Entries\u003c/b\u003e: 100,966\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eKey Columns\u003c/b\u003e:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eID\u003c/b\u003e: Unique identifier for each article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eText\u003c/b\u003e: The original text of the article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eText_Translate_to_English\u003c/b\u003e: The English translation of the text.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eURL\u003c/b\u003e: The source URL of the article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eDate\u003c/b\u003e: The publication date of the article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eSource\u003c/b\u003e: The source platform of the article.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eLabel\u003c/b\u003e: Binary label indicating whether the news is \"Real\" (0) or \"Fake\" (1).\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eThe dataset also contains labels determining whether the news is real or not. Both the Kurdish and English texts share the same labels, so we could compare them fairly. We did the usual preprocessing\u0026mdash;normalization, tokenization, padding\u0026mdash;and split the data separately for each language to keep things consistent.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e3.2. Dataset Cleaning and Preprocessing\u003c/h2\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003e3.2.1. Handling Missing Data\u003c/h2\u003e \u003cp\u003eThe missing values were in a small part of the dataset, namely, the \"Text\" column of Kurdish and the column of Text Translate to English of English. The text columns had about 2.3% missing values, which were eliminated in preprocessing.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section3\"\u003e \u003ch2\u003e3.2.2. Feature Selection\u003c/h2\u003e \u003cp\u003eWe removed some of the columns which we thought were not important such as ID, URL, Date and Source in order to concentrate on the important textual features. The analysis is mainly based on the column with the text or the translation to English which was renamed to the article and the label column.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section3\"\u003e \u003ch2\u003e3.2.3. Balancing the Dataset\u003c/h2\u003e \u003cp\u003eThere was a minor skew in the dataset, as the number of real news articles was much larger than the number of fake news. The dataset was balanced by an undersampling method. The majority class (real news) was randomly undersampled to equal the minority class (fake news) and the balanced dataset was mixed to reduce the possibility of ordering bias during training, as recommended in [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The strategy serves to avoid favoring the majority group and still has enough data to conduct meaningful deep learning training. Though the method of undersampling decreases the dataset size slightly, it does not change the distribution of linguistic features needed to train a model, and the resulting dataset is large enough to be used by DL models (50,210 samples per class). The last dataset consists of 50,210 samples each of the categories.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eBalanced class distribution\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLabel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOriginal Dataset\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eundersampling\u0026nbsp;Count\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e0 (Real)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e50751\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e50210\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1 (Fake)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e50210\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e50210\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe initial sample was slightly unbalanced and included 100,961 articles that had slight class imbalance (Real: 50,751, Fake: 50,210; ratio 1.01:1). To obtain an utter balance, random undersampling was used to eliminate 541 samples in the majority class (0.5 percent of the total data). This option has been selected because of its insignificant loss of data and calculational efficiency. The final distribution is seen in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section3\"\u003e \u003ch2\u003e3.2.4. Text Preprocessing\u003c/h2\u003e \u003cp\u003ePreprocessing of text is significant in making text data ready to be used in ML. The following processes were used:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eRemoved URLs, mentions, hashtags, punctuation, and numbers.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eConverted everything to lowercase.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eTokenized and removed stopwords (both Kurdish and English).\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003eIn the case of Kurdish text, further normalization measures were taken to deal with spelling and character encoding variation. To be able to compare the two languages, these preprocessing steps were used in both languages.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003e3.2.5. Exploratory Data Analysis (EDA)\u003c/h2\u003e \u003cp\u003eTo gain a clearer picture of the dataset EDA was conducted to study the distribution of text lengths in real and fake news articles. Distribution plots were used to plot the average word length per article as indicated in Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. They provide us with a very approximate picture of how the style of writing may vary between the two classes.\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eReal News Articles\u003c/b\u003e:\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003e \u003cb\u003eFake News Articles\u003c/b\u003e:\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e3.3. Feature Extraction\u003c/h2\u003e \u003cdiv id=\"Sec12\" class=\"Section3\"\u003e \u003ch2\u003e3.3.1. Feature Representation\u003c/h2\u003e \u003cp\u003eThis study investigated two strategies of representation of features. The former approach applies pretrained FastText embeddings to initialize embedding layers. The second strategy employs sentence level FastText vectors which have been trained through a supervised training model. FastText was chosen because it has the capability of capturing subword information, as it is particularly useful in low-resource languages, like Kurdish, where morphological variation is typical.\u003c/p\u003e \u003cp\u003eThe entire pipeline is presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. In case of sentence level vectors, we simply averaged the word embeddings. That makes it easy and quick. The sentence-level representation method has a document representation as a dense vector, which is an average of word embeddings produced by FastText. The representation minimizes the dimensionality and computational expense and maintains semantic information.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003ch2\u003e3.3.2. FastText Embeddings\u003c/h2\u003e \u003cp\u003eIn the present work, we employed FastText embeddings to represent textual data harvested from the KDFND dataset, comprising both Kurdish and their English-translated news articles [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Which is a representation of each word as a bag of character n-grams, making it capable of capturing \u0026ldquo;subword\u0026rdquo; information and dealing with out-of-vocabulary words efficiently, a key benefit for low-resource languages like Kurdish [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. For the models utilizing pretrained embeddings, we used the publicly available 300-dimensional FastText vectors trained on Common Crawl for English, following the approach in [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], and the \u003cb\u003ec\u003c/b\u003eorresponding Kurdish vectors from [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eTo generate sentence-level representations, we trained a supervised FastText model on labeled examples of fake and real news by utilizing the \"\u003cem\u003efasttext.train_supervised\u003c/em\u003e\" function. The supervised FastText model was trained using the following hyperparameters:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eLearning Rate: 0.5\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eEpochs: 25\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eWord n-grams: 2\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eEmbedding Dimension: 300\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eThe training process yielded a vocabulary of 69,428 words along with two categorical labels. Sentence vectors were generated for each article. After the training process, sentence vectors for each article in the dataset were generated using the \"\u003cem\u003eget_sentence_vector()\u003c/em\u003e\" function, which finds the average vector of word embeddings contained in a sentence. This method has been shown to be effective in handling noisy text [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThis was done by adding a custom helper function (\"fasttext_vector\") to make sure that it preprocesses well and that it can deal with edge cases like no input or empty input. The resulting fasttext matrix \"X_fasttext\" was used as feature set in classification.\u003c/p\u003e \u003cp\u003eWhereas FastText tends to be fairly useful in low-resource and multilingual contexts [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], experimental findings in this paper show a major drop in performance with the usage of pretrained embeddings relative to the performance with sentence-level representations. The result suggests that despite the theoretical advantages of FastText, sentence-averaged or default representations are unable to detect contextual nuances in Kurdish fake news, a fact corroborated by Daneshfar [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] on the suitability of embeddings to low-resource sentiment analysis.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section3\"\u003e \u003ch2\u003e3.3.3. Tokenization and Padding\u003c/h2\u003e \u003cp\u003eBesides sentence-level representations, we also tested the token-level representations. The original news (raw) articles were then tokenized and transformed into padded sequences with Keras Tokenizer and pad sequences. A 10,000-word vocabulary was kept, and sequence lengths were fixed or trimmed to 200 tokens so that the input dimensions are equally sized.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section3\"\u003e \u003ch2\u003e3.3.4. Embedding Matrix\u003c/h2\u003e \u003cp\u003eAn embedding matrix (FastText-based) was created on the current vocabulary. It was the start of embedding layer of the LSTM, BiLSTM and CNN models. The FastText vectors of each word in the tokenized vocabulary were loaded.\u003c/p\u003e \u003cp\u003eOne can attribute the observed difference in the performance between pretrained embeddings (approximately 50) and sentence-level vectors (approximately 97) to the following factors:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eDomain Mismatch: General corpus embeddings (Wikipedia, Common Crawl) can fail to learn fake news linguistic patterns.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eOOV 18% on Kurdish and 12% on English, which indicates the use of domain-specific terms that the pretrained models do not know.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eStatic vs. Dynamic Representations: The FastText vectors are sentence-level vectors that learn task-specific semantics during training, whereas pretrained vectors are fixed.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003e3.4. Dataset Splitting\u003c/h2\u003e \u003cp\u003eThis processed data was divided into training and test data in the 80:20 ratio. Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows an overview of the FastText-based data preparation process. The split was performed randomly to ensure balanced representation.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eFastText-Based Data Preparation\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStep\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePurpose\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTrain FastText model\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSupervised training on labeled dataset to learn domain-specific embeddings\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGenerate sentence vectors\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eApply \u0026ldquo;get_sentence_vector()\u0026rdquo; for each article\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePrepare train/test split\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eUse \u0026ldquo;train_test_split\u0026rdquo; to create 80/20 split\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTokenize articles\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eConvert raw text into integer sequences\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePad sequences\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNormalize input length to fixed size for deep models\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBuild embedding matrix\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMap FastText word vectors to vocabulary tokens\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"4. Proposed Model Architectures and Performance Optimization","content":"\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Model Hyperparameters\u003c/h2\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e presents the detailed hyperparameter configurations for the (LSTM, BiLSTM, and CNN) models. All models utilize pretrained FastText (300-dimensional) embeddings with a vocabulary size of 10,000 and a maximum sequence length of 200 tokens. The embedding layer is trainable across all architectures.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eExperimental Configuration of Deep Learning Models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eArchitecture\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDropout\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOptimizer\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eBatch Size\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eEpochs\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLSTM\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLSTM (128) \u0026rarr; Dense (32) \u0026rarr; Dropout (0.2) \u0026rarr; Dense (1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAdam\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e64\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eBiLSTM\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eBiLSTM (64) \u0026rarr; Dense (16) \u0026rarr; Dropout (0.3) \u0026rarr; Dense (1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e0.2 (recurrent)\u0026thinsp;+\u0026thinsp;0.3 (dense)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAdam\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e64\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eCNN\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eConv1D (64) \u0026rarr; MaxPool1D \u0026rarr; GlobalMaxPool1D \u0026rarr; Dense (16) \u0026rarr; Dropout (0.3) \u0026rarr; Dense (1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAdam\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e64\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eIn the present work, a number of DL models were investigated and employed to improve both classification accuracy and computational speed in the scenario of detecting fake news on multilingual datasets. The experiments contrasted different configurations of embedding techniques and neural network models, such as LSTM, BiLSTM, and CNN, with both pretrained FastText embeddings and non-pretrained variants thereof.\u003c/p\u003e \u003cdiv id=\"Sec19\" class=\"Section3\"\u003e \u003ch2\u003e4.1.1 LSTM with Pretrained FastText Embedding\u003c/h2\u003e \u003cp\u003eThe first model used pretrained FastText embeddings in an LSTM model with the following configuration. The LSTM model consists of an input layer followed by an LSTM layer with 128 units, a dropout layer with a rate of 0.5, and a dense output layer with sigmoid activation. The model was trained using the Adam optimizer with a learning rate of 0.001.\u003c/p\u003e \u003cp\u003eIn spite of the conceptual benefits of operating with semantically dense pretrained embeddings, this model underperformed at just 49.98% accuracy for Kurdish and 49.66% accuracy for English. The near-zero Precision and Recall values for Kurdish indicate that the model predicted only one class (majority class) for all test samples, likely due to the inability of static embeddings to capture domain-specific linguistic patterns. The lack of improvement can also suggest that the FastText embeddings that are trained independently and used as fixed vectors might not fit well the domain semantics that are found within the fake news corpus. Besides, fixed sequence length and potentially sparse vector representations could have led to the model underfitting. Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e shows the results of the LSTM model with pretrained FastText embeddings.\u003c/p\u003e \u003cp\u003eThe LSTM model, without the embedding layer, and directly trained on the sentence vectors provided by FastText, was performing much better with an accuracy of 97.24 on Kurdish and 96.85 on English. This architecture did not use the embedding layer and simply passed the raw dense FastText sentence vectors to the LSTM. Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e shows the performance of the LSTM model without embedding layers.\u003c/p\u003e \u003cp\u003eBy not considering the embedding matrix and relying on FastText to run sentence-level vectorization, the model was able to use contextually aware, semantically complete input representations, which greatly enhanced convergence and classification performance.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section3\"\u003e \u003ch2\u003e\u003cb\u003e4.1.2 Bidirectional LSTM (BiLSTM\u003c/b\u003e)\u003c/h2\u003e \u003cp\u003eLikewise, the BiLSTM model has a 128-unit, bidirectional LSTM layer and dropout and a dense output layer. To further exploit context both forward and backward, the use of BiLSTM model was used. However, the model did not improve in combination with pretrained FastText embeddings (Kurdish accuracy: 50.02, and English accuracy: 49.66), presumably because the embeddings do not align with the dataset. Table\u0026nbsp;\u003cspan refid=\"Tab7\" class=\"InternalRef\"\u003e7\u003c/span\u003e shows the result of the BiLSTM model using pretrained embeddings.\u003c/p\u003e \u003cp\u003eWhile the BiLSTM without embedding has achieved 97.37% accuracy on Kurdish and 97.01% accuracy on English, demonstrating that trainable representations derived directly from input vectors allowed the model to fit the data's structure and meaning more effectively. This model used dropout regularization combined with a smaller dense layer for mitigating overfitting. Table\u0026nbsp;\u003cspan refid=\"Tab8\" class=\"InternalRef\"\u003e8\u003c/span\u003e presents the results of the BiLSTM model without embedding layers.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section3\"\u003e \u003ch2\u003e4.1.3 Convolutional Neural Networks (CNN)\u003c/h2\u003e \u003cp\u003eThe CNN model consists of a convolutional layer with 128 filters and a kernel size of 5, followed by a max-pooling layer, dropout (rate\u0026thinsp;=\u0026thinsp;0.5), and a fully connected dense layer. CNN-based models were explored to capture local n-gram features through 1D convolutions. In combination with FastText embeddings, the performance saturated at 49.98% accuracy for Kurdish and 49.66% accuracy for English, mirroring the limitation in the static embedding fusion. The CNN model\u0026rsquo;s results using pretrained embeddings are detailed in Table\u0026nbsp;\u003cspan refid=\"Tab9\" class=\"InternalRef\"\u003e9\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eAn embedding-free version of CNN fed directly on reshaped sentence vectors performed much better. CNN model was found to have accuracy of 97.40 Kurdish and a result of 96.69 English. It proves that CNNs can be trained to recognize discriminative features with sequence-level vectors without word representations trained on massive text collections. CNN model performance of excluding embedding layers is shown in Table\u0026nbsp;\u003cspan refid=\"Tab10\" class=\"InternalRef\"\u003e10\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eTraining of models that had no embedding layer was superior to training models with embedding layers in any experimental situation. Directly using the sentence vector of FastText enhances the performance of the model and classification. Besides, small models such as LSTM and CNN would quickly converge and generalise more in the absence of overhead of pretrained embeddings. The findings also highlight the fact that the representation strategies of features need to be accommodated to data to be used successfully, particularly in multilingual or domain-specific settings.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e4.2. Evaluation Strategy\u003c/h2\u003e \u003cp\u003eThe data was divided into 8020 train test ratios. In order to make it robust, 5-fold cross-validation was also used. Accuracy, precision, recall and F1-score were used to assess model performance. To give more insights into the behavior of the model, confusion matrices and ROC curves were plotted, as displayed in Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e and \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"5. Discussion and Results","content":"\u003cp\u003eThis section compares the performance of LSTM, BiLSTM and CNN models to detect fake news on Kurdish and English on KDFND dataset particularly the influence of FastText based representations. The findings indicate that there is a significant performance difference between the embedding-based and sentence-level representation systems, and the results have a considerable implication in the low-resource language processing.\u003c/p\u003e \u003cp\u003eThe results of the experiment suggest an evident difference between embedding-based and non-embedding. Models based on pretrained FastText embeddings did not perform well, and had almost random accuracy (~\u0026thinsp;50) on both Kurdish and English. Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, Table\u0026nbsp;\u003cspan refid=\"Tab7\" class=\"InternalRef\"\u003e7\u003c/span\u003e, and Table\u0026nbsp;\u003cspan refid=\"Tab9\" class=\"InternalRef\"\u003e9\u003c/span\u003e show the detailed results of the LSTM, BiLSTM, and CNN models that included embedding layers.\u003c/p\u003e \u003cp\u003eConversely, direct models that were trained on sentence-level FastText vectors ranked much higher than embedding-based models. Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e showed that the LSTM model was 97.24% accurate with Kurdish and 96.85% accurate with English. Equally, the BiLSTM model had 97.37% and 97.01% accuracy on Kurdish and English respectively (Table\u0026nbsp;\u003cspan refid=\"Tab8\" class=\"InternalRef\"\u003e8\u003c/span\u003e), whereas the CNN model had 97.40% and 96.69% (Table\u0026nbsp;\u003cspan refid=\"Tab10\" class=\"InternalRef\"\u003e10\u003c/span\u003e). This significant enhancement suggests that sentence representations are better at the sentence level to reflect semantic and contextual patterns that are useful in detecting fake news.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eLSTM with pretrained embedding layer results\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.4998\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.6636\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eLSTM without an Embedding Layer\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9724\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9766\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9680\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9723\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9685\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9702\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9663\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9683\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab7\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 7\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eBiLSTM with pretrained embedding layer results\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.5002\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.5002\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.6669\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.6636\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab8\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 8\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eBiLSTM without an Embedding Layer\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9737\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9717\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9759\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9738\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9701\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9691\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9707\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9699\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab9\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 9\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eCNN with pretrained embedding layer results\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.4998\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.4966\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.6636\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab10\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 10\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eCNN without an Embedding Layer\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLanguages\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9740\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9690\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9793\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9741\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eEnglish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.9669\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9630\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9707\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9668\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003e5.1 Key Findings\u003c/b\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003e1. Impact of FastText Embeddings\u003c/b\u003e \u003c/p\u003e \u003cp\u003eThe FastText embedding-based model trained on pretrained embeddings also fared poorly with a score of approximately 50 percent accuracy on both English and Kurdish, which is only slightly more accurate than a simple guessing game. It indicates that fix embeddings do not resolve minor linguistic and contextual tendencies that are characteristic of fake news, especially in low-resource languages. By contrast, models trained on dynamically trained sentence-level FastText vectors (without embedding layers) did significantly better. They were able to get the accuracy of about 97 and 96 on Kurdish and English respectively. These findings emphasize the importance of domain-adapted feature representations.\u003c/p\u003e \u003cp\u003eA vast gap in the domains existed between the general domain corpus (e.g., Wikipedia, Common Crawl) that pretrained embeddings are trained and the general social media and news linguistic style, vocabulary, and neologisms of KDFND. This incompatibility restricts the capability of pretrained embeddings to focus on misleading patterns and contextual peculiarities.\u003c/p\u003e \u003cp\u003eThe other urgent problem is the existence of out-of-vocabulary (OOV) words especially in Kurdish. The comparison of the tokens in our vocabulary and the pretrained model reveals that the OOV rate of Kurdish is around 18% and of English is around 12%. This implies that significant domain-specific words can be given insufficient or no vector representations, which are harmful to model learning.\u003c/p\u003e \u003cp\u003eFinally, such embeddings are non-adaptable and cannot learn or disambiguate semantics of words depending on context during learning. Sentence-level representations, on the other hand, enable models to have task-specific semantics, which results in much better outputs.\u003c/p\u003e\u003cp\u003e \u003cb\u003e2. Model Architecture Comparisons\u003c/b\u003e \u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eBoth LSTM and BiLSTM models are best trained on sentence-level vectors. Although BiLSTM outperforms LSTM (97.37% vs. 97.24% on Kurdish), this is due to the fact that it is able to learn the context in both directions.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eCNN: Achieved similar performance (97.40% for Kurdish) by effectively capturing local n-gram features, illustrating its suitability for text classification tasks.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e\n\u003ch3\u003e3. Cross-Lingual Performance\u003c/h3\u003e\n\u003cp\u003eIn this paper, the differences in performance were found to be negligible less than 1% between the English and the Kurdish which indicates that the methods proposed are sound. This result contradicts the popular belief that languages with low resources naturally restrict model performance.\u003c/p\u003e\n\u003ch3\u003e4. Computational Efficiency\u003c/h3\u003e\n\u003cp\u003eThe non-embedding layer models were also faster to train and consumed less resources due to the fact that they did not involve computations on huge matrices of embedding. This predisposes them to be used especially in resource-limited settings.\u003c/p\u003e \u003cdiv id=\"Sec26\" class=\"Section2\"\u003e \u003ch2\u003e5.2. Comparison with previous Research.\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab11\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 11\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparative Performance on the Kurdish Fake News Detection Task as Compared to Past Studies.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStudy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eLanguage\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSalh and Nabi [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSVM\u0026thinsp;+\u0026thinsp;TF-IDF\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e~\u0026thinsp;92%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAhmed [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHard Voting (SVM, NB, DT)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e~\u0026thinsp;93%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKarim and Hassan [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLSTM\u0026thinsp;+\u0026thinsp;FastText\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eKurdish\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e~\u0026thinsp;95%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eThis Study\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eCNN (no embedding)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003eKurdish\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e97.40%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eThis Study\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eBiLSTM (no embedding)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003eKurdish\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e97.37%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eThe results of Table\u0026nbsp;\u003cspan refid=\"Tab11\" class=\"InternalRef\"\u003e11\u003c/span\u003e are contrary to the previous research that indicated high performance with hybrid embedding-deep learning methods [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. This study shows that pretrained embeddings are not generalized to domain-specific tasks like Kurdish fake news detection and supports the same reservations of Daneshfar [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] regarding the efficiency of embeddings in low-resource settings. However, our results on the downsize model show that Salh and Nabi [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e] recommendation about non-standardized solutions in Kurdish NLP is true.\u003c/p\u003e \n\u003ch2\u003e\u003cspan\u003e5.3 Limitations\u003c/span\u003e\u003c/h2\u003e\n\u003col\u003e\n \u003cli\u003e\u003cspan\u003eSingle Train-Test Split: Results are based on a single 80\u0026thinsp;\u0026minus;\u0026thinsp;20 split. Future work should employ k-fold cross-validation to ensure robustness.\u003c/span\u003e\u003c/li\u003e\n \u003cli\u003e\u003cspan\u003eSingle Dataset: Evaluation limited to KDFND. External validation on additional Kurdish datasets is needed.\u003c/span\u003e\u003c/li\u003e\n \u003cli\u003e\u003cspan\u003eStatistical Significance: Trained models that do not have confidence intervals. It is advisable to repeat the runs with various random seeds. The future work should contain the test of statistical significance and more than one run of the experiment in order to be sure of the reproducibility.\u003cbr\u003e\u003c/span\u003e\u003c/li\u003e\n\u003c/ol\u003e\n\u003cdiv id=\"Sec27\" class=\"Section2\"\u003e\n \u003ch2\u003e5.4. Implications\u003c/h2\u003e\n \u003cp\u003eThe high-performing sentence-level FastText vectors are indicative of the possible success of lightweight and flexible feature-extraction approaches in low-resource language. Possible future directions are:\u003c/p\u003e\n \u003cul\u003e\n \u003cli\u003e\n \u003cp\u003eMultimodal integration (e.g., images, metadata) to reflect progress made for high-resource languages [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eHybrid models combining the strengths of CNNs and BiLSTMs to improve feature extraction.\u003c/p\u003e\n \u003c/li\u003e\n \u003cli\u003e\n \u003cp\u003eExplainability methods \u003cstrong\u003e(\u003c/strong\u003ee.g., SHAP, LIME\u003cstrong\u003e)\u003c/strong\u003e [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e] are utilized to enhance the transparency of the model for users.\u003c/p\u003e\n \u003c/li\u003e\n \u003c/ul\u003e\n\u003c/div\u003e"},{"header":"6. Conclusion","content":"\u003cp\u003eThe spread of fake news needs effective detection mechanisms, particularly for low-resource languages that lack advantageous linguistic tools. This research explored the effectiveness of LSTM, BiLSTM, and CNN models for fake news detection in Kurdish and English using the KDFND dataset and FastText embeddings. Our main results are that models trained with sentence-level FastText vectors (without embedding layers) perform close to state-of-the-art (~\u0026thinsp;97% for Kurdish, ~\u0026thinsp;96% for English), but models with pre-trained embeddings perform worse (~\u0026thinsp;50% accuracy). This implies that static embeddings do not generalize well to domain-specific low-resource language tasks, but dynamically trained representations are more accommodating.\u003c/p\u003e \u003cp\u003eThe BiLSTM and CNN model performance further highlights the contribution of feature extraction with contextual awareness and computational feasibility for fake news detection. The findings can be compared to the current body of information regarding the limitation of embedding-based methodologies to low-resource NLP and propose a potentially productive alternative based on sentence vectorization. Future studies should take into account multimodal and explainable AI [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e] and bigger dialectally diverse datasets to improve model robustness.\u003c/p\u003e \u003cp\u003eLastly, this paper provides a practical roadmap to improve the detection of fake news in low-resource languages by requesting task-oriented optimizations instead of the task-agnostic pre-trained embeddings.\u003c/p\u003e \u003cp\u003eOur solution paves the way for more equitable and scalable solutions to the world's misinformation crisis by closing the gap between high-resource and low-resource NLP.\u003c/p\u003e \u003cp\u003e \u003cb\u003e6.1 Key Takeaways\u003c/b\u003e:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eSentence-level FastText vectors outperform pretrained embeddings in low-resource fake news detection.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eBiLSTM and CNN models perform with ~\u0026thinsp;97% accuracy for Kurdish, demonstrating their efficiency.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eStatic embeddings can fail to generalize and promote flexible feature extraction.\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eFuture work involves multimodal integration and explainability improvements.\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003cp\u003eThis study pushes the boundary of NLP on low-resource languages and provides practical recommendations for building multilingual fake news detection systems.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eAzad M. Karim: Conceptualization, methodology, software implementation, formal analysis, investigation, writing\u0026mdash;original draft preparation, visualization, project administration.Bryar A. Hassan: Validation, data curation, resources, writing\u0026mdash;review and editing, supervision.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eKarim AM, Hassan BA (2025) Research Article: A Hybrid AI Model for Fake News Detection: Leveraging FastText and LSTM for Kurdish and English, \u003cem\u003ecjnst\u003c/em\u003e, vol. 0, no. 1, pp. 1\u0026ndash;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.31530/cjnst.2025.1.1\u003c/span\u003e\u003cspan address=\"10.31530/cjnst.2025.1.1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNasser M et al (2024) A systematic review of multimodal fake news detection on social media using deep learning models, \u003cem\u003eResults Eng.\u003c/em\u003e, vol. 26, no. December p. 104752, 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.rineng.2025.104752\u003c/span\u003e\u003cspan address=\"10.1016/j.rineng.2025.104752\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaham S, Tariq A, Khan MUG, Alamri FS, Rehman A, Saba T (2024) ANN: adversarial news net for robust fake news classification. Sci Rep 14(1):1\u0026ndash;20. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41598-024-56567-4\u003c/span\u003e\u003cspan address=\"10.1038/s41598-024-56567-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalh DA, Nabi RM (2023) Kurdish Fake News Detection Based on Machine Learning Approaches. Passer J Basic Appl Sci 5(2):262\u0026ndash;271. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.24271/PSR.2023.380132.1226\u003c/span\u003e\u003cspan address=\"10.24271/PSR.2023.380132.1226\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHashmi E, Yayilgan SY, Yamin MM, Ali S, Abomhara M (2024) Advancing Fake News Detection: Hybrid Deep Learning With FastText and Explainable AI, \u003cem\u003eIEEE Access\u003c/em\u003e, vol. 12, no. March, pp. 44462\u0026ndash;44480. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ACCESS.2024.3381038\u003c/span\u003e\u003cspan address=\"10.1109/ACCESS.2024.3381038\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYan K (2024) Optimizing an English text reading recommendation model by integrating collaborative filtering algorithm and FastText classification method. Heliyon 10(9):e30413. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.heliyon.2024.e30413\u003c/span\u003e\u003cspan address=\"10.1016/j.heliyon.2024.e30413\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlmandouh ME, Alrahmawy MF, Eisa M, Elhoseny M, Tolba AS (2024) Ensemble based high performance deep learning models for fake news detection. Sci Rep 14(1):26591. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41598-024-76286-0\u003c/span\u003e\u003cspan address=\"10.1038/s41598-024-76286-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArmeen I, Niswanger R, Tian C (2024) Combating Fake News Using Implementation Intentions. Inf Syst Front no June. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10796-024-10502-0\u003c/span\u003e\u003cspan address=\"10.1007/s10796-024-10502-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDaneshfar F (2024) Enhancing Low-Resource Sentiment Analysis: A Transfer Learning Approach. Passer J Basic Appl Sci 6(2):265\u0026ndash;274. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.24271/PSR.2024.440793.1484\u003c/span\u003e\u003cspan address=\"10.24271/PSR.2024.440793.1484\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBussa S, Bodhankar A, Patil VH, Pal H, Bunkar SK, Qureshi ARK (2023) An Implementation of Machine Learning Algorithm for Fake News Detection, \u003cem\u003eInt. J. Recent Innov. Trends Comput. Commun.\u003c/em\u003e, vol. 11, no. March, pp. 392\u0026ndash;401. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.17762/ijritcc.v11i9s.7435\u003c/span\u003e\u003cspan address=\"10.17762/ijritcc.v11i9s.7435\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSan Ahmed RAM (2023) Hard Voting Approach using SVM, Na\u0026iuml;ve Bays and Decision Tree for Kurdish Fake News Detection. Iraqi J Comput Sci Math 4(3):25\u0026ndash;33. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.52866/ijcsm.2023.02.03.003\u003c/span\u003e\u003cspan address=\"10.52866/ijcsm.2023.02.03.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbdul H, Al H, Abdul H, Al H, Jabardi M (2024) Detecting Fake News Using Machine Learning: A Comparative Study of Techniques. J Kufa Math Comput 11:113\u0026ndash;120\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCamelia TS, Fahim FR (2024) A Regularized LSTM Method for Detecting Fake News Articles. IEEE, pp. 1\u0026ndash;2\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTama FR, Sibaroni Y (2023) Fake News (Hoaxes) Detection on Twitter Social Media Content Through Convolutional Neural Network (CNN) Method. JINAV, 4, 1\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBhadana J, Kouritzin MA, Park S, Zhang I (2024) Markov Processes for Enhanced Deepfake Generation and Detection. arXiv, pp. 1\u0026ndash;17\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eScience I, Don E, Engineering S, Bosco D, Science SI (2024) Deceptive Content Detection Using Machine Learning. IJSREM 1\u0026ndash;5. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.55041/IJSREM34830\u003c/span\u003e\u003cspan address=\"10.55041/IJSREM34830\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArowolo MO, Misra S, Ogundokun RO (2023) A Machine Learning Technique for Detection of Social Media Fake News. IJSWIS 19(1):1\u0026ndash;25. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.4018/IJSWIS.326120\u003c/span\u003e\u003cspan address=\"10.4018/IJSWIS.326120\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVerma A et al (2025) ScrutNet: a deep ensemble network for detecting fake news in online text. Soc Netw Anal Min 15(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s13278-025-01412-3\u003c/span\u003e\u003cspan address=\"10.1007/s13278-025-01412-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbubakr Salh D, Nabi R (2022) Mendeley Data V1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.17632/3zx9vpw3wh.1\u003c/span\u003e\u003cspan address=\"10.17632/3zx9vpw3wh.1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Kurdish Dataset for Fake News Detection (KDFND)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNguyen D, Nguyen TT, Nguyen CV (2025) Fake advertisements detection using automated multimodal learning: a case study for Vietnamese real estate data. Appl Intell 55(6). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10489-025-06238-2\u003c/span\u003e\u003cspan address=\"10.1007/s10489-025-06238-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFastText crawl-vectors @ fasttext.cc. [Online]. Available: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://fasttext.cc/docs/en/crawl-vectors.html\u003c/span\u003e\u003cspan address=\"https://fasttext.cc/docs/en/crawl-vectors.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"fake news detection, low-resource languages, Kurdish NLP, deep learning, FastText embeddings","lastPublishedDoi":"10.21203/rs.3.rs-9409057/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9409057/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe rapid dissemination of misinformation on social media is a growing concern. This concern hit languages like Kurdish hard, whose fewer resources created problems in not identifying and understanding the issue. This work made use of deep learning (DL) techniques\u0026mdash;Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Convolutional Neural Networks (CNN)\u0026mdash;for misinformation detection in Kurdish and English languages. These DL techniques were tested on the Kurdish Dataset for Fake News Detection (KDFND). Among the three models, CNN achieved the highest accuracy (97.40% for Kurdish). We also examined how FastText embeddings affect performance by comparing models with and without embedding layers. Aim for the highest accuracy and the fastest model. Our tests indicate that FastText models for sentence-level vectors (without embedding layers) perform much better, with almost 97 percent accuracy for the Kurdish and 96 percent for the English. On the other hand, pretrained-embedding models only attain about 50 percent accuracy. The results demonstrate the limitations of static embeddings in low-resource settings and show that flexible, simple models are able to detect fake news without much pretraining. The present research contributes toward further development of NLP techniques for low-resource languages while having practical implications for multilingual fake news detection systems.\u003c/p\u003e","manuscriptTitle":"Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-15 03:55:52","doi":"10.21203/rs.3.rs-9409057/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"1f01a9b2-4d3f-426d-9d35-db155fe9e6d2","owner":[],"postedDate":"April 15th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-04-23T16:39:58+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-15 03:55:52","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9409057","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9409057","identity":"rs-9409057","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.