C-PsyD: A Chinese text classification model for detecting psychological problems

preprint OA: closed
Full text JSON View at publisher
Full text 165,508 characters · extracted from preprint-html · click to expand
C-PsyD: A Chinese text classification model for detecting psychological problems | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article C-PsyD: A Chinese text classification model for detecting psychological problems Chaoqun Zhang, Yunheng Yi This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5337854/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The COVID-19 epidemic has had significant direct and psychological impacts. This study introduces a Chinese text classification model, C-PsyD, which combines BiGRU, Attention, Self-Attention, and convolutional neural network (CNN) techniques. The model utilizes TextCNN and BiGRU outputs in the Attention module, generating result A. Furthermore, the outputs of Self-Attention and BiGRU are used in the Attention mechanism, producing result B. By averaging the results of A and B, a final text feature vector is obtained and passed through a dropout layer. A fully connected neural network layer processes the text feature vector to obtain the classification result. Experimental evaluations were conducted using a Chinese psychological text dataset from GitHub. The results, including loss function value, classification accuracy, recall result, false positive rate, and confusion matrix, indicate that C-PsyD outperforms six competing models. Notably, C-PsyD achieves a classification accuracy of 79.5%, surpassing TextCNN (78.2%), BiLSTM (76.4%), LSTM (74.9%), Simple-RNN (55.7%), FastText (50.1%), and ST_MFLC (44.8%), as well as FastText (50%). These findings confirm the feasibility and effectiveness of the proposed psychological text classification model. Its implementation can enhance doctors' ability to classify patients, promptly detect psychological problems, and facilitate effective treatment, thus optimizing the utilization of medical resources. Psychological Problem Text Classification BiGRU Self-Attention Attention Convolutional Neural Network Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 Figure 14 Figure 15 Figure 16 Figure 17 Figure 18 Figure 19 Figure 20 Figure 21 1. Introduction People are more likely to have mental health problems in today's society. People's poor eating habits can lead to mental health problems from the perspective of daily life [1] . Many diseases, such as dementia, can cause psychological problems from the perspective of physical diseases[2]. As a global epidemic disease, COVID-19 has inevitably had an impact on people's mental health in various countries and regions since the outbreak of the epidemic. The global crisis caused by COVID-19 has also affected the mental health of many people[3]. Literature[3] also shows that students, single people and people who live with a large number of other people have a higher risk of poor mental health. In other words, students are more likely to have psychological problems related to COVID-19. There are many people with psychological problems. However, at present, the general public in China lacks sufficient knowledge about psychological problems, and there are relatively few medical resources related to psychology. Therefore, ordinary people often don't go to see a psychologist to avoid getting embarrassed. Especially now that medical resources across the country are strained to varying degrees, we need to find a way to save medical resources and improve the efficiency of using medical resources for psychological problems. Now that the Internet of Things (IoT) [4] technology is advanced, there is a new trend to integrate healthcare systems with the IoT and the potential for IoT healthcare applications is enormous [5] . As described in the literature[6], with the increasing use of online media, Internet-delivered psychotherapies are becoming an important tool for improving mental health conditions. Based on the above background, this paper will combine some existing technologies to design a model and make it play a role in the medical system of the Internet of Things. Deep learning, as one of the most popular artificial intelligence technologies, has found numerous applications in the medical field. For instance, Reference [7] utilized deep learning to detect depression based on EEG data of patients, while Reference [8] employed it to distinguish between anxiety and depression. Reference [9] developed a system that automatically detects psychological diseases by analyzing patient data through text classification technology. Similarly, Reference [10] identified mental illnesses by examining user data on social media platforms using text classification. Despite its potential, the integration of deep learning with psychotherapy in online diagnosis and treatment is rare, and research combining these two elements to improve medical resource utilization is scarce. Thus, we aim to apply deep learning to psychotherapy. Diagnosing psychological diseases using deep learning technology essentially involves a text classification task, which is common in natural language processing (NLP). Traditional machine learning algorithms, such as support vector machine (SVM) [11] and K-nearest neighbor (K-NN) [12], can also perform text classification. However, these algorithms rely heavily on manually generated features, resulting in high labor costs and low efficiency [13]. In contrast, deep learning-based algorithms can continually update text vectors through computer learning, eliminating such drawbacks. Various neural networks are used for text classification, including convolutional neural network (CNN) [14], recurrent neural network (RNN) [15], recursive neural network (ReNN) [16], and graph neural network (GNN) [17]. CNN-based models, such as MobyDeep proposed in Reference [14], excel at long text classification. TextCNN, as described in Reference [18], improved long text classification by treating long text as an image and applying convolution operations. However, TextCNN is less suitable for short text with high character correlation, whereas RNN can extract semantic features from the entire text for classification. Traditional RNNs suffer from gradient vanishing and gradient explosion issues. Long short-term memory (LSTM) is a special RNN model that addresses these problems and outperforms traditional RNNs. Gated recurrent unit (GRU), a LSTM variant introduced by Cho in 2014 [19], is often superior to LSTM in specific classification tasks. Numerous RNN models have been proposed, such as the bidirectional long short-term memory (BiLSTM) model for power dispatching classification in Reference [20], which demonstrated excellent classification capabilities. With the emergence of the Attention mechanism [21] and Transformer model [22], an increasing number of text classification models have been developed and applied. Reference [23] suggested acquiring dynamic character vectors [24] of a text using BERT [25], a pre-trained model, and constructing text representation vectors with BiLSTM, weighted convolution [26], and Attention mechanism. However, BERT's excessive parameters lead to high training costs, and the model only supports texts with fewer than 512 characters. In view of the background described above, a novel Chinese text classification model named C-PsyD is proposed for detecting psychological problems in this paper. We obtained Chinese psychological text dataset shared on GitHub, and used the established text classification model to calculate and infer the text data to achieve preliminary psychological diagnosis of a user. The contributions of this paper are summarized as follows: a. We apply the text classification technology to the field of psychological disease detection. b. We propose a Chinese text classification model named C-PsyD for detecting psychological problems. C-PsyD uses text features obtained from CNN and Self-Attention to guide the Attention module to generate Attention parameters. These Attention parameters are combined with the text features implemented by BiGRU to finally obtain the text features. c. The experimental results verify that C-PsyD significantly outperforms other models such as FastText [28] , TextCNN [28] , Simpl-RNN [29] , LSTM [29] , BiLSTM [30] and ST_MFLC [31] . This suggests that the proposed model for psychological text classification is not only feasible but also highly effective. By assisting doctors in classifying patients, detecting psychological problems at an early stage, and taking timely measures to treat them, this model can significantly improve the efficiency of medical resource utilization. Furthermore, even though these models were initially designed for specific domains, they can still be applied in other areas after being trained using datasets from different fields . The rest of this paper is organized as follows. First, six text classification models including FastText, TextCNN, Simpl-RNN, LSTM, BiLSTM and ST_MFLC are introduced. Next, the proposed model C-PsyD are explained in detail. Then, the experiments are described and the experimental results are analysed. Finally, conclusions and future work are presented. 2. Text Classification Models 2.1 Overall Framework of Text Classification The main function of a text classification model is to finish text classification. There are many models available for text classification tasks, such as FastText, TextCNN, Simpl-RNN, LSTM, BiLSTM and ST_ MFLC。 In this paper, these models are used to extract the text features of the text, and the full connection layer outputs the results according to the text features. The overall framework of text classification task is shown in Fig. 1 As can be seen from Fig. 1 , after transformation from input text to words vectors, the vectors can be fed into a model. The model extracts the text features from the input text vectors. The text features are also represented as vectors. The text feature vectors are fed into the fully connected layer, which computes the final classification result from the text feature vectors. In this paper, the result is also a vector, which represents the probability that each input text vector belongs to each category. In this paper, the input content of all text classification models is text vector, and the output content is text feature. Figure 1 is intended to illustrate the role of the model mentioned in the text in the whole text classification task. 2.2 FastText Model FastText is a fast and efficient text classification model. Its core idea is to first obtain the character vectors of sentences, then aggregate these vectors, and finally classify the sentences. The performance of FastText in classification is significantly affected by vector dimension. To ensure consistency, FastText obtains character vectors through embedding layers, which is consistent with other models in the paper.The specific calculation is shown as $${m_i}=\frac{{\sum\limits_{{t=0}}^{{n - 1}} {{w_{(t,i)}}} }}{n}$$ 1 where W are the original text vectors, W (t,i) denotes the i-th value in the character vector of the t-th character in the original text vectors, N represents the number of characters in this sentence, M i denotes the i th value in the processed text vector, M is the text feature extracted from the model. In the literature [ 28 ], FastText used word embedding, that is, it calculated the vector of the central word based on the central word and the n-gram algorithm [32] , and outputted the word vector of each word. According to the description in the literature [ 28 ], the structure of the original FastText CNN is supplemented as shown in Fig. 2 . Because this paper uses character embedding, some changes are made to the original FastText model; that is to say, we use character embedding instead of word embedding in FastText. The structure of our designed FastText is shown in Fig. 3 . 2.3 TextCNN Model TextCNN essentially uses CNN to classify texts. As textual data can be considered as one-dimensional data, TextCNN uses a one-dimensional CNN[ 32 ]. Its one-dimensional convolution operation is shown in Fig. 4 . Traditional TextCNN includes pooling layers. The types of pooling layers can be divided into max-pooling layer [34] and avg-pooling layer [35] . Max-pooling layer is shown in Fig. 5 , and its main idea is to select the maximum value in a range. Because max-pooling layer could lose too much useless information, it is suitable for texts with a lot of meaningless information. Figure 6 shows mean-pooling layer which main idea is to select the mean value in a range. Since mean-pooling layer does not directly lose information, it is appropriate for texts with less feature information. TextCNN from the literature [ 28 ] uses word vectors to make the experimental results more reliable. However, there are too many common words in Chinese. If word vectors are used, there are too many model parameters and it may be difficult to complete the experiments with ordinary equipment. Therefore, we replace word vectors with character vectors and take no account of n-gram information. The structure of TextCNN is shown in Fig. 7 . 2.4 Simple-RNN Model Simple-RNN is the simplest RNN model, which has only tanh operations in its cells. The structure of Simple-RNN-Cell was described in detail in the literature[ 29 ]. In this paper the structure of the Simple-RNN model is shown in Fig. 8 . As shown in Fig. 8 , the input to the model is a text vector. The model accepts an input of a text vector at each time step and accepts the output of the hidden layer from the previous time step. The text vector is propagated forward through two cells and a rejection layer to get the final output. Each Simple-RNN cell is just a simple tanh operation, which is only represented as tanh in Fig. 8 . 2.5 LSTM Model LSTM is a one-way RNN model and a classical model in the field of natural language processing. LSTM introduces a long-time memory mechanism, which makes the model perform better when processing long text. In other words, LSTM models are often used for processing long texts. The structure of LSTM cell was presented in detail in literature [ 29 ].The structure of LSTM in this paper is similar as Simple-RNN shown in Fig. 8 . The essential difference between LSTM and Simple-RNN is different cell structure, that is, each LSTM cell is a LSTM-based cell, while each Simple-RNN cell is a simple tanh operation. 2.6 BiLSTM Model BiLSTM is a bidirectional LSTM. It differs from the traditional bidirectional RNN due to different cell structures. The structure of BiLSTM in this paper is derived from the literature [ 25 ] and shown in Fig. 9 . In a word, BiLSTM is a bidirectional RNN model with LSTM cell. BiLSTM uses forward and backward propagation, so the result of each word operation is affected by the words on both sides of this word, and the bidirectional forward mechanism can extract semantic features more accurately than LSTM. In BiLSTM, each layer contains a dropout layer, and the parameter of dropout layer was set to 0.2. In the literature [ 25 ], the model took the output of the last layer as the extracted text features, and then classified texts through a fully connected layer. 2.7 ST_MFLC Model The literature [ 26 ] integrated TextCNN, BiGRU and Self-Attentions into one model named ST_MFLC. In ST_MFLC, each module extracted different text features from input data, then spliced the features extracted from the three modules, and took the spliced output as the final feature. The literature [ 26 ] has the framework of ST_MFLC but no specific implementation details, so we reproduced the model and added some implementation details according to the application of this paper. The structure of ST_MFLC is shown in Fig. 10 . In Fig. 10 , Conv stands for a convolution layer. In the CNN part of the model, the initial input layer of the convolutional layer remains the same as the dimension of the word vector, and the final output dimension is 1. MaxPool represents the maximum pooling layer with a size of sequence length.. BiGRU also has two hidden layers. Dropout stands for the dropout operation, and the parameter of dropout layer was set to 0.2. The final output are the features of input data. In Fig. 10 , Self-Attention represents the Self-Attention layer [27] . The calculation of this layer is shown in Equations ( 2 )-( 5 ). Note that, Eq. ( 2 ) is the general calculation formula, where Q , K and V represent three variables respectively in the Self-Attention mechanism, d k is the dimension of the character vector, that is, the vector dimension of K . And D stands for text vector. QW , KW and VW are three different matrices respectively, and the values in the matrices indicate the parameters of the fully connected layer of the neural network in the actual coding. The Self-Attention mechanism, where the text vector D is multiplied by three different matrices, then gives three different features. These three different features are obtained by performing the operation given in Eq. 2 , which is called the "self-attentive value". $${\text{Self}}Att(Q,K,V)=s{\text{oftmax(}}\frac{{(Q{K^T})}}{{\sqrt {{d_k}} }}V)$$ 2 $$Q=D \times Q{W^{^{T}}}$$ 3 $$K=D \times K{W^{^{T}}}$$ 4 $$V=D \times V{W^{^{T}}}$$ 5 3. The Proposed Model C-PsyD 3.1 Framework of C-PsyD Model Based on the research and application of text classification techniques at home and abroad, C-PsyD is constructed in this paper using existing techniques. C-PsyD is designed by integrating the four modules, namely TextCNN, BiGRU, Self-Attention and Attention. The structure of C-PsyD is shown in Fig. 11 . In Fig. 11 , the convolution layer Conv is used to extract the crucial features from input text. The crucial features and text semantic features which extracted from BiGRU are input into Attention layer together to obtain the new text features A. Similarly, the outputs from Self-Attention layer and BiGRU layer are input into Attention layer together to get new text features B. Finally, the text features A and B are spliced to obtain the final text features. It is worth noting that the parameters of two Attention layers above are shared. 3.2 TextCNN Module In this model, the TextCNN module is comprised of three sub-modules, each containing a convolutional layer, a max-pooling layer, and an average-pooling layer. Based on the dimensions of character vectors, the input and output channel numbers of the convolutional layers are set to 128, respectively, and the size of each pooling layer is equal to the sequence length. The input vector dimensions for each sub-module are [batch_size, seq_len, word_size]. By analyzing the description of one-dimensional convolution operations and pooling layers in Section 2.3 , we can deduce that the output vector dimensions are [batch_size, 2, word_size]. 3.3 BiGRU Module BiGRU is a bidirectional RNN network. In this module, the number of hidden network layers was fixed at 2, and there is a dropout layer behind each hidden network layer, and the parameters of dropout layers were set to 0.2. The purpose of dropout layers is to prevent the model from overfitting. The dimension of the output vector from BiGRU layer is also [ batch_size , seq_len , word_size ]. 3.4 Self-Attention Module In the C-PsyD model, Self-Attention is a module that uses the Self-Attention mechanism. The detailed calculation of the Attention mechanism is shown in equations ( 2 ) to ( 5 ). In order to calculate the input text and obtain the Self-Attention feature of the text, C-PsyD will use the Self-Attention mechanism. Self-Attention feature is very important for subsequent calculation.. 3.5 Attention Module Attention mechanism [28] is used many times in our model, and its calculation can be presented by $$\operatorname{Score} (H,S)=\tanh ((\operatorname{cat} (H,S).W{a^T})).W{v^T}$$ 6 $$attW=\operatorname{softmax} (\operatorname{Score} (H,S))$$ 7 $$att=attW.{H^T}$$ 8 attW represents the attention weight, H is the vector to be attended to, which represents the output of BiGRU module in C-PsyD model. S is the parameter vector input into the attention layer. tanh represents the activation function, cat(H, S) represents the concatenation of H and S. Wa and Wv represent matrices, which are represented by a layer of fully connected neural networks in specific code implementation. The purpose of Wa and Wv is to transform the high-dimensional cat(H, S) into low-dimensional space. Formula (6) is used to calculate the attention score. Compared with reference [ 28 ], there is an additional Wv here for linear transformation of the result. "." represents dot product. The first step is to concatenate H and S. Since H and S could have different shapes, we need to perform linear transformation on S. We perform linear transformation on cat(H, S) and apply tanh operation to obtain score feature information, then perform linear transformation again to obtain the attention score, which is attW. The attention weight and H are multiplied to obtain the attention result using dot product. There are two positions where attention is implemented in C-PsyD. The module receives the output of TextCNN and BiGRU, and outputs the result through the attention mechanism. Attention also receives the output of Self Attention and BiGRU. Then the result is output through the attention mechanism. The final attention result is obtained by taking the average of the two output results. 4. Experiments 4.1 Experimental Environment All the experiments were performed on the processor numbered i5-11400F. The Linux server used a GPU with a version number of RTX3060, which has a 12G of running memory and a 32G of RAM. The computer operation system is Windows 11, the version of Python is Python3.11, and PyTorch used in our experiments is a framework for deep learning. 4.2 Dataset Preprocessing The dataset for our experiments is a collection of questions and answers about psychological knowledge, which can be downloaded from https://github.com/chatopera/efaqa-corpus-zh . As a public dataset, each row of it is in the format of JSON. Labels and their corresponding meanings related to the dataset are described in detail on the website. Because the number of partial label data in the public dataset is too small, some labels are sorted and merged. The merged labels are shown in Table 1 . The dataset can be divided into six main categories identified as 0, 1, 2, 3, 4, 5, respectively. Table 1 Some examples of the original dataset labels Category Label Original Content 0 未知 (Unknown) 未知、其它 (Unknown, Others) 1 学业与事业烦恼 (Academic and career worries) 学业烦恼、对未来规划的迷茫、事业和工作烦恼 (Academic worries, Confusion about future planning, Career and work worries) 2 家庭问题 (Family problems) 家庭问题和矛盾 (Family problems and contradictions) 3 情感问题 (Emotional problems) 男同性恋、女同性恋、双性恋与跨性别、情感关系问题、离婚、分手 (Gay, Lesbian, Bisexual and transgender, Emotional relationship issues, Divorce, Separation) 4 自我生理问题 (Self physical problems) 物质滥用、悲恸、失眠、压力、强迫症 (Substance abuse, Grief, Insomnia, Stress, Obsessive-compulsive disorder) 5 性格与关系问题 (Personality and relationship problems) 自我探索、低自尊、青春期问题、人际关系 (Self exploration, Low self-esteem, Adolescent problems, Interpersonal relationships) Considering the possibility of ambiguities such as “Unknown” and “Others” in the original dataset, we deleted this part of data. Therefore, the number of the data of Category 0 in Table 1 is 0. The processed dataset were imported into Excel and can be downloaded from https://pan.baidu.com/s/1OL5xbfWDndJa75C2PRKbUA?pwd=k44a . Table 2 presents some examples of the preprocessed dataset. Table 2 Some examples of the preprocessed dataset labels Preprocessed Content Label Number 女: 听过别人最多的议论就是干啥啥不行不长心眼没有脑子 (Girl: The most comments on women heard from others are what they can't do. They don't have minds and no brains) 5 女: 我整宿整宿睡不着经常头疼不知道自己怎么了 (Girl: I can't sleep all night long. I often have headaches. I don't know what's wrong with me) 4 放不下女朋友的过去, 又不想放手, 我该怎么办 (What should I do if I can't let go of my girlfriend's past and don't want to let go) 3 老婆与异性经常聊天是否正常༟聊天语气很轻浮༟ (Is it normal for wives to chat with the opposite sex frequently? The tone of conversation is relatively frivolous?) 3 女: 我暗恋了一个男孩, 可是不知道怎样接触他༟ (Girl: I secretly love a boy, but I don't know how to contact him?) 5 Due to the limitation of the small amount of data in the original dataset, one-tenth of the dataset were randomly selected as test set, and the rest were used as training set. Table 3 shows the sizes of the training set and test set in the six categories of the dataset. Table 3 Quantity distribution of the dataset Division of the dataset Category 0 Category 1 Category 2 Category 3 Category 4 Category 5 The number of test set 0 101 218 663 156 186 The number of training set 0 852 1848 5560 1478 1781 4.3 Experimental Settings and Evaluation indicators To verify the performance of our proposed model, C-PsyD is compared with FastText, TextCNN, ST_MFLC, BiLSTM, Simple-RNN and LSTM. It should be noted that due to the large Chinese vocabulary, using word vectors will lead to too many model parameters and thus greatly increase the computational effort. Therefore, we use character vectors instead of word vectors, that is, each Chinese character or symbol is treated as a word. In Chinese language, each individual Chinese character carries semantic meaning, which means that a single character can be considered as a word. This is significantly different from English. The character vector has a feature dimension of 128 and is computed by the word embedding[ 28 ]module. The number of hidden layers for all RNN modules was set to 2, the number of dataset batches ( Batch_size ) of all models in the experiments was fixed at 256, and the number of iterations ( epoch ) of all models was set to 10. We chose the loss function value ( Loss ) of the training set, the overall accuracy ( ACC ), the recall result ( recall ) and the false positive rate ( Far ) as the main indicators to evaluate the performance of each model. These four evaluation indicators widely used in literatures were recorded during experiments. Loss indicates how fast the model converges, which guides updating the model parameters and explains the cost of training the model. ACC can more intuitively illustrate the training effect of the model. As shown in Eq. ( 9 ), Loss is calculated by cross entropy function, where p is the real distribution, q stands for the non-real distribution and x represents the number of the sequence. $$Loss=\sum\limits_{x} {p(x).\log (\frac{1}{{q(x)}})}$$ 9 Loss has an error value between the predicted value and the actual value. The smaller Loss is, the more convergent the model becomes. Loss is obtained by the cross-entropy function, which can better describe the probability distribution of predicted and actual values. The classification in our experiments is not a two-classification problem, so the calculation formula is slightly different, where A i denotes the amount of data labeled as Category i , T i refers to the amount of data correctly classified as Category i and ACC is calculated by $$ACC=\frac{{\sum\limits_{{i=1}}^{5} {{T_i}} }}{{\sum\limits_{{i=1}}^{5} {{A_i}} }}$$ 10 We take each category as a positive sample of itself, and the others as negative samples. For example, when predicting Category 1, samples labeled as Category 1 denotes positive samples (represented by P ), and samples labeled as Category 2,3,4 and 5 are negative samples (represented by N ). When a sample is a positive sample, the prediction result of a model shows that the sample is positive. We call this as true positive ( TP ). When a sample is a positive sample, the prediction result of a model shows that the sample is negative, which is called false negative ( FN ). Similarly, a negative sample predicted to be negative by a model is represented by a true negative ( TN ). The model predicts a positive sample, but it is a negative sample, which can be marked as false positive ( FP ). The recall result ( recall ) of each model can be expressed by Eq. ( 11 ). The recall result indicates how many probabilities of positive samples are predicted to be positive samples. The higher the recall result is, the better the classification effect of the model has. $$recall=\frac{{TP}}{P}$$ 11 In fact, some models have a high predictive recall for a certain category, which does not mean that the model has a good classification result for this category. It is possible that the model classifies all the categories into the same category. Subsequently, the false positive rate ( Far ) obtained by each model for each category is needed to analyse. Far refers to the rate of texts which do not belong to a certain category is predicted to be that category. The calculation formula is shown in Eq. ( 12 ). The lower the value of this indicator is, the better the classification effect of this model has. $$Far=\frac{{FP}}{N}$$ 12 For example, Far for Category 1 refers to the probability of being predicted as Category 1 in Categories 2, 3, 4 and 5. 5. Experimental Results Analysis 5.1 Training Results Analysis Through the changes of Loss of each mode during training, we can see the convergence of a model more intuitively. The lower Loss is, the better the convergence of the model becomes. Drawing directly with the original data, the image will be messy. Therefore, Loss map shown in Fig. 12 is the image of the original data after smoothing. Through the comparative analysis of Fig. 12 , we can clearly observe that the loss values of all seven models are gradually decreasing. However, the loss curve of FastText fluctuates greatly, indicating its weak learning ability. In contrast, although the loss curve of ST-MFLC decreases more than FastText, it is still lower than that of other models, indicating its slightly inferior performance. Simple-RNN performs significantly better than ST-MFLC in reducing loss values. The loss reduction amplitude of BiLSTM and LSTM is quite similar. TextCNN and C-PsyD both show much higher performance than all other models in reducing loss values. Particularly, C-PsyD exhibits the most significant advantages, with the largest reduction amplitude and the fastest reduction speed, fully demonstrating its remarkable superiority in this aspect. As illustrated in Fig. 13 , the accuracy (ACC) of each model increases gradually with the increase of steps. The slow growth of FastText's accuracy indicates its poor performance, while Simple RNN performs better than FastText, but still unsatisfactory. The ACC curves of LSTM and BiLSTM are very similar, with BiLSTM performing slightly better. ST-MFLC has a higher increase in ACC compared to BiLSTM, but still inferior to TextCNN and C-PsyD. Overall, C-PsyD shows the fastest increase in ACC and the greatest increase in ACC values, demonstrating its remarkable superiority. Figure 14 comprehensively illustrates the results of the performance evaluation of various models using the test set upon the completion of each Epoch's training. As evidenced by the data presented in the figure, the performance of C-PsyD remains consistently optimal, particularly in the 3rd Epoch, where it demonstrates the most significant advantage compared to other models. Although the effectiveness of TextCNN is somewhat close to that of C-PsyD, it still falls short of the superior performance of C-PsyD. This is because, in the psychological classification dataset, most data can be determined based solely on keywords, while a small portion of the data requires the utilization of semantic information for classification. Regrettably, TextCNN lacks sufficient capabilities in extracting semantic information. In contrast, RNN models such as BiLSTM can adeptly extract semantic information, but the training time for these models is relatively long, and they typically require a greater number of iterations to achieve satisfactory performance levels. While the accuracy of ST-MFLC is also progressively improving, with the semantic vectors extracted by its self-attention mechanism being crucial for classification information, the model struggles to showcase its potential advantages due to the use of character vectorization in this experiment and the limited need for context-based semantic classification in the dataset. FastText is unable to function properly under the character vectorization mode, resulting in relatively poor performance throughout the entire experimental process. Taking all factors into consideration, there is no doubt that C-PsyD stands out as the most outstanding model. 5.2 Test Results Analysis When the model training was finished, we need to test the model with a test dataset set, and the test metrics include ACC , Recall and Far . The meaning and calculation of these evaluation indicators have been described above. According to Eq. ( 10 ), the overall accuracies obtained by the seven given models for the psychological distress type test are shown in Table 4 . The higher the value of ACC is, the better the classification effect of the model achieves. The best ACC is marked in bold in Table 4 . Table 4 Overall accuracies obtained by the seven models Model ACC C-PsyD 79.5% FastText 50.1% TextCNN 78.2% ST-MFLC 44.8% BiLSTM 76.4% LSTM 74.9% Simple-RNN 55.7% As can be observed from the data displayed in Table 4 , C-PsyD outperforms all other models, achieving the highest accuracy rate of 79.5%. This highlights the exceptional performance of the C-PsyD model in the classification task. While TextCNN also exhibits relatively high accuracy, with a rate of 78.2%, it still falls short in comparison to C-PsyD's remarkable results. Other models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate lower accuracy rates, ranging from 44.8–76.4%, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits the lowest performance with an accuracy rate of merely 50.1%. In summary, the results outlined in Table 4 unequivocally showcase the superior performance of the C-PsyD model in comparison to the other six models, solidifying its status as the most outstanding model in this experiment. According to Eq. ( 11 ), the recall results obtained by the seven models for the five categories are shown in Table 5 . Table 5 Recall results obtained by the seven models for the five categories Model Category 1 Category 2 Category 3 Category 4 Category 5 C-PsyD 79.2% 64.2% 91.9% 73.1% 59.1% FastText 0.0% 0.0% 100.0% 0.0% 0.0% TextCNN 68.3% 56.4% 90.6% 73.7% 68.8% ST-MFLC 72.3% 50.9% 59.4% 4.5% 4.3% BiLSTM 66.3% 46.8% 93.2% 78.8% 54.8% LSTM 55.4% 49.1% 89.0% 80.8% 60.8% Simple-RNN 5.9% 1.8% 91.6% 48.7% 23.7% As evidenced by the data displayed in Table 5 , the C-PsyD model consistently outperforms all other models across all five categories, with recall rates ranging from 59.1–91.9%. This result highlights the outstanding performance of the C-PsyD model in handling diverse classification tasks. While TextCNN exhibits relatively high recall rates for most categories, with values between 56.4% and 90.6%, it still falls short in comparison to C-PsyD's exceptional results. Other models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate varying recall rates across categories, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits an extreme imbalance in performance, with a recall rate of 100.0% for Category 3, but 0.0% for all other categories. In summary, the results outlined in Table 5 provide strong evidence of the superior performance of the C-PsyD model in comparison to the other six models, confirming its status as the most outstanding model in this experiment in terms of recall rates across all categories. According to Eq. ( 11 ), Table 6 shows the Far values obtained by the seven models for the five categories. Since 0% probability is not indicative, the bold data represents the best data except 0%. Table 6 False positive rate results obtained by the seven models for the five categories Model Category 1 Category 2 Category 3 Category 4 Category 5 C-PsyD 2.3% 5.1% 16.2% 4.1% 2.8% FastText 0.0% 0.0% 100.0% 0.0% 0.0% TextCNN 1.6% 4.6% 14.4% 5.6% 5.1% ST-MFLC 18.2% 21.8% 28.1% 1.5% 0.3% BiLSTM 2.1% 3.9% 20.7% 6.3% 2.8% LSTM 1.6% 5.2% 17.7% 6.3% 5.7% Simple-RNN 1.4% 1.2% 46.3% 15.4% 6.2% As evidenced by the data displayed in Table 6 , the C-PsyD model demonstrates a competitive advantage over the other models by maintaining relatively low false positive rates across all five categories, ranging from 2.3–16.2%. This result highlights the remarkable performance of the C-PsyD model in minimizing classification errors. While TextCNN exhibits relatively low false positive rates for most categories, with values between 1.6% and 5.6%, it is still surpassed by C-PsyD's exceptional results in some instances. Other models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate varying false positive rates across categories, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits an extreme imbalance in performance, with a false positive rate of 100.0% for Category 3, but 0.0% for all other categories. In summary, the results outlined in Table 6 provide strong evidence of the superior performance of the C-PsyD model in comparison to the other six models, confirming its status as the most outstanding model in this experiment in terms of minimizing false positive rates across all categories. Therefore, it can be concluded that the C-PsyD model performs better in terms of false positive rate compared to other models in most categories. 5.3 Confusion Matrixes Confusion matrix is an analytical graph in machine learning that summarizes the classification results predicted by a model and the true classification results of the data in the form of a matrix. By observing the confusion matrix of a model, we can easily see the classification of the model. Figure 15 presents the confusion matrix for the C-PsyD model. Upon analyzing the confusion matrix for the C-PsyD model, it becomes evident that the model tends to confuse Category 2 and Category 3 when classifying the given dataset. By referring to Table 1 , we can gain a more specific understanding of why C-PsyD confuses family and emotional issues. Based on empirical knowledge, family problems often co-occur with emotional issues, and emotional issues may also be related to family situations. This correlation may lead the C-PsyD model to misclassify instances from these categories. Additionally, C-PsyD is prone to misclassifying Category 5 as Categories 1, 2, 3, and 4. As seen in Table 1 , Category 5 represents personality and relationship issues, which can easily be intertwined with other problems in life. For example, if an individual experiences low self-esteem—a personality issue—this problem could be rooted in family problems. Thus, the overlap and interconnectedness of these issues in real life might cause the C-PsyD model to misclassify instances from Category 5. In summary, the confusion matrix analysis reveals that the C-PsyD model has difficulties in accurately distinguishing between categories that are intrinsically related, such as family and emotional issues, as well as personality and relationship issues that are often entangled with other life problems. Figure 16 presents the confusion matrix for the FastText model. Upon analyzing the confusion matrix for the FastText model, it becomes clear that the model demonstrates poor classification capabilities. Remarkably, the FastText model has assigned all instances to Category 3, suggesting that the model has failed to learn how to properly classify instances across the various categories.This striking finding indicates that FastText has not successfully grasped the underlying patterns or features needed to distinguish between the different categories present in the dataset. The model's inability to correctly classify instances from Categories 1, 2, 4, and 5 highlights a significant limitation in the FastText model's performance on this particular task. Overall, the confusion matrix analysis reveals that the FastText model's classification ability is severely impaired, as it incorrectly assigns all instances to a single category. Figure 17 presents the confusion matrix for the TextCNN model. Upon analyzing the confusion matrix for the TextCNN model, it is evident that the model demonstrates relatively good classification capabilities with a simple structure. Despite its overall performance, there are a few issues worth mentioning.The model tends to confuse Category 2 with Category 3, as evidenced by the number of misclassified instances between these two categories. Additionally, the classification performance for Category 1 is somewhat weaker compared to other categories, suggesting that the model struggles to accurately differentiate instances from Category 1.One possible explanation for the less optimal performance in classifying Categories 1, 2, and 3 when compared to the C-PsyD model is that these categories may require contextual semantic information for accurate classification. The TextCNN model, however, does not possess the capability to extract such contextual information, which may contribute to its lower performance in these categories. In summary, while the TextCNN model exhibits relatively good classification capabilities with a simple structure and low levels of misclassification, it struggles with certain categories that may require contextual semantic information for accurate classification. This limitation leads to a comparatively weaker performance for Categories 1, 2, and 3 when compared to the C-PsyD model. . Figure 18 presents the confusion matrix for the ST-MFLC model. Upon analyzing the confusion matrix for the ST-MFLC model, it is apparent that the model possesses a certain level of classification ability, albeit with subpar performance. The model exhibits several issues, including a high degree of misclassification across all categories, particularly between Categories 1, 2, and 3. This suggests that the ST-MFLC model struggles to accurately distinguish between instances from these categories, resulting in poor overall performance. Additionally, the model demonstrates a considerably weaker performance in classifying Categories 1 and 3 compared to other categories, which may be indicative of the model's inability to capture the unique characteristics and features of these categories, leading to a higher number of misclassifications. In summary, while the ST-MFLC model exhibits some classification capabilities, its performance is notably inferior to other models. The high degree of misclassification across all categories, as well as the model's weaker performance in classifying Categories 1 and 3, highlights its limitations in accurately and effectively distinguishing between different categories. Figure 19 presents the confusion matrix for the BiLSTM model. Upon analyzing the confusion matrix for the BiLSTM model, it is evident that the model possesses a certain level of classification ability, with relatively good overall performance. However, the effectiveness of the BiLSTM model does not quite match that of the C-PsyD model. Although the BiLSTM model demonstrates commendable performance in certain aspects, it falls short in accurately classifying Categories 1 and 5. This suggests that there is potential for improvement in the model's ability to effectively distinguish between different categories. Figure 20 presents the confusion matrix for the LSTM model. Upon analyzing the confusion matrix, it becomes apparent that the model possesses a certain degree of classification ability, albeit with some limitations. In comparison to the BiLSTM model, the overall performance of the LSTM model is inferior, as evidenced by the misclassification of instances across several categories. More specifically, the model encounters difficulties in accurately classifying categories 2 and 5, and also faces challenges in distinguishing between other categories. The model's performance highlights the need for further investigation into potential improvements and alternative approaches that could enhance classification accuracy and effectiveness. Figure 21 illustrates the confusion matrix for the Simple RNN model. Upon examining the confusion matrix of the Simple RNN model, it is evident that the model's classification capability is limited, with an overall poor performance in distinguishing between different categories. While the model demonstrates a certain degree of classification ability, it has not effectively learned to accurately classify instances of various categories. One potential explanation for this suboptimal performance lies in the inherent limitations of the Simple RNN architecture, which often struggles with handling long sequences and retaining information over extended durations. This issue may result in the model's inability to capture the complex relationships and patterns present within the data, thus necessitating further exploration of alternative approaches to improve classification performance. In short, the classification reflected by the confusion matrix obtained by C-PsyD is undoubtedly the most acceptable. From the above experimental results, it is clear that C-PsyD is generally better than other comparative models. 6. Conlcusions Considering the increasing number of people suffering from mental illnesses and the scarcity of medical resources, a novel Chinese text classification model for detecting psychological issues, C-PsyD, has been proposed to improve the efficiency of doctors and utilization of medical resources. C-PsyD employs text features obtained from TextCNN and Self Attention to guide the Attention module in generating attention weights. These attention weights are combined with the text features implemented by BiGRU to ultimately obtain the text features output by C-PsyD. Experiments were conducted using a shared Chinese psychological text dataset on GitHub. All experiments validated that C-PsyD significantly outperforms six competitors. Excitingly, C-PsyD's accuracy rate is 79.5%, higher than TextCNN (78.2%), BiLSTM (76.4%), LSTM (74.9%), Simple-RNN (55.7%), FastText (50.1%), and ST_MFLC (44.8%). These results indicate that the newly proposed psychological text classification model is feasible and effective. The overall C-PsyD is relatively complex and requires high computational capabilities. Graphics cards or AI chips are commonly used for acceleration, but this increases the cost of using C-PsyD and may make the model difficult to implement on lower-capacity mobile devices. In the future, we will continue to simplify the model to facilitate its deployment on mobile devices such as smartphones. Moreover, C-PsyD lacks the ability to predict complex psychological problems. In fact, many individuals do not have a single type of psychological issue. Therefore, to address this challenge, we will continue to collect datasets and improve C-PsyD. C-PsyD can also be integrated with other systems, which will undoubtedly make the proposed variants more useful in modern society. For instance, C-PsyD can be extended to diagnose psychological disorders based on users' descriptive texts, then recommend suitable doctors and use chatbots to guide users, enhancing the efficiency of medical resources. Making the proposed variants more accessible to anyone is also a major focus of our future work. Declarations Conflict of Interest statement All authors disclosed no relevant relationships Author Contribution C was responsible for proofreading and polishing the document, as well as writing the content for the experimental section, while Y mainly conducted the experiments and drafted an initial version. Acknowledgments This work is supported by the National Natural Science Foundation of China under Grant No. 62062011, by the Guangxi Natural Science Foundation under Grant No. 2019GXNSFAA185017 and by the Autonomous Region Level College Students’ Innovation and Entrepreneurship Practice Project under Grant No. 202110608211. The authors would like to thank the editors and the anonymous reviewers for their kind assistance, constructive comments and recommendations, which have significantly improved the presentation of this paper. We would like to express our appreciation to those who share the psychological dataset used in this paper. References Haddad C et al (2021) Variation of psychological and anthropometrics measures before and after dieting and factors associated with body dissatisfaction and quality of life in a Lebanese clinical sample. BMC Psychol 9(1):1–13 Yunusa I, Marie Line El Helou (2020) The use of risperidone in behavioral and psychological symptoms of dementia: a review of pharmacology, clinical evidence, regulatory approvals, and off-label use. Front Pharmacol 11:596 Eisenbeck N et al (2022) An international study on psychological coping during COVID-19: Towards a meaning-centered coping style. Int J Clin health Psychol 22(1):100256 Belhadi A et al (2023) Fast and Accurate Framework for Ontology Matching in Web of Things. ACM Trans Asian Low-Resource Lang Inform Process Hasan M, Kamrul et al (2021) Fischer linear discrimination and quadratic discrimination analysis–based data mining technique for internet of things framework for Healthcare. Front Public Health : 1354 Ahmed U et al (2022) Explainable deep Attention active learning for sentimental analytics of mental disorder. Trans Asian Low-Resource Lang Inform Process Sarkar A, Singh A, Chakraborty R (2022) A deep learning-based comparative study to track mental depression from EEG data. Neurosci Inf : 100039 Thakre TP et al (2022) Polysomnographic identification of anxiety and depression using deep learning. J Psychiatr Res 150:54–63 Madan S et al (2022) Deep Learning-based detection of psychiatric attributes from German mental health records. Int J Med Informatics 161:104724 Burdisso SG (2019) Marcelo Errecalde, and Manuel Montes-y-Gómez. A text classification framework for simple and effective early depression detection over social media streams. Expert Syst Appl 133:182–197 Sabri T, Beggar OE, Kissi M (2022) Comparative study of Arabic text classification using feature vectorization methods. Procedia Comput Sci 198:269–275 Lu J et al (2022) Photocatalytic H2 evolution properties of K0. 5Na0. 5NbO3 (KNN) with halloysite nanotubes. Opt Mater 129:112516 Yang Y et al (2022) Contrastive Graph Convolutional Networks with adaptive augmentation for text classification. Inf Process Manag 59(4):102946 Romero R et al (2022) MobyDeep: A lightweight CNN architecture to configure models for text classification. Knowl Based Syst 257:109914 Banerjee I et al (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79–88 Zaporojets K et al (2021) Solving arithmetic word problems by scoring equations with recursive neural networks. Expert Syst Appl 174:114704 Shi M et al (2022) Genetic-gnn: evolutionary architecture search for graph neural networks. Knowl Based Syst 247:108752 Guo B et al (2019) Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing 363:366–374 Cho K et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 Wang M et al (2022) Chinese power dispatching text entity recognition based on a double-layer BiLSTM and multi-feature fusion. Energy Rep 8:980–987 Zhou Q, Liu X, Wang Q (2021) Interpretable duplicate question detection models based on Attention mechanism. Inf Sci 543:259–272 Peer D et al (2022) Greedy-layer pruning: Speeding up transformer models for natural language processing. Pattern Recognit Lett 157:76–82 ZHANG, Xu et al (2022) Pre-hospital emergency text classification model based on label confusion. J Comput Appl : 0 Orhan U, Cagatay Neftali Tulu (2021) A novel embedding approach to learn word vectors by weighting semantic relations: SemSpace. Expert Syst Appl 180:115146 Jia K (2022) Sentiment classification of microblog: A framework based on BERT and CNN with Attention mechanism. Comput Electr Eng 101:108032 Ghorbanali A, Sohrabi MK, Farzin Yaghmaee (2022) Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf Process Manag 59(3):102929 Vaswani A et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30 Bahdanau D, Cho K, and Yoshua Bengio (2014). Neural machine translation by jointly learning to aligntranslate. arXiv preprint arXiv:1409.0473 Khasanah IN (2021) Sentiment classification using fasttext embedding and deep learning model. Procedia Comput Sci 189:343–350 Amalou I, Mouhni N, Abdali A (2022) Multivariate time series prediction by RNN architectures for energy consumption forecasting. Energy Rep 8:1084–1091 Arbane M et al (2023) Social media-based COVID-19 sentiment classification model using Bi-LSTM. Expert Syst Appl 212:118710 Cao N et al (2022) A deceptive reviews detection model: Separated training of multi-feature learning and classification. Expert Syst Appl 187:115977 Zhu E et al (2022) N-gram MalGAN: Evading machine learning detection via feature n-gram. Digit Commun Networks 8(4):485–491 Wang Q (2022) Malicious code classification based on opcode sequences and textCNN network. J Inform Secur Appl 67:103151 Souquet Léo et al (2023) Convolutional neural network architecture search based on fractal decomposition optimization algorithm. Expert Syst Appl 213:118947 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-5337854","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":371101372,"identity":"f5b0faa4-57cb-4d82-8b6f-253c382e52e4","order_by":0,"name":"Chaoqun Zhang","email":"","orcid":"","institution":"Guangxi Minzu University","correspondingAuthor":false,"prefix":"","firstName":"Chaoqun","middleName":"","lastName":"Zhang","suffix":""},{"id":371101373,"identity":"80ff1f3c-59e2-464f-8fc3-581c32d1547f","order_by":1,"name":"Yunheng Yi","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA7klEQVRIiWNgGAWjYDACZsYGgw8VNXL87I2ND4B8GQYGNgJa2JkbCmecOWYs2XO42YCBwYCHsBZ+9obPvG3MiQY30tskiNJicJixceOMM2wJBmcOtlV8bPvDw8/elsDwo2IbPi3NQL/I5Ekeb2y7ObPNgEey59gBxp4zt/FpaTME2lLMB7TlNi9QC9CFDcyMbXi1tP8G+aXhRmJbMbFaGoxBWiYAtTBDtKQdwKtFEqjFEBLIB5slZ5wzBvkl4SA+v/CdP/4AGpXtDz98KJMDMtoMH/yowK1F4QA2UayCMCDfgE92FIyCUTAKRgEIAACXKmE9sn3ZZgAAAABJRU5ErkJggg==","orcid":"","institution":"Guangxi Minzu University","correspondingAuthor":true,"prefix":"","firstName":"Yunheng","middleName":"","lastName":"Yi","suffix":""}],"badges":[],"createdAt":"2024-10-26 14:08:04","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-5337854/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-5337854/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":69250266,"identity":"74a5cf87-104d-4d6d-87df-95022335fb9b","added_by":"auto","created_at":"2024-11-18 11:38:41","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":9131,"visible":true,"origin":"","legend":"\u003cp\u003eOverall framework of text classification models\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/5ba6878647053cafd368644a.png"},{"id":69250324,"identity":"22040a1c-0ba3-45b1-8564-dbe9f196bbbb","added_by":"auto","created_at":"2024-11-18 11:38:43","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":32008,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of the original FastText\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/ae64cd81ceab7d23e6012420.png"},{"id":69249373,"identity":"6da51a10-2a54-4a37-832f-ab54f144ed25","added_by":"auto","created_at":"2024-11-18 11:30:31","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":12538,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of our designed FastText\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/9cd5604a084b07e62371db49.png"},{"id":69247171,"identity":"3309509c-f29f-4960-83fe-3d8895015554","added_by":"auto","created_at":"2024-11-18 11:14:32","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":6875,"visible":true,"origin":"","legend":"\u003cp\u003eOne-dimensional convolutional operation of CNN\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/5fa474934562d0e1f9e95927.png"},{"id":69247523,"identity":"e2efe83a-b8dd-41f8-986b-184a0a2f977b","added_by":"auto","created_at":"2024-11-18 11:22:31","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":8782,"visible":true,"origin":"","legend":"\u003cp\u003eMaximum pooling layer\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/b240695158fe6840af418f03.png"},{"id":69247176,"identity":"6b777e94-0cc6-4ede-ae05-e0a591929609","added_by":"auto","created_at":"2024-11-18 11:14:33","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":8560,"visible":true,"origin":"","legend":"\u003cp\u003eAverage pooling layer\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/e9d177f609bce24f31ceb5ff.png"},{"id":69247162,"identity":"3c6d091c-126d-4060-aa21-4211adacfd82","added_by":"auto","created_at":"2024-11-18 11:14:31","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":20118,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of TextCNN\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/d105bb3704beeac885a2f211.png"},{"id":69247181,"identity":"81bbbc91-47c2-4aa3-88dc-1e7f4c81e1c5","added_by":"auto","created_at":"2024-11-18 11:14:33","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":18599,"visible":true,"origin":"","legend":"\u003cp\u003eStructure ofSimple-RNN\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/db244b6132b41f7c2e871fe4.png"},{"id":69247183,"identity":"5628d31e-64de-4dff-9bd8-38193e6793b6","added_by":"auto","created_at":"2024-11-18 11:14:59","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":22395,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of BiLSTM\u003c/p\u003e","description":"","filename":"9.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/f0b577519898a0af04644c4b.png"},{"id":69247524,"identity":"dc764717-e979-4b6d-b93e-86153b28bdf1","added_by":"auto","created_at":"2024-11-18 11:22:31","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":52374,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of ST_MFLC\u003c/p\u003e","description":"","filename":"10.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/76a372d9958169e531b5f6e0.png"},{"id":69247177,"identity":"caf2dea7-3ff8-44be-a598-58654828d374","added_by":"auto","created_at":"2024-11-18 11:14:33","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":44015,"visible":true,"origin":"","legend":"\u003cp\u003eStructure of C-PsyD\u003c/p\u003e","description":"","filename":"11.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/46c0673041ee3fc0fd0a3931.png"},{"id":69247164,"identity":"fe661221-74e7-48cd-930d-01a5dfb27e77","added_by":"auto","created_at":"2024-11-18 11:14:31","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":40844,"visible":true,"origin":"","legend":"\u003cp\u003eChanges of the loss function values of the seven models\u003c/p\u003e","description":"","filename":"12.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/e9504daa8e4ff459a8d40461.png"},{"id":69247174,"identity":"a92bf4d5-ccdc-4b87-a11e-4cd724015c26","added_by":"auto","created_at":"2024-11-18 11:14:32","extension":"png","order_by":13,"title":"Figure 13","display":"","copyAsset":false,"role":"figure","size":43912,"visible":true,"origin":"","legend":"\u003cp\u003eChanges of accuracy obtained by the seven models\u003c/p\u003e","description":"","filename":"13.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/6f90a7115bf3980c30c55552.png"},{"id":69247527,"identity":"eadce45f-3719-4fc0-8d8f-7f61b1fc9686","added_by":"auto","created_at":"2024-11-18 11:22:31","extension":"png","order_by":14,"title":"Figure 14","display":"","copyAsset":false,"role":"figure","size":31698,"visible":true,"origin":"","legend":"\u003cp\u003eChanges of eval accuracy obtained by the seven models\u003c/p\u003e","description":"","filename":"14.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/21ea1c5c7d9f9ac122988d2f.png"},{"id":69247182,"identity":"ebab7825-a3a6-403f-8ffc-6062f47689c4","added_by":"auto","created_at":"2024-11-18 11:14:34","extension":"png","order_by":15,"title":"Figure 15","display":"","copyAsset":false,"role":"figure","size":18191,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by C-PsyD\u003c/p\u003e","description":"","filename":"15.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/4d68f3442e34b8c51c19b319.png"},{"id":69250271,"identity":"f3596a4b-2e11-492b-bae5-dade809725c3","added_by":"auto","created_at":"2024-11-18 11:38:42","extension":"png","order_by":16,"title":"Figure 16","display":"","copyAsset":false,"role":"figure","size":15547,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by FastText\u003c/p\u003e","description":"","filename":"16.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/6cee8c20ee56a441590ca050.png"},{"id":69247531,"identity":"8b8ae461-6d4e-446c-8c60-c9a53696db99","added_by":"auto","created_at":"2024-11-18 11:22:33","extension":"png","order_by":17,"title":"Figure 17","display":"","copyAsset":false,"role":"figure","size":20899,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by TextCNN\u003c/p\u003e","description":"","filename":"17.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/977ae2ce2b9a4491e7a2e1ae.png"},{"id":69247179,"identity":"f12a9011-a315-4153-acaf-7842ada39db6","added_by":"auto","created_at":"2024-11-18 11:14:33","extension":"png","order_by":18,"title":"Figure 18","display":"","copyAsset":false,"role":"figure","size":19202,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by ST-MFLC\u003c/p\u003e","description":"","filename":"18.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/b52ecedc320d3c7a15d44696.png"},{"id":69249380,"identity":"98382ef7-496f-4975-a1ec-7e69cc34bbcd","added_by":"auto","created_at":"2024-11-18 11:30:33","extension":"png","order_by":19,"title":"Figure 19","display":"","copyAsset":false,"role":"figure","size":20034,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by BiLSTM\u003c/p\u003e","description":"","filename":"19.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/b0e121c3d271477f34cf2c7c.png"},{"id":69247173,"identity":"75adcf6a-ba18-4423-99c4-20f1561c4db4","added_by":"auto","created_at":"2024-11-18 11:14:32","extension":"png","order_by":20,"title":"Figure 20","display":"","copyAsset":false,"role":"figure","size":16994,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by LSTM\u003c/p\u003e","description":"","filename":"20.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/879ef8d924d71e9faf188e05.png"},{"id":69247169,"identity":"26f75fd2-13b1-4b4c-a9d3-f52f7cf12f60","added_by":"auto","created_at":"2024-11-18 11:14:31","extension":"png","order_by":21,"title":"Figure 21","display":"","copyAsset":false,"role":"figure","size":18356,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix obtained by Simple-RNN\u003c/p\u003e","description":"","filename":"21.png","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/2216620614418ce4367a7008.png"},{"id":74066789,"identity":"c39245fa-5051-4354-9823-c28ec53cb41e","added_by":"auto","created_at":"2025-01-17 12:17:25","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1367097,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-5337854/v1/310abd99-1943-4c45-a55f-7e5d9b736b2b.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"C-PsyD: A Chinese text classification model for detecting psychological problems","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003ePeople are more likely to have mental health problems in today's society. People's poor eating habits can lead to mental health problems from the perspective of daily life\u003csup\u003e[1]\u003c/sup\u003e. Many diseases, such as dementia, can cause psychological problems from the perspective of physical diseases[2]. As a global epidemic disease, COVID-19 has inevitably had an impact on people's mental health in various countries and regions since the outbreak of the epidemic. The global crisis caused by COVID-19 has also affected the mental health of many people[3]. Literature[3] also shows that students, single people and people who live with a large number of other people have a higher risk of poor mental health. In other words, students \u0026nbsp;are more likely to have psychological problems related to COVID-19. There are many people with psychological problems. However, at present, the general public in China lacks sufficient knowledge about psychological problems, and there are relatively few medical resources related to psychology. Therefore, ordinary people often don't go to see a psychologist to avoid getting embarrassed. Especially now that medical resources across the country are strained to varying degrees, we need to find a way to save medical resources and improve the efficiency of using medical resources for psychological problems.\u0026nbsp;Now that the Internet of Things (IoT)\u003csup\u003e[4]\u003c/sup\u003etechnology is advanced, there is a new trend to integrate healthcare systems with the IoT and the potential for IoT healthcare applications is enormous\u003csup\u003e[5]\u003c/sup\u003e. As described in the literature[6], with the increasing use of online media, Internet-delivered psychotherapies \u0026nbsp;are becoming an important tool for improving mental health conditions.\u0026nbsp;Based on the above background, this paper will combine some existing technologies to design a model and make it play a role in the medical system of the Internet of Things.\u003c/p\u003e\n\u003cp\u003eDeep learning, as one of the most popular artificial intelligence technologies, has found numerous applications in the medical field. For instance, Reference [7] utilized deep learning to detect depression based on EEG data of patients, while Reference [8] employed it to distinguish between anxiety and depression. Reference [9] developed a system that automatically detects psychological diseases by analyzing patient data through text classification technology. Similarly, Reference [10] identified mental illnesses by examining user data on social media platforms using text classification.\u003c/p\u003e\n\u003cp\u003eDespite its potential, the integration of deep learning with psychotherapy in online diagnosis and treatment is rare, and research combining these two elements to improve medical resource utilization is scarce. Thus, we aim to apply deep learning to psychotherapy. Diagnosing psychological diseases using deep learning technology essentially involves a text classification task, which is common in natural language processing (NLP). Traditional machine learning algorithms, such as support vector machine (SVM) [11] and K-nearest neighbor (K-NN) [12], can also perform text classification. However, these algorithms rely heavily on manually generated features, resulting in high labor costs and low efficiency [13]. In contrast, deep learning-based algorithms can continually update text vectors through computer learning, eliminating such drawbacks.\u003c/p\u003e\n\u003cp\u003eVarious neural networks are used for text classification, including convolutional neural network (CNN) [14], recurrent neural network (RNN) [15], recursive neural network (ReNN) [16], and graph neural network (GNN) [17]. CNN-based models, such as MobyDeep proposed in Reference [14], excel at long text classification. TextCNN, as described in Reference [18], improved long text classification by treating long text as an image and applying convolution operations. However, TextCNN is less suitable for short text with high character correlation, whereas RNN can extract semantic features from the entire text for classification.\u003c/p\u003e\n\u003cp\u003eTraditional RNNs suffer from gradient vanishing and gradient explosion issues. Long short-term memory (LSTM) is a special RNN model that addresses these problems and outperforms traditional RNNs. Gated recurrent unit (GRU), a LSTM variant introduced by Cho in 2014 [19], is often superior to LSTM in specific classification tasks. Numerous RNN models have been proposed, such as the bidirectional long short-term memory (BiLSTM) model for power dispatching classification in Reference [20], which demonstrated excellent classification capabilities.\u003c/p\u003e\n\u003cp\u003eWith the emergence of the Attention mechanism [21] and Transformer model [22], an increasing number of text classification models have been developed and applied. Reference [23] suggested acquiring dynamic character vectors [24] of a text using BERT [25], a pre-trained model, and constructing text representation vectors with BiLSTM, weighted convolution [26], and Attention mechanism. However, BERT's excessive parameters lead to high training costs, and the model only supports texts with fewer than 512 characters.\u003c/p\u003e\n\u003cp\u003eIn view of the background described above, a\u0026nbsp;novel\u0026nbsp;Chinese text\u0026nbsp;classification model\u0026nbsp;named\u0026nbsp;C-PsyD\u0026nbsp;is proposed\u0026nbsp;for detecting\u0026nbsp;psychological problems in\u0026nbsp;this\u0026nbsp;paper.\u0026nbsp;We\u0026nbsp;obtained\u0026nbsp;Chinese\u0026nbsp;psychological\u0026nbsp;text dataset\u0026nbsp;shared on\u0026nbsp;GitHub, and used\u0026nbsp;the established\u0026nbsp;text classification model to\u0026nbsp;calculate\u0026nbsp;and infer\u0026nbsp;the text data to\u0026nbsp;achieve\u0026nbsp;preliminary\u0026nbsp;psychological diagnosis of\u0026nbsp;a\u0026nbsp;user.\u003c/p\u003e\n\u003cp\u003eThe contributions of this\u0026nbsp;paper\u0026nbsp;are\u0026nbsp;summarized\u0026nbsp;as follows:\u003c/p\u003e\n\u003cp\u003ea.\u0026nbsp;We apply the text classification technology to the field of psychological disease detection.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eb. We propose\u0026nbsp;a\u0026nbsp;Chinese text classification model\u0026nbsp;named\u0026nbsp;C-PsyD\u0026nbsp;for detecting psychological problems.\u0026nbsp;C-PsyD\u0026nbsp;uses text features obtained from CNN and Self-Attention to guide the\u0026nbsp;Attention\u0026nbsp;module to generate\u0026nbsp;Attention\u0026nbsp;parameters. These\u0026nbsp;Attention\u0026nbsp;parameters\u0026nbsp;are\u0026nbsp;combined with the text features implemented by BiGRU to finally obtain the text features.\u003c/p\u003e\n\u003cp\u003ec.\u0026nbsp;The experimental results verify that C-PsyD\u0026nbsp;significantly outperforms\u0026nbsp;other models such as FastText\u003csup\u003e[28]\u003c/sup\u003e, TextCNN\u003csup\u003e[28]\u003c/sup\u003e,\u0026nbsp;Simpl-RNN\u003csup\u003e[29]\u003c/sup\u003e, LSTM\u003csup\u003e[29]\u003c/sup\u003e,\u0026nbsp;BiLSTM\u003csup\u003e[30]\u003c/sup\u003e and ST_MFLC\u003csup\u003e[31]\u003c/sup\u003e. This suggests that the proposed model for psychological text classification is not only feasible but also highly effective. By assisting doctors in classifying patients, detecting psychological problems at an early stage, and taking timely measures to treat them, this model can significantly improve the efficiency of medical resource utilization. Furthermore, even though these models were initially designed for specific domains, they can still be applied in other areas after being trained using datasets from different fields .\u003c/p\u003e\n\u003cp\u003eThe rest of this paper is organized as follows. First, six text classification models including FastText, TextCNN, Simpl-RNN, LSTM, BiLSTM and ST_MFLC are introduced. Next, the proposed model C-PsyD are explained in detail. Then, the experiments are described and the experimental results are analysed. Finally, conclusions and future work are presented.\u003c/p\u003e"},{"header":"2. Text Classification Models","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Overall Framework of Text Classification\u003c/h2\u003e \u003cp\u003eThe main function of a text classification model is to finish text classification. There are many models available for text classification tasks, such as FastText, TextCNN, Simpl-RNN, LSTM, BiLSTM and ST_ MFLC。 In this paper, these models are used to extract the text features of the text, and the full connection layer outputs the results according to the text features. The overall framework of text classification task is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs can be seen from Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, after transformation from input text to words vectors, the vectors can be fed into a model. The model extracts the text features from the input text vectors. The text features are also represented as vectors. The text feature vectors are fed into the fully connected layer, which computes the final classification result from the text feature vectors. In this paper, the result is also a vector, which represents the probability that each input text vector belongs to each category. In this paper, the input content of all text classification models is text vector, and the output content is text feature. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e is intended to illustrate the role of the model mentioned in the text in the whole text classification task.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 FastText Model\u003c/h2\u003e \u003cp\u003eFastText is a fast and efficient text classification model. Its core idea is to first obtain the character vectors of sentences, then aggregate these vectors, and finally classify the sentences. The performance of FastText in classification is significantly affected by vector dimension. To ensure consistency, FastText obtains character vectors through embedding layers, which is consistent with other models in the paper.The specific calculation is shown as\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$${m_i}=\\frac{{\\sum\\limits_{{t=0}}^{{n - 1}} {{w_{(t,i)}}} }}{n}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ewhere \u003cem\u003eW\u003c/em\u003e are the original text vectors, \u003cem\u003eW\u003c/em\u003e\u003csub\u003e\u003cem\u003e(t,i)\u003c/em\u003e\u003c/sub\u003e denotes the \u003cem\u003ei-th\u003c/em\u003e value in the character vector of the \u003cem\u003et-th\u003c/em\u003e character in the original text vectors, \u003cem\u003eN\u003c/em\u003e represents the number of characters in this sentence, \u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ei\u003c/em\u003e\u003c/sub\u003e denotes the \u003cem\u003ei\u003c/em\u003eth value in the processed text vector, \u003cem\u003eM\u003c/em\u003e is the text feature extracted from the model.\u003c/p\u003e \u003cp\u003eIn the literature [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], FastText used word embedding, that is, it calculated the vector of the central word based on the central word and the n-gram algorithm\u003csup\u003e[32]\u003c/sup\u003e, and outputted the word vector of each word.\u003c/p\u003e \u003cp\u003eAccording to the description in the literature [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], the structure of the original FastText CNN is supplemented as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eBecause this paper uses character embedding, some changes are made to the original FastText model; that is to say, we use character embedding instead of word embedding in FastText. The structure of our designed FastText is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3 TextCNN Model\u003c/h2\u003e \u003cp\u003eTextCNN essentially uses CNN to classify texts. As textual data can be considered as one-dimensional data, TextCNN uses a one-dimensional CNN[\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Its one-dimensional convolution operation is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTraditional TextCNN includes pooling layers. The types of pooling layers can be divided into max-pooling layer\u003csup\u003e[34]\u003c/sup\u003e and avg-pooling layer\u003csup\u003e[35]\u003c/sup\u003e. Max-pooling layer is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, and its main idea is to select the maximum value in a range. Because max-pooling layer could lose too much useless information, it is suitable for texts with a lot of meaningless information.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e shows mean-pooling layer which main idea is to select the mean value in a range. Since mean-pooling layer does not directly lose information, it is appropriate for texts with less feature information.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTextCNN from the literature [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e] uses word vectors to make the experimental results more reliable. However, there are too many common words in Chinese. If word vectors are used, there are too many model parameters and it may be difficult to complete the experiments with ordinary equipment. Therefore, we replace word vectors with character vectors and take no account of n-gram information. The structure of TextCNN is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e2.4 Simple-RNN Model\u003c/h2\u003e \u003cp\u003eSimple-RNN is the simplest RNN model, which has only tanh operations in its cells. The structure of Simple-RNN-Cell was described in detail in the literature[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]. In this paper the structure of the Simple-RNN model is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e, the input to the model is a text vector. The model accepts an input of a text vector at each time step and accepts the output of the hidden layer from the previous time step. The text vector is propagated forward through two cells and a rejection layer to get the final output. Each Simple-RNN cell is just a simple tanh operation, which is only represented as tanh in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e2.5 LSTM Model\u003c/h2\u003e \u003cp\u003eLSTM is a one-way RNN model and a classical model in the field of natural language processing. LSTM introduces a long-time memory mechanism, which makes the model perform better when processing long text. In other words, LSTM models are often used for processing long texts. The structure of LSTM cell was presented in detail in literature [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e].The structure of LSTM in this paper is similar as Simple-RNN shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e. The essential difference between LSTM and Simple-RNN is different cell structure, that is, each LSTM cell is a LSTM-based cell, while each Simple-RNN cell is a simple tanh operation.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e2.6 BiLSTM Model\u003c/h2\u003e \u003cp\u003eBiLSTM is a bidirectional LSTM. It differs from the traditional bidirectional RNN due to different cell structures. The structure of BiLSTM in this paper is derived from the literature [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e] and shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e. In a word, BiLSTM is a bidirectional RNN model with LSTM cell. BiLSTM uses forward and backward propagation, so the result of each word operation is affected by the words on both sides of this word, and the bidirectional forward mechanism can extract semantic features more accurately than LSTM.\u003c/p\u003e \u003cp\u003eIn BiLSTM, each layer contains a dropout layer, and the parameter of dropout layer was set to 0.2. In the literature [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e], the model took the output of the last layer as the extracted text features, and then classified texts through a fully connected layer.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.7 ST_MFLC Model\u003c/h2\u003e \u003cp\u003eThe literature [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] integrated TextCNN, BiGRU and Self-Attentions into one model named ST_MFLC. In ST_MFLC, each module extracted different text features from input data, then spliced the features extracted from the three modules, and took the spliced output as the final feature. The literature [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] has the framework of ST_MFLC but no specific implementation details, so we reproduced the model and added some implementation details according to the application of this paper. The structure of ST_MFLC is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eIn Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e, Conv stands for a convolution layer. In the CNN part of the model, the initial input layer of the convolutional layer remains the same as the dimension of the word vector, and the final output dimension is 1. MaxPool represents the maximum pooling layer with a size of sequence length.. BiGRU also has two hidden layers. Dropout stands for the dropout operation, and the parameter of dropout layer was set to 0.2. The final output are the features of input data.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e, Self-Attention represents the Self-Attention layer\u003csup\u003e[27]\u003c/sup\u003e. The calculation of this layer is shown in Equations (\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e)-(\u003cspan refid=\"Equ5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). Note that, Eq.\u0026nbsp;(\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) is the general calculation formula, where \u003cem\u003eQ\u003c/em\u003e, \u003cem\u003eK\u003c/em\u003e and \u003cem\u003eV\u003c/em\u003e represent three variables respectively in the Self-Attention mechanism, \u003cem\u003ed\u003c/em\u003e\u003csub\u003e\u003cem\u003ek\u003c/em\u003e\u003c/sub\u003e is the dimension of the character vector, that is, the vector dimension of \u003cem\u003eK\u003c/em\u003e. And D stands for text vector. \u003cem\u003eQW\u003c/em\u003e, \u003cem\u003eKW\u003c/em\u003e and \u003cem\u003eVW\u003c/em\u003e are three different matrices respectively, and the values in the matrices indicate the parameters of the fully connected layer of the neural network in the actual coding. The Self-Attention mechanism, where the text vector \u003cem\u003eD\u003c/em\u003e is multiplied by three different matrices, then gives three different features. These three different features are obtained by performing the operation given in Eq.\u0026nbsp;\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, which is called the \"self-attentive value\".\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$${\\text{Self}}Att(Q,K,V)=s{\\text{oftmax(}}\\frac{{(Q{K^T})}}{{\\sqrt {{d_k}} }}V)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$Q=D \\times Q{W^{^{T}}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ4\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ4\" name=\"EquationSource\"\u003e\n$$K=D \\times K{W^{^{T}}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e4\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ5\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ5\" name=\"EquationSource\"\u003e\n$$V=D \\times V{W^{^{T}}}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e5\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003c/div\u003e"},{"header":"3. The Proposed Model C-PsyD","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003e3.1 Framework of C-PsyD Model\u003c/h2\u003e \u003cp\u003eBased on the research and application of text classification techniques at home and abroad, C-PsyD is constructed in this paper using existing techniques. C-PsyD is designed by integrating the four modules, namely TextCNN, BiGRU, Self-Attention and Attention. The structure of C-PsyD is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e, the convolution layer Conv is used to extract the crucial features from input text. The crucial features and text semantic features which extracted from BiGRU are input into Attention layer together to obtain the new text features A. Similarly, the outputs from Self-Attention layer and BiGRU layer are input into Attention layer together to get new text features B. Finally, the text features A and B are spliced to obtain the final text features. It is worth noting that the parameters of two Attention layers above are shared.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.2 TextCNN Module\u003c/h2\u003e \u003cp\u003eIn this model, the TextCNN module is comprised of three sub-modules, each containing a convolutional layer, a max-pooling layer, and an average-pooling layer. Based on the dimensions of character vectors, the input and output channel numbers of the convolutional layers are set to 128, respectively, and the size of each pooling layer is equal to the sequence length. The input vector dimensions for each sub-module are [batch_size, seq_len, word_size]. By analyzing the description of one-dimensional convolution operations and pooling layers in Section \u003cspan refid=\"Sec5\" class=\"InternalRef\"\u003e2.3\u003c/span\u003e, we can deduce that the output vector dimensions are [batch_size, 2, word_size].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.3 BiGRU Module\u003c/h2\u003e \u003cp\u003eBiGRU is a bidirectional RNN network. In this module, the number of hidden network layers was fixed at 2, and there is a dropout layer behind each hidden network layer, and the parameters of dropout layers were set to 0.2. The purpose of dropout layers is to prevent the model from overfitting. The dimension of the output vector from BiGRU layer is also [\u003cem\u003ebatch_size\u003c/em\u003e, \u003cem\u003eseq_len\u003c/em\u003e, \u003cem\u003eword_size\u003c/em\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.4 Self-Attention Module\u003c/h2\u003e \u003cp\u003eIn the C-PsyD model, Self-Attention is a module that uses the Self-Attention mechanism. The detailed calculation of the Attention mechanism is shown in equations (\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e) to (\u003cspan refid=\"Equ5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). In order to calculate the input text and obtain the Self-Attention feature of the text, C-PsyD will use the Self-Attention mechanism. Self-Attention feature is very important for subsequent calculation..\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e3.5 Attention Module\u003c/h2\u003e \u003cp\u003eAttention mechanism\u003csup\u003e[28]\u003c/sup\u003e is used many times in our model, and its calculation can be presented by\u003cdiv id=\"Equ6\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ6\" name=\"EquationSource\"\u003e\n$$\\operatorname{Score} (H,S)=\\tanh ((\\operatorname{cat} (H,S).W{a^T})).W{v^T}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e6\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ7\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ7\" name=\"EquationSource\"\u003e\n$$attW=\\operatorname{softmax} (\\operatorname{Score} (H,S))$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e7\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ8\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ8\" name=\"EquationSource\"\u003e\n$$att=attW.{H^T}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e8\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eattW represents the attention weight, H is the vector to be attended to, which represents the output of BiGRU module in C-PsyD model. S is the parameter vector input into the attention layer. tanh represents the activation function, cat(H, S) represents the concatenation of H and S. Wa and Wv represent matrices, which are represented by a layer of fully connected neural networks in specific code implementation. The purpose of Wa and Wv is to transform the high-dimensional cat(H, S) into low-dimensional space. Formula (6) is used to calculate the attention score. Compared with reference [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], there is an additional Wv here for linear transformation of the result. \".\" represents dot product. The first step is to concatenate H and S. Since H and S could have different shapes, we need to perform linear transformation on S. We perform linear transformation on cat(H, S) and apply tanh operation to obtain score feature information, then perform linear transformation again to obtain the attention score, which is attW. The attention weight and H are multiplied to obtain the attention result using dot product. There are two positions where attention is implemented in C-PsyD. The module receives the output of TextCNN and BiGRU, and outputs the result through the attention mechanism. Attention also receives the output of Self Attention and BiGRU. Then the result is output through the attention mechanism. The final attention result is obtained by taking the average of the two output results.\u003c/p\u003e \u003c/div\u003e"},{"header":"4. Experiments","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Experimental Environment\u003c/h2\u003e \u003cp\u003eAll the experiments were performed on the processor numbered i5-11400F. The Linux server used a GPU with a version number of RTX3060, which has a 12G of running memory and a 32G of RAM. The computer operation system is Windows 11, the version of Python is Python3.11, and PyTorch used in our experiments is a framework for deep learning.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Dataset Preprocessing\u003c/h2\u003e \u003cp\u003eThe dataset for our experiments is a collection of questions and answers about psychological knowledge, which can be downloaded from \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/chatopera/efaqa-corpus-zh\u003c/span\u003e\u003cspan address=\"https://github.com/chatopera/efaqa-corpus-zh\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. As a public dataset, each row of it is in the format of JSON. Labels and their corresponding meanings related to the dataset are described in detail on the website. Because the number of partial label data in the public dataset is too small, some labels are sorted and merged. The merged labels are shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. The dataset can be divided into six main categories identified as 0, 1, 2, 3, 4, 5, respectively.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSome examples of the original dataset labels\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCategory\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLabel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eOriginal Content\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e未知\u003c/p\u003e \u003cp\u003e(Unknown)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e未知、其它\u003c/p\u003e \u003cp\u003e(Unknown, Others)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e学业与事业烦恼\u003c/p\u003e \u003cp\u003e(Academic and career worries)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e学业烦恼、对未来规划的迷茫、事业和工作烦恼\u003c/p\u003e \u003cp\u003e(Academic worries, Confusion about future planning, Career and work worries)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e家庭问题\u003c/p\u003e \u003cp\u003e(Family problems)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e家庭问题和矛盾\u003c/p\u003e \u003cp\u003e(Family problems and contradictions)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e情感问题\u003c/p\u003e \u003cp\u003e(Emotional problems)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e男同性恋、女同性恋、双性恋与跨性别、情感关系问题、离婚、分手\u003c/p\u003e \u003cp\u003e(Gay, Lesbian, Bisexual and transgender, Emotional relationship issues, Divorce, Separation)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e自我生理问题\u003c/p\u003e \u003cp\u003e(Self physical problems)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e物质滥用、悲恸、失眠、压力、强迫症\u003c/p\u003e \u003cp\u003e(Substance abuse, Grief, Insomnia, Stress, Obsessive-compulsive disorder)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e性格与关系问题\u003c/p\u003e \u003cp\u003e(Personality and relationship problems)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e自我探索、低自尊、青春期问题、人际关系\u003c/p\u003e \u003cp\u003e(Self exploration, Low self-esteem, Adolescent problems, Interpersonal relationships)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eConsidering the possibility of ambiguities such as \u0026ldquo;Unknown\u0026rdquo; and \u0026ldquo;Others\u0026rdquo; in the original dataset, we deleted this part of data. Therefore, the number of the data of Category 0 in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e is 0. The processed dataset were imported into Excel and can be downloaded from \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pan.baidu.com/s/1OL5xbfWDndJa75C2PRKbUA?pwd=k44a\u003c/span\u003e\u003cspan address=\"https://pan.baidu.com/s/1OL5xbfWDndJa75C2PRKbUA?pwd=k44a\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e presents some examples of the preprocessed dataset.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSome examples of the preprocessed dataset labels\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePreprocessed Content\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLabel Number\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e女: 听过别人最多的议论就是干啥啥不行不长心眼没有脑子\u003c/p\u003e \u003cp\u003e(Girl: The most comments on women heard from others are what they can't do. They don't have minds and no brains)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e女: 我整宿整宿睡不着经常头疼不知道自己怎么了\u003c/p\u003e \u003cp\u003e(Girl: I can't sleep all night long. I often have headaches. I don't know what's wrong with me)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e放不下女朋友的过去, 又不想放手, 我该怎么办\u003c/p\u003e \u003cp\u003e(What should I do if I can't let go of my girlfriend's past and don't want to let go)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e老婆与异性经常聊天是否正常༟聊天语气很轻浮༟\u003c/p\u003e \u003cp\u003e(Is it normal for wives to chat with the opposite sex frequently? The tone of conversation is relatively frivolous?)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e女: 我暗恋了一个男孩, 可是不知道怎样接触他༟\u003c/p\u003e \u003cp\u003e(Girl: I secretly love a boy, but I don't know how to contact him?)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eDue to the limitation of the small amount of data in the original dataset, one-tenth of the dataset were randomly selected as test set, and the rest were used as training set. Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e shows the sizes of the training set and test set in the six categories of the dataset.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eQuantity distribution of the dataset\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"7\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDivision of the dataset\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCategory 0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCategory 1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCategory 2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCategory 3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eCategory 4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eCategory 5\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eThe number of test set\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e101\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e218\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e663\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e156\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e186\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eThe number of training set\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e852\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1848\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5560\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1478\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e1781\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Experimental Settings and Evaluation indicators\u003c/h2\u003e \u003cp\u003eTo verify the performance of our proposed model, C-PsyD is compared with FastText, TextCNN, ST_MFLC, BiLSTM, Simple-RNN and LSTM. It should be noted that due to the large Chinese vocabulary, using word vectors will lead to too many model parameters and thus greatly increase the computational effort. Therefore, we use character vectors instead of word vectors, that is, each Chinese character or symbol is treated as a word. In Chinese language, each individual Chinese character carries semantic meaning, which means that a single character can be considered as a word. This is significantly different from English. The character vector has a feature dimension of 128 and is computed by the word embedding[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]module. The number of hidden layers for all RNN modules was set to 2, the number of dataset batches (\u003cem\u003eBatch_size\u003c/em\u003e) of all models in the experiments was fixed at 256, and the number of iterations (\u003cem\u003eepoch\u003c/em\u003e) of all models was set to 10.\u003c/p\u003e \u003cp\u003eWe chose the loss function value (\u003cem\u003eLoss\u003c/em\u003e) of the training set, the overall accuracy (\u003cem\u003eACC\u003c/em\u003e), the recall result (\u003cem\u003erecall\u003c/em\u003e) and the false positive rate (\u003cem\u003eFar\u003c/em\u003e) as the main indicators to evaluate the performance of each model. These four evaluation indicators widely used in literatures were recorded during experiments.\u003c/p\u003e \u003cp\u003e \u003cem\u003eLoss\u003c/em\u003e indicates how fast the model converges, which guides updating the model parameters and explains the cost of training the model. \u003cem\u003eACC\u003c/em\u003e can more intuitively illustrate the training effect of the model.\u003c/p\u003e \u003cp\u003eAs shown in Eq.\u0026nbsp;(\u003cspan refid=\"Equ9\" class=\"InternalRef\"\u003e9\u003c/span\u003e), \u003cem\u003eLoss\u003c/em\u003e is calculated by cross entropy function, where \u003cem\u003ep\u003c/em\u003e is the real distribution, \u003cem\u003eq\u003c/em\u003e stands for the non-real distribution and \u003cem\u003ex\u003c/em\u003e represents the number of the sequence.\u003cdiv id=\"Equ9\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ9\" name=\"EquationSource\"\u003e\n$$Loss=\\sum\\limits_{x} {p(x).\\log (\\frac{1}{{q(x)}})}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e9\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e \u003cem\u003eLoss\u003c/em\u003e has an error value between the predicted value and the actual value. The smaller \u003cem\u003eLoss\u003c/em\u003e is, the more convergent the model becomes. \u003cem\u003eLoss\u003c/em\u003e is obtained by the cross-entropy function, which can better describe the probability distribution of predicted and actual values.\u003c/p\u003e \u003cp\u003eThe classification in our experiments is not a two-classification problem, so the calculation formula is slightly different, where \u003cem\u003eA\u003c/em\u003e\u003csub\u003e\u003cem\u003ei\u003c/em\u003e\u003c/sub\u003e denotes the amount of data labeled as Category \u003cem\u003ei\u003c/em\u003e, \u003cem\u003eT\u003c/em\u003e\u003csub\u003e\u003cem\u003ei\u003c/em\u003e\u003c/sub\u003e refers to the amount of data correctly classified as Category \u003cem\u003ei\u003c/em\u003e and \u003cem\u003eACC\u003c/em\u003e is calculated by\u003cdiv id=\"Equ10\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ10\" name=\"EquationSource\"\u003e\n$$ACC=\\frac{{\\sum\\limits_{{i=1}}^{5} {{T_i}} }}{{\\sum\\limits_{{i=1}}^{5} {{A_i}} }}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e10\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eWe take each category as a positive sample of itself, and the others as negative samples. For example, when predicting Category 1, samples labeled as Category 1 denotes positive samples (represented by \u003cem\u003eP\u003c/em\u003e), and samples labeled as Category 2,3,4 and 5 are negative samples (represented by \u003cem\u003eN\u003c/em\u003e). When a sample is a positive sample, the prediction result of a model shows that the sample is positive. We call this as true positive (\u003cem\u003eTP\u003c/em\u003e). When a sample is a positive sample, the prediction result of a model shows that the sample is negative, which is called false negative (\u003cem\u003eFN\u003c/em\u003e). Similarly, a negative sample predicted to be negative by a model is represented by a true negative (\u003cem\u003eTN\u003c/em\u003e). The model predicts a positive sample, but it is a negative sample, which can be marked as false positive (\u003cem\u003eFP\u003c/em\u003e).\u003c/p\u003e \u003cp\u003eThe recall result (\u003cem\u003erecall\u003c/em\u003e) of each model can be expressed by Eq.\u0026nbsp;(\u003cspan refid=\"Equ11\" class=\"InternalRef\"\u003e11\u003c/span\u003e). The recall result indicates how many probabilities of positive samples are predicted to be positive samples. The higher the recall result is, the better the classification effect of the model has.\u003cdiv id=\"Equ11\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ11\" name=\"EquationSource\"\u003e\n$$recall=\\frac{{TP}}{P}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e11\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eIn fact, some models have a high predictive \u003cem\u003erecall\u003c/em\u003e for a certain category, which does not mean that the model has a good classification result for this category. It is possible that the model classifies all the categories into the same category. Subsequently, the false positive rate (\u003cem\u003eFar\u003c/em\u003e) obtained by each model for each category is needed to analyse. \u003cem\u003eFar\u003c/em\u003e refers to the rate of texts which do not belong to a certain category is predicted to be that category. The calculation formula is shown in Eq.\u0026nbsp;(\u003cspan refid=\"Equ12\" class=\"InternalRef\"\u003e12\u003c/span\u003e). The lower the value of this indicator is, the better the classification effect of this model has.\u003cdiv id=\"Equ12\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ12\" name=\"EquationSource\"\u003e\n$$Far=\\frac{{FP}}{N}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e12\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eFor example, \u003cem\u003eFar\u003c/em\u003e for Category 1 refers to the probability of being predicted as Category 1 in Categories 2, 3, 4 and 5.\u003c/p\u003e \u003c/div\u003e"},{"header":"5. Experimental Results Analysis","content":"\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e5.1 Training Results Analysis\u003c/h2\u003e \u003cp\u003eThrough the changes of \u003cem\u003eLoss\u003c/em\u003e of each mode during training, we can see the convergence of a model more intuitively. The lower \u003cem\u003eLoss\u003c/em\u003e is, the better the convergence of the model becomes. Drawing directly with the original data, the image will be messy. Therefore, \u003cem\u003eLoss\u003c/em\u003e map shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003e is the image of the original data after smoothing.\u003c/p\u003e \u003cp\u003eThrough the comparative analysis of Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003e, we can clearly observe that the loss values of all seven models are gradually decreasing. However, the loss curve of FastText fluctuates greatly, indicating its weak learning ability. In contrast, although the loss curve of ST-MFLC decreases more than FastText, it is still lower than that of other models, indicating its slightly inferior performance. Simple-RNN performs significantly better than ST-MFLC in reducing loss values. The loss reduction amplitude of BiLSTM and LSTM is quite similar. TextCNN and C-PsyD both show much higher performance than all other models in reducing loss values. Particularly, C-PsyD exhibits the most significant advantages, with the largest reduction amplitude and the fastest reduction speed, fully demonstrating its remarkable superiority in this aspect.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAs illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig13\" class=\"InternalRef\"\u003e13\u003c/span\u003e, the accuracy (ACC) of each model increases gradually with the increase of steps. The slow growth of FastText's accuracy indicates its poor performance, while Simple RNN performs better than FastText, but still unsatisfactory. The ACC curves of LSTM and BiLSTM are very similar, with BiLSTM performing slightly better. ST-MFLC has a higher increase in ACC compared to BiLSTM, but still inferior to TextCNN and C-PsyD. Overall, C-PsyD shows the fastest increase in ACC and the greatest increase in ACC values, demonstrating its remarkable superiority.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig14\" class=\"InternalRef\"\u003e14\u003c/span\u003e comprehensively illustrates the results of the performance evaluation of various models using the test set upon the completion of each Epoch's training. As evidenced by the data presented in the figure, the performance of C-PsyD remains consistently optimal, particularly in the 3rd Epoch, where it demonstrates the most significant advantage compared to other models. Although the effectiveness of TextCNN is somewhat close to that of C-PsyD, it still falls short of the superior performance of C-PsyD. This is because, in the psychological classification dataset, most data can be determined based solely on keywords, while a small portion of the data requires the utilization of semantic information for classification. Regrettably, TextCNN lacks sufficient capabilities in extracting semantic information.\u003c/p\u003e \u003cp\u003eIn contrast, RNN models such as BiLSTM can adeptly extract semantic information, but the training time for these models is relatively long, and they typically require a greater number of iterations to achieve satisfactory performance levels. While the accuracy of ST-MFLC is also progressively improving, with the semantic vectors extracted by its self-attention mechanism being crucial for classification information, the model struggles to showcase its potential advantages due to the use of character vectorization in this experiment and the limited need for context-based semantic classification in the dataset. FastText is unable to function properly under the character vectorization mode, resulting in relatively poor performance throughout the entire experimental process. Taking all factors into consideration, there is no doubt that C-PsyD stands out as the most outstanding model.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e5.2 Test Results Analysis\u003c/h2\u003e \u003cp\u003eWhen the model training was finished, we need to test the model with a test dataset set, and the test metrics include \u003cem\u003eACC\u003c/em\u003e, \u003cem\u003eRecall\u003c/em\u003e and \u003cem\u003eFar\u003c/em\u003e. The meaning and calculation of these evaluation indicators have been described above.\u003c/p\u003e \u003cp\u003eAccording to Eq.\u0026nbsp;(\u003cspan refid=\"Equ10\" class=\"InternalRef\"\u003e10\u003c/span\u003e), the overall accuracies obtained by the seven given models for the psychological distress type test are shown in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e. The higher the value of \u003cem\u003eACC\u003c/em\u003e is, the better the classification effect of the model achieves. The best \u003cem\u003eACC\u003c/em\u003e is marked in bold in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eOverall accuracies obtained by the seven models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cem\u003eACC\u003c/em\u003e\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eC-PsyD\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003e79.5%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFastText\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e50.1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTextCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e78.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eST-MFLC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e44.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBiLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e76.4%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e74.9%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSimple-RNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e55.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eAs can be observed from the data displayed in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, C-PsyD outperforms all other models, achieving the highest accuracy rate of 79.5%. This highlights the exceptional performance of the C-PsyD model in the classification task. While TextCNN also exhibits relatively high accuracy, with a rate of 78.2%, it still falls short in comparison to C-PsyD's remarkable results.\u003c/p\u003e \u003cp\u003eOther models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate lower accuracy rates, ranging from 44.8\u0026ndash;76.4%, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits the lowest performance with an accuracy rate of merely 50.1%. In summary, the results outlined in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e unequivocally showcase the superior performance of the C-PsyD model in comparison to the other six models, solidifying its status as the most outstanding model in this experiment.\u003c/p\u003e \u003cp\u003eAccording to Eq.\u0026nbsp;(\u003cspan refid=\"Equ11\" class=\"InternalRef\"\u003e11\u003c/span\u003e), the recall results obtained by the seven models for the five categories are shown in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRecall results obtained by the seven models for the five categories\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCategory 1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCategory 2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCategory 3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCategory 4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eCategory 5\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eC-PsyD\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003e79.2%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e64.2%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003e91.9%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e73.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e59.1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFastText\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e100.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTextCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e68.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e56.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e90.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e73.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e68.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eST-MFLC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e72.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e50.9%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e59.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e4.5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e4.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBiLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e66.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e46.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e93.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e78.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e54.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e55.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e49.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e89.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e80.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e60.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSimple-RNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5.9%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e91.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e48.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e23.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eAs evidenced by the data displayed in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, the C-PsyD model consistently outperforms all other models across all five categories, with recall rates ranging from 59.1\u0026ndash;91.9%. This result highlights the outstanding performance of the C-PsyD model in handling diverse classification tasks.\u003c/p\u003e \u003cp\u003eWhile TextCNN exhibits relatively high recall rates for most categories, with values between 56.4% and 90.6%, it still falls short in comparison to C-PsyD's exceptional results. Other models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate varying recall rates across categories, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits an extreme imbalance in performance, with a recall rate of 100.0% for Category 3, but 0.0% for all other categories.\u003c/p\u003e \u003cp\u003eIn summary, the results outlined in Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e provide strong evidence of the superior performance of the C-PsyD model in comparison to the other six models, confirming its status as the most outstanding model in this experiment in terms of recall rates across all categories.\u003c/p\u003e \u003cp\u003eAccording to Eq.\u0026nbsp;(\u003cspan refid=\"Equ11\" class=\"InternalRef\"\u003e11\u003c/span\u003e), Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e shows the \u003cem\u003eFar\u003c/em\u003e values obtained by the seven models for the five categories. Since 0% probability is not indicative, the bold data represents the best data except 0%.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eFalse positive rate results obtained by the seven models for the five categories\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCategory 1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCategory 2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCategory 3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCategory 4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eCategory 5\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eC-PsyD\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e2.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e16.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e4.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e2.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFastText\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e100.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTextCNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e14.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e5.1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eST-MFLC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e18.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e21.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e28.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1.5%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBiLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e2.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.9%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e20.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e6.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e2.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLSTM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.6%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e17.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e6.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e5.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSimple-RNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e46.3%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e15.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e6.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eAs evidenced by the data displayed in Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e, the C-PsyD model demonstrates a competitive advantage over the other models by maintaining relatively low false positive rates across all five categories, ranging from 2.3\u0026ndash;16.2%. This result highlights the remarkable performance of the C-PsyD model in minimizing classification errors.\u003c/p\u003e \u003cp\u003eWhile TextCNN exhibits relatively low false positive rates for most categories, with values between 1.6% and 5.6%, it is still surpassed by C-PsyD's exceptional results in some instances. Other models, such as ST-MFLC, BiLSTM, LSTM, and Simple-RNN, demonstrate varying false positive rates across categories, further emphasizing the clear advantage of C-PsyD over the competing models. FastText, in particular, exhibits an extreme imbalance in performance, with a false positive rate of 100.0% for Category 3, but 0.0% for all other categories.\u003c/p\u003e \u003cp\u003eIn summary, the results outlined in Table\u0026nbsp;\u003cspan refid=\"Tab6\" class=\"InternalRef\"\u003e6\u003c/span\u003e provide strong evidence of the superior performance of the C-PsyD model in comparison to the other six models, confirming its status as the most outstanding model in this experiment in terms of minimizing false positive rates across all categories.\u003c/p\u003e \u003cp\u003eTherefore, it can be concluded that the C-PsyD model performs better in terms of false positive rate compared to other models in most categories.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec23\" class=\"Section2\"\u003e \u003ch2\u003e5.3 Confusion Matrixes\u003c/h2\u003e \u003cp\u003eConfusion matrix is an analytical graph in machine learning that summarizes the classification results predicted by a model and the true classification results of the data in the form of a matrix. By observing the confusion matrix of a model, we can easily see the classification of the model.\u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig15\" class=\"InternalRef\"\u003e15\u003c/span\u003e presents the confusion matrix for the C-PsyD model. Upon analyzing the confusion matrix for the C-PsyD model, it becomes evident that the model tends to confuse Category 2 and Category 3 when classifying the given dataset. By referring to Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, we can gain a more specific understanding of why C-PsyD confuses family and emotional issues. Based on empirical knowledge, family problems often co-occur with emotional issues, and emotional issues may also be related to family situations. This correlation may lead the C-PsyD model to misclassify instances from these categories.\u003c/p\u003e \u003cp\u003eAdditionally, C-PsyD is prone to misclassifying Category 5 as Categories 1, 2, 3, and 4. As seen in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e, Category 5 represents personality and relationship issues, which can easily be intertwined with other problems in life. For example, if an individual experiences low self-esteem\u0026mdash;a personality issue\u0026mdash;this problem could be rooted in family problems. Thus, the overlap and interconnectedness of these issues in real life might cause the C-PsyD model to misclassify instances from Category 5.\u003c/p\u003e \u003cp\u003eIn summary, the confusion matrix analysis reveals that the C-PsyD model has difficulties in accurately distinguishing between categories that are intrinsically related, such as family and emotional issues, as well as personality and relationship issues that are often entangled with other life problems.\u003c/p\u003e\u003cp\u003eFigure \u003cspan refid=\"Fig16\" class=\"InternalRef\"\u003e16\u003c/span\u003e presents the confusion matrix for the FastText model. Upon analyzing the confusion matrix for the FastText model, it becomes clear that the model demonstrates poor classification capabilities. Remarkably, the FastText model has assigned all instances to Category 3, suggesting that the model has failed to learn how to properly classify instances across the various categories.This striking finding indicates that FastText has not successfully grasped the underlying patterns or features needed to distinguish between the different categories present in the dataset. The model's inability to correctly classify instances from Categories 1, 2, 4, and 5 highlights a significant limitation in the FastText model's performance on this particular task. Overall, the confusion matrix analysis reveals that the FastText model's classification ability is severely impaired, as it incorrectly assigns all instances to a single category.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig17\" class=\"InternalRef\"\u003e17\u003c/span\u003e presents the confusion matrix for the TextCNN model. Upon analyzing the confusion matrix for the TextCNN model, it is evident that the model demonstrates relatively good classification capabilities with a simple structure. Despite its overall performance, there are a few issues worth mentioning.The model tends to confuse Category 2 with Category 3, as evidenced by the number of misclassified instances between these two categories. Additionally, the classification performance for Category 1 is somewhat weaker compared to other categories, suggesting that the model struggles to accurately differentiate instances from Category 1.One possible explanation for the less optimal performance in classifying Categories 1, 2, and 3 when compared to the C-PsyD model is that these categories may require contextual semantic information for accurate classification. The TextCNN model, however, does not possess the capability to extract such contextual information, which may contribute to its lower performance in these categories.\u003c/p\u003e \u003cp\u003eIn summary, while the TextCNN model exhibits relatively good classification capabilities with a simple structure and low levels of misclassification, it struggles with certain categories that may require contextual semantic information for accurate classification. This limitation leads to a comparatively weaker performance for Categories 1, 2, and 3 when compared to the C-PsyD model.\u003c/p\u003e \u003cp\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig18\" class=\"InternalRef\"\u003e18\u003c/span\u003e presents the confusion matrix for the ST-MFLC model. Upon analyzing the confusion matrix for the ST-MFLC model, it is apparent that the model possesses a certain level of classification ability, albeit with subpar performance. The model exhibits several issues, including a high degree of misclassification across all categories, particularly between Categories 1, 2, and 3. This suggests that the ST-MFLC model struggles to accurately distinguish between instances from these categories, resulting in poor overall performance. Additionally, the model demonstrates a considerably weaker performance in classifying Categories 1 and 3 compared to other categories, which may be indicative of the model's inability to capture the unique characteristics and features of these categories, leading to a higher number of misclassifications.\u003c/p\u003e \u003cp\u003eIn summary, while the ST-MFLC model exhibits some classification capabilities, its performance is notably inferior to other models. The high degree of misclassification across all categories, as well as the model's weaker performance in classifying Categories 1 and 3, highlights its limitations in accurately and effectively distinguishing between different categories.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig19\" class=\"InternalRef\"\u003e19\u003c/span\u003e presents the confusion matrix for the BiLSTM model. Upon analyzing the confusion matrix for the BiLSTM model, it is evident that the model possesses a certain level of classification ability, with relatively good overall performance. However, the effectiveness of the BiLSTM model does not quite match that of the C-PsyD model. Although the BiLSTM model demonstrates commendable performance in certain aspects, it falls short in accurately classifying Categories 1 and 5. This suggests that there is potential for improvement in the model's ability to effectively distinguish between different categories.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig20\" class=\"InternalRef\"\u003e20\u003c/span\u003e presents the confusion matrix for the LSTM model. Upon analyzing the confusion matrix, it becomes apparent that the model possesses a certain degree of classification ability, albeit with some limitations. In comparison to the BiLSTM model, the overall performance of the LSTM model is inferior, as evidenced by the misclassification of instances across several categories. More specifically, the model encounters difficulties in accurately classifying categories 2 and 5, and also faces challenges in distinguishing between other categories. The model's performance highlights the need for further investigation into potential improvements and alternative approaches that could enhance classification accuracy and effectiveness.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig21\" class=\"InternalRef\"\u003e21\u003c/span\u003e illustrates the confusion matrix for the Simple RNN model. Upon examining the confusion matrix of the Simple RNN model, it is evident that the model's classification capability is limited, with an overall poor performance in distinguishing between different categories. While the model demonstrates a certain degree of classification ability, it has not effectively learned to accurately classify instances of various categories. One potential explanation for this suboptimal performance lies in the inherent limitations of the Simple RNN architecture, which often struggles with handling long sequences and retaining information over extended durations. This issue may result in the model's inability to capture the complex relationships and patterns present within the data, thus necessitating further exploration of alternative approaches to improve classification performance.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn short, the classification reflected by the confusion matrix obtained by C-PsyD is undoubtedly the most acceptable.\u003c/p\u003e \u003cp\u003eFrom the above experimental results, it is clear that C-PsyD is generally better than other comparative models.\u003c/p\u003e \u003c/div\u003e"},{"header":"6. Conlcusions","content":"\u003cp\u003eConsidering the increasing number of people suffering from mental illnesses and the scarcity of medical resources, a novel Chinese text classification model for detecting psychological issues, C-PsyD, has been proposed to improve the efficiency of doctors and utilization of medical resources. C-PsyD employs text features obtained from TextCNN and Self Attention to guide the Attention module in generating attention weights. These attention weights are combined with the text features implemented by BiGRU to ultimately obtain the text features output by C-PsyD. Experiments were conducted using a shared Chinese psychological text dataset on GitHub. All experiments validated that C-PsyD significantly outperforms six competitors. Excitingly, C-PsyD's accuracy rate is 79.5%, higher than TextCNN (78.2%), BiLSTM (76.4%), LSTM (74.9%), Simple-RNN (55.7%), FastText (50.1%), and ST_MFLC (44.8%). These results indicate that the newly proposed psychological text classification model is feasible and effective.\u003c/p\u003e \u003cp\u003eThe overall C-PsyD is relatively complex and requires high computational capabilities. Graphics cards or AI chips are commonly used for acceleration, but this increases the cost of using C-PsyD and may make the model difficult to implement on lower-capacity mobile devices. In the future, we will continue to simplify the model to facilitate its deployment on mobile devices such as smartphones. Moreover, C-PsyD lacks the ability to predict complex psychological problems. In fact, many individuals do not have a single type of psychological issue. Therefore, to address this challenge, we will continue to collect datasets and improve C-PsyD. C-PsyD can also be integrated with other systems, which will undoubtedly make the proposed variants more useful in modern society. For instance, C-PsyD can be extended to diagnose psychological disorders based on users' descriptive texts, then recommend suitable doctors and use chatbots to guide users, enhancing the efficiency of medical resources. Making the proposed variants more accessible to anyone is also a major focus of our future work.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eConflict of Interest statement\u003c/h2\u003e \u003cp\u003eAll authors disclosed no relevant relationships\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eC was responsible for proofreading and polishing the document, as well as writing the content for the experimental section, while Y mainly conducted the experiments and drafted an initial version.\u003c/p\u003e\u003ch2\u003eAcknowledgments\u003c/h2\u003e \u003cp\u003eThis work is supported by the National Natural Science Foundation of China under Grant No. 62062011, by the Guangxi Natural Science Foundation under Grant No. 2019GXNSFAA185017 and by the Autonomous Region Level College Students\u0026rsquo; Innovation and Entrepreneurship Practice Project under Grant No. 202110608211. The authors would like to thank the editors and the anonymous reviewers for their kind assistance, constructive comments and recommendations, which have significantly improved the presentation of this paper. We would like to express our appreciation to those who share the psychological dataset used in this paper.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eHaddad C et al (2021) Variation of psychological and anthropometrics measures before and after dieting and factors associated with body dissatisfaction and quality of life in a Lebanese clinical sample. BMC Psychol 9(1):1\u0026ndash;13\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYunusa I, Marie Line El Helou (2020) The use of risperidone in behavioral and psychological symptoms of dementia: a review of pharmacology, clinical evidence, regulatory approvals, and off-label use. Front Pharmacol 11:596\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEisenbeck N et al (2022) An international study on psychological coping during COVID-19: Towards a meaning-centered coping style. Int J Clin health Psychol 22(1):100256\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBelhadi A et al (2023) Fast and Accurate Framework for Ontology Matching in Web of Things. ACM Trans Asian Low-Resource Lang Inform Process\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHasan M, Kamrul et al (2021) Fischer linear discrimination and quadratic discrimination analysis\u0026ndash;based data mining technique for internet of things framework for Healthcare. Front Public Health : 1354\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAhmed U et al (2022) Explainable deep Attention active learning for sentimental analytics of mental disorder. Trans Asian Low-Resource Lang Inform Process\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSarkar A, Singh A, Chakraborty R (2022) A deep learning-based comparative study to track mental depression from EEG data. Neurosci Inf : 100039\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThakre TP et al (2022) Polysomnographic identification of anxiety and depression using deep learning. J Psychiatr Res 150:54\u0026ndash;63\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMadan S et al (2022) Deep Learning-based detection of psychiatric attributes from German mental health records. Int J Med Informatics 161:104724\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBurdisso SG (2019) Marcelo Errecalde, and Manuel Montes-y-G\u0026oacute;mez. A text classification framework for simple and effective early depression detection over social media streams. Expert Syst Appl 133:182\u0026ndash;197\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSabri T, Beggar OE, Kissi M (2022) Comparative study of Arabic text classification using feature vectorization methods. Procedia Comput Sci 198:269\u0026ndash;275\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLu J et al (2022) Photocatalytic H2 evolution properties of K0. 5Na0. 5NbO3 (KNN) with halloysite nanotubes. Opt Mater 129:112516\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang Y et al (2022) Contrastive Graph Convolutional Networks with adaptive augmentation for text classification. Inf Process Manag 59(4):102946\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRomero R et al (2022) MobyDeep: A lightweight CNN architecture to configure models for text classification. Knowl Based Syst 257:109914\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBanerjee I et al (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79\u0026ndash;88\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZaporojets K et al (2021) Solving arithmetic word problems by scoring equations with recursive neural networks. Expert Syst Appl 174:114704\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShi M et al (2022) Genetic-gnn: evolutionary architecture search for graph neural networks. Knowl Based Syst 247:108752\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuo B et al (2019) Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing 363:366\u0026ndash;374\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCho K et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. \u003cem\u003earXiv preprint arXiv:1406.1078\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang M et al (2022) Chinese power dispatching text entity recognition based on a double-layer BiLSTM and multi-feature fusion. Energy Rep 8:980\u0026ndash;987\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou Q, Liu X, Wang Q (2021) Interpretable duplicate question detection models based on Attention mechanism. Inf Sci 543:259\u0026ndash;272\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeer D et al (2022) Greedy-layer pruning: Speeding up transformer models for natural language processing. Pattern Recognit Lett 157:76\u0026ndash;82\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZHANG, Xu et al (2022) Pre-hospital emergency text classification model based on label confusion. J Comput Appl : 0\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOrhan U, Cagatay Neftali Tulu (2021) A novel embedding approach to learn word vectors by weighting semantic relations: SemSpace. Expert Syst Appl 180:115146\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJia K (2022) Sentiment classification of microblog: A framework based on BERT and CNN with Attention mechanism. Comput Electr Eng 101:108032\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGhorbanali A, Sohrabi MK, Farzin Yaghmaee (2022) Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf Process Manag 59(3):102929\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVaswani A et al (2017) Attention is all you need. Adv Neural Inf Process Syst 30\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBahdanau D, Cho K, and Yoshua Bengio (2014). Neural machine translation by jointly learning to aligntranslate. \u003cem\u003earXiv preprint arXiv:1409.0473\u003c/em\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhasanah IN (2021) Sentiment classification using fasttext embedding and deep learning model. Procedia Comput Sci 189:343\u0026ndash;350\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAmalou I, Mouhni N, Abdali A (2022) Multivariate time series prediction by RNN architectures for energy consumption forecasting. Energy Rep 8:1084\u0026ndash;1091\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArbane M et al (2023) Social media-based COVID-19 sentiment classification model using Bi-LSTM. Expert Syst Appl 212:118710\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCao N et al (2022) A deceptive reviews detection model: Separated training of multi-feature learning and classification. Expert Syst Appl 187:115977\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhu E et al (2022) N-gram MalGAN: Evading machine learning detection via feature n-gram. Digit Commun Networks 8(4):485\u0026ndash;491\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Q (2022) Malicious code classification based on opcode sequences and textCNN network. J Inform Secur Appl 67:103151\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSouquet L\u0026eacute;o et al (2023) Convolutional neural network architecture search based on fractal decomposition optimization algorithm. Expert Syst Appl 213:118947\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Psychological Problem, Text Classification, BiGRU, Self-Attention, Attention, Convolutional Neural Network","lastPublishedDoi":"10.21203/rs.3.rs-5337854/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-5337854/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThe COVID-19 epidemic has had significant direct and psychological impacts. This study introduces a Chinese text classification model, C-PsyD, which combines BiGRU, Attention, Self-Attention, and convolutional neural network (CNN) techniques. The model utilizes TextCNN and BiGRU outputs in the Attention module, generating result A. Furthermore, the outputs of Self-Attention and BiGRU are used in the Attention mechanism, producing result B. By averaging the results of A and B, a final text feature vector is obtained and passed through a dropout layer. A fully connected neural network layer processes the text feature vector to obtain the classification result. Experimental evaluations were conducted using a Chinese psychological text dataset from GitHub. The results, including loss function value, classification accuracy, recall result, false positive rate, and confusion matrix, indicate that C-PsyD outperforms six competing models. Notably, C-PsyD achieves a classification accuracy of 79.5%, surpassing TextCNN (78.2%), BiLSTM (76.4%), LSTM (74.9%), Simple-RNN (55.7%), FastText (50.1%), and ST_MFLC (44.8%), as well as FastText (50%). These findings confirm the feasibility and effectiveness of the proposed psychological text classification model. Its implementation can enhance doctors' ability to classify patients, promptly detect psychological problems, and facilitate effective treatment, thus optimizing the utilization of medical resources.\u003c/p\u003e","manuscriptTitle":"C-PsyD: A Chinese text classification model for detecting psychological problems","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-11-18 11:14:26","doi":"10.21203/rs.3.rs-5337854/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"51a8da5f-7753-4add-8174-f27b3b01e632","owner":[],"postedDate":"November 18th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-01-17T12:09:11+00:00","versionOfRecord":[],"versionCreatedAt":"2024-11-18 11:14:26","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-5337854","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-5337854","identity":"rs-5337854","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00