Explainable Machine Learning Models as Screening tool for Anxiety and Depression in Medical Students Based on Non-Stigmatizing Lifestyle and Sociodemographic Factors- Pakistan. A Validation Study

preprint OA: closed
Full text JSON View at publisher
AI-generated deep summary by claude@2026-06, 2026-06-24 · read from full text

This cross-sectional study of 1,630 first- and second-year undergraduate medical students in Islamabad, Pakistan, developed and validated explainable Random Forest screening models for anxiety and depression using non-stigmatizing sociodemographic and lifestyle variables (e.g., sleep, physical activity, academic workload, and social context) without asking directly stigmatizing mental-health questions. Anxiety and depression were measured with GAD-7 and PHQ-9, and model performance on a held-out test set was reported using accuracy, sensitivity, specificity, and AUC-ROC, with SHAP used to identify key predictors and their (non-linear) effects. Depression prevalence was 57.8% and anxiety 46.4%, while the anxiety and depression models achieved 76.69% and 77.30% accuracy, respectively; SHAP highlighted academic performance, sleep patterns, and physical activity as important predictors. A major caveat explicitly inherent to the design is that it is a single time-point validation study (not a longitudinal or external validation study). Relevance to endometriosis: The paper does not explicitly discuss endometriosis or adenomyosis; it was included in the corpus via a keyword match in the upstream search index.

Read from the paper's body, not the abstract. Not a substitute for reading the paper. No clinical advice. How this works

Full text 136,232 characters · extracted from preprint-html · click to expand
Explainable Machine Learning Models as Screening tool for Anxiety and Depression in Medical Students Based on Non-Stigmatizing Lifestyle and Sociodemographic Factors- Pakistan. A Validation Study | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Explainable Machine Learning Models as Screening tool for Anxiety and Depression in Medical Students Based on Non-Stigmatizing Lifestyle and Sociodemographic Factors- Pakistan. A Validation Study Farah Rashid, Ahmed Waqas, Rafay Rashed Siddiqui, Talha Ahmed, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9558928/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background: Anxiety and depression are highly prevalent among medical students, particularly in low- and middle-income countries such as Pakistan, where stigma limits help-seeking. The research shows that machine learning models based on lifestyle and sociodemographic data can be effective in screening anxiety and depression among medical students. This study aimed to develop an explainable machine learning–based screening tool using non-stigmatizing lifestyle and sociodemographic factors. We hypothesized that such models could effectively identify anxiety and depression in academic settings. Method: A cross-sectional study was conducted among 1,630 undergraduate medical students from various medical colleges in Islamabad, Pakistan. Data collection was informed by stakeholder engagement to ensure contextual relevance. Variables included were sociodemographic characteristics and lifestyle factors such as sleep, physical activity, academic workload, and social context. Anxiety and depression were measured using GAD-7 and PHQ-9. After preprocessing, data were split into training (80%) and testing (20%) sets. Random Forest classifiers were developed separately for anxiety and depression, with hyperparameters optimized via cross-validation. Performance was evaluated using accuracy, sensitivity, specificity, and AUC-ROC. Model interpretability was achieved using SHAP. Results: Prevalence of depression and anxiety was 57.8% and 46.4%, respectively. The anxiety model achieved 76.69% accuracy, while the depression model achieved 77.30%. SHAP analysis identified academic performance, sleep patterns, and physical activity as key predictors, exhibiting non-linear effects. It demonstrates that effective and interpretable screening can be achieved without sensitive disclosures related to mental health. Practically, such strategies can help universities to identify at-risk students early, refer them to the support services. and provide focused intervention timely. Conclusion: The findings of this study support the use of scalable and ethically sound Explainable machine learning screening instruments, using non-stigmatizing data for student mental health in academic environment. Future studies should aim to validate and implement, whereas policymakers might want to incorporate explainable, data-driven strategies into student mental health frameworks in resource-limited educational settings. Psychiatry Anxiety Artificial Intelligence Depression Machine Learning Models Medical Students Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction Students are the most vulnerable population group facing stressful events in life, especially those achieving higher professional education ( 1 ). Increasing levels of stress, anxiety and depression among medical undergraduates has been recognized as a significant and under-reported health problem in the community ( 2 ). It is also observed that medical graduates have increased levels of mental distress compared to their peers and the general population ( 3 ). This alarming situation has also been identified in both developed and developing countries ( 4 ). Global evidence shows that medical students have an excessively large proportion of psychological distress due to academic demands, competitive contexts, and clinical stressors ( 5 , 6 ). Additional stressors were attributed to inflexible teaching schedules, an intensive curriculum, and stringent administrative frameworks ( 7 ). This burden is further aggravated in low- and middle-income countries (LMICs) such as Pakistan due to added challenges of political instability, sociocultural stigma towards mental health, lowered access to professional mental health services and institutional support systems ( 8 ). Although the issue is substantial, there is still a lack of effective screening of medical students with mental health issues. There are multiple tools available for assessing psychological distress in medical students However, there is no consensus on the tools used to detect these symptoms ( 9 ). Traditional screening methods are based on self-reported psychological symptoms that tend to be affected by stigma, fear of discrimination, and distress of academic or professional repercussions ( 10 ). Such barriers are especially acute in collectivist and stigmatizing situations, in which the sharing of mental health issues can be discouraged. New developments in the field of artificial intelligence (AI) and machine learning (ML) present new opportunities to conduct mental health screening by detecting multidimensional trends in data. Nevertheless, the transparency, interpretability, and ethical accountability of the use of ML in mental health have been constrained due to high-stakes decision-making settings ( 11 ). The limitations are mitigated by explainable machine learning (XAI) methods like SHapley Additive explanations (SHAP) which give understandable insights into how a model makes predictions, thus increasing trust and making it easier to use in practice ( 12 ). Particularly, some lifestyle and contextual variables, including sleeping patterns, physical activity, academic achievements, and social support, have been repeatedly related to mental health outcomes and are non-stigmatizing proxies of psychological distress( 13 , 14 ). Using these variables will enable the creation of scalable and cultural acceptable screening instruments that will reduce stigma and allow early identification. With this background, the purpose of the current research was to design and test explainable machine learning models to identify anxiety and depression screening among medical students based on non-stigmatizing lifestyle and sociodemographic variables. Our hypothesis was that these models would have reasonable predictive accuracy and be interpretable, and hence, provide a scalable and ethically acceptable and context-specific solution to mental health screening in resource-constrained learning environments. Methods This was a cross-sectional study conducted in Islamabad, capital of Pakistan. The study recruited 1630 first- and second-year medical students through universal sampling approach from nine different medical colleges of Islamabad selected by simple random techniques. Data was collected from the selected medical colleges through an on-site survey. Prior to administering the questionnaire, the researchers conducted an introductory session with the students, during which the study objectives, procedures, and ethical considerations were explained. The questionnaire was then introduced, and its structure briefly explained to ensure understanding. Subsequently, all the students completed the questionnaire in the presence of the researchers, who were available to address any queries. This approach facilitated a high participation with response rate of 91%. Inclusion criteria were enrolled medical students in medical colleges of Islamabad. Students who declined to participate or didn’t give consent or were absent during data collection period and incomplete questionnaires were excluded. Participation was voluntary; informed consent was taken and confidentiality assured. The study protocol was reviewed and approved by the institutional ethics committee (Approval No. 00009 IHSA/P\D-2022). Development of data collection tools was guided by extensive stakeholder engagement involving students, faculty, and mental health professionals to ensure contextual relevance and acceptability of variables to be included. Sociodemographic variables (e.g., age, gender, year of study, socioeconomic indicators) and lifestyle factors (e.g., sleep patterns, physical activity, academic workload, and social factors) were collected. Anxiety and depression were assessed using validated self-report instruments (GAD-7 and PHQ-9). Data Preprocessing and MLM Development. Following data cleaning, imputation, and encoding, the dataset was split into training (80%) and testing (20%) subsets. Separate Random Forest classification models were developed for anxiety and depression due to their capacity to model complex, non-linear relationships. Hyperparameters were optimized using cross-validation. The performance on the held-out test dataset was evaluated on the standard classification measures of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUC-ROC). These measurements were selected because they offer a level of balance when assessing predictive performance, especially when it comes to mental health screening. The MLM coding is available as supplementary file. Explainability of Models with SHAP Analysis. SHapley Additive exPlanations (SHAP) were used to measure the importance of each predictor to model predictions in order to increase interpretability and facilitate practical application. SHAPs were computed per observation and summed up to establish variable’s significant on a global scale. Furthermore, analysis based on the class specific SHAP was performed to investigate the impact of various classes within sociodemographic and lifestyle variables on the probability of experiencing anxiety and depression. Statistical Software Analysis was done in existing packages of statistical and machine learning software. First part of Data analysis was performed using SPSS statistical software version 26.0 (IBM). Multiple regression was performed to assess the effect of several factors on the likelihood that respondents have anxiety and depression according to GAD-7 and PHQ-9 scale score. Analysis was conducted using anxiety and depression as the dependent variable and several sociodemographic and behavioral factors as the independent variables, including age, sex, place of residence, type of family setup, physical activity, family and social support, sleep pattern, and academic performance The results were presented with statistical significance (p < 0.05), regression coefficients, and 95% confidence intervals for beta-coefficient for each of the predictors. Secondly, Open-source libraries were used to develop the model, perform validation and SHAP analysis, and the reproducibility and transparency of the results. (coding files attached as supplementary file) Results Cross-Sectional study findings A total of 1630 participants were surveyed, and overall prevalence of depression and anxiety was found 57.8% and 46.4%, respectively. The age distribution was 1274 (78.2%) aged 18–20 years, 314 (19.3%) aged 21-23years, 28 (1.7%) <18years, and 14 (0.9%) aged 24-26years. Of the participants, 1,100 (67.5%) were female and 530 (32.5%) were male. Academic year distribution showed that 800 (49.1%) participants were first-year medical students, while 830(50.9%) were in their second year. Socioeconomic backgrounds varied, with 1302 (79.9%) from upper middle-income, 272(16.7%) from high-income, and 56 (3.4%) from low middle-income. In terms of residency, 694 (42.6%) of participants were hostellers, while 936 (57.4%) lived with their families. Most participants, 1216 (74.6%), came from nuclear families, while 414 (25.4%) were from joint families. Regarding parental status, 1524 (93.5%) participants had both parents living together and 106 (6.5%) had one deceased parent. Multivariable ordinal logistic regression models were used to test hypothesis and to estimate the adjusted odds ratio (aOR) for anxiety and depression. Before the analysis, we checked the assumptions, the proportional odds assumption was tested using the Brant test and variance inflation factors (VIF) were used to detect multicollinearity (VIF less than 5 was considered acceptable). The p-value cut-off was p < .05. Variables with potential clinical or statistical relevance were included in the model to adjust for confounding. Adjusted odds ratios (ORs) with 95% confidence intervals (CIs) were reported, and statistical significance was set at p < 0.05. This analysis showed that female gender had higher odds with both anxiety (aOR = 2.13, 95% CI: 1.59–2.86) and depression (aOR = 1.64, 95% CI: 1.22–2.22). Significant risk factors for both outcomes included irregular sleep (anxiety: aOR = 1.96, 95% CI: 1.39–2.78; depression: aOR = 2.27, 95% CI: 1.64–3.23), family history of mental illness (anxiety: aOR = 2.38, 95% CI: 1.72–3.33; depression: aOR = 3.23, 95% CI: 2.33–4.76), history of harassment (anxiety: aOR = 2.04, 95% CI: 1.52–2.70; depression: aOR = 2.56, 95% CI: 1.85–3.45), and peer pressure (anxiety: aOR = 2.78, 95% CI: 2.04–3.70; depression: aOR = 3.45, 95% CI: 2.50–4.76). Protective factors included strong social support (anxiety: aOR = 0.63, 95% CI: 0.48–0.84; depression: aOR = 0.49, 95% CI: 0.37–0.66), strong family support (anxiety: aOR = 0.49, 95% CI: 0.32–0.75; depression: aOR = 0.47, 95% CI: 0.31–0.71), and positive coping style (anxiety: aOR = 0.53, 95% CI: 0.40–0.70; depression: aOR = 0.49, 95% CI: 0.37–0.64). Additionally, higher screen time (≥ 5 h vs < 2 h) increased the odds of anxiety (aOR = 1.79, 95% CI: 1.12–2.78) and depression (aOR = 2.33, 95% CI: 1.45–3.70), while excellent academic performance was associated with lower odds of both outcomes (anxiety: aOR = 0.49, 95% CI: 0.33–0.74; depression: aOR = 0.35, 95% CI: 0.22–0.53). Full Detailed Table included in supplementary file. Machine Learning Models Analysis This report presents the findings of a machine learning analysis aimed at predicting mental health outcomes among students. Specifically, the study focuses on predicting Anxiety (measured using GAD-7 scale) and Depression (measured using PHQ-9 scale) based on various demographic and lifestyle risk factors. The analysis employs the Random Forest classification algorithm to identify patterns in the data and determine which factors contribute most significantly to mental health outcomes. Additionally, SHAP (SHapley Additive exPlanations) values are used to understand how specific classes within each variable influence predictions. Dataset Overview The dataset comprises survey responses from 1,630 student participants. The data was collected in SPSS format and contains information about various demographic characteristics, lifestyle factors, and mental health assessments. The analysis utilized 11 independent variables (risk factors) to predict 2 dependent variables (anxiety and depression): Independent Variables (Predictors) : Gender: Gender of the participant (Male/Female) YOS: Year of Study (1st Year/2nd Year) Residential Status: Living arrangement (Hostel/Family) Screen Time: Daily screen time usage (5hrs) Sleep pattern: Quality of sleep patterns (< 6hrs/7-8hrs/irregular) Peer Pressure: Level of peer pressure experienced (No/Yes) Social Media Usage: Extent of social media engagement (rarely/occasionally/daily) Family Support: Level of family support received (No/Yes) Lack of Social Support: Lack of strong social support network (No/Yes) Physical Activity: Level of physical activity (sedentary/Moderate(1–2 days/week)/High(+ 3 days/week)) Academic Performance: Academic performance level (Failing performance/poor performance/moderate performance/good performance/excellent performance) Dependent Variables (Outcomes) : Anxiety (GAD-7): Measured on a scale of 0–3, representing severity levels of anxiety symptoms Depression (PHQ-9): Measured on a scale of 0–4, representing severity levels of depressive symptoms Data Quality The dataset was examined for missing values and data quality issues. The analysis confirmed that there were no missing values across any of the 11 variables used in this study, ensuring complete data for all 1,630 participants. Methodology- Machine Learning Models Random Forest Classifier The Random Forest algorithm was selected for this classification task due to its robustness, ability to handle non-linear relationships, and built-in feature importance estimation. Random Forest is an ensemble learning method that constructs multiple decision trees during training and outputs the class that is the mode of the classes predicted by individual trees. Model Configuration The following configuration was used for both the Anxiety and Depression prediction models: Parameter Value Number of Trees (n-estimators) 100 Maximum Depth None (unlimited) Minimum Samples to Split 2 Minimum Samples per Leaf 1 Random State 42 (for reproducibility) Data Splitting Strategy The dataset was divided into training (80%) and testing (20%) sets using stratified sampling. Stratification ensures that the proportion of each class in the target variable is maintained in both the training and testing sets, which is particularly important when dealing with imbalanced classes. SHAP Analysis SHAP (SHapley Additive exPlanations) values were computed using TreeExplainer for Random Forest models. SHAP provides both the magnitude and direction of each feature's impact on predictions, allowing us to identify which specific classes within each variable increase or decrease the risk of anxiety and depression. Key metrics include: Mean |SHAP|: Average absolute impact (higher = more influential) Mean SHAP (signed): Direction of impact (positive = increases risk, negative = decreases risk) Results of SHAP-Model Performance Model Performance Model Accuracy Correct Predictions Total Test Samples Anxiety (GAD-7) 76.69% 275 326 Depression (PHQ-9) 77.30% 264 326 Detailed Classification Metrics - Anxiety Model The following table presents precision, recall, and F1-score for each anxiety severity class: Class Precision Recall F1-Score Support 0 (Minimal) 0.82 0.82 0.82 175 1 (Mild) 0.72 0.76 0.74 128 2 (Moderate) 0.50 0.37 0.42 19 3 (Severe) 1.00 0.50 0.67 4 Academic and lifestyle-related factors are main drivers of the anxiety model, where the academic performance makes the largest contribution (≈ 15%), then physical activity, screen time, and sleep pattern, and all are above or close to the mean contribution (~ 9%). This implies that the most influential predictors of anxiety risk are performance pressure and the daily behavioral habit. Mid-level effects are residential status and year of study that captures contextual and transitional effects, whereas social and demographic factors like inadequate social support, social media use, peer pressure, gender, and family support have less and are below the average level. In general, the model highlights the point that changeable behaviors and academic stressors are much more predictive of anxiety than have been the personal or social features that remain constant. The confusion matrix shows that the model has a moderate performance (accuracy ≈ 76.69%), with strong and more confident predictions, on lower levels of anxiety (class 0 and 1), with most of the predictions being correctly recognized (158 and 103), which means the model is very reliable for the majority of cases. Nevertheless, there is a decrease in performance with higher severity classes (2 and 3) and a clear misclassification in the next lower classes, indicating that moderate-to-severe anxiety is difficult to differentiate. Mistakes are mainly within adjacent classes (e.g., 0 1 and 1 2), which means that the model has a good representation of overall severity patterns but cannot distinguish between boundaries accurately. Overall, the model is trustworthy in the determination of low to mild anxiety but not so effective in determining the higher levels of anxiety. Detailed Classification Metrics - Depression Model The following table presents precision, recall, and F1-score for each depression severity class: Class Precision Recall F1-Score Support 0 (Minimal) 0.79 0.80 0.79 138 1 (Mild) 0.77 0.81 0.79 150 2 (Moderate) 0.65 0.53 0.59 32 3 (Moderately Severe) 1.00 0.60 0.75 5 4 (Severe) 0.00 0.00 0.00 1 Sleep pattern, academic performance, and physical activity are the key factors in the depression model, with an equal contribution of about 1314% and above the average level of importance, indicating the key role of behavioral and functional factors. There is also an impressive contribution of screen time, making it a worthwhile risk factor in terms of lifestyle. Residential status and year of study are considered mid-level influences, whereas psychosocial and demographic factors (peer pressure, gender, use of social media, absence of social support, family support) are less important and below average. In general, the model highlights that the risk of depression is mainly determined by the alterable everyday behaviors, especially sleep, but not immutable personal or social traits. This confusion matrix shows a solid improvement in the depression model (accuracy ≈ 77.30%), with strong performance in identifying minimal and mild cases, which make up most of the data. Maximum individuals in these groups are correctly classified (110 and 130), which means that there is a good reliability of common and lower-severity groups. However, there is a drop in performance of moderate cases with a observable misclassification into the neighboring low categories which may indicate underestimation of severity. In moderately severe and severe classes, the model performs poorly because the support is very low, and the model only makes a limited number of correct predictions and tends to misclassify to the lower levels. Most errors are between adjacent classes (0 01 and 1 12), and this means that the model is accurate in overall-severe progression but fails to be specific to the higher-risk levels. Feature Importance Analysis Feature importance analysis reveals which risk factors have the greatest influence on predicting mental health outcomes. The Random Forest algorithm calculates importance based on how much each feature contributes to reducing impurity across all trees. Top Risk Factors for Anxiety (GAD-7): Rank Risk Factor Contribution (%) 1 Academic Performance 15.19% 2 Physical Activity 13.08% 3 Screen Time 11.78% 4 Sleep Pattern 11.26% 5 Residential Status 9.08% 6 YOS 8.33% 7 Lack of Social Support 7.10% 8 Social Media Usage 7.08% 9 Peer Pressure 6.14% 10 Gender 5.91% 11 Family Support 5.05% Top Risk Factors for Depression (PHQ-9): Rank Risk Factor Contribution (%) 1 Sleep pattern 13.74% 2 Academic Performance 13.54% 3 Physical Activity 13.53% 4 Screen Time 11.49% 5 Residential Status 8.81% 6 YOS 8.06% 7 Peer Pressure 7.08% 8 Gender 6.77% 9 Social Media Usage 5.96% 10 Lack of Social Support 5.83% 11 Family Support 5.20% The comparison shows that both models are powered by the behavioral and academic factors but with different priorities: the factor of anxiety is the most driven by the academic performance which signifies performance-related stress as the key factor and the factor of depression is the most driven by the sleep pattern which denotes a stronger physiological and regulatory factor. Both models have consistently had physical activity and screen time which supports the importance of lifestyle behaviors across conditions. Residential status and year of study are contextual factors, moderately influential in the two, with psychosocial and demographic variables playing a lesser role overall. Notably, peer pressure has relatively greater importance in depression compared to anxiety, whereas lack of social support is slightly more relevant for anxiety. In general, anxiety seems to be more outwardly influenced by performance and activity pressures whereas depression is more influenced by inner regulation, especially sleep as well as common lifestyle risks. Class-Level Impact Analysis This section examines which specific classes within each categorical variable have the highest and lowest association with anxiety and depression outcomes. This analysis helps identify specific risk groups that may benefit most from targeted interventions. (this whole analysis is available as separate supplementary file) SHAP Values Analysis SHAP (SHapley Additive exPlanations) values provide insight into how each class within a variable influences the model's predictions. Positive SHAP values indicate that the class increases the predicted risk of anxiety/depression, while negative values indicate a protective effect. The interaction plot by SHAP shows that the captured features (gender, year of study, residential status and screen time) exhibit relatively low and well-concentrated interaction effects on predicting anxiety and that most of the values are clustering around zero, implying that the features do not have a strong pair-wise effect unlike the strong main effects. The dispersion of gender and year of study is slightly wider, suggesting a slight variability of interaction, especially between them and residential status, although this is not strongly directed. The interactions of screen time seem to be weak and highly concentrated around zero which means that its impact on anxiety is more independent than based on interaction with these demographic variables. More precisely, the plot indicates that the prediction of anxiety depends more on the contribution of individual features as opposed to the interaction between these variables which supports the assertion that the primary behavioral factors are stronger than the demographic interaction. However, in contrast, the wider model results show that the most significant drivers, i.e. the key behavioral factors, are academic performance, physical activity, sleep pattern, and screen time (as a main effect), i.e. the prediction of anxiety is influenced much more by these individual lifestyle or performance-related factors than by the interactions between demographic or contextual factors. Key SHAP Findings - Depression: The interaction plot indicates that the majority of the feature interactions in the depression model are weak with most values being centered around zero, which means that the prediction of depression is primarily influenced by independent (main) effects and not strong pairwise interactions. The year of study (YOS) has the most dispersion, indicating that it has the most prominent, yet not very significant, interactions, with gender and residential status, which are probably due to differences in stress or adaptation between academic levels. Gender interaction is also mildly spread, but without the consistent directional effect, meaning that it has little influence on other variables. Residential status exhibits very little interaction with a high degree of concentration around zero, which means that its impact is largely independent. The screen and sleep pattern have the least overall interaction, and these variables are very tightly clustered around the zero-point meaning that its effect on depression is more direct behavioral effects than of interaction among variables with demographic or contextual variables. This is particularly because sleep pattern is a leading predictor in the model- its effect is high but is mostly independent. Overall, the plot confirms that key behavioral drivers such as sleep, physical activity, and screen time act independently, while demographic/contextual variables (gender, YOS, residential status) contribute weakly and do not meaningfully interact, reinforcing that depression risk is primarily shaped by individual lifestyle factors rather than complex interplay. Importance of Features and SHAP-Based Model Explainability The most significant predictors in terms of contributing to the risk of anxiety and depression were determined using the feature importance analysis of the Random Forest models. The strongest predictors of the anxiety results were academic performance, physical activity, screen time, and sleep patterns with residential status, year of study, and social support indicators coming next. The same trend was found on depression whereby sleep patterns, academic performance, physical activity, and screen time were the most important predictors. To further explain model predictions, SHAP ( SHapley Additive exPlanations ) values have been calculated with the help of the TreeExplainer algorithm. SHAP analysis was able to measure the magnitude and direction of each variable contribution, which was important in identifying risk-enhancing and protective effects. The mean absolute SHAP values of the lifestyle-related variables were always higher than those of the sociodemographic variables that tighten the prevalence of their dominance in the stratification of mental health risks. In both results, negative sleeping habits, poor academic achievement, low physical activity, and more screen time were linked to higher predicted risk whereas positive social and familial conditions were protective. SHAP visualizations also identified a non-linear and class-specific relationship, which means that the impact of some factors was different in different categories of severity. The results are an example of the value add of explainable machine learning in helping reveal multifaceted and understandable risk patterns that cannot be identified through more conventional regression-based models. Discussion This study investigated the prevalence and associated factors of anxiety and depression among undergraduate medical students, revealing that depression and anxiety were experienced 57.8% and 46.4% respectively. These findings reflect a moderate burden of mental health issues in this population and underscore the influence of various socio-demographic and psychosocial factors. This paper offers empirical data which reports that explainable machine learning models based on non-stigmatizing lifestyle and sociodemographic variables can be useful for screening anxiety and depression in medical students. The achieved accuracies, 76.69% per cent of anxiety and 77.30 per cent of depression, are similar to the current machine learning applications in the mental health industry, where predicting is a complicated task because of multifactorial effects ( 15 ). This evidence demonstrates the usefulness of such models as screening tools as opposed to diagnostic ones. One of the major contributions of the research is that it uses explainable AI with SHAP analysis, which increases the interpretability of the research and can give actionable insights into risk factors. Of these, sleep quality was the most significant predictor potentially modifiable, and also in line with the vast literature that has identified the presence of a two-way relationship between sleep disturbances and mental health disturbances ( 16 ). This SHAP analysis proves that the length of sleep is a significant predictor of anxiety forecasting, and the most significant protective effect is by optimal sleep (7–8 hours). Conversely, the correlation between sleep patterns and depression seems to be more complicated and non-linear with some unexpected protective relations in all categories and especially in irregular sleep schedules. It should be viewed with caution, instead of as indicative of protective action. A significant amount of recent evidence invariably shows that the lack of regularity in sleep and the decrease in the duration of sleep is associated with the increase in the degree of depressive symptoms ( 17 , 18 ). This correlation is specifically applicable when dealing with medical student groups, in which sleep disturbances and disruptions are frequent and are typically grouped within the larger umbrella of the psychosocial-level training burden ( 19 ). Intervention approaches such as sleep education, behavioral changes, and relaxing techniques are recommended to address contributing factors. This identifies sleep hygiene interventions as a viable and scalable point of mental health promotion in medical institutions ( 20 ). The fact that high screen time is associated with negative mental health outcomes is consistent with the rising evidence across the world regarding the correlation between high levels of digital use and higher rates of anxiety and depression, as well as sleep disturbances ( 21 ). This highlights the need to include digital well-being aspects in the student support programs. Similarly, the preventive influence of physical activity in the current study supports the prior viewed research on its antidepressant and anxiolytic effects ( 14 ). Campus-wide policies that encourage organized exercise, including incorporating physical activity into campus life, can have significant psychological impacts. The interdependence between academic and psychosocial determinants of mental health is also manifested in the role of academic performance and social support. Academic underachievement can serve as a cause and effect of psychological distress, which implies the necessity of an academic and mental health support intervention ( 22 ). At the same time, the close connection with social support highlights the role of peer networks and mentorship systems along with community-building movements. Through the application of the non-stigmatizing variables, the proposed models overcome a major barrier in mental health screening, which is stigma, and thus increase acceptability and participation. In addition, the models can be explained, which will assist in ethical transparency, which is necessary to implement them in education and policy-making. These are significant policies and practice implications of these findings. To begin with, such models can be incorporated in medical colleges and universities as a regular student wellness activity as an early screening tool to identify timely in-risk students. Second, the detection of changes in lifestyle creates the basis of specific preventive strategies, including sleep hygiene programs, physical activity, and online wellness training. Third, at the systems level, this strategy is consistent with the demand of the scalable and low-cost interventions in mental health in LMICs like Pakistan, where limited resources of specialist’s care are available. eMental Health Implementation and mHealth Relevance. The paper complies with the guidelines of the mHealth Evidence Reporting and Assessment (mERA) initiative, as it shows the practicability of a digitally enabled and scalable mental health screening solution that is applicable to resource-limited educational environments. Routinely collectable non-stigmatizing lifestyle and sociodemographic variables are beneficial in terms of acceptability and feasibility, which are major areas of focus in the implementation research. This is especially applicable to LMIC settings like Pakistan in which mental health workforce shortages and added challenges such as political instability, lowered access to institutional support prevent access to specialist care due to its reach and scalability. Since the model allows stratification of risks at the population level early, it enables task-shifting strategies in which educators and non-specialty staff members could recognize at-risk students and make corresponding referrals or preventive measures. Moreover, the determination of modifiable lifestyle predictors is consistent with behavioral intervention models and can be combined with low-cost, evidence-based procedures, including sleep hygiene education, physical activity promotion, peer support, etc. This makes the model not a screening tool only but also a decision-support model to plan specific interventions, which is furthering the agenda of precision mental health in the community. Towards Precision Mental Health within Academic Institutions. Explainable machine learning is the future of the field as it is integrated to provide precision mental health. It allows profiling risks individually and is interpretable and ethically transparent. The model can be conceptualized as a screening tool to decision-support system that might inform specific preventive interventions, which include sleep-hygiene program and peer-support programs. Conclusion To sum up, this paper illustrates that explainable machine learning models with non-stigmatizing lifestyle factors can be effective, scalable, and ethically transparent in screening anxiety and depression in university students. These strategies have great potential towards enhancing early detection and targeted prevention to the greater cause of reinforcement of mental health systems in resource-constrained education environments. Strengths and Limitations The large sample size, the fact that the selection of variables was informed by stakeholders, and the application of explainable techniques of AI to improve its interpretability and acceptability are major strengths of this study. However, there are few limitations that should be considered. The cross-sectional design does not allow causal inferences and use of self-reported data could cause reporting bias. Also, the imbalance in classes between the severe anxiety and depression groups was likely to limit the model performance of these outcomes. Abbreviations AI Artificial Intelligence AUC ROC –Area Under the Curve–Receiver Operating Characteristic aOR Adjusted Odds Ratio CI Confidence Interval GAD 7 –Generalized Anxiety Disorder–7 LMICs Low–and Middle–Income Countries mERA mHealth Evidence Reporting and Assessment ML Machine Learning PHQ 9 –Patient Health Questionnaire–9 SH+ Self–Help Plus SHAP SHapley Additive exPlanations SPSS Statistical Package for the Social Sciences WHO World Health Organization XAI Explainable Artificial Intelligence Declarations Ethics approval and consent to participate: The research was carried out in accordance with the principles of the Declaration of Helsinki and received ethical approval from the Institutional Review Board of the Health Services Academy (No. 00009 IHSA/P/D-2022). All participants provided written informed consent before enrollment in the study. They were informed about the purpose of the study, the voluntary nature of participation, and their right to withdraw at any time without any consequences. Confidentiality and data protection measures were ensured in compliance with ethical guidelines. The study ensured ethical implementation and adoption of machine learning-enhanced mental health screening instruments through the adoption of non-stigmatizing variables and explainable models in a university environment. Consent for publication – not applicable Availability of data and materials- All data generated or analyzed during this study are included in this published article and available in its supplementary information files. Competing interests: The author declares no competing interests. Funding : No funds, grants, or other support was received. Authors’ Contribution FR: Conceptualization, Methodology, Data collection, Writing – Original final manuscript Preparation and oversaw the execution of the entire study AW: Writing – Review & Editing, Critical revision of the manuscript. RR: Application of MLM, development of algorithms and interpretation TA: Application of MLM and interpretation AR: Supervision, Technical oversight, Writing – Review & Editing. AM: Methodological guidance, technical support, Writing – Review & Editing. All authors contributed to the refinement of the manuscript, reviewed the final version, and approved it for publication Acknowledgements: The authors gratefully acknowledge the medical students who participated in this study. A part of this research has been accepted for poster presentation at the International Congress of the Royal College of Psychiatrists (UK), to be held in June 2026. Authors’ information *Dr Farah Rashid – (Corresponding Author)* [email protected] https://orcid.org/0009-0006-5931-6204 PhD Fellow-Public Health, Health Services Academy, Islamabad. Pakistan. IRSIP Fellowship at the University of Liverpool, UK. Faculty at National University of Sciences and Technology. Islamabad. Pakistan Dr Ahmed Waqas [email protected] 0000-0002-7492-5052 clinical lecturer – SAS doctor University of Liverpool, UK Rafay Rashed Siddiqui [email protected] MSc-student Computer Science Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau. Germany Talha Ahmed [email protected] MSc-student Computer Science University: California State Polytechnic University, Pomona. USA Dr Atif Rahman [email protected] 0000-0002-2066-4467 Professor of Child Psychiatry & Global Mental Health, Department of Primary Care and Mental Health Institute of Population Liverpool L69 3BX, United Kingdom Phone: +44(0)7807 10 6764 Dr Abid Malik [email protected] 0000-0002-9084-2185 Professor and HOD Public Mental Health Health Services Academy, Islamabad +923468544463 References Saipanish R. Stress among medical students in a Thai medical school. Med Teach. 2003;25(5):502–6. Azim SR, Adnan N, Azim SN, Nisar M, Shamim MS. Frequency of mental distress among medical students from selected medical colleges of Pakistan: A systematic review. J Pak Med Assoc. 2022;72(10):2048–53. Saravanan C, Wilks R. Medical students' experience of and reaction to stress: the role of depression and anxiety. ScientificWorldJournal. 2014;2014:737382. Carson AJ, Dias S, Johnston A, McLoughlin MA, O'Connor M, Robinson BL, et al. Mental health in medical students. A case control study using the 60 item General Health Questionnaire. Scott Med J. 2000;45(4):115–6. Rotenstein LS, Ramos MA, Torre M, Segal JB, Peluso MJ, Guille C, et al. Prevalence of Depression, Depressive Symptoms, and Suicidal Ideation Among Medical Students: A Systematic Review and Meta-Analysis. JAMA. 2016;316(21):2214–36. Tian-Ci Quek T, Wai-San Tam W, X. Tran B, Zhang M, Zhang Z, Su-Hui Ho C, et al. The Global Prevalence of Anxiety Among Medical Students: A Meta-Analysis. International Journal of Environmental Research and Public Health. 2019;16(15). Kubwimana L, Mutatsineza G, Tesi L, Wong R. Assessing the Stress Level among Medical Students in Rwanda. Open Journal of Psychiatry. 2022;12(02):174–87. Sarfraz A, Siddiqui S, Galante J, Sikander S. Feasibility and Acceptability of an Online Mindfulness-Based Intervention for Stress Reduction and Psychological Wellbeing of University Students in Pakistan: A Pilot Randomized Controlled Trial. Int J Environ Res Public Health. 2023;20(8). L'Hote D, Potiron L, Levaillant M. Assessing psychological distress among medical students: a systematic review and meta-analysis of tools available. BMC Med Educ. 2026;26(1):267. Dyrbye LN, Thomas MR, Shanafelt TD. Systematic review of depression, anxiety, and other indicators of psychological distress among U.S. and Canadian medical students. Acad Med. 2006;81(4):354–73. Rudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat Mach Intell. 2019;1(5):206–15. . Firth J, Solmi M, Wootton RE, Vancampfort D, Schuch FB, Hoare E, et al. A meta-review of "lifestyle psychiatry": the role of exercise, smoking, diet and sleep in the prevention and treatment of mental disorders. World Psychiatry. 2020;19(3):360–80. Kandola A, Ashdown-Franks G, Hendrikse J, Sabiston CM, Stubbs B. Physical activity and depression: Towards understanding the antidepressant mechanisms of physical activity. Neurosci Biobehav Rev. 2019;107:525–39. Shatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of methods and applications. Psychol Med. 2019;49(9):1426–48. Alvaro PK, Roberts RM, Harris JK. A Systematic Review Assessing Bidirectionality between Sleep Disturbances, Anxiety, and Depression. Sleep. 2013;36(7):1059–68. Maki KA, Yang L, Farmer N, Papneja S, Wallen GR, Barb JJ. Sleep regularity and duration are associated with depression severity in a nationally representative United States sample. Neurobiol Sleep Circadian Rhythms. 2025;19:100133. Wallace DA, Redline S, Sofer T, Kossowsky J. Environmental Bright Light Exposure, Depression Symptoms, and Sleep Regularity. JAMA Netw Open. 2024;7(7):e2422810. Chaabane S, Chaabna K, Khawaja S, Aboughanem J, Mittal D, Mamtani R, et al. Sleep disorders and associated factors among medical students in the Middle East and North Africa: a systematic review and meta-analysis. Sci Rep. 2024;14(1):4656. Nsengimana A, Mugabo E, Niyonsenga J, Hategekimana JC, Biracyaza E, Mutarambirwa R, et al. Sleep quality among undergraduate medical students in Rwanda: a comparative study. Sci Rep. 2023;13(1):265. Twenge JM, Campbell WK. Associations between screen time and lower psychological well-being among children and adolescents: Evidence from a population-based study. Prev Med Rep. 2018;12:271–83. Alchalabi S, Layth A. Exploring the Impact of Academic Stress on Depression Levels in Medical Students. The Medical Journal of Tikrit University. 2025;31(2):413–22. Additional Declarations The authors declare no competing interests. Supplementary Files BMCsupplementaryfile.zip Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9558928","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":631379794,"identity":"85aff62c-7707-4da9-a51a-acd42a42c4f3","order_by":0,"name":"Farah Rashid","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABMElEQVRIie3RMWuDQBTA8SeCLlezngj1KyiCEEzIV7kjYKaUQpZO6RXBLOlu6eBXuLGjRUiXdLeklLpkLnQxdOnZQhpQId1Kuf/kIT/eOw5AJvuD6QxA/frCCsNvAOTnXwagsCZB2QExkz0hRxEACx1F9Mfy/RxebPs2utoM7+ZnNjtZmbsKTo2CaGXSQtDEsxKYufz5Pgqm63zmZEZoIQKeWRDd5U0ygrDehygc09iaxhnlgHxLLEa5mGK+tkzpbdUPQUZpIkg/ntOUId+sCFx2Ehxq9RTKCkGUWKUsQz4WixGnJi2LIbzVAuSQMS9o1L+Oc8pz5AUoxO7Nuly4bdfvheoGXZBhmkzKYlcvtli6T9VgYBsP41W5bJLvnMPD/plAibtAd9rviUwmk/3DPgEd62PEmVlFuQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0009-0006-5931-6204","institution":"National University of Sciences and Technology, Islamabad. Pakistan","correspondingAuthor":true,"prefix":"","firstName":"Farah","middleName":"","lastName":"Rashid","suffix":""},{"id":631380185,"identity":"83559672-bd19-414f-a48f-fea0fe61c6b4","order_by":1,"name":"Ahmed Waqas","email":"","orcid":"","institution":"University of Liverpool, UK","correspondingAuthor":false,"prefix":"","firstName":"Ahmed","middleName":"","lastName":"Waqas","suffix":""},{"id":631380186,"identity":"fa5ca6d3-e2b0-46a3-aabf-f92778311fdf","order_by":2,"name":"Rafay Rashed Siddiqui","email":"","orcid":"","institution":"Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau. Germany","correspondingAuthor":false,"prefix":"","firstName":"Rafay","middleName":"Rashed","lastName":"Siddiqui","suffix":""},{"id":631380187,"identity":"56c068bc-3dbc-442a-90f0-6de0b3862c44","order_by":3,"name":"Talha Ahmed","email":"","orcid":"","institution":"University: California State Polytechnic University, Pomona. USA","correspondingAuthor":false,"prefix":"","firstName":"Talha","middleName":"","lastName":"Ahmed","suffix":""},{"id":631380188,"identity":"87952ace-01b0-4116-b75b-d66b446f6570","order_by":4,"name":"Atif Rahman","email":"","orcid":"https://orcid.org/0000-0002-2066-4467","institution":"University of Liverpool. UK","correspondingAuthor":false,"prefix":"","firstName":"Atif","middleName":"","lastName":"Rahman","suffix":""},{"id":631380189,"identity":"211f9448-ec2e-4f04-bfbb-0da132f33b93","order_by":5,"name":"Abid Malik","email":"","orcid":"","institution":"Health Services Academy, Islamabad. Pakistan","correspondingAuthor":false,"prefix":"","firstName":"Abid","middleName":"","lastName":"Malik","suffix":""}],"badges":[],"createdAt":"2026-04-29 00:59:09","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-9558928/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9558928/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109147045,"identity":"68333e2e-debd-40c6-8b6d-888af6e01d53","added_by":"auto","created_at":"2026-05-13 04:26:42","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":192150,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/8c94f5c12e204d4f341a9f7a.png"},{"id":109205340,"identity":"8c73b1e4-29e4-49d8-a2b8-6765d52123c4","added_by":"auto","created_at":"2026-05-13 15:04:18","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":143516,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/e0317bd91692435cb0535403.png"},{"id":109205059,"identity":"c70e9ab4-18d9-4124-a6d7-36914210783d","added_by":"auto","created_at":"2026-05-13 15:03:12","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":191969,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/d3cfa46a139a3b69ba4564f4.png"},{"id":109205002,"identity":"93e4dd8c-3405-4323-be96-e006aba25e1f","added_by":"auto","created_at":"2026-05-13 15:03:10","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":166112,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/31dbdb59525f08ec83f1a2c7.png"},{"id":109147046,"identity":"d9a7365b-1dd0-43c6-a071-9b8bac25c3ca","added_by":"auto","created_at":"2026-05-13 04:26:42","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":191709,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/243468ef11098f5391d180e5.png"},{"id":109205350,"identity":"ccf24539-b051-43cc-a251-49a7622d15d7","added_by":"auto","created_at":"2026-05-13 15:04:22","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":432321,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/67f7ba79ba1bdf9f486c97be.png"},{"id":109147048,"identity":"dca50804-df0d-4cbc-bb31-4aa97086b238","added_by":"auto","created_at":"2026-05-13 04:26:42","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":469555,"visible":true,"origin":"","legend":"\u003cp\u003eLegend not included with this version\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/60f0cbae45ef18d07335839a.png"},{"id":109249413,"identity":"34e12517-4375-4436-82a7-bcddc1ebfe95","added_by":"auto","created_at":"2026-05-14 08:51:09","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1968900,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/59c11372-dad4-4bf7-b788-3fd4072c7a7c.pdf"},{"id":109206012,"identity":"ea0439cf-3dce-4308-90bb-30d19229c53b","added_by":"auto","created_at":"2026-05-13 15:10:30","extension":"zip","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1770694,"visible":true,"origin":"","legend":"","description":"","filename":"BMCsupplementaryfile.zip","url":"https://assets-eu.researchsquare.com/files/rs-9558928/v1/f637b3c71c4efe2433298c84.zip"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003e\u003cstrong\u003eExplainable Machine Learning Models as Screening tool for Anxiety and Depression in Medical Students Based on Non-Stigmatizing Lifestyle and Sociodemographic Factors- Pakistan. A Validation Study\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cbr\u003e\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eStudents are the most vulnerable population group facing stressful events in life, especially those achieving higher professional education (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). Increasing levels of stress, anxiety and depression among medical undergraduates has been recognized as a significant and under-reported health problem in the community (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e). It is also observed that medical graduates have increased levels of mental distress compared to their peers and the general population (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e). This alarming situation has also been identified in both developed and developing countries (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e). Global evidence shows that medical students have an excessively large proportion of psychological distress due to academic demands, competitive contexts, and clinical stressors (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e). Additional stressors were attributed to inflexible teaching schedules, an intensive curriculum, and stringent administrative frameworks (\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e). This burden is further aggravated in low- and middle-income countries (LMICs) such as Pakistan due to added challenges of political instability, sociocultural stigma towards mental health, lowered access to professional mental health services and institutional support systems (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eAlthough the issue is substantial, there is still a lack of effective screening of medical students with mental health issues. There are multiple tools available for assessing psychological distress in medical students However, there is no consensus on the tools used to detect these symptoms (\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e). Traditional screening methods are based on self-reported psychological symptoms that tend to be affected by stigma, fear of discrimination, and distress of academic or professional repercussions (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e). Such barriers are especially acute in collectivist and stigmatizing situations, in which the sharing of mental health issues can be discouraged.\u003c/p\u003e \u003cp\u003eNew developments in the field of artificial intelligence (AI) and machine learning (ML) present new opportunities to conduct mental health screening by detecting multidimensional trends in data. Nevertheless, the transparency, interpretability, and ethical accountability of the use of ML in mental health have been constrained due to high-stakes decision-making settings (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e). The limitations are mitigated by explainable machine learning (XAI) methods like SHapley Additive explanations (SHAP) which give understandable insights into how a model makes predictions, thus increasing trust and making it easier to use in practice (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eParticularly, some lifestyle and contextual variables, including sleeping patterns, physical activity, academic achievements, and social support, have been repeatedly related to mental health outcomes and are non-stigmatizing proxies of psychological distress(\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e). Using these variables will enable the creation of scalable and cultural acceptable screening instruments that will reduce stigma and allow early identification.\u003c/p\u003e \u003cp\u003eWith this background, the purpose of the current research was to design and test explainable machine learning models to identify anxiety and depression screening among medical students based on non-stigmatizing lifestyle and sociodemographic variables. Our hypothesis was that these models would have reasonable predictive accuracy and be interpretable, and hence, provide a scalable and ethically acceptable and context-specific solution to mental health screening in resource-constrained learning environments.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eThis was a cross-sectional study conducted in Islamabad, capital of Pakistan. The study recruited 1630 first- and second-year medical students through universal sampling approach from nine different medical colleges of Islamabad selected by simple random techniques. Data was collected from the selected medical colleges through an on-site survey. Prior to administering the questionnaire, the researchers conducted an introductory session with the students, during which the study objectives, procedures, and ethical considerations were explained. The questionnaire was then introduced, and its structure briefly explained to ensure understanding. Subsequently, all the students completed the questionnaire in the presence of the researchers, who were available to address any queries. This approach facilitated a high participation with response rate of 91%. Inclusion criteria were enrolled medical students in medical colleges of Islamabad. Students who declined to participate or didn\u0026rsquo;t give consent or were absent during data collection period and incomplete questionnaires were excluded.\u003c/p\u003e \u003cp\u003eParticipation was voluntary; informed consent was taken and confidentiality assured. The study protocol was reviewed and approved by the institutional ethics committee (Approval No. 00009 IHSA/P\\D-2022). Development of data collection tools was guided by extensive stakeholder engagement involving students, faculty, and mental health professionals to ensure contextual relevance and acceptability of variables to be included. Sociodemographic variables (e.g., age, gender, year of study, socioeconomic indicators) and lifestyle factors (e.g., sleep patterns, physical activity, academic workload, and social factors) were collected. Anxiety and depression were assessed using validated self-report instruments (GAD-7 and PHQ-9).\u003c/p\u003e \u003cp\u003e \u003cb\u003eData Preprocessing and MLM Development.\u003c/b\u003e \u003c/p\u003e \u003cp\u003eFollowing data cleaning, imputation, and encoding, the dataset was split into training (80%) and testing (20%) subsets. Separate Random Forest classification models were developed for anxiety and depression due to their capacity to model complex, non-linear relationships. Hyperparameters were optimized using cross-validation. The performance on the held-out test dataset was evaluated on the standard classification measures of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve (AUC-ROC). These measurements were selected because they offer a level of balance when assessing predictive performance, especially when it comes to mental health screening. The MLM coding is available as supplementary file.\u003c/p\u003e \u003cp\u003e \u003cb\u003eExplainability of Models with SHAP Analysis.\u003c/b\u003e \u003c/p\u003e \u003cp\u003eSHapley Additive exPlanations (SHAP) were used to measure the importance of each predictor to model predictions in order to increase interpretability and facilitate practical application. SHAPs were computed per observation and summed up to establish variable\u0026rsquo;s significant on a global scale. Furthermore, analysis based on the class specific SHAP was performed to investigate the impact of various classes within sociodemographic and lifestyle variables on the probability of experiencing anxiety and depression.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStatistical Software\u003c/h2\u003e \u003cp\u003eAnalysis was done in existing packages of statistical and machine learning software. First part of Data analysis was performed using SPSS statistical software version 26.0 (IBM). Multiple regression was performed to assess the effect of several factors on the likelihood that respondents have anxiety and depression according to GAD-7 and PHQ-9 scale score. Analysis was conducted using anxiety and depression as the dependent variable and several sociodemographic and behavioral factors as the independent variables, including age, sex, place of residence, type of family setup, physical activity, family and social support, sleep pattern, and academic performance The results were presented with statistical significance (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05), regression coefficients, and 95% confidence intervals for beta-coefficient for each of the predictors. Secondly, Open-source libraries were used to develop the model, perform validation and SHAP analysis, and the reproducibility and transparency of the results. (coding files attached as supplementary file)\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eCross-Sectional study findings\u003c/h2\u003e \u003cp\u003eA total of 1630 participants were surveyed, and overall prevalence of depression and anxiety was found 57.8% and 46.4%, respectively. The age distribution was 1274 (78.2%) aged 18\u0026ndash;20 years, 314 (19.3%) aged 21-23years, 28 (1.7%) \u0026lt;18years, and 14 (0.9%) aged 24-26years. Of the participants, 1,100 (67.5%) were female and 530 (32.5%) were male. Academic year distribution showed that 800 (49.1%) participants were first-year medical students, while 830(50.9%) were in their second year. Socioeconomic backgrounds varied, with 1302 (79.9%) from upper middle-income, 272(16.7%) from high-income, and 56 (3.4%) from low middle-income. In terms of residency, 694 (42.6%) of participants were hostellers, while 936 (57.4%) lived with their families. Most participants, 1216 (74.6%), came from nuclear families, while 414 (25.4%) were from joint families. Regarding parental status, 1524 (93.5%) participants had both parents living together and 106 (6.5%) had one deceased parent.\u003c/p\u003e \u003cp\u003eMultivariable ordinal logistic regression models were used to test hypothesis and to estimate the adjusted odds ratio (aOR) for anxiety and depression. Before the analysis, we checked the assumptions, the proportional odds assumption was tested using the Brant test and variance inflation factors (VIF) were used to detect multicollinearity (VIF less than 5 was considered acceptable). The p-value cut-off was p \u0026lt; .05. Variables with potential clinical or statistical relevance were included in the model to adjust for confounding. Adjusted odds ratios (ORs) with 95% confidence intervals (CIs) were reported, and statistical significance was set at p\u0026thinsp;\u0026lt;\u0026thinsp;0.05. This analysis showed that female gender had higher odds with both anxiety (aOR\u0026thinsp;=\u0026thinsp;2.13, 95% CI: 1.59\u0026ndash;2.86) and depression (aOR\u0026thinsp;=\u0026thinsp;1.64, 95% CI: 1.22\u0026ndash;2.22). Significant risk factors for both outcomes included irregular sleep (anxiety: aOR\u0026thinsp;=\u0026thinsp;1.96, 95% CI: 1.39\u0026ndash;2.78; depression: aOR\u0026thinsp;=\u0026thinsp;2.27, 95% CI: 1.64\u0026ndash;3.23), family history of mental illness (anxiety: aOR\u0026thinsp;=\u0026thinsp;2.38, 95% CI: 1.72\u0026ndash;3.33; depression: aOR\u0026thinsp;=\u0026thinsp;3.23, 95% CI: 2.33\u0026ndash;4.76), history of harassment (anxiety: aOR\u0026thinsp;=\u0026thinsp;2.04, 95% CI: 1.52\u0026ndash;2.70; depression: aOR\u0026thinsp;=\u0026thinsp;2.56, 95% CI: 1.85\u0026ndash;3.45), and peer pressure (anxiety: aOR\u0026thinsp;=\u0026thinsp;2.78, 95% CI: 2.04\u0026ndash;3.70; depression: aOR\u0026thinsp;=\u0026thinsp;3.45, 95% CI: 2.50\u0026ndash;4.76). Protective factors included strong social support (anxiety: aOR\u0026thinsp;=\u0026thinsp;0.63, 95% CI: 0.48\u0026ndash;0.84; depression: aOR\u0026thinsp;=\u0026thinsp;0.49, 95% CI: 0.37\u0026ndash;0.66), strong family support (anxiety: aOR\u0026thinsp;=\u0026thinsp;0.49, 95% CI: 0.32\u0026ndash;0.75; depression: aOR\u0026thinsp;=\u0026thinsp;0.47, 95% CI: 0.31\u0026ndash;0.71), and positive coping style (anxiety: aOR\u0026thinsp;=\u0026thinsp;0.53, 95% CI: 0.40\u0026ndash;0.70; depression: aOR\u0026thinsp;=\u0026thinsp;0.49, 95% CI: 0.37\u0026ndash;0.64). Additionally, higher screen time (\u0026ge;\u0026thinsp;5 h vs\u0026thinsp;\u0026lt;\u0026thinsp;2 h) increased the odds of anxiety (aOR\u0026thinsp;=\u0026thinsp;1.79, 95% CI: 1.12\u0026ndash;2.78) and depression (aOR\u0026thinsp;=\u0026thinsp;2.33, 95% CI: 1.45\u0026ndash;3.70), while excellent academic performance was associated with lower odds of both outcomes (anxiety: aOR\u0026thinsp;=\u0026thinsp;0.49, 95% CI: 0.33\u0026ndash;0.74; depression: aOR\u0026thinsp;=\u0026thinsp;0.35, 95% CI: 0.22\u0026ndash;0.53). Full Detailed Table included in supplementary file.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eMachine Learning Models Analysis\u003c/h3\u003e\n\u003cp\u003eThis report presents the findings of a machine learning analysis aimed at predicting mental health outcomes among students. Specifically, the study focuses on predicting Anxiety (measured using GAD-7 scale) and Depression (measured using PHQ-9 scale) based on various demographic and lifestyle risk factors. The analysis employs the Random Forest classification algorithm to identify patterns in the data and determine which factors contribute most significantly to mental health outcomes. Additionally, SHAP (SHapley Additive exPlanations) values are used to understand how specific classes within each variable influence predictions.\u003c/p\u003e\n\u003ch3\u003eDataset Overview\u003c/h3\u003e\n\u003cp\u003eThe dataset comprises survey responses from 1,630 student participants. The data was collected in SPSS format and contains information about various demographic characteristics, lifestyle factors, and mental health assessments. The analysis utilized 11 independent variables (risk factors) to predict 2 dependent variables (anxiety and depression):\u003c/p\u003e \u003cp\u003e \u003cb\u003eIndependent Variables (Predictors)\u003c/b\u003e:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eGender: Gender of the participant (Male/Female)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eYOS: Year of Study (1st Year/2nd Year)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eResidential Status: Living arrangement (Hostel/Family)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eScreen Time: Daily screen time usage (\u0026lt;\u0026thinsp;2hrs/3-4hrs/\u0026gt;5hrs)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eSleep pattern: Quality of sleep patterns (\u0026lt;\u0026thinsp;6hrs/7-8hrs/irregular)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePeer Pressure: Level of peer pressure experienced (No/Yes)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eSocial Media Usage: Extent of social media engagement (rarely/occasionally/daily)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eFamily Support: Level of family support received (No/Yes)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eLack of Social Support: Lack of strong social support network (No/Yes)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003ePhysical Activity: Level of physical activity (sedentary/Moderate(1\u0026ndash;2 days/week)/High(+\u0026thinsp;3 days/week))\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eAcademic Performance: Academic performance level (Failing performance/poor performance/moderate performance/good performance/excellent performance)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eDependent Variables (Outcomes)\u003c/b\u003e:\u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eAnxiety (GAD-7): Measured on a scale of 0\u0026ndash;3, representing severity levels of anxiety symptoms\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eDepression (PHQ-9): Measured on a scale of 0\u0026ndash;4, representing severity levels of depressive symptoms\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eData Quality\u003c/h2\u003e \u003cp\u003eThe dataset was examined for missing values and data quality issues. The analysis confirmed that there were no missing values across any of the 11 variables used in this study, ensuring complete data for all 1,630 participants.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eMethodology- Machine Learning Models\u003c/h3\u003e\n\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eRandom Forest Classifier\u003c/h2\u003e \u003cp\u003eThe Random Forest algorithm was selected for this classification task due to its robustness, ability to handle non-linear relationships, and built-in feature importance estimation. Random Forest is an ensemble learning method that constructs multiple decision trees during training and outputs the class that is the mode of the classes predicted by individual trees.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eModel Configuration\u003c/h2\u003e \u003cp\u003eThe following configuration was used for both the Anxiety and Depression prediction models:\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Taba\" border=\"1\"\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eParameter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValue\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNumber of Trees (n-estimators)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e100\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMaximum Depth\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNone (unlimited)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMinimum Samples to Split\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMinimum Samples per Leaf\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRandom State\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e42 (for reproducibility)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eData Splitting Strategy\u003c/h2\u003e \u003cp\u003eThe dataset was divided into training (80%) and testing (20%) sets using stratified sampling. Stratification ensures that the proportion of each class in the target variable is maintained in both the training and testing sets, which is particularly important when dealing with imbalanced classes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eSHAP Analysis\u003c/h2\u003e \u003cp\u003eSHAP (SHapley Additive exPlanations) values were computed using TreeExplainer for Random Forest models. SHAP provides both the magnitude and direction of each feature's impact on predictions, allowing us to identify which specific classes within each variable increase or decrease the risk of anxiety and depression. Key metrics include:\u003c/p\u003e \u003cp\u003e \u003cul\u003e \u003cli\u003e \u003cp\u003eMean |SHAP|: Average absolute impact (higher\u0026thinsp;=\u0026thinsp;more influential)\u003c/p\u003e \u003c/li\u003e \u003cli\u003e \u003cp\u003eMean SHAP (signed): Direction of impact (positive\u0026thinsp;=\u0026thinsp;increases risk, negative\u0026thinsp;=\u0026thinsp;decreases risk)\u003c/p\u003e \u003c/li\u003e \u003c/ul\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eResults of SHAP-Model Performance\u003c/h2\u003e \u003cdiv id=\"Sec15\" class=\"Section3\"\u003e \u003ch2\u003eModel Performance\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabb\" border=\"1\"\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCorrect Predictions\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTotal Test Samples\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAnxiety (GAD-7)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e76.69%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e275\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e326\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDepression (PHQ-9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e77.30%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e264\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e326\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eDetailed Classification Metrics - Anxiety Model\u003c/h2\u003e \u003cp\u003eThe following table presents precision, recall, and F1-score for each anxiety severity class:\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabc\" border=\"1\"\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClass\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSupport\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e0 (Minimal)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.82\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e175\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1 (Mild)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.72\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.76\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.74\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e128\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2 (Moderate)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.42\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e19\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3 (Severe)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.50\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.67\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAcademic and lifestyle-related factors are main drivers of the anxiety model, where the academic performance makes the largest contribution (\u0026asymp;\u0026thinsp;15%), then physical activity, screen time, and sleep pattern, and all are above or close to the mean contribution (~\u0026thinsp;9%). This implies that the most influential predictors of anxiety risk are performance pressure and the daily behavioral habit. Mid-level effects are residential status and year of study that captures contextual and transitional effects, whereas social and demographic factors like inadequate social support, social media use, peer pressure, gender, and family support have less and are below the average level. In general, the model highlights the point that changeable behaviors and academic stressors are much more predictive of anxiety than have been the personal or social features that remain constant.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe confusion matrix shows that the model has a moderate performance (accuracy\u0026thinsp;\u0026asymp;\u0026thinsp;76.69%), with strong and more confident predictions, on lower levels of anxiety (class 0 and 1), with most of the predictions being correctly recognized (158 and 103), which means the model is very reliable for the majority of cases. Nevertheless, there is a decrease in performance with higher severity classes (2 and 3) and a clear misclassification in the next lower classes, indicating that moderate-to-severe anxiety is difficult to differentiate. Mistakes are mainly within adjacent classes (e.g., 0 1 and 1 2), which means that the model has a good representation of overall severity patterns but cannot distinguish between boundaries accurately. Overall, the model is trustworthy in the determination of low to mild anxiety but not so effective in determining the higher levels of anxiety.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eDetailed Classification Metrics - Depression Model\u003c/h2\u003e \u003cp\u003eThe following table presents precision, recall, and F1-score for each depression severity class:\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabd\" border=\"1\"\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClass\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePrecision\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRecall\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eF1-Score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSupport\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e0 (Minimal)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e138\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1 (Mild)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.77\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e150\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2 (Moderate)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.65\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.53\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.59\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e32\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3 (Moderately Severe)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.75\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4 (Severe)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eSleep pattern, academic performance, and physical activity are the key factors in the depression model, with an equal contribution of about 1314% and above the average level of importance, indicating the key role of behavioral and functional factors. There is also an impressive contribution of screen time, making it a worthwhile risk factor in terms of lifestyle. Residential status and year of study are considered mid-level influences, whereas psychosocial and demographic factors (peer pressure, gender, use of social media, absence of social support, family support) are less important and below average. In general, the model highlights that the risk of depression is mainly determined by the alterable everyday behaviors, especially sleep, but not immutable personal or social traits.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThis confusion matrix shows a solid improvement in the depression model (accuracy\u0026thinsp;\u0026asymp;\u0026thinsp;77.30%), with strong performance in identifying minimal and mild cases, which make up most of the data. Maximum individuals in these groups are correctly classified (110 and 130), which means that there is a good reliability of common and lower-severity groups. However, there is a drop in performance of moderate cases with a observable misclassification into the neighboring low categories which may indicate underestimation of severity. In moderately severe and severe classes, the model performs poorly because the support is very low, and the model only makes a limited number of correct predictions and tends to misclassify to the lower levels. Most errors are between adjacent classes (0 01 and 1 12), and this means that the model is accurate in overall-severe progression but fails to be specific to the higher-risk levels.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eFeature Importance Analysis\u003c/h2\u003e \u003cp\u003eFeature importance analysis reveals which risk factors have the greatest influence on predicting mental health outcomes. The Random Forest algorithm calculates importance based on how much each feature contributes to reducing impurity across all trees.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003eTop Risk Factors for Anxiety (GAD-7):\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabe\" border=\"1\"\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRank\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRisk Factor\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eContribution (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcademic Performance\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e15.19%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePhysical Activity\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.08%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eScreen Time\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e11.78%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSleep Pattern\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e11.26%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eResidential Status\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9.08%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYOS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e8.33%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLack of Social Support\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.10%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSocial Media Usage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.08%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePeer Pressure\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.14%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGender\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.91%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFamily Support\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.05%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec20\" class=\"Section3\"\u003e \u003ch2\u003eTop Risk Factors for Depression (PHQ-9):\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"No\" id=\"Tabf\" border=\"1\"\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRank\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRisk Factor\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eContribution (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSleep pattern\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.74%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcademic Performance\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.54%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePhysical Activity\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e13.53%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eScreen Time\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e11.49%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eResidential Status\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e8.81%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYOS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e8.06%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePeer Pressure\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7.08%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGender\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.77%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSocial Media Usage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.96%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLack of Social Support\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.83%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFamily Support\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5.20%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe comparison shows that both models are powered by the behavioral and academic factors but with different priorities: the factor of anxiety is the most driven by the academic performance which signifies performance-related stress as the key factor and the factor of depression is the most driven by the sleep pattern which denotes a stronger physiological and regulatory factor. Both models have consistently had physical activity and screen time which supports the importance of lifestyle behaviors across conditions. Residential status and year of study are contextual factors, moderately influential in the two, with psychosocial and demographic variables playing a lesser role overall. Notably, peer pressure has relatively greater importance in depression compared to anxiety, whereas lack of social support is slightly more relevant for anxiety. In general, anxiety seems to be more outwardly influenced by performance and activity pressures whereas depression is more influenced by inner regulation, especially sleep as well as common lifestyle risks.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003eClass-Level Impact Analysis\u003c/h2\u003e \u003cp\u003eThis section examines which specific classes within each categorical variable have the highest and lowest association with anxiety and depression outcomes. This analysis helps identify specific risk groups that may benefit most from targeted interventions. (this whole analysis is available as separate supplementary file)\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003eSHAP Values Analysis\u003c/h2\u003e \u003cp\u003eSHAP (SHapley Additive exPlanations) values provide insight into how each class within a variable influences the model's predictions. Positive SHAP values indicate that the class increases the predicted risk of anxiety/depression, while negative values indicate a protective effect.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe interaction plot by SHAP shows that the captured features (gender, year of study, residential status and screen time) exhibit relatively low and well-concentrated interaction effects on predicting anxiety and that most of the values are clustering around zero, implying that the features do not have a strong pair-wise effect unlike the strong main effects. The dispersion of gender and year of study is slightly wider, suggesting a slight variability of interaction, especially between them and residential status, although this is not strongly directed. The interactions of screen time seem to be weak and highly concentrated around zero which means that its impact on anxiety is more independent than based on interaction with these demographic variables. More precisely, the plot indicates that the prediction of anxiety depends more on the contribution of individual features as opposed to the interaction between these variables which supports the assertion that the primary behavioral factors are stronger than the demographic interaction.\u003c/p\u003e \u003cp\u003eHowever, in contrast, the wider model results show that the most significant drivers, i.e. the key behavioral factors, are academic performance, physical activity, sleep pattern, and screen time (as a main effect), i.e. the prediction of anxiety is influenced much more by these individual lifestyle or performance-related factors than by the interactions between demographic or contextual factors.\u003c/p\u003e \u003cdiv id=\"Sec23\" class=\"Section3\"\u003e \u003ch2\u003eKey SHAP Findings - Depression:\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe interaction plot indicates that the majority of the feature interactions in the depression model are weak with most values being centered around zero, which means that the prediction of depression is primarily influenced by independent (main) effects and not strong pairwise interactions. The year of study (YOS) has the most dispersion, indicating that it has the most prominent, yet not very significant, interactions, with gender and residential status, which are probably due to differences in stress or adaptation between academic levels. Gender interaction is also mildly spread, but without the consistent directional effect, meaning that it has little influence on other variables. Residential status exhibits very little interaction with a high degree of concentration around zero, which means that its impact is largely independent.\u003c/p\u003e \u003cp\u003eThe screen and sleep pattern have the least overall interaction, and these variables are very tightly clustered around the zero-point meaning that its effect on depression is more direct behavioral effects than of interaction among variables with demographic or contextual variables. This is particularly because sleep pattern is a leading predictor in the model- its effect is high but is mostly independent.\u003c/p\u003e \u003cp\u003eOverall, the plot confirms that key behavioral drivers such as sleep, physical activity, and screen time act independently, while demographic/contextual variables (gender, YOS, residential status) contribute weakly and do not meaningfully interact, reinforcing that depression risk is primarily shaped by individual lifestyle factors rather than complex interplay.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section2\"\u003e \u003ch2\u003eImportance of Features and SHAP-Based Model Explainability\u003c/h2\u003e \u003cp\u003eThe most significant predictors in terms of contributing to the risk of anxiety and depression were determined using the feature importance analysis of the Random Forest models. The strongest predictors of the anxiety results were academic performance, physical activity, screen time, and sleep patterns with residential status, year of study, and social support indicators coming next. The same trend was found on depression whereby sleep patterns, academic performance, physical activity, and screen time were the most important predictors.\u003c/p\u003e \u003cp\u003eTo further explain model predictions, SHAP ( SHapley Additive exPlanations ) values have been calculated with the help of the TreeExplainer algorithm. SHAP analysis was able to measure the magnitude and direction of each variable contribution, which was important in identifying risk-enhancing and protective effects. The mean absolute SHAP values of the lifestyle-related variables were always higher than those of the sociodemographic variables that tighten the prevalence of their dominance in the stratification of mental health risks.\u003c/p\u003e \u003cp\u003eIn both results, negative sleeping habits, poor academic achievement, low physical activity, and more screen time were linked to higher predicted risk whereas positive social and familial conditions were protective. SHAP visualizations also identified a non-linear and class-specific relationship, which means that the impact of some factors was different in different categories of severity. The results are an example of the value add of explainable machine learning in helping reveal multifaceted and understandable risk patterns that cannot be identified through more conventional regression-based models.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThis study investigated the prevalence and associated factors of anxiety and depression among undergraduate medical students, revealing that depression and anxiety were experienced 57.8% and 46.4% respectively. These findings reflect a moderate burden of mental health issues in this population and underscore the influence of various socio-demographic and psychosocial factors.\u003c/p\u003e \u003cp\u003eThis paper offers empirical data which reports that explainable machine learning models based on non-stigmatizing lifestyle and sociodemographic variables can be useful for screening anxiety and depression in medical students. The achieved accuracies, 76.69% per cent of anxiety and 77.30 per cent of depression, are similar to the current machine learning applications in the mental health industry, where predicting is a complicated task because of multifactorial effects (\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e). This evidence demonstrates the usefulness of such models as screening tools as opposed to diagnostic ones.\u003c/p\u003e \u003cp\u003eOne of the major contributions of the research is that it uses explainable AI with SHAP analysis, which increases the interpretability of the research and can give actionable insights into risk factors. Of these, sleep quality was the most significant predictor potentially modifiable, and also in line with the vast literature that has identified the presence of a two-way relationship between sleep disturbances and mental health disturbances (\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e). This SHAP analysis proves that the length of sleep is a significant predictor of anxiety forecasting, and the most significant protective effect is by optimal sleep (7\u0026ndash;8 hours). Conversely, the correlation between sleep patterns and depression seems to be more complicated and non-linear with some unexpected protective relations in all categories and especially in irregular sleep schedules. It should be viewed with caution, instead of as indicative of protective action. A significant amount of recent evidence invariably shows that the lack of regularity in sleep and the decrease in the duration of sleep is associated with the increase in the degree of depressive symptoms (\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e). This correlation is specifically applicable when dealing with medical student groups, in which sleep disturbances and disruptions are frequent and are typically grouped within the larger umbrella of the psychosocial-level training burden (\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e). Intervention approaches such as sleep education, behavioral changes, and relaxing techniques are recommended to address contributing factors. This identifies sleep hygiene interventions as a viable and scalable point of mental health promotion in medical institutions (\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe fact that high screen time is associated with negative mental health outcomes is consistent with the rising evidence across the world regarding the correlation between high levels of digital use and higher rates of anxiety and depression, as well as sleep disturbances (\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e). This highlights the need to include digital well-being aspects in the student support programs. Similarly, the preventive influence of physical activity in the current study supports the prior viewed research on its antidepressant and anxiolytic effects (\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e). Campus-wide policies that encourage organized exercise, including incorporating physical activity into campus life, can have significant psychological impacts.\u003c/p\u003e \u003cp\u003eThe interdependence between academic and psychosocial determinants of mental health is also manifested in the role of academic performance and social support. Academic underachievement can serve as a cause and effect of psychological distress, which implies the necessity of an academic and mental health support intervention (\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e). At the same time, the close connection with social support highlights the role of peer networks and mentorship systems along with community-building movements.\u003c/p\u003e \u003cp\u003eThrough the application of the non-stigmatizing variables, the proposed models overcome a major barrier in mental health screening, which is stigma, and thus increase acceptability and participation. In addition, the models can be explained, which will assist in ethical transparency, which is necessary to implement them in education and policy-making.\u003c/p\u003e \u003cp\u003eThese are significant policies and practice implications of these findings. To begin with, such models can be incorporated in medical colleges and universities as a regular student wellness activity as an early screening tool to identify timely in-risk students. Second, the detection of changes in lifestyle creates the basis of specific preventive strategies, including sleep hygiene programs, physical activity, and online wellness training. Third, at the systems level, this strategy is consistent with the demand of the scalable and low-cost interventions in mental health in LMICs like Pakistan, where limited resources of specialist\u0026rsquo;s care are available.\u003c/p\u003e \u003cp\u003e \u003cb\u003eeMental Health Implementation and mHealth Relevance.\u003c/b\u003e \u003c/p\u003e \u003cp\u003e The paper complies with the guidelines of the mHealth Evidence Reporting and Assessment (mERA) initiative, as it shows the practicability of a digitally enabled and scalable mental health screening solution that is applicable to resource-limited educational environments. Routinely collectable non-stigmatizing lifestyle and sociodemographic variables are beneficial in terms of acceptability and feasibility, which are major areas of focus in the implementation research.\u003c/p\u003e \u003cp\u003eThis is especially applicable to LMIC settings like Pakistan in which mental health workforce shortages and added challenges such as political instability, lowered access to institutional support prevent access to specialist care due to its reach and scalability. Since the model allows stratification of risks at the population level early, it enables task-shifting strategies in which educators and non-specialty staff members could recognize at-risk students and make corresponding referrals or preventive measures.\u003c/p\u003e \u003cp\u003eMoreover, the determination of modifiable lifestyle predictors is consistent with behavioral intervention models and can be combined with low-cost, evidence-based procedures, including sleep hygiene education, physical activity promotion, peer support, etc. This makes the model not a screening tool only but also a decision-support model to plan specific interventions, which is furthering the agenda of precision mental health in the community.\u003c/p\u003e \u003cp\u003e \u003cb\u003eTowards Precision Mental Health within Academic Institutions.\u003c/b\u003e \u003c/p\u003e \u003cp\u003eExplainable machine learning is the future of the field as it is integrated to provide precision mental health. It allows profiling risks individually and is interpretable and ethically transparent. The model can be conceptualized as a screening tool to decision-support system that might inform specific preventive interventions, which include sleep-hygiene program and peer-support programs.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eTo sum up, this paper illustrates that explainable machine learning models with non-stigmatizing lifestyle factors can be effective, scalable, and ethically transparent in screening anxiety and depression in university students. These strategies have great potential towards enhancing early detection and targeted prevention to the greater cause of reinforcement of mental health systems in resource-constrained education environments.\u003c/p\u003e \u003cdiv id=\"Sec27\" class=\"Section2\"\u003e \u003ch2\u003eStrengths and Limitations\u003c/h2\u003e \u003cp\u003eThe large sample size, the fact that the selection of variables was informed by stakeholders, and the application of explainable techniques of AI to improve its interpretability and acceptability are major strengths of this study. However, there are few limitations that should be considered. The cross-sectional design does not allow causal inferences and use of self-reported data could cause reporting bias. Also, the imbalance in classes between the severe anxiety and depression groups was likely to limit the model performance of these outcomes.\u003c/p\u003e \u003c/div\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eAI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArtificial Intelligence\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cb\u003eROC\u003c/b\u003e\u0026ndash;Area Under the Curve\u0026ndash;Receiver Operating Characteristic\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eaOR\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eAdjusted Odds Ratio\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eCI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eConfidence Interval\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eGAD\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cb\u003e7\u003c/b\u003e\u0026ndash;Generalized Anxiety Disorder\u0026ndash;7\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eLMICs\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLow\u0026ndash;and Middle\u0026ndash;Income Countries\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003emERA\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003emHealth Evidence Reporting and Assessment\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eML\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMachine Learning\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003ePHQ\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cb\u003e9\u003c/b\u003e\u0026ndash;Patient Health Questionnaire\u0026ndash;9\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eSH+\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSelf\u0026ndash;Help Plus\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eSHAP\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSHapley Additive exPlanations\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eSPSS\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eStatistical Package for the Social Sciences\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eWHO\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eWorld Health Organization\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eXAI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eExplainable Artificial Intelligence\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate: \u003c/strong\u003eThe research was carried out in accordance with the principles of the Declaration of Helsinki and received ethical approval from the Institutional Review Board of the Health Services Academy (No. 00009 IHSA/P/D-2022). All participants provided written informed consent before enrollment in the study. They were informed about the purpose of the study, the voluntary nature of participation, and their right to withdraw at any time without any consequences. Confidentiality and data protection measures were ensured in compliance with ethical guidelines. The study ensured ethical implementation and adoption of machine learning-enhanced mental health screening instruments through the adoption of non-stigmatizing variables and explainable models in a university environment. \u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication – \u003c/strong\u003enot applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials- \u003c/strong\u003eAll data generated or analyzed during this study are included in this published article and available in its supplementary information files.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests: \u003c/strong\u003eThe author declares no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e: No funds, grants, or other support was received.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors’ Contribution\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFR:\u003c/strong\u003e Conceptualization, Methodology, Data collection, Writing – Original final manuscript Preparation and oversaw the execution of the entire study\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAW:\u003c/strong\u003e Writing – Review \u0026amp; Editing, Critical revision of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRR: \u003c/strong\u003eApplication of MLM, development of algorithms and interpretation\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTA: \u003c/strong\u003eApplication of MLM and interpretation\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAR:\u003c/strong\u003e Supervision, Technical oversight, Writing – Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAM:\u003c/strong\u003e Methodological guidance, technical support, Writing – Review \u0026amp; Editing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAll authors\u003c/strong\u003e contributed to the refinement of the manuscript, reviewed the final version, and approved it for publication\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements:\u003c/strong\u003e The authors gratefully acknowledge the medical students who participated in this study. A part of this research has been accepted for poster presentation at the International Congress of the Royal College of Psychiatrists (UK), to be held in June 2026.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors’ information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e*Dr Farah Rashid – (Corresponding Author)*\u003c/strong\u003e\u003c/p\u003e\n\u003cp\[email protected]\u003c/p\u003e\n\u003cp\u003ehttps://orcid.org/0009-0006-5931-6204\u003c/p\u003e\n\u003cp\u003ePhD Fellow-Public Health, Health Services Academy, Islamabad. Pakistan. IRSIP Fellowship at the University of Liverpool, UK. Faculty at National University of Sciences and Technology. Islamabad. Pakistan\u003c/p\u003e\n\u003cp\u003eDr Ahmed Waqas\u003c/p\u003e\n\u003cp\[email protected]\u003c/p\u003e\n\u003cp\u003e0000-0002-7492-5052\u003c/p\u003e\n\u003cp\u003eclinical lecturer – SAS doctor\u003c/p\u003e\n\u003cp\u003eUniversity of Liverpool, UK\u003c/p\u003e\n\u003cp\u003eRafay Rashed Siddiqui \u003c/p\u003e\n\u003cp\[email protected]\u003c/p\u003e\n\u003cp\u003eMSc-student Computer Science\u003c/p\u003e\n\u003cp\u003eRheinland-Pfälzische Technische Universität Kaiserslautern-Landau. Germany\u003c/p\u003e\n\u003cp\u003eTalha Ahmed\u003c/p\u003e\n\u003cp\[email protected]\u003c/p\u003e\n\u003cp\u003eMSc-student Computer Science\u003c/p\u003e\n\u003cp\u003eUniversity: California State Polytechnic University, Pomona. USA\u003c/p\u003e\n\u003cp\u003eDr Atif Rahman\u003c/p\u003e\n\u003cp\[email protected] \u003c/p\u003e\n\u003cp\u003e0000-0002-2066-4467\u003c/p\u003e\n\u003cp\u003eProfessor of Child Psychiatry \u0026amp; Global Mental Health, Department of Primary Care and Mental Health Institute of Population Liverpool L69 3BX, United Kingdom \u003c/p\u003e\n\u003cp\u003ePhone: +44(0)7807 10 6764 \u003c/p\u003e\n\u003cp\u003eDr Abid Malik\u003c/p\u003e\n\u003cp\[email protected]\u003c/p\u003e\n\u003cp\u003e0000-0002-9084-2185\u003c/p\u003e\n\u003cp\u003eProfessor and HOD Public Mental Health \u003c/p\u003e\n\u003cp\u003eHealth Services Academy, Islamabad\u003c/p\u003e\n\u003cp\u003e+923468544463\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSaipanish R. Stress among medical students in a Thai medical school. Med Teach. 2003;25(5):502\u0026ndash;6.\u003c/li\u003e\n\u003cli\u003eAzim SR, Adnan N, Azim SN, Nisar M, Shamim MS. Frequency of mental distress among medical students from selected medical colleges of Pakistan: A systematic review. J Pak Med Assoc. 2022;72(10):2048\u0026ndash;53.\u003c/li\u003e\n\u003cli\u003eSaravanan C, Wilks R. Medical students\u0026apos; experience of and reaction to stress: the role of depression and anxiety. ScientificWorldJournal. 2014;2014:737382.\u003c/li\u003e\n\u003cli\u003eCarson AJ, Dias S, Johnston A, McLoughlin MA, O\u0026apos;Connor M, Robinson BL, et al. Mental health in medical students. A case control study using the 60 item General Health Questionnaire. Scott Med J. 2000;45(4):115\u0026ndash;6.\u003c/li\u003e\n\u003cli\u003eRotenstein LS, Ramos MA, Torre M, Segal JB, Peluso MJ, Guille C, et al. Prevalence of Depression, Depressive Symptoms, and Suicidal Ideation Among Medical Students: A Systematic Review and Meta-Analysis. JAMA. 2016;316(21):2214\u0026ndash;36.\u003c/li\u003e\n\u003cli\u003eTian-Ci Quek T, Wai-San Tam W, X. Tran B, Zhang M, Zhang Z, Su-Hui Ho C, et al. The Global Prevalence of Anxiety Among Medical Students: A Meta-Analysis. International Journal of Environmental Research and Public Health. 2019;16(15).\u003c/li\u003e\n\u003cli\u003eKubwimana L, Mutatsineza G, Tesi L, Wong R. Assessing the Stress Level among Medical Students in Rwanda. Open Journal of Psychiatry. 2022;12(02):174\u0026ndash;87.\u003c/li\u003e\n\u003cli\u003eSarfraz A, Siddiqui S, Galante J, Sikander S. Feasibility and Acceptability of an Online Mindfulness-Based Intervention for Stress Reduction and Psychological Wellbeing of University Students in Pakistan: A Pilot Randomized Controlled Trial. Int J Environ Res Public Health. 2023;20(8).\u003c/li\u003e\n\u003cli\u003eL\u0026apos;Hote D, Potiron L, Levaillant M. Assessing psychological distress among medical students: a systematic review and meta-analysis of tools available. BMC Med Educ. 2026;26(1):267.\u003c/li\u003e\n\u003cli\u003eDyrbye LN, Thomas MR, Shanafelt TD. Systematic review of depression, anxiety, and other indicators of psychological distress among U.S. and Canadian medical students. Acad Med. 2006;81(4):354\u0026ndash;73.\u003c/li\u003e\n\u003cli\u003eRudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat Mach Intell. 2019;1(5):206\u0026ndash;15.\u003c/li\u003e\n\u003cli\u003e\u0026lt;Lundberg-AUnified Approach to Interpreting Model.pdf\u0026gt;.\u003c/li\u003e\n\u003cli\u003eFirth J, Solmi M, Wootton RE, Vancampfort D, Schuch FB, Hoare E, et al. A meta-review of \u0026quot;lifestyle psychiatry\u0026quot;: the role of exercise, smoking, diet and sleep in the prevention and treatment of mental disorders. World Psychiatry. 2020;19(3):360\u0026ndash;80.\u003c/li\u003e\n\u003cli\u003eKandola A, Ashdown-Franks G, Hendrikse J, Sabiston CM, Stubbs B. Physical activity and depression: Towards understanding the antidepressant mechanisms of physical activity. Neurosci Biobehav Rev. 2019;107:525\u0026ndash;39.\u003c/li\u003e\n\u003cli\u003eShatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of methods and applications. Psychol Med. 2019;49(9):1426\u0026ndash;48.\u003c/li\u003e\n\u003cli\u003eAlvaro PK, Roberts RM, Harris JK. A Systematic Review Assessing Bidirectionality between Sleep Disturbances, Anxiety, and Depression. Sleep. 2013;36(7):1059\u0026ndash;68.\u003c/li\u003e\n\u003cli\u003eMaki KA, Yang L, Farmer N, Papneja S, Wallen GR, Barb JJ. Sleep regularity and duration are associated with depression severity in a nationally representative United States sample. Neurobiol Sleep Circadian Rhythms. 2025;19:100133.\u003c/li\u003e\n\u003cli\u003eWallace DA, Redline S, Sofer T, Kossowsky J. Environmental Bright Light Exposure, Depression Symptoms, and Sleep Regularity. JAMA Netw Open. 2024;7(7):e2422810.\u003c/li\u003e\n\u003cli\u003eChaabane S, Chaabna K, Khawaja S, Aboughanem J, Mittal D, Mamtani R, et al. Sleep disorders and associated factors among medical students in the Middle East and North Africa: a systematic review and meta-analysis. Sci Rep. 2024;14(1):4656.\u003c/li\u003e\n\u003cli\u003eNsengimana A, Mugabo E, Niyonsenga J, Hategekimana JC, Biracyaza E, Mutarambirwa R, et al. Sleep quality among undergraduate medical students in Rwanda: a comparative study. Sci Rep. 2023;13(1):265.\u003c/li\u003e\n\u003cli\u003eTwenge JM, Campbell WK. Associations between screen time and lower psychological well-being among children and adolescents: Evidence from a population-based study. Prev Med Rep. 2018;12:271\u0026ndash;83.\u003c/li\u003e\n\u003cli\u003eAlchalabi S, Layth A. Exploring the Impact of Academic Stress on Depression Levels in Medical Students. The Medical Journal of Tikrit University. 2025;31(2):413\u0026ndash;22.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"National University of Sciences and Technology, Islamabad. Pakistan","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Anxiety, Artificial Intelligence, Depression, Machine Learning Models, Medical Students","lastPublishedDoi":"10.21203/rs.3.rs-9558928/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9558928/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground:\u003c/strong\u003e Anxiety and depression are highly prevalent among medical students, particularly in low- and middle-income countries such as Pakistan, where stigma limits help-seeking. The research shows that machine learning models based on lifestyle and sociodemographic data can be effective in screening anxiety and depression among medical students. This study aimed to develop an explainable machine learning–based screening tool using non-stigmatizing lifestyle and sociodemographic factors. We hypothesized that such models could effectively identify anxiety and depression in academic settings.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethod:\u003c/strong\u003e A cross-sectional study was conducted among 1,630 undergraduate medical students from various medical colleges in Islamabad, Pakistan. Data collection was informed by stakeholder engagement to ensure contextual relevance. Variables included were sociodemographic characteristics and lifestyle factors such as sleep, physical activity, academic workload, and social context. Anxiety and depression were measured using GAD-7 and PHQ-9. After preprocessing, data were split into training (80%) and testing (20%) sets. Random Forest classifiers were developed separately for anxiety and depression, with hyperparameters optimized via cross-validation. Performance was evaluated using accuracy, sensitivity, specificity, and AUC-ROC. Model interpretability was achieved using SHAP.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003e Prevalence of depression and anxiety was 57.8% and 46.4%, respectively. The anxiety model achieved 76.69% accuracy, while the depression model achieved 77.30%. SHAP analysis identified academic performance, sleep patterns, and physical activity as key predictors, exhibiting non-linear effects. It demonstrates that effective and interpretable screening can be achieved without sensitive disclosures related to mental health. Practically, such strategies can help universities to identify at-risk students early, refer them to the support services. and provide focused intervention timely.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e The findings of this study support the use of scalable and ethically sound Explainable machine learning screening instruments, using non-stigmatizing data for student mental health in academic environment. Future studies should aim to validate and implement, whereas policymakers might want to incorporate explainable, data-driven strategies into student mental health frameworks in resource-limited educational settings.\u003c/p\u003e","manuscriptTitle":"Explainable Machine Learning Models as Screening tool for Anxiety and Depression in Medical Students Based on Non-Stigmatizing Lifestyle and Sociodemographic Factors- Pakistan. A Validation Study","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-13 04:26:29","doi":"10.21203/rs.3.rs-9558928/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"06dd34ac-bb10-41a6-8399-fbd3f3884351","owner":[],"postedDate":"May 13th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":67852121,"name":"Psychiatry"}],"tags":[],"updatedAt":"2026-05-13T04:26:29+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-13 04:26:29","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9558928","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9558928","identity":"rs-9558928","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00