Prediction of COVID 19 vaccine uptake among Nigerian women using supervised machine learning based on 2024 Demographic and Health Survey data

preprint OA: closed
Full text JSON View at publisher
Full text 158,797 characters · extracted from preprint-html · click to expand
Prediction of COVID 19 vaccine uptake among Nigerian women using supervised machine learning based on 2024 Demographic and Health Survey data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Prediction of COVID 19 vaccine uptake among Nigerian women using supervised machine learning based on 2024 Demographic and Health Survey data Aklilu Habte Hailegebireal This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9288102/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 7 You are reading this latest preprint version Abstract Background COVID-19 vaccination remains a key strategy for reducing severe illness and mortality, yet uptake remains uneven in Nigeria. Although prior studies have highlighted multiple predictors of vaccine uptake among women, most rely on traditional analytical approaches, limiting their ability to capture complex, non-linear drivers of vaccine behaviour. Applying robust, data-driven machine learning methods is therefore essential to more accurately identify key predictors of vaccine uptake and to inform targeted, equitable vaccination interventions. Methods This study employed machine learning techniques using data from the 2024 Nigeria Demographic and Health Survey. A weighted sample of 36,205 women aged 15–49 years was considered using Python software. Sociodemographic, household, reproductive, health-system, and media-exposure variables were considered as features. After preprocessing, the dataset was split into training (80%) and testing (20%) sets. To address class imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied. Multiple supervised machine-learning algorithms, namely Logistic Regression, Decision Tree Classifier, Random Forest, Gradient Boosting Machines, extreme gradient boosting, CatBoost, Support Vector Machines, K-Nearest Neighbors, and Artificial Neural Networks, were applied. Model effectivenesswas evaluated using accuracy and Area Under the Receiver Operating Characteristic Curve (AUC-ROC) as primary metrics. The best-performing model was interpreted using SHapley Additive exPlanations (SHAP) to identify the most influential predictors of vaccine uptake. Results Overall, 27.6% of women reported receiving at least one dose of a COVID-19 vaccine (95% CI: 27.14–28.06). Ten-fold cross-validation with SMOTE produced the best results, emphasizing the significance of addressing class. Random Forest achieved the highest performance (accuracy = 75.1%, AUC = 82.8%). SHAP-based interpretation of the Random Forest model identified older age (35–49 years), region of residence (notably outside the North West and South East, particularly South West), and marital status (not never in union) as key positive contributors to vaccine uptake, while no radio exposure reduced the predicted likelihood of vaccination. Conclusion This study identified low coverage of COVID-19 vaccine uptake among Nigerian women. Policymakers should adopt targeted, equity-focused vaccination strategies that prioritise underserved regions and socioeconomically disadvantaged women. Integrating COVID-19 vaccination into routine maternal and reproductive health services, expanding community-based and mobile outreach, and strengthening radio-based and culturally tailored communication campaigns are essential to improve coverage. COVID-19 Vaccine Women Machine Learning Nigeria Demographic and Health Survey Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Background The emergence of the COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), marked a pivotal turning point in modern history, triggering widespread disruptions that reshaped nearly every facet of human life across the globe [ 1 , 2 ]. It has resulted in more than 778.8 million confirmed cases and 7.1 million deaths worldwide, with far-reaching economic and social consequences across nations [ 3 ]. Amid the global disruption caused by the pandemic, vaccination has proven essential in easing the multifaceted burden by limiting transmission, severe illness, and death [ 4 ]. Women represent a critical demographic in the COVID-19 vaccination effort, requiring targeted education and equitable access—especially those of reproductive age, who face unique concerns around fertility and menstrual health that influence uptake [ 5 , 6 ]. Findings revealed that COVID-19 vaccine hesitancy among women stems from a myriad of factors, such as education, misinformation, and perceived risk, and broader community influences such as access, religion, and local health systems [ 7 – 11 ]. In Nigeria, the national rollout of COVID-19 vaccines began in March 2021, aiming to curb transmission and reduce mortality [ 12 ]. Of the 11 COVID-19 vaccines authorized globally by the World Health Organization, Nigeria has approved seven for national use: Vaxzevria, Covishield, Comirnaty, Jcovden, Spikevax, Sputnik V, and Covilo[ 13 , 14 ]. However, vaccine uptake has varied significantly across regions, socioeconomic groups, and gender lines [ 15 ]. Although the availability, uptake remains uneven due to a complex mix of challenges, which contributed to persistent hesitancy, namely misinformation, limited health literacy, and fear of side effects[ 16 , 17 ]. Geographic disparities, religious beliefs, and logistical constraints further complicate vaccine delivery and acceptance [ 18 ]. Understanding the level and drivers of COVID-19 vaccine uptake is crucial for targeted interventions, yet most studies in Nigeria have centered on the general population, overlooking women of reproductive age—whose unique health concerns and sociocultural context warrant focused analysis [ 19 – 21 ]. Despite a growing body of research on COVID-19 vaccine uptake among women [ 22 – 24 ], most existing studies have been limited in scale and analytical depth, relying primarily on basic statistical models. These approaches often fail to capture the complex, non-linear interactions among individual, social, and structural determinants of vaccine behavior. This study addresses a critical gap by applying machine learning algorithms to a large, representative dataset, enabling more accurate identification of key predictors of vaccine uptake among women in Nigeria. The findings hold significant implications for public health: they offer actionable insights for improving vaccine uptake in the country. For researchers, the study demonstrates the value of advanced modeling in epidemiological analysis, while for policymakers, it provides evidence-based guidance to design equitable and targeted vaccination strategies that can enhance coverage and reduce gender disparities in health outcomes. Methods Study setting, data source, and population This study is based on women (IR) data from the 2024 Nigeria Demographic and Health Survey (NDHS), which was conducted between 1 December 2023 and 7 May 2024, encompassing all 36 states of Nigeria and the Federal Capital Territory (FCT). The study population comprised women aged 15–49 years who were vaccinated for COVID-19. Women who didn’t know their vaccination status or had missing data on this variable were excluded. Sample Sampling procedures followed a stratified two-stage design based on an updated cartographic frame developed for Nigeria’s first fully digital Population and Housing Census. In the first stage, 1,400 Enumeration Areas (EAs)—comprising 701 urban and 699 rural clusters—were selected using probability proportional to size, stratified across 74 distinct strata defined by urban-rural residence and geopolitical zones. In the second stage, 30 households were systematically selected from each EA, yielding a nationally representative sample of households. Eligible women within these households were interviewed using the standardized Women’s Questionnaire, resulting in a final sample of 36,205 respondents. Further methodological details are available in the NDHS technical documentation. Outcome Variable Definition The primary outcome of interest was COVID-19 vaccine uptake, operationalized as a binary classification variable, which was determined from responses to a question that asked whether a woman had received a COVID-19 vaccine or not. Women who reported receiving at least one dose were coded as vaccinated (Yes = 1), while those who had not received any dose were coded as not vaccinated (No = 0). Predictor Variables (Features) The predictor variables used in this study encompassed a broad range of sociodemographic, behavioral, and health system factors known to influence vaccine uptake. These included age of the woman (15–24, 25–34, 35–49), marital status(unmarried, in a marital relationship, and others), educational attainment (no education, primary, secondary, higher), occupation (working, not working), residence (urban, rura), sex of the household head (male vs female), and household wealth index (poorest, poorer, middle, richer, ricest). Reproductive and media exposure variables such as parity (nulliparous, primiparous, multiparous, grandmultiparous), frequency of listening to the radio, watching television, and reading newspapers (not at all, less than once a week, and at least once a week) were also incorporated. Perceived barriers to maternity care, such as distance to the nearest health facility, permission to seek care, and financial constraints, were assessed and categorized as a big problem or not a big problem. Additional variables included health insurance coverage (yes vs. no), current use of contraception (users vs. non-users), and recent visits to a health facility (yes vs. no). Data management and analysis A structured machine learning (ML) approach, including data preprocessing, feature engineering, model training, and performance evaluation, was used. All analytical techniques were carried out with Python (Visual Studio Code 3.13.6). To ensure representativeness and account for the complicated sampling design, analyses were weighted with sampling weights (v005/1000000). All potentially relevant variables were retrieved using STATA version 18 and exported as CSV files for further investigation. To improve model performance and reduce bias, the dataset was thoroughly cleaned, encoded, and partitioned. Both descriptive and predictive studies were carried out, with ML algorithms chosen based on their suitability for classification tasks and interpretability in public health settings. Data preprocessing and analysis All subsequent preprocessing and modeling steps were conducted in Python (Visual Studio Code 3.13.6). Initial data preparation involved exploratory data analysis to understand variable distributions and relationships. This was followed by systematic handling of missing values, multicollinearity, discretization of relevant features, detection and treatment of outliers, and balancing of the target variable to address class imbalance. Feature engineering and transformation were applied to enhance model interpretability and performance. All categorical variables were encoded to numerical values using one-hot encoding to ensure compatibility with machine learning algorithms. To address high dimensionality and reduce model complexity, dimensionality reduction techniques were applied following feature engineering. Highly correlated predictors were identified and removed to minimise redundancy. Feature selection was performed to retain only the most informative predictors based on the literature and chi-square test, ensuring that the final model was both parsimonious and robust. The dataset was then partitioned into training (80%) and testing (20%) subsets, with all modeling restricted to features that passed the selection criteria. To address the potential class imbalance between vaccinated and unvaccinated groups, we applied the Synthetic Minority Over-sampling Technique (SMOTE) to the training dataset. This method generates synthetic samples for the minority class by interpolating between existing observations, thereby equalizing the number of vaccinated and non-vaccinated cases without duplicating data [ 25 , 26 ]. The balanced dataset was subsequently used to train all classification models, ensuring that each algorithm had equitable exposure to both outcome classes. This approach improves model sensitivity and reduces bias toward the majority class, enhancing the reliability of predictive performance across all metrics [ 27 ]. To predict COVID-19 vaccine uptake, this study employed a diverse set of nine supervised machine learning algorithms: Logistic Regression (LR), Decision Tree Classifier (DT), Random Forest Classifier (RF), Gradient Boosting Machines (GB), extreme gradient boosting (XGBoost), CatBoost, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Artificial Neural Networks (ANNs). These algorithms were selected to capture a wide spectrum of modeling approaches—ranging from interpretable linear models and tree-based ensembles to kernel-based classifiers and deep learning architectures. This diversity allowed for a comprehensive comparison of predictive performance across models with varying strengths in handling nonlinearity, feature interactions, and class imbalance. Hyperparameter tuning was conducted using grid search and cross-validation techniques to optimize each model’s performance while minimizing overfitting (Fig. 1 ). Model performance evaluation Model performance was rigorously evaluated using an 80:20 train-test split, ensuring that each algorithm was assessed on unseen data. Standardized metrics—including accuracy, sensitivity (recall), precision, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), F1-score, specificity, and negative predictive value (NPV)—were computed to capture both overall and class-specific predictive power. To enhance the reliability of these estimates, bootstrapped 95% confidence intervals were generated for each metric across 1,000 resampled iterations. Confusion matrices and classification reports were also produced to visualize prediction errors and class-wise performance. This multi-model evaluation strategy provided robust and generalizable insights into the determinants of COVID-19 vaccine uptake, allowing for comparative assessment across diverse learning algorithms. Sensitivity, also known as Recall or the True Positive Rate, measures the model’s ability to correctly identify the proportion of vaccinated women. $$\:\text{Sensitivity}=\frac{TP}{TP+FN}*100$$ Specificity (True Negative Rate) measures the ability of the model to correctly identify the proportion of unvaccinated women or the model’s ability to avoid false positives. $$\:\text{Specificity}=\frac{TN}{TN+FP}*100$$ Accuracy refers to the proportion of correctly classified women (both vaccinated and not vaccinated), which measures the overall correctness of the model’s prediction. $$\:\text{Accuracy}=\frac{TP+TN}{TP+TN+FP+FN}*100$$ Precision (Positive Predictive Value) refers to the proportion of predicted vaccinated women who are actually vaccinated, or measures the reliability of positive predictions. $$\:\text{P}\text{r}\text{e}\text{c}\text{i}\text{s}\text{i}\text{o}\text{n}=\frac{TP}{TP+FP}*100$$ F1-Score refers to the harmonic mean of precision and sensitivity, balancing both false positives and false negatives. $$\:\text{F}1-\text{s}\text{c}\text{o}\text{r}\text{e}\:=\frac{2.Precision*sensitivity}{Precision+Sensistivity}*100$$ AUC-ROC is a performance metric that evaluates a model’s ability to distinguish between women who received the COVID-19 vaccine versus those who did not. The ROC curve plots the True Positive Rate (Sensitivity) against the False Positive Rate (1 − Specificity) across various classification thresholds. The AUC (Area Under the Curve) summarizes this plot into a single value ranging from 0 to 1. The higher AUC values indicate better accuracy in distinguishing between vaccinated and unvaccinated women Feature Importance Analysis To identify the most influential predictors of COVID-19 vaccine uptake, we conducted a comprehensive feature importance analysis across all trained machine learning models. Statistical association was first assessed using chi-square tests to examine the relationship between categorical predictors and vaccine uptake, helping to identify variables with significant univariate associations for initial screening. In addition, permutation importance with bootstrapped confidence intervals was applied to quantify the change in model performance when each feature was randomly shuffled, offering a robust, model-agnostic measure of variable relevance. To enhance interpretability, SHAP (SHapley Additive exPlanations) was employed across all classifiers to quantify and visualize the contribution of each feature to individual predictions. By fairly assigning each feature's contribution to the model's output, SHAP, which is based on cooperative game theory, provides a unified and theoretically sound way to explain individual prediction [ 28 ]. Beeswarm plots were used to display the distribution of SHAP values across the dataset, highlighting both the magnitude and direction of each feature’s influence. Waterfall plots were generated to explain the cumulative impact of predictors on the probability of vaccine uptake for specific individuals, illustrating how each variable pushed the prediction toward or away from the positive class. These visualizations provided intuitive, model-consistent insights into how individual predictors shaped outcomes across algorithms, reinforcing the transparency and policy relevance of the study. Results Sociodemographic characteristics of respondents A total of 36,205 women were included in this study; the majority were from the Northwest region (32.2%), followed by Northcentral (18.01%). The majority were aged 15–24 (37.9%) and had secondary education (41.54%). More than one-third (36.39%) of women were multiparous, 18.0% reported using contraceptives, and just 8.38% were currently pregnant. Regarding the vaccination status, the highest proportion of vaccinated women was seen in the southwest region (45.71%), followed by the Northeast (30.63%) and the North central (28.6%) regions. A relatively higher proportion of vaccinated women was observed among women with higher education (40.93%), employed (32.03%), and living in urban areas (31.13%), and the richest wealth quintile (32.31%). Similarly, a higher proportion was reported among multiparous (30.95%), contraceptive users (36.83%), who listened to the radio at least once a week (35.18%), and enrolled in a health insurance scheme (50.98%) (Table 1 ). Overall, a significant difference in the uptake of the COVID-19 vaccine was observed across regions, age, level of education, residence, wealth index, parity, media exposure(radio, Television, newspaper), barriers to maternity care (money, distance, permission) at p < 0.001(Table 1 ). Table 1 The distribution of sociodemographic and health service-related characteristics of respondents, Nigerian Demographic Health Survey, 2024 Variable categories Total (N = 36,205) Vaccinated [n(%)] Test statistics Region Northwest 11,658(32.20) 2,477(21.24) χ2 = 1433.31 p < 0.001 Northeast 5,502(15.20) 1,686(30.63) Northcentral 6,522(18.01) 1,865(28.6) Southeast 3,052(8.43) 559(18.3) South 4,229(11.68) 1,007(23.81) Southwest 5,242(14.48) 2,396(45.71) Age 15–24 13,658(37.72) 3,009(22.03) χ2 = 453.15 p < 0.001 25–34 11,174(30.86) 3,202(28.66) 35–49 11,374(31.41) 3,779(33.23) Marital status Unmarried 10,025(27.69) 2,430(24.24) χ2 = 131.51 p < 0.001 In a marital relationship 24,393(67.38) 6,957(28.52) Others 1,786(4.93) 602.8(33.75) Educational level No education 11,978(33.08) 2,371(19.80) χ2 = 825.71 p < 0.001 Primary 4,063(11.22) 1,176(28.95) Secondary 15,041(41.54) 4,345(28.89) Higher 5,123(14.15) 2,097(40.93) Women occupation χ2 = 664.36 p < 0.001 Not working 13,600(37.56) 2,738(20.13) Working 22,605(62.44) 7,253(32.08) Residence χ2 = 201.03 p < 0.001 Urban 17,718(48.94) 5,516(31.13) Rural 18,487(51.06) 4,474(24.20) Wealth index χ2 = 472.95 p < 0.001 Poorest 5,943(16.41) 1,009(17.00) Poorer 6,683(18.46) 1,571(23.5) Middle 7,259(20.05) 2,176(29.98) Richer 7,980(22.04) 2,540(31.83) Richest 8,341(23.04) 2,694(32.31) Family size χ2 = 24.82 p < 0.001 =5 members 23,837(65.84) 6,412(26.90) Head of household χ2 = 51.16 p < 0.001 Male 29,323(81.0) 7,854(26.78) Female 6,882(19.0) 2,136(31.0) Parity χ2 = 177.06 p < 0.001 Nulliparous 10,976(30.32) 2,614(23.81) Primiparous 4,420(12.21) 1,160(26.25) Multiparous 13,175(36.39) 4,077(30.95) Grand multiparous 7,634(21.08) 2,139(28.02) Currently pregnant χ2 = 8.14 p = 0.003 Yes 3,032(8.38) 726(23.94) No 33,173(91.62) 9,264(27.93) Contraceptive uptake χ2 = 319.26 p < 0.001 Non-users 29,702(82.04) 7,595(25.57) Users 6,503(18.0) 2,395(36.83) Perceived distance χ2 = 78.57 p < 0.001 Big problem 8,901(24.58) 2,161(24.28) Not a big problem 27,304(75.42) 7,829(28.67) Getting permission χ2 = 115.29 p < 0.001 Big problem 3,994(11.03) 843(21.12) Not a big problem 32,211(88.97) 9,147(28.4) Getting money χ2 = 84.83 p < 0.001 Big problem 17,158(47.39) 4,411(25.71) Not a big problem 19,047(52.61) 5,579(29.29) Watching TV χ2 = 281.60 p < 0.001 Not all 19,082(52.70) 4,499(23.58) < once a week 5,938(16.40) 1,894(31.90) At least once 11,186(30.90) 3,597(32.16) Listening to the radio χ2 = 527.29 p < 0.001 No tat all 18,731(51.74) 4,180(22.31) < once a week 8,141(22.49) 2,527(31.04) At least once a week 9,333(25.78) 3,284(35.18) Reading newspaper χ2 = 100.12 p < 0.001 Not at all 32,602(90.05) 8,746(26.83) < once a week 2,319(6.41) 793(34.18) At least once a week 1,284(3.55) 451(35.12) Health insurance enrolment Yes 1,250(3.45) 637(50.98) χ2 = 253.72 p < 0.001 No 34,955(96.55) 9,353(26.76) Weighted prevalence of COVID-19 vaccine uptake Over one-fourths, 27.6%(95% CI: 27.14, 28.06) of women were vaccinated for COVID-19. Of those vaccinated, the majority, 51.18% and 40.83%, received one and two doses of the vaccine, respectively. Results for data preprocessing To meet the input requirements of machine learning classification models, all categorical variables were transformed into numerical representations using one-hot encoding. This process converted each categorical category into a separate binary indicator (0/1), allowing the models to capture non-ordinal relationships without imposing artificial ordering. There was a great imbalance between classes, in which 26,698 were unvaccinated (the majority class) and 9,463 vaccinated (minority class). To address class imbalance and improve model reliability, the Synthetic Minority Oversampling Technique (SMOTE) was applied to the training data. SMOTE generated 17,235 synthetic samples for the vaccinated group, resulting in a balanced distribution of 26,698 observations in each class (Fig. 2 ). Model building and performance comparison We developed ten supervised machine learning algorithms—Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), Gradient Boosting, Extreme Gradient Boosting (XGB), CatBoost, Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), and Artificial Neural Network (ANN)—to predict COVID-19 vaccine uptake. Each model’s performance was evaluated and compared using accuracy before and after data balancing (SMOTE) to identify the most effective predictive algorithm. Prior to data balancing, Gradient Boosting (with 75.1% accuracy) and Support Vector Machine (SVM) (with 74.9% accuracy) achieved the highest predictive performance. After applying the SMOTE oversampling technique to balance the training data, Random Forest emerged as the top-performing model, achieving an accuracy of 75.1% (Table 2 ). Table 2 Performance comparison of machine learning algorithms for predicting COVID-19 vaccine uptake using unbalanced and balanced training datasets, Nigeria, 2024 Machine learning models Core metrics Unbalanced training data (before SMOTE) Balanced training data (after SMOTE) Logistic regression Acucracy 74.7% 66.0% AUC 68.0% 72.0% Decision tree Acucracy 68.2% 72.6% AUC 56.0% 75.3% Random forest Acucracy 71.0% 75.1% AUC 63.0% 82.8% KNN Acucracy 73.8% 71.3% AUC 62.0% 77.2% Gradient boosting Acucracy 75.1% 66.6% AUC 69.0% 72.4% Extreme gradient boosting (XGB) Acucracy 74.4% 70.4% AUC 68.0% 77.3% CatBoost Acucracy 74.8% 70.4% AUC 68.0% 77.1% Support vector machine (SVM) Acucracy 74.9% 69.0% AUC 63.0% 75.3% Gausian Naive Bayes(GNB) Acucracy 67.2% 60.0% AUC 64.0% 64.6% Artificial neural network (ANN) Acucracy 73.2% 68.9% AUC 66.0% 75.3% Receiver Operating Characteristic (ROC) curves were generated for each machine learning model to evaluate classification performance on the training dataset. The curves were plotted both before and after applying balancing techniques (SMOTE), allowing for visual comparison of model discrimination under imbalanced and resampled conditions (Fig. 3 and Fig. 4 ). Important feature selection and model interpretability using RF The probability of COVID-19 vaccine uptake was estimated using the random forest (RF), the ML algorithm with the highest accuracy and AUC in the balanced data. As RF is a tree-based model, we used SHapley Additive exPlanations (SHAP) Tree Explainer to compute global and local feature importance. The SHAP waterfall plot illustrates how individual predictors contributed to the Random Forest model’s prediction of COVID-19 vaccine uptake for a representative observation. Starting from the baseline probability (expected model output) of E[f(x)] = 0.105, the combined effects of the predictors increased the predicted probability to f(x) = 0.918, indicating a high likelihood of vaccine uptake. Several features made strong positive contributions to this prediction. Being aged 35–49 years increased the predicted probability of vaccination, reflecting higher uptake among older reproductive-age women. Geographic location also played an important role: residing outside the North West and South East regions, particularly being associated with the South West, contributed positively to vaccine uptake. Marital status was influential, with women who were not “never in union” showing a higher likelihood of vaccination. Similarly, belonging to a wealth category other than the richest quintile (in this specific observation) contributed positively, highlighting the nuanced, non-linear relationships captured by the Random Forest model. Collectively, the contribution of “other features” further reinforced the upward shift in the predicted probability. In contrast, some variables exerted negative effects on the prediction. Lack of radio exposure (“radio not at all”) reduced the likelihood of vaccination, underscoring the role of mass media and information access in shaping vaccine behaviour. Educational characteristics showed mixed effects: while having secondary education reduced the predicted probability in this instance, the absence of “no education” contributed positively, reflecting complex interactions between education levels and other sociodemographic factors within the model (Fig. 5 ). Discussion This study leveraged a comprehensive machine learning algorithm to identify key predictors of COVID-19 vaccine uptake among Nigerian women using nationally representative DHS data. By addressing class imbalance through the Synthetic Minority Oversampling Technique (SMOTE), we improved model stability and ensured equitable learning across vaccinated and unvaccinated groups [ 29 , 30 ]. The generation of synthetic observations for vaccinated women ensured equal class representation during training [ 31 , 32 ], which is particularly important in vaccine uptake studies where skewed outcome distributions can compromise both predictive performance and interpretability. From a statistical standpoint, this approach enhances internal validity by allowing algorithms to learn meaningful patterns from both outcome classes rather than overfitting to the dominant group [ 33 ]. Comparative model evaluation across ten supervised learning algorithms demonstrated that model effectiveness varied before and after data balancing. Before the application of SMOTE, Gradient Boosting (GB) and Support Vector Machine (SVM) models achieved the highest accuracy, reflecting their ability to handle non-linear relationships and margin-based classification in imbalanced datasets [ 34 – 36 ]. However, after balancing, Random Forest (RF) emerged as the top-performing model, achieving the highest accuracy and AUC. This improvement is likely attributable to RF’s ensemble structure, which benefits from increased representation of minority-class observations and reduces variance through aggregation of multiple decision trees [ 37 ]. This finding is consistent with existing evidence that ensemble, tree-based methods are particularly well suited for heterogeneous population data and benefit substantially from balanced class distributions. The superior post-SMOTE performance of RF underscores the importance of addressing class imbalance when deploying machine learning models for health behavior prediction. Using a Random Forest model coupled with SHAP-based interpretability, this study provides nuanced insights into the socio-demographic and contextual drivers of COVID-19 vaccine uptake. The SHAP waterfall analysis demonstrates how multiple factors interact to shift predicted vaccination probability from a low baseline to a high likelihood, underscoring the value of machine learning approaches in capturing non-linear and context-specific relationships that may be obscured in conventional regression models. Overall, the analysis demonstrates that COVID-19 vaccine uptake among Nigerian women is driven by a combination of sociodemographic, regional, and socioeconomic factors, with media exposure, region of residence, age, marital status, and education exerting particularly strong influences. Age emerged as an important predictor, with women aged 35–49 years contributing positively to vaccine uptake. This finding is consistent with evidence that older women of reproductive age may perceive themselves to be at greater risk of severe COVID-19 outcomes or may have higher engagement with healthcare services due to cumulative reproductive and health system contact over time [ 38 – 41 ]. Increased health literacy and greater autonomy in health decision-making among older women may also facilitate vaccine acceptance. Marital status was another influential predictor, with women who were not “never in union” (in a marital relationship) showing a higher likelihood of vaccination, which is backed by other findings [ 41 – 43 ]. This association may reflect greater social support, spousal influence, or shared decision-making within unions, which can facilitate health service utilization. Married or previously married women may also have more frequent interactions with the health system through maternal and child health services, increasing opportunities for vaccine information and uptake. Media exposure, particularly radio access, emerged as a critical enabling factor for vaccine uptake, which is supported by related studies [ 44 – 46 ]. Lack of radio exposure reduced the predicted probability of vaccination, highlighting the importance of mass media in disseminating accurate vaccine information and countering misinformation. In settings where literacy and internet access remain limited, radio continues to be a key channel for public health communication, especially in rural and underserved populations. This finding reinforces the central role of community-level communication strategies in vaccination campaigns. Women in the richest wealth quintile were more likely to be vaccinated, which is backed by studies conducted elsewhere [ 47 – 50 ]. This association likely reflects greater access to healthcare services and digital health information among women in the richest wealth quintile, which enhances vaccine uptake. Conversely, lower uptake among poorer women suggests that financial and logistical constraints remain substantial barriers. Targeted interventions, including outreach vaccination services and integration of COVID-19 vaccination into routine maternal health care services, may help mitigate these disparities and promote more equitable vaccine coverage. Geographic variation played a substantial role in shaping vaccine uptake. Women outside the North West and South East regions—particularly associated with the South West—contributed positively to vaccination probability. These regional differences likely reflect disparities in health system infrastructure, vaccine availability, and exposure to public health messaging. The South West region of Nigeria is comparatively more urbanized, with stronger health service coverage and higher levels of education and media education, factors that have consistently been linked to improved uptake of preventive health interventions. Conversely, lower uptake in the north-west regions might be due to uneven health system capacity, regional variations in trust toward public institutions, and sociocultural norms influencing health-seeking behavior. This was supported by studies that showed low vaccine uptake in this region [ 15 , 51 , 52 ]. These patterns highlight the importance of strengthening immunization strategies through decentralizing vaccine outreach efforts and strengthening regional health infrastructure, with special attention to underserved areas. Policy implications The findings of this study have important implications for COVID-19 vaccination policy and implementation in Nigeria. The strong influence of age, wealth, region of residence, marital status, and media exposure highlights the need for targeted, equity-oriented vaccination strategies rather than uniform national approaches. Policymakers should prioritize region-specific interventions, particularly in underserved areas such as the North West and South East, by strengthening local health infrastructure and decentralizing vaccine delivery through community-based and mobile outreach services. The protective role of radio exposure underscores the value of sustained, culturally tailored mass-media campaigns, especially in settings with limited digital access. Socioeconomic disparities in uptake suggest that integrating COVID-19 vaccination into routine maternal and reproductive health services and reducing indirect costs of access could improve coverage among poorer women. Finally, the demonstrated utility of machine learning and explainable models supports the incorporation of data-driven decision tools into immunization planning, enabling policymakers to identify high-risk subpopulations and allocate resources more efficiently to achieve equitable vaccine coverage. Strength and limitations This study has some notable strengths. It utilised nationally representative Demographic and Health Survey (DHS) data, enhancing the generalisability of the findings to women of reproductive age across Nigeria. In addition, the application of multiple supervised machine learning algorithms allowed for a robust comparison of predictive performance, ensuring that model selection was data-driven rather than assumption-based. Addressing class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) further strengthened the analysis by improving model stability, internal validity, and predictive fairness between vaccinated and unvaccinated groups. Despite these strengths, some limitations should be considered when interpreting the results. The cross-sectional nature of DHS data precludes causal inference, and the observed associations should therefore be interpreted as correlational rather than causal. COVID-19 vaccination status and key explanatory variables were self-reported, which may introduce recall or social desirability bias. Additionally, although DHS data are comprehensive, they lack detailed information on some potentially important determinants of vaccine uptake, such as vaccine supply constraints, individual risk perceptions, misinformation exposure, trust in government or health systems, and community-level influences. Conclusion Using nationally representative DHS data and explainable machine learning methods, this study identified key sociodemographic, socioeconomic, and contextual predictors of COVID-19 vaccine uptake among Nigerian women. Vaccine uptake was strongly shaped by age, region of residence, wealth status, marital status, and media exposure, with marked regional and socioeconomic inequalities. Policymakers should adopt targeted, equity-focused vaccination strategies that prioritise underserved regions and socioeconomically disadvantaged women. Integrating COVID-19 vaccination into routine maternal and reproductive health services, expanding community-based and mobile outreach, and strengthening radio-based and culturally tailored communication campaigns are essential to improve coverage. Declarations Ethics approval and consent to participate Ethical approval for the original data collection was obtained and approved from the National Health Research Ethics Committee of Nigeria (NHREC). The present study involved secondary analysis of publicly available, fully de-identified DHS data. Written informed consent was obtained from all participants at the time of the original survey, with parental or guardian consent and assent from minors where applicable. As the data were anonymized, no additional ethical approval or participant consent was required for this secondary analysis. All study procedures were conducted in accordance with the DHS data use agreement, relevant ethical guidelines and regulations, and the principles of the Declaration of Helsinki. Consent for publication Not Applicable Availability of data and materials The data supporting the findings of this study were obtained from the Demographic and Health Surveys (DHS) Program. Access to the anonymized dataset is available upon reasonable request through the DHS website (https://dhsprogram.com/Countries/Country-Main.cfm?ctry_id=30&c=Nigeria&Country=Nigeria&cn=&r=1), following the same procedure undertaken by the authors. Competing interests I would like to declare that there is no conflict of interest with other individuals or organizations that could influence or bias the content of the paper inappropriately. Funding No funding to report Author Contributions AHH was involved in the conceptualization, design, Literature review, data analysis and interpretation, and manuscript writing and editing. The author read and approved the final manuscript prior to submission. Acknowledgments I would like to acknowledge the DHS office for letting me access the data based on a reasonable request. References Lai, C.-C., et al., Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges. International journal of antimicrobial agents, 2020. 55 (3): p. 105924. Acter, T., et al., Evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as coronavirus disease 2019 (COVID-19) pandemic: A global health emergency. Science of the Total Environment, 2020. 730 : p. 138996. WHO, WHO COVID-19 dashboard, https://data.who.int/dashboards/covid19/cases?n=c . 2025. Andreadakis, Z., et al., The COVID-19 vaccine development landscape. Nat Rev Drug Discov, 2020. 19 (5): p. 305-306. Galanis, P., et al., Intention of healthcare workers to accept COVID-19 vaccination and related factors: A systematic review and meta-analysis. Asian Pacific Journal of Tropical Medicine, 2021. 14 (12): p. 543-554. Chang, W.-H., A review of vaccine effects on women in light of the COVID-19 pandemic. Taiwanese Journal of Obstetrics and Gynecology, 2020. 59 (6): p. 812-820. Adeyemo, K.S., A.O. Mbata, and O.D. Balogun, Developing a Multidimensional Framework for Vaccine Confidence: Analyzing Socioeconomic, Cultural, and Psychological Determinants of Vaccine Decision-Making. Engineering and Technology Journal, 2025. 10 (5): p. 4764-4776. Smith, N., Exploring the Contextual Determinants of Vaccine Acceptability . 2022. de Sousa, Á.F.L., et al., Determinants of COVID-19 vaccine hesitancy in Portuguese-speaking countries: a structural equations modeling approach. Vaccines, 2021. 9 (10): p. 1167. Liu, R. and G.M. Li, Hesitancy in the time of coronavirus: Temporal, spatial, and sociodemographic variations in COVID-19 vaccine hesitancy. SSM-population health, 2021. 15 : p. 100896. Rodrigues, F., S. Block, and S. Sood, What determines vaccine hesitancy: recommendations from childhood vaccine hesitancy to address COVID-19 vaccine hesitancy. Vaccines, 2022. 10 (1): p. 80. Uzochukwu, B.S.C., et al., A health technology assessment of COVID-19 vaccination for Nigerian decision-makers: Identifying stakeholders and pathways to support evidence uptake. Health Research Policy and Systems, 2024. 22 (1): p. 73. Nigeria – COVID19 Vaccine Tracker,2022 https://covid19.trackvaccines.org/country/nigeria/ . NAFDAC, National Agency for Food and Drug Administration and Control (NAFDAC),Approved Covid-19 Vaccines https://nafdac.gov.ng/vaccines-biologicals/covid-19-vaccine-update/ . 2023. Bamgboye, E.A., et al., Regional variation in COVID-19 vaccine uptake and intention in Nigeria: A computer assisted telephone survey. PLOS Global Public Health, 2024. 4 (11): p. e0002895. Chikezie, I.N., et al., Assessing the determinants of uptake and hesitancy in accessing COVID 19 vaccines in Nigeria: a scoping review. Frontiers in Health Services, 2025. 5 : p. 1609418. Sogbesan, A., et al., Exploring COVID-19 Pandemic Perceptions and Vaccine Uptake among Community Members and Primary Healthcare Workers in Nigeria: A Mixed Methods Study. medRxiv, 2024: p. 2024.09. 02.24312966. Ojumu, A., et al., Understanding factors influencing the implementation and uptake of less-established adult vaccination programmes: A meta-ethnography of COVID-19 vaccination in Nigeria. Global Public Health, 2025. 20 (1): p. 2544183. Eniade, O.D., et al., COVID-19 Vaccine Uptake, Unmet Need and Reported Side Effect in Nigeria: An Online Cross-sectional Study. Asian Journal of Research in Infectious Diseases, 2022. 9 (4): p. 10-22. Ojo, T.O., et al., Determinants of COVID-19 vaccine uptake among Nigerians: evidence from a cross-sectional national survey. Archives of Public Health, 2023. 81 (1): p. 95. Babatope, T., V. Ilyenkova, and D. Marais, COVID-19 vaccine hesitancy: a systematic review of barriers to the uptake of COVID-19 vaccine among adults in Nigeria. Bulletin of the National Research Centre, 2023. 47 (1): p. 45. Goncu Ayhan, S., et al., COVID‐19 vaccine acceptance in pregnant women. International Journal of Gynecology & Obstetrics, 2021. 154 (2): p. 291-296. Khubchandani, J., et al., COVID-19 vaccination hesitancy in the United States: a rapid national assessment. Journal of community health, 2021. 46 (2): p. 270-277. Ward, C., L. Megaw, S. White, and Z. Bradfield, COVID‐19 vaccination rates in an antenatal population: a survey of women's perceptions, factors influencing vaccine uptake and potential contributors to vaccine hesitancy. Australian and New Zealand Journal of Obstetrics and Gynaecology, 2022. 62 (5): p. 695-700. Barua, S., M.M. Islam, and K. Murase. A novel synthetic minority oversampling technique for imbalanced data set learning . in International conference on neural information processing . 2011. Springer. Chawla, N.V., K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 2002. 16 : p. 321-357. Liu, J., Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data. Soft Computing, 2022. 26 (3): p. 1141-1163. KOVVURI, V., Explainable Artificial Intelligence across Domains: Refinement of SHAP and Practical Applications. 2024. Pradipta, G.A., et al. SMOTE for handling imbalanced data problem: A review . in 2021 sixth international conference on informatics and computing (ICIC) . 2021. IEEE. Allal, Z., H.N. Noura, O. Salman, and K. Chahine, Leveraging the power of machine learning and data balancing techniques to evaluate stability in smart grids. Engineering Applications of Artificial Intelligence, 2024. 133 : p. 108304. Satriaji, W. and R. Kusumaningrum. Effect of synthetic minority oversampling technique (SMOTE), feature representation, and classification algorithm on imbalanced sentiment analysis . in 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS) . 2018. IEEE. Mohammed, A.J., M. Muhammed Hassan, and D. Hussein Kadir, Improving classification performance for a novel imbalanced medical dataset using SMOTE method. International Journal of Advanced Trends in Computer Science and Engineering, 2020. 9 (3): p. 3161-3172. Gholampour, S., Impact of nature of Medical Data on Machine and Deep Learning for Imbalanced datasets: clinical validity of SMOTE is questionable. Machine Learning and Knowledge Extraction, 2024. 6 (2): p. 827-841. Mohasel, S.M. and H. Koosha, Robust support vector machines for imbalanced and noisy data via benders decomposition. arXiv preprint arXiv:2503.14873, 2025. Ali, M. and Y. Fissha, Propose adjustable Support Vector Machine approach for classifying imbalanced work travel mode choice data. Transportation Research Interdisciplinary Perspectives, 2026. 35 : p. 101786. Cahyana, N., S. Khomsah, and A.S. Aribowo. Improving imbalanced dataset classification using oversampling and gradient boosting . in 2019 5th International Conference on Science in Information Technology (ICSITech) . 2019. IEEE. Pan Quan, S. Agarwal, N. Nissim, and S.K. Sabut, Random Forest Classifier https://www.sciencedirect.com/topics/computer-science/random-forest-classifier . 2025. Razzaghi, H., et al., COVID-19 vaccination coverage and intent among women aged 18–49 years by pregnancy status, United States, April–November 2021. Vaccine, 2022. 40 (32): p. 4554-4563. Malik, A.A., S.M. McFadden, J. Elharake, and S.B. Omer, Determinants of COVID-19 vaccine acceptance in the US. EClinicalMedicine, 2020. 26 . Wiltse, D. and F. Viskupič, Age and partisan self-identification predict uptake of additional COVID-19 booster doses: Evidence from a longitudinal study. Preventive Medicine Reports, 2023. 36 : p. 102407. Gilliss, L., et al., Factors associated with COVID-19 vaccine uptake and hesitancy among women of reproductive age in Mozambique. Frontiers in Public Health, 2025. 13 : p. 1484477. Liu, H., G.R. Nowak III, J. Wang, and Z. Luo, A national study of marital status differences in early uptake of COVID-19 vaccine among older Americans. Geriatrics, 2023. 8 (4): p. 69. Ndlovu, K., S. Ndlangamandla, and B.G. Olutola, Factors influencing Covid-19 vaccine uptake among women in the rural communities of South Africa (SDG 3, target 3. b. 1). OIDA International Journal of Sustainable Development, 2024. 17 (07): p. 35-46. Jaravaza, D.C., J. Risiro, and P. Mukucha, COVID-19 vaccination national radio advertising credibility assessment by rural consumers: The influence of indigenous medical knowledge systems and traditional beliefs. Cogent Public Health, 2023. 10 (1): p. 2178052. Recio-Román, A., M. Recio-Menéndez, and M.V. Román-González, Influence of media information sources on vaccine uptake: the full and inconsistent mediating role of vaccine hesitancy. Computation, 2023. 11 (10): p. 208. Amodan, B.O., et al., Knowledge, attitudes and barriers to uptake of COVID-19 vaccine in Uganda, February 2021. BMJ Global Health, 2025. 10 (3): p. e016959. Patenaude, B.N., et al., Comparing multivariate with wealth-based inequity in vaccination coverage in 56 countries: Toward a better measure of equity in vaccination coverage. Vaccines, 2023. 11 (3): p. 536. Bayati, M., R. Noroozi, M. Ghanbari-Jahromi, and F.S. Jalali, Inequality in the distribution of Covid-19 vaccine: a systematic review. International journal for equity in health, 2022. 21 (1): p. 122. Januszek, S.M., et al., The approach of pregnant women to vaccination based on a COVID-19 systematic review. Medicina, 2021. 57 (9): p. 977. Syan, S.K., et al., COVID-19 vaccine perceptions and differences by sex, age, and education in 1,367 community adults in Ontario. Frontiers in public health, 2021. 9 : p. 719665. Sani, J., et al., Regional Disparities and Maternal Sociodemographic Determinants of Full Immunization Coverage Among Children Aged 12–23 Months in Nigeria: Insights from NDHS 2018. Pediatric health, medicine and therapeutics, 2025: p. 157-170. Ogundele, O.A., et al., Determinants of vaccine hesitancy among pregnant women in South-West Nigeria: an explanatory sequential mixed method design. BMJ open, 2025. 15 (10): p. e101767. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 08 May, 2026 Reviewers agreed at journal 29 Apr, 2026 Reviewers invited by journal 13 Apr, 2026 Editor assigned by journal 13 Apr, 2026 Editor invited by journal 13 Apr, 2026 Submission checks completed at journal 08 Apr, 2026 First submitted to journal 08 Apr, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9288102","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":624134653,"identity":"23c4a00a-e347-4513-812c-e8368ff43494","order_by":0,"name":"Aklilu Habte Hailegebireal","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/0lEQVRIiWNgGAWjYDACHuaGAw8YDjAYMDCwf/5TARRhZm4goIWx4UACSAsbAxsDzxmQFkbCWhjgWnjbQEIEtPD3HGw8kNh2R95cvvnYA8l5tdH87UAtPyq24dQicbaxAajlmeHONrZ0A8Ntx3NnHGZsYOw5cxu3NecZQVoOM244xmMgkbjtWG4DUAszYxtuLfJQLfYbjvF/kDg451jufEJaDCAOO5wItIVNsrGhJncDIS2GZw4CA/nc4eSdbWnGxgzHDuRuBGo5iM8vcmeSD3/4UHbYdjvz4YePGWrqcuedP3zwwY8KPN5HA4fB5AGi1QNBHSmKR8EoGAWjYIQAALu0aRGQBuvhAAAAAElFTkSuQmCC","orcid":"","institution":"Auckland University of Technology","correspondingAuthor":true,"prefix":"","firstName":"Aklilu","middleName":"Habte","lastName":"Hailegebireal","suffix":""}],"badges":[],"createdAt":"2026-04-01 07:25:29","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9288102/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9288102/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107485291,"identity":"c31cf9fc-019e-4e8c-affc-9fcb1ddc6afb","added_by":"auto","created_at":"2026-04-22 02:34:08","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":346685,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of machine learning framework to predict COVID-19 vaccine uptake among Nigerian women, NDHS 20204\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/be321606f7611d0ea93eae19.png"},{"id":107310226,"identity":"4923b65d-c836-4676-a91d-0e989b6aeb78","added_by":"auto","created_at":"2026-04-20 09:05:42","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":36920,"visible":true,"origin":"","legend":"\u003cp\u003eSMOTE balancing of COVID-19 vaccination status among women of reproductive age in Nigeria, DHS 2024\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/77cfb895e38cd4f4cdebac73.png"},{"id":107486996,"identity":"b126b56d-d6a6-4a97-a389-b8a4dc67ec95","added_by":"auto","created_at":"2026-04-22 02:39:31","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":161172,"visible":true,"origin":"","legend":"\u003cp\u003eReceiver Operating Characteristic (ROC) curves for each machine learning model trained on the original, imbalanced dataset.\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/12a1aa0308e2fc8cd8cac2dd.png"},{"id":107310228,"identity":"2c86d10c-04e1-47a1-974a-3eb1adde2bd5","added_by":"auto","created_at":"2026-04-20 09:05:42","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":156006,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eReceiver Operating Characteristic (ROC) curves for each machine learning model trained on the resampled (balanced) dataset.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/9e9606f34b687ea158ccd574.png"},{"id":107310229,"identity":"ab947af3-9717-48b9-9e10-38860816d9c6","added_by":"auto","created_at":"2026-04-20 09:05:42","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":55589,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP waterfall plot illustrating individual-level feature contributions to the Random Forest prediction of COVID-19 vaccine uptake among Nigerian women (Red bars indicate features that increase the predicted likelihood of vaccine uptake, while blue bars represent features that decrease it).\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/14a93421e6df8a2e23382a16.png"},{"id":107708712,"identity":"3f362c6a-a082-4726-a7e4-dc7c37b16cf4","added_by":"auto","created_at":"2026-04-24 09:31:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1223944,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9288102/v1/9e18d0b6-3788-421c-be09-48c869cce8c7.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Prediction of COVID 19 vaccine uptake among Nigerian women using supervised machine learning based on 2024 Demographic and Health Survey data","fulltext":[{"header":"Background","content":"\u003cp\u003eThe emergence of the COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), marked a pivotal turning point in modern history, triggering widespread disruptions that reshaped nearly every facet of human life across the globe [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. It has resulted in more than 778.8\u0026nbsp;million confirmed cases and 7.1\u0026nbsp;million deaths worldwide, with far-reaching economic and social consequences across nations [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Amid the global disruption caused by the pandemic, vaccination has proven essential in easing the multifaceted burden by limiting transmission, severe illness, and death [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Women represent a critical demographic in the COVID-19 vaccination effort, requiring targeted education and equitable access\u0026mdash;especially those of reproductive age, who face unique concerns around fertility and menstrual health that influence uptake [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Findings revealed that COVID-19 vaccine hesitancy among women stems from a myriad of factors, such as education, misinformation, and perceived risk, and broader community influences such as access, religion, and local health systems [\u003cspan additionalcitationids=\"CR8 CR9 CR10\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn Nigeria, the national rollout of COVID-19 vaccines began in March 2021, aiming to curb transmission and reduce mortality [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Of the 11 COVID-19 vaccines authorized globally by the World Health Organization, Nigeria has approved seven for national use: Vaxzevria, Covishield, Comirnaty, Jcovden, Spikevax, Sputnik V, and Covilo[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. However, vaccine uptake has varied significantly across regions, socioeconomic groups, and gender lines [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Although the availability, uptake remains uneven due to a complex mix of challenges, which contributed to persistent hesitancy, namely misinformation, limited health literacy, and fear of side effects[\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. Geographic disparities, religious beliefs, and logistical constraints further complicate vaccine delivery and acceptance [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eUnderstanding the level and drivers of COVID-19 vaccine uptake is crucial for targeted interventions, yet most studies in Nigeria have centered on the general population, overlooking women of reproductive age\u0026mdash;whose unique health concerns and sociocultural context warrant focused analysis [\u003cspan additionalcitationids=\"CR20\" citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Despite a growing body of research on COVID-19 vaccine uptake among women [\u003cspan additionalcitationids=\"CR23\" citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], most existing studies have been limited in scale and analytical depth, relying primarily on basic statistical models. These approaches often fail to capture the complex, non-linear interactions among individual, social, and structural determinants of vaccine behavior. This study addresses a critical gap by applying machine learning algorithms to a large, representative dataset, enabling more accurate identification of key predictors of vaccine uptake among women in Nigeria. The findings hold significant implications for public health: they offer actionable insights for improving vaccine uptake in the country. For researchers, the study demonstrates the value of advanced modeling in epidemiological analysis, while for policymakers, it provides evidence-based guidance to design equitable and targeted vaccination strategies that can enhance coverage and reduce gender disparities in health outcomes.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy setting, data source, and population\u003c/h2\u003e \u003cp\u003eThis study is based on women (IR) data from the 2024 Nigeria Demographic and Health Survey (NDHS), which was conducted between 1 December 2023 and 7 May 2024, encompassing all 36 states of Nigeria and the Federal Capital Territory (FCT). The study population comprised women aged 15\u0026ndash;49 years who were vaccinated for COVID-19. Women who didn\u0026rsquo;t know their vaccination status or had missing data on this variable were excluded.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eSample\u003c/h3\u003e\n\u003cp\u003eSampling procedures followed a stratified two-stage design based on an updated cartographic frame developed for Nigeria\u0026rsquo;s first fully digital Population and Housing Census. In the first stage, 1,400 Enumeration Areas (EAs)\u0026mdash;comprising 701 urban and 699 rural clusters\u0026mdash;were selected using probability proportional to size, stratified across 74 distinct strata defined by urban-rural residence and geopolitical zones. In the second stage, 30 households were systematically selected from each EA, yielding a nationally representative sample of households. Eligible women within these households were interviewed using the standardized Women\u0026rsquo;s Questionnaire, resulting in a final sample of 36,205 respondents. Further methodological details are available in the NDHS technical documentation.\u003c/p\u003e\n\u003ch3\u003eOutcome Variable Definition\u003c/h3\u003e\n\u003cp\u003eThe primary outcome of interest was COVID-19 vaccine uptake, operationalized as a binary classification variable, which was determined from responses to a question that asked whether a woman had received a COVID-19 vaccine or not. Women who reported receiving at least one dose were coded as vaccinated (Yes\u0026thinsp;=\u0026thinsp;1), while those who had not received any dose were coded as not vaccinated (No\u0026thinsp;=\u0026thinsp;0).\u003c/p\u003e\n\u003ch3\u003ePredictor Variables (Features)\u003c/h3\u003e\n\u003cp\u003eThe predictor variables used in this study encompassed a broad range of sociodemographic, behavioral, and health system factors known to influence vaccine uptake. These included age of the woman (15\u0026ndash;24, 25\u0026ndash;34, 35\u0026ndash;49), marital status(unmarried, in a marital relationship, and others), educational attainment (no education, primary, secondary, higher), occupation (working, not working), residence (urban, rura), sex of the household head (male vs female), and household wealth index (poorest, poorer, middle, richer, ricest). Reproductive and media exposure variables such as parity (nulliparous, primiparous, multiparous, grandmultiparous), frequency of listening to the radio, watching television, and reading newspapers (not at all, less than once a week, and at least once a week) were also incorporated. Perceived barriers to maternity care, such as distance to the nearest health facility, permission to seek care, and financial constraints, were assessed and categorized as a big problem or not a big problem. Additional variables included health insurance coverage (yes vs. no), current use of contraception (users vs. non-users), and recent visits to a health facility (yes vs. no).\u003c/p\u003e\n\u003ch3\u003eData management and analysis\u003c/h3\u003e\n\u003cp\u003eA structured machine learning (ML) approach, including data preprocessing, feature engineering, model training, and performance evaluation, was used. All analytical techniques were carried out with Python (Visual Studio Code 3.13.6). To ensure representativeness and account for the complicated sampling design, analyses were weighted with sampling weights (v005/1000000). All potentially relevant variables were retrieved using STATA version 18 and exported as CSV files for further investigation. To improve model performance and reduce bias, the dataset was thoroughly cleaned, encoded, and partitioned. Both descriptive and predictive studies were carried out, with ML algorithms chosen based on their suitability for classification tasks and interpretability in public health settings.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eData preprocessing and analysis\u003c/h2\u003e \u003cp\u003eAll subsequent preprocessing and modeling steps were conducted in Python (Visual Studio Code 3.13.6). Initial data preparation involved exploratory data analysis to understand variable distributions and relationships. This was followed by systematic handling of missing values, multicollinearity, discretization of relevant features, detection and treatment of outliers, and balancing of the target variable to address class imbalance. Feature engineering and transformation were applied to enhance model interpretability and performance. All categorical variables were encoded to numerical values using one-hot encoding to ensure compatibility with machine learning algorithms. To address high dimensionality and reduce model complexity, dimensionality reduction techniques were applied following feature engineering. Highly correlated predictors were identified and removed to minimise redundancy. Feature selection was performed to retain only the most informative predictors based on the literature and chi-square test, ensuring that the final model was both parsimonious and robust. The dataset was then partitioned into training (80%) and testing (20%) subsets, with all modeling restricted to features that passed the selection criteria. To address the potential class imbalance between vaccinated and unvaccinated groups, we applied the Synthetic Minority Over-sampling Technique (SMOTE) to the training dataset. This method generates synthetic samples for the minority class by interpolating between existing observations, thereby equalizing the number of vaccinated and non-vaccinated cases without duplicating data [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. The balanced dataset was subsequently used to train all classification models, ensuring that each algorithm had equitable exposure to both outcome classes. This approach improves model sensitivity and reduces bias toward the majority class, enhancing the reliability of predictive performance across all metrics [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eTo predict COVID-19 vaccine uptake, this study employed a diverse set of nine supervised machine learning algorithms: Logistic Regression (LR), Decision Tree Classifier (DT), Random Forest Classifier (RF), Gradient Boosting Machines (GB), extreme gradient boosting (XGBoost), CatBoost, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Artificial Neural Networks (ANNs). These algorithms were selected to capture a wide spectrum of modeling approaches\u0026mdash;ranging from interpretable linear models and tree-based ensembles to kernel-based classifiers and deep learning architectures. This diversity allowed for a comprehensive comparison of predictive performance across models with varying strengths in handling nonlinearity, feature interactions, and class imbalance. Hyperparameter tuning was conducted using grid search and cross-validation techniques to optimize each model\u0026rsquo;s performance while minimizing overfitting (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eModel performance evaluation\u003c/h3\u003e\n\u003cp\u003eModel performance was rigorously evaluated using an 80:20 train-test split, ensuring that each algorithm was assessed on unseen data. Standardized metrics\u0026mdash;including accuracy, sensitivity (recall), precision, Area Under the Receiver Operating Characteristic Curve (AUC-ROC), F1-score, specificity, and negative predictive value (NPV)\u0026mdash;were computed to capture both overall and class-specific predictive power.\u003c/p\u003e \u003cp\u003eTo enhance the reliability of these estimates, bootstrapped 95% confidence intervals were generated for each metric across 1,000 resampled iterations. Confusion matrices and classification reports were also produced to visualize prediction errors and class-wise performance. This multi-model evaluation strategy provided robust and generalizable insights into the determinants of COVID-19 vaccine uptake, allowing for comparative assessment across diverse learning algorithms.\u003c/p\u003e \u003cp\u003eSensitivity, also known as Recall or the True Positive Rate, measures the model\u0026rsquo;s ability to correctly identify the proportion of vaccinated women.\u003cdiv id=\"Equa\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equa\" name=\"EquationSource\"\u003e\n$$\\:\\text{Sensitivity}=\\frac{TP}{TP+FN}*100$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eSpecificity (True Negative Rate) measures the ability of the model to correctly identify the proportion of unvaccinated women or the model\u0026rsquo;s ability to avoid false positives.\u003cdiv id=\"Equb\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equb\" name=\"EquationSource\"\u003e\n$$\\:\\text{Specificity}=\\frac{TN}{TN+FP}*100$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eAccuracy refers to the proportion of correctly classified women (both vaccinated and not vaccinated), which measures the overall correctness of the model\u0026rsquo;s prediction.\u003cdiv id=\"Equc\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equc\" name=\"EquationSource\"\u003e\n$$\\:\\text{Accuracy}=\\frac{TP+TN}{TP+TN+FP+FN}*100$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003ePrecision (Positive Predictive Value) refers to the proportion of predicted vaccinated women who are actually vaccinated, or measures the reliability of positive predictions.\u003cdiv id=\"Equd\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equd\" name=\"EquationSource\"\u003e\n$$\\:\\text{P}\\text{r}\\text{e}\\text{c}\\text{i}\\text{s}\\text{i}\\text{o}\\text{n}=\\frac{TP}{TP+FP}*100$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003e \u003cb\u003eF1-Score\u003c/b\u003e refers to the harmonic mean of precision and sensitivity, balancing both false positives and false negatives.\u003cdiv id=\"Eque\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Eque\" name=\"EquationSource\"\u003e\n$$\\:\\text{F}1-\\text{s}\\text{c}\\text{o}\\text{r}\\text{e}\\:=\\frac{2.Precision*sensitivity}{Precision+Sensistivity}*100$$\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eAUC-ROC is a performance metric that evaluates a model\u0026rsquo;s ability to distinguish between women who received the COVID-19 vaccine versus those who did not. The ROC curve plots the True Positive Rate (Sensitivity) against the False Positive Rate (1\u0026thinsp;\u0026minus;\u0026thinsp;Specificity) across various classification thresholds. The AUC (Area Under the Curve) summarizes this plot into a single value ranging from 0 to 1. The higher AUC values indicate better accuracy in distinguishing between vaccinated and unvaccinated women\u003c/p\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eFeature Importance Analysis\u003c/h2\u003e \u003cp\u003eTo identify the most influential predictors of COVID-19 vaccine uptake, we conducted a comprehensive feature importance analysis across all trained machine learning models. Statistical association was first assessed using chi-square tests to examine the relationship between categorical predictors and vaccine uptake, helping to identify variables with significant univariate associations for initial screening. In addition, permutation importance with bootstrapped confidence intervals was applied to quantify the change in model performance when each feature was randomly shuffled, offering a robust, model-agnostic measure of variable relevance. To enhance interpretability, SHAP (SHapley Additive exPlanations) was employed across all classifiers to quantify and visualize the contribution of each feature to individual predictions. By fairly assigning each feature's contribution to the model's output, SHAP, which is based on cooperative game theory, provides a unified and theoretically sound way to explain individual prediction [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Beeswarm plots were used to display the distribution of SHAP values across the dataset, highlighting both the magnitude and direction of each feature\u0026rsquo;s influence. Waterfall plots were generated to explain the cumulative impact of predictors on the probability of vaccine uptake for specific individuals, illustrating how each variable pushed the prediction toward or away from the positive class. These visualizations provided intuitive, model-consistent insights into how individual predictors shaped outcomes across algorithms, reinforcing the transparency and policy relevance of the study.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eSociodemographic characteristics of respondents\u003c/h2\u003e \u003cp\u003eA total of 36,205 women were included in this study; the majority were from the Northwest region (32.2%), followed by Northcentral (18.01%). The majority were aged 15\u0026ndash;24 (37.9%) and had secondary education (41.54%). More than one-third (36.39%) of women were multiparous, 18.0% reported using contraceptives, and just 8.38% were currently pregnant. Regarding the vaccination status, the highest proportion of vaccinated women was seen in the southwest region (45.71%), followed by the Northeast (30.63%) and the North central (28.6%) regions.\u003c/p\u003e \u003cp\u003eA relatively higher proportion of vaccinated women was observed among women with higher education (40.93%), employed (32.03%), and living in urban areas (31.13%), and the richest wealth quintile (32.31%). Similarly, a higher proportion was reported among multiparous (30.95%), contraceptive users (36.83%), who listened to the radio at least once a week (35.18%), and enrolled in a health insurance scheme (50.98%) (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Overall, a significant difference in the uptake of the COVID-19 vaccine was observed across regions, age, level of education, residence, wealth index, parity, media exposure(radio, Television, newspaper), barriers to maternity care (money, distance, permission) at p\u0026thinsp;\u0026lt;\u0026thinsp;0.001(Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eThe distribution of sociodemographic and health service-related characteristics of respondents, Nigerian Demographic Health Survey, 2024\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eVariable categories\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTotal (N\u0026thinsp;=\u0026thinsp;36,205)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eVaccinated [n(%)]\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTest statistics\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRegion\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNorthwest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11,658(32.20)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,477(21.24)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"5\" rowspan=\"6\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;1433.31\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNortheast\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5,502(15.20)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,686(30.63)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNorthcentral\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6,522(18.01)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,865(28.6)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSoutheast\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3,052(8.43)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e559(18.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSouth\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4,229(11.68)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,007(23.81)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSouthwest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5,242(14.48)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,396(45.71)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAge\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e15\u0026ndash;24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13,658(37.72)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,009(22.03)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;453.15\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e25\u0026ndash;34\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11,174(30.86)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,202(28.66)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e35\u0026ndash;49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11,374(31.41)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,779(33.23)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMarital status\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnmarried\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e10,025(27.69)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,430(24.24)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;131.51\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIn a marital relationship\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e24,393(67.38)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6,957(28.52)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOthers\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1,786(4.93)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e602.8(33.75)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eEducational level\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo education\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11,978(33.08)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,371(19.80)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;825.71\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePrimary\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4,063(11.22)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,176(28.95)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSecondary\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e15,041(41.54)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,345(28.89)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHigher\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5,123(14.15)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,097(40.93)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eWomen occupation\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;664.36\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot working\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13,600(37.56)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,738(20.13)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWorking\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e22,605(62.44)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7,253(32.08)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eResidence\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;201.03\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUrban\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e17,718(48.94)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5,516(31.13)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRural\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e18,487(51.06)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,474(24.20)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eWealth index\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"5\" rowspan=\"6\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;472.95\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePoorest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5,943(16.41)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,009(17.00)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePoorer\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6,683(18.46)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,571(23.5)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMiddle\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7,259(20.05)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,176(29.98)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRicher\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7,980(22.04)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,540(31.83)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRichest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e8,341(23.04)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,694(32.31)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eFamily size\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;24.82\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;5 members\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e12,368(34.16)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,578(28.93)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u0026gt;=5 members\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e23,837(65.84)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6,412(26.90)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHead of household\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;51.16\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e29,323(81.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7,854(26.78)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFemale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6,882(19.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,136(31.0)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eParity\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"4\" rowspan=\"5\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;177.06\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNulliparous\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e10,976(30.32)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,614(23.81)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePrimiparous\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4,420(12.21)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,160(26.25)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMultiparous\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e13,175(36.39)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,077(30.95)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGrand multiparous\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e7,634(21.08)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,139(28.02)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eCurrently pregnant\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;8.14\u003c/p\u003e \u003cp\u003ep\u0026thinsp;=\u0026thinsp;0.003\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3,032(8.38)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e726(23.94)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e33,173(91.62)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9,264(27.93)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eContraceptive uptake\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;319.26\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNon-users\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e29,702(82.04)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7,595(25.57)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUsers\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e6,503(18.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,395(36.83)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePerceived distance\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;78.57\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBig problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e8,901(24.58)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,161(24.28)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot a big problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e27,304(75.42)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e7,829(28.67)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eGetting permission\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;115.29\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBig problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3,994(11.03)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e843(21.12)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot a big problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e32,211(88.97)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9,147(28.4)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eGetting money\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;84.83\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBig problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e17,158(47.39)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,411(25.71)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot a big problem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e19,047(52.61)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e5,579(29.29)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eWatching TV\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;281.60\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot all\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e19,082(52.70)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,499(23.58)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u0026lt; once a week\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5,938(16.40)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1,894(31.90)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAt least once\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e11,186(30.90)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,597(32.16)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eListening to the radio\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;527.29\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo tat all\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e18,731(51.74)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e4,180(22.31)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u0026lt; once a week\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e8,141(22.49)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2,527(31.04)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAt least once a week\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e9,333(25.78)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3,284(35.18)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eReading newspaper\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;100.12\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNot at all\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e32,602(90.05)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e8,746(26.83)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u0026lt; once a week\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e2,319(6.41)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e793(34.18)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAt least once a week\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1,284(3.55)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e451(35.12)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHealth insurance enrolment\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1,250(3.45)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e637(50.98)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eχ2\u0026thinsp;=\u0026thinsp;253.72\u003c/p\u003e \u003cp\u003ep\u0026thinsp;\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e34,955(96.55)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e9,353(26.76)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eWeighted prevalence of COVID-19 vaccine uptake\u003c/h2\u003e \u003cp\u003eOver one-fourths, 27.6%(95% CI: 27.14, 28.06) of women were vaccinated for COVID-19. Of those vaccinated, the majority, 51.18% and 40.83%, received one and two doses of the vaccine, respectively.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eResults for data preprocessing\u003c/h2\u003e \u003cp\u003eTo meet the input requirements of machine learning classification models, all categorical variables were transformed into numerical representations using one-hot encoding. This process converted each categorical category into a separate binary indicator (0/1), allowing the models to capture non-ordinal relationships without imposing artificial ordering. There was a great imbalance between classes, in which 26,698 were unvaccinated (the majority class) and 9,463 vaccinated (minority class). To address class imbalance and improve model reliability, the Synthetic Minority Oversampling Technique (SMOTE) was applied to the training data. SMOTE generated 17,235 synthetic samples for the vaccinated group, resulting in a balanced distribution of 26,698 observations in each class (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eModel building and performance comparison\u003c/h2\u003e \u003cp\u003eWe developed ten supervised machine learning algorithms\u0026mdash;Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), Gradient Boosting, Extreme Gradient Boosting (XGB), CatBoost, Support Vector Machine (SVM), Gaussian Naive Bayes (GNB), and Artificial Neural Network (ANN)\u0026mdash;to predict COVID-19 vaccine uptake. Each model\u0026rsquo;s performance was evaluated and compared using accuracy before and after data balancing (SMOTE) to identify the most effective predictive algorithm. Prior to data balancing, Gradient Boosting (with 75.1% accuracy) and Support Vector Machine (SVM) (with 74.9% accuracy) achieved the highest predictive performance. After applying the SMOTE oversampling technique to balance the training data, Random Forest emerged as the top-performing model, achieving an accuracy of 75.1% (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003ePerformance comparison of machine learning algorithms for predicting COVID-19 vaccine uptake using unbalanced and balanced training datasets, Nigeria, 2024\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMachine learning models\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCore metrics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eUnbalanced training data (before SMOTE)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBalanced training data (after SMOTE)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eLogistic regression\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e74.7%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e66.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e72.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eDecision tree\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e72.6%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e56.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e75.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eRandom forest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e71.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e75.1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e63.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e82.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eKNN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e73.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e71.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e62.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e77.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eGradient boosting\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e75.1%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e66.6%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e69.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e72.4%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eExtreme gradient boosting (XGB)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e74.4%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e70.4%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e77.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eCatBoost\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e74.8%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e70.4%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e68.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e77.1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eSupport vector machine (SVM)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e74.9%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e69.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e63.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e75.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eGausian Naive Bayes(GNB)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e67.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e60.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e64.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e64.6%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eArtificial neural network (ANN)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcucracy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e73.2%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e68.9%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e66.0%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e75.3%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eReceiver Operating Characteristic (ROC) curves were generated for each machine learning model to evaluate classification performance on the training dataset. The curves were plotted both before and after applying balancing techniques (SMOTE), allowing for visual comparison of model discrimination under imbalanced and resampled conditions (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e and Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eImportant feature selection and model interpretability using RF\u003c/h2\u003e \u003cp\u003eThe probability of COVID-19 vaccine uptake was estimated using the random forest (RF), the ML algorithm with the highest accuracy and AUC in the balanced data. As RF is a tree-based model, we used SHapley Additive exPlanations (SHAP) Tree Explainer to compute global and local feature importance. The SHAP waterfall plot illustrates how individual predictors contributed to the Random Forest model\u0026rsquo;s prediction of COVID-19 vaccine uptake for a representative observation. Starting from the baseline probability (expected model output) of E[f(x)]\u0026thinsp;=\u0026thinsp;0.105, the combined effects of the predictors increased the predicted probability to f(x)\u0026thinsp;=\u0026thinsp;0.918, indicating a high likelihood of vaccine uptake. Several features made strong positive contributions to this prediction. Being aged 35\u0026ndash;49 years increased the predicted probability of vaccination, reflecting higher uptake among older reproductive-age women. Geographic location also played an important role: residing outside the North West and South East regions, particularly being associated with the South West, contributed positively to vaccine uptake. Marital status was influential, with women who were not \u0026ldquo;never in union\u0026rdquo; showing a higher likelihood of vaccination. Similarly, belonging to a wealth category other than the richest quintile (in this specific observation) contributed positively, highlighting the nuanced, non-linear relationships captured by the Random Forest model. Collectively, the contribution of \u0026ldquo;other features\u0026rdquo; further reinforced the upward shift in the predicted probability. In contrast, some variables exerted negative effects on the prediction. Lack of radio exposure (\u0026ldquo;radio not at all\u0026rdquo;) reduced the likelihood of vaccination, underscoring the role of mass media and information access in shaping vaccine behaviour. Educational characteristics showed mixed effects: while having secondary education reduced the predicted probability in this instance, the absence of \u0026ldquo;no education\u0026rdquo; contributed positively, reflecting complex interactions between education levels and other sociodemographic factors within the model (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThis study leveraged a comprehensive machine learning algorithm to identify key predictors of COVID-19 vaccine uptake among Nigerian women using nationally representative DHS data. By addressing class imbalance through the Synthetic Minority Oversampling Technique (SMOTE), we improved model stability and ensured equitable learning across vaccinated and unvaccinated groups [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. The generation of synthetic observations for vaccinated women ensured equal class representation during training [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e], which is particularly important in vaccine uptake studies where skewed outcome distributions can compromise both predictive performance and interpretability. From a statistical standpoint, this approach enhances internal validity by allowing algorithms to learn meaningful patterns from both outcome classes rather than overfitting to the dominant group [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eComparative model evaluation across ten supervised learning algorithms demonstrated that model effectiveness varied before and after data balancing. Before the application of SMOTE, Gradient Boosting (GB) and Support Vector Machine (SVM) models achieved the highest accuracy, reflecting their ability to handle non-linear relationships and margin-based classification in imbalanced datasets [\u003cspan additionalcitationids=\"CR35\" citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e]. However, after balancing, Random Forest (RF) emerged as the top-performing model, achieving the highest accuracy and AUC. This improvement is likely attributable to RF\u0026rsquo;s ensemble structure, which benefits from increased representation of minority-class observations and reduces variance through aggregation of multiple decision trees [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. This finding is consistent with existing evidence that ensemble, tree-based methods are particularly well suited for heterogeneous population data and benefit substantially from balanced class distributions. The superior post-SMOTE performance of RF underscores the importance of addressing class imbalance when deploying machine learning models for health behavior prediction.\u003c/p\u003e \u003cp\u003eUsing a Random Forest model coupled with SHAP-based interpretability, this study provides nuanced insights into the socio-demographic and contextual drivers of COVID-19 vaccine uptake. The SHAP waterfall analysis demonstrates how multiple factors interact to shift predicted vaccination probability from a low baseline to a high likelihood, underscoring the value of machine learning approaches in capturing non-linear and context-specific relationships that may be obscured in conventional regression models. Overall, the analysis demonstrates that COVID-19 vaccine uptake among Nigerian women is driven by a combination of sociodemographic, regional, and socioeconomic factors, with media exposure, region of residence, age, marital status, and education exerting particularly strong influences.\u003c/p\u003e \u003cp\u003eAge emerged as an important predictor, with women aged 35\u0026ndash;49 years contributing positively to vaccine uptake. This finding is consistent with evidence that older women of reproductive age may perceive themselves to be at greater risk of severe COVID-19 outcomes or may have higher engagement with healthcare services due to cumulative reproductive and health system contact over time [\u003cspan additionalcitationids=\"CR39 CR40\" citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. Increased health literacy and greater autonomy in health decision-making among older women may also facilitate vaccine acceptance.\u003c/p\u003e \u003cp\u003eMarital status was another influential predictor, with women who were not \u0026ldquo;never in union\u0026rdquo; (in a marital relationship) showing a higher likelihood of vaccination, which is backed by other findings [\u003cspan additionalcitationids=\"CR42\" citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. This association may reflect greater social support, spousal influence, or shared decision-making within unions, which can facilitate health service utilization. Married or previously married women may also have more frequent interactions with the health system through maternal and child health services, increasing opportunities for vaccine information and uptake.\u003c/p\u003e \u003cp\u003eMedia exposure, particularly radio access, emerged as a critical enabling factor for vaccine uptake, which is supported by related studies [\u003cspan additionalcitationids=\"CR45\" citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. Lack of radio exposure reduced the predicted probability of vaccination, highlighting the importance of mass media in disseminating accurate vaccine information and countering misinformation. In settings where literacy and internet access remain limited, radio continues to be a key channel for public health communication, especially in rural and underserved populations. This finding reinforces the central role of community-level communication strategies in vaccination campaigns.\u003c/p\u003e \u003cp\u003eWomen in the richest wealth quintile were more likely to be vaccinated, which is backed by studies conducted elsewhere [\u003cspan additionalcitationids=\"CR48 CR49\" citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. This association likely reflects greater access to healthcare services and digital health information among women in the richest wealth quintile, which enhances vaccine uptake. Conversely, lower uptake among poorer women suggests that financial and logistical constraints remain substantial barriers. Targeted interventions, including outreach vaccination services and integration of COVID-19 vaccination into routine maternal health care services, may help mitigate these disparities and promote more equitable vaccine coverage.\u003c/p\u003e \u003cp\u003eGeographic variation played a substantial role in shaping vaccine uptake. Women outside the North West and South East regions\u0026mdash;particularly associated with the South West\u0026mdash;contributed positively to vaccination probability. These regional differences likely reflect disparities in health system infrastructure, vaccine availability, and exposure to public health messaging. The South West region of Nigeria is comparatively more urbanized, with stronger health service coverage and higher levels of education and media education, factors that have consistently been linked to improved uptake of preventive health interventions. Conversely, lower uptake in the north-west regions might be due to uneven health system capacity, regional variations in trust toward public institutions, and sociocultural norms influencing health-seeking behavior. This was supported by studies that showed low vaccine uptake in this region [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e]. These patterns highlight the importance of strengthening immunization strategies through decentralizing vaccine outreach efforts and strengthening regional health infrastructure, with special attention to underserved areas.\u003c/p\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003ePolicy implications\u003c/h2\u003e \u003cp\u003eThe findings of this study have important implications for COVID-19 vaccination policy and implementation in Nigeria. The strong influence of age, wealth, region of residence, marital status, and media exposure highlights the need for targeted, equity-oriented vaccination strategies rather than uniform national approaches. Policymakers should prioritize region-specific interventions, particularly in underserved areas such as the North West and South East, by strengthening local health infrastructure and decentralizing vaccine delivery through community-based and mobile outreach services. The protective role of radio exposure underscores the value of sustained, culturally tailored mass-media campaigns, especially in settings with limited digital access. Socioeconomic disparities in uptake suggest that integrating COVID-19 vaccination into routine maternal and reproductive health services and reducing indirect costs of access could improve coverage among poorer women. Finally, the demonstrated utility of machine learning and explainable models supports the incorporation of data-driven decision tools into immunization planning, enabling policymakers to identify high-risk subpopulations and allocate resources more efficiently to achieve equitable vaccine coverage.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003eStrength and limitations\u003c/h2\u003e \u003cp\u003eThis study has some notable strengths. It utilised nationally representative Demographic and Health Survey (DHS) data, enhancing the generalisability of the findings to women of reproductive age across Nigeria. In addition, the application of multiple supervised machine learning algorithms allowed for a robust comparison of predictive performance, ensuring that model selection was data-driven rather than assumption-based. Addressing class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) further strengthened the analysis by improving model stability, internal validity, and predictive fairness between vaccinated and unvaccinated groups. Despite these strengths, some limitations should be considered when interpreting the results. The cross-sectional nature of DHS data precludes causal inference, and the observed associations should therefore be interpreted as correlational rather than causal. COVID-19 vaccination status and key explanatory variables were self-reported, which may introduce recall or social desirability bias. Additionally, although DHS data are comprehensive, they lack detailed information on some potentially important determinants of vaccine uptake, such as vaccine supply constraints, individual risk perceptions, misinformation exposure, trust in government or health systems, and community-level influences.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusion","content":"\u003cp\u003eUsing nationally representative DHS data and explainable machine learning methods, this study identified key sociodemographic, socioeconomic, and contextual predictors of COVID-19 vaccine uptake among Nigerian women. Vaccine uptake was strongly shaped by age, region of residence, wealth status, marital status, and media exposure, with marked regional and socioeconomic inequalities. Policymakers should adopt targeted, equity-focused vaccination strategies that prioritise underserved regions and socioeconomically disadvantaged women. Integrating COVID-19 vaccination into routine maternal and reproductive health services, expanding community-based and mobile outreach, and strengthening radio-based and culturally tailored communication campaigns are essential to improve coverage.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eEthical approval for the original data collection was obtained and approved from the National Health Research Ethics Committee of Nigeria (NHREC). The present study involved secondary analysis of publicly available, fully de-identified DHS data. Written informed consent was obtained from all participants at the time of the original survey, with parental or guardian consent and assent from minors where applicable. As the data were anonymized, no additional ethical approval or participant consent was required for this secondary analysis. All study procedures were conducted in accordance with the DHS data use agreement, relevant ethical guidelines and regulations, and the principles of the Declaration of Helsinki.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot Applicable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data supporting the findings of this study were obtained from the Demographic and Health Surveys (DHS) Program. Access to the anonymized dataset is available upon reasonable request through the DHS website (https://dhsprogram.com/Countries/Country-Main.cfm?ctry_id=30\u0026amp;c=Nigeria\u0026amp;Country=Nigeria\u0026amp;cn=\u0026amp;r=1), following the same procedure undertaken by the authors.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eI would like to declare that there is no conflict of interest with other individuals or organizations that could influence or bias the content of the paper inappropriately.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNo funding to report\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAHH was involved in the conceptualization, design, Literature review, data analysis and interpretation, and manuscript writing and editing. The author read and approved the final manuscript prior to submission.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eI would like to acknowledge the DHS office for letting me access the data based on a reasonable request.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eLai, C.-C., et al., \u003cem\u003eSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges.\u003c/em\u003e International journal of antimicrobial agents, 2020. \u003cstrong\u003e55\u003c/strong\u003e(3): p. 105924.\u003c/li\u003e\n\u003cli\u003eActer, T., et al., \u003cem\u003eEvolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as coronavirus disease 2019 (COVID-19) pandemic: A global health emergency.\u003c/em\u003e Science of the Total Environment, 2020. \u003cstrong\u003e730\u003c/strong\u003e: p. 138996.\u003c/li\u003e\n\u003cli\u003eWHO, \u003cem\u003eWHO COVID-19 dashboard, \u003c/em\u003e\u003cem\u003ehttps://data.who.int/dashboards/covid19/cases?n=c\u003c/em\u003e\u003cem\u003e.\u003c/em\u003e 2025.\u003c/li\u003e\n\u003cli\u003eAndreadakis, Z., et al., \u003cem\u003eThe COVID-19 vaccine development landscape.\u003c/em\u003e Nat Rev Drug Discov, 2020. \u003cstrong\u003e19\u003c/strong\u003e(5): p. 305-306.\u003c/li\u003e\n\u003cli\u003eGalanis, P., et al., \u003cem\u003eIntention of healthcare workers to accept COVID-19 vaccination and related factors: A systematic review and meta-analysis.\u003c/em\u003e Asian Pacific Journal of Tropical Medicine, 2021. \u003cstrong\u003e14\u003c/strong\u003e(12): p. 543-554.\u003c/li\u003e\n\u003cli\u003eChang, W.-H., \u003cem\u003eA review of vaccine effects on women in light of the COVID-19 pandemic.\u003c/em\u003e Taiwanese Journal of Obstetrics and Gynecology, 2020. \u003cstrong\u003e59\u003c/strong\u003e(6): p. 812-820.\u003c/li\u003e\n\u003cli\u003eAdeyemo, K.S., A.O. Mbata, and O.D. Balogun, \u003cem\u003eDeveloping a Multidimensional Framework for Vaccine Confidence: Analyzing Socioeconomic, Cultural, and Psychological Determinants of Vaccine Decision-Making.\u003c/em\u003e Engineering and Technology Journal, 2025. \u003cstrong\u003e10\u003c/strong\u003e(5): p. 4764-4776.\u003c/li\u003e\n\u003cli\u003eSmith, N., \u003cem\u003eExploring the Contextual Determinants of Vaccine Acceptability\u003c/em\u003e. 2022.\u003c/li\u003e\n\u003cli\u003ede Sousa, \u0026Aacute;.F.L., et al., \u003cem\u003eDeterminants of COVID-19 vaccine hesitancy in Portuguese-speaking countries: a structural equations modeling approach.\u003c/em\u003e Vaccines, 2021. \u003cstrong\u003e9\u003c/strong\u003e(10): p. 1167.\u003c/li\u003e\n\u003cli\u003eLiu, R. and G.M. Li, \u003cem\u003eHesitancy in the time of coronavirus: Temporal, spatial, and sociodemographic variations in COVID-19 vaccine hesitancy.\u003c/em\u003e SSM-population health, 2021. \u003cstrong\u003e15\u003c/strong\u003e: p. 100896.\u003c/li\u003e\n\u003cli\u003eRodrigues, F., S. Block, and S. Sood, \u003cem\u003eWhat determines vaccine hesitancy: recommendations from childhood vaccine hesitancy to address COVID-19 vaccine hesitancy.\u003c/em\u003e Vaccines, 2022. \u003cstrong\u003e10\u003c/strong\u003e(1): p. 80.\u003c/li\u003e\n\u003cli\u003eUzochukwu, B.S.C., et al., \u003cem\u003eA health technology assessment of COVID-19 vaccination for Nigerian decision-makers: Identifying stakeholders and pathways to support evidence uptake.\u003c/em\u003e Health Research Policy and Systems, 2024. \u003cstrong\u003e22\u003c/strong\u003e(1): p. 73.\u003c/li\u003e\n\u003cli\u003e\u003cem\u003eNigeria \u0026ndash; COVID19 Vaccine Tracker,2022 \u003c/em\u003e\u003cem\u003ehttps://covid19.trackvaccines.org/country/nigeria/\u003c/em\u003e\u003cem\u003e.\u003c/em\u003e\u003c/li\u003e\n\u003cli\u003eNAFDAC, \u003cem\u003eNational Agency for Food and Drug Administration and Control (NAFDAC),Approved Covid-19 Vaccines \u003c/em\u003e\u003cem\u003ehttps://nafdac.gov.ng/vaccines-biologicals/covid-19-vaccine-update/\u003c/em\u003e\u003cem\u003e.\u003c/em\u003e 2023.\u003c/li\u003e\n\u003cli\u003eBamgboye, E.A., et al., \u003cem\u003eRegional variation in COVID-19 vaccine uptake and intention in Nigeria: A computer assisted telephone survey.\u003c/em\u003e PLOS Global Public Health, 2024. \u003cstrong\u003e4\u003c/strong\u003e(11): p. e0002895.\u003c/li\u003e\n\u003cli\u003eChikezie, I.N., et al., \u003cem\u003eAssessing the determinants of uptake and hesitancy in accessing COVID 19 vaccines in Nigeria: a scoping review.\u003c/em\u003e Frontiers in Health Services, 2025. \u003cstrong\u003e5\u003c/strong\u003e: p. 1609418.\u003c/li\u003e\n\u003cli\u003eSogbesan, A., et al., \u003cem\u003eExploring COVID-19 Pandemic Perceptions and Vaccine Uptake among Community Members and Primary Healthcare Workers in Nigeria: A Mixed Methods Study.\u003c/em\u003e medRxiv, 2024: p. 2024.09. 02.24312966.\u003c/li\u003e\n\u003cli\u003eOjumu, A., et al., \u003cem\u003eUnderstanding factors influencing the implementation and uptake of less-established adult vaccination programmes: A meta-ethnography of COVID-19 vaccination in Nigeria.\u003c/em\u003e Global Public Health, 2025. \u003cstrong\u003e20\u003c/strong\u003e(1): p. 2544183.\u003c/li\u003e\n\u003cli\u003eEniade, O.D., et al., \u003cem\u003eCOVID-19 Vaccine Uptake, Unmet Need and Reported Side Effect in Nigeria: An Online Cross-sectional Study.\u003c/em\u003e Asian Journal of Research in Infectious Diseases, 2022. \u003cstrong\u003e9\u003c/strong\u003e(4): p. 10-22.\u003c/li\u003e\n\u003cli\u003eOjo, T.O., et al., \u003cem\u003eDeterminants of COVID-19 vaccine uptake among Nigerians: evidence from a cross-sectional national survey.\u003c/em\u003e Archives of Public Health, 2023. \u003cstrong\u003e81\u003c/strong\u003e(1): p. 95.\u003c/li\u003e\n\u003cli\u003eBabatope, T., V. Ilyenkova, and D. Marais, \u003cem\u003eCOVID-19 vaccine hesitancy: a systematic review of barriers to the uptake of COVID-19 vaccine among adults in Nigeria.\u003c/em\u003e Bulletin of the National Research Centre, 2023. \u003cstrong\u003e47\u003c/strong\u003e(1): p. 45.\u003c/li\u003e\n\u003cli\u003eGoncu Ayhan, S., et al., \u003cem\u003eCOVID‐19 vaccine acceptance in pregnant women.\u003c/em\u003e International Journal of Gynecology \u0026amp; Obstetrics, 2021. \u003cstrong\u003e154\u003c/strong\u003e(2): p. 291-296.\u003c/li\u003e\n\u003cli\u003eKhubchandani, J., et al., \u003cem\u003eCOVID-19 vaccination hesitancy in the United States: a rapid national assessment.\u003c/em\u003e Journal of community health, 2021. \u003cstrong\u003e46\u003c/strong\u003e(2): p. 270-277.\u003c/li\u003e\n\u003cli\u003eWard, C., L. Megaw, S. White, and Z. Bradfield, \u003cem\u003eCOVID‐19 vaccination rates in an antenatal population: a survey of women\u0026apos;s perceptions, factors influencing vaccine uptake and potential contributors to vaccine hesitancy.\u003c/em\u003e Australian and New Zealand Journal of Obstetrics and Gynaecology, 2022. \u003cstrong\u003e62\u003c/strong\u003e(5): p. 695-700.\u003c/li\u003e\n\u003cli\u003eBarua, S., M.M. Islam, and K. Murase. \u003cem\u003eA novel synthetic minority oversampling technique for imbalanced data set learning\u003c/em\u003e. in \u003cem\u003eInternational conference on neural information processing\u003c/em\u003e. 2011. Springer.\u003c/li\u003e\n\u003cli\u003eChawla, N.V., K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, \u003cem\u003eSMOTE: synthetic minority over-sampling technique.\u003c/em\u003e Journal of artificial intelligence research, 2002. \u003cstrong\u003e16\u003c/strong\u003e: p. 321-357.\u003c/li\u003e\n\u003cli\u003eLiu, J., \u003cem\u003eImportance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data.\u003c/em\u003e Soft Computing, 2022. \u003cstrong\u003e26\u003c/strong\u003e(3): p. 1141-1163.\u003c/li\u003e\n\u003cli\u003eKOVVURI, V., \u003cem\u003eExplainable Artificial Intelligence across Domains: Refinement of SHAP and Practical Applications.\u003c/em\u003e 2024.\u003c/li\u003e\n\u003cli\u003ePradipta, G.A., et al. \u003cem\u003eSMOTE for handling imbalanced data problem: A review\u003c/em\u003e. in \u003cem\u003e2021 sixth international conference on informatics and computing (ICIC)\u003c/em\u003e. 2021. IEEE.\u003c/li\u003e\n\u003cli\u003eAllal, Z., H.N. Noura, O. Salman, and K. Chahine, \u003cem\u003eLeveraging the power of machine learning and data balancing techniques to evaluate stability in smart grids.\u003c/em\u003e Engineering Applications of Artificial Intelligence, 2024. \u003cstrong\u003e133\u003c/strong\u003e: p. 108304.\u003c/li\u003e\n\u003cli\u003eSatriaji, W. and R. Kusumaningrum. \u003cem\u003eEffect of synthetic minority oversampling technique (SMOTE), feature representation, and classification algorithm on imbalanced sentiment analysis\u003c/em\u003e. in \u003cem\u003e2018 2nd International Conference on Informatics and Computational Sciences (ICICoS)\u003c/em\u003e. 2018. IEEE.\u003c/li\u003e\n\u003cli\u003eMohammed, A.J., M. Muhammed Hassan, and D. Hussein Kadir, \u003cem\u003eImproving classification performance for a novel imbalanced medical dataset using SMOTE method.\u003c/em\u003e International Journal of Advanced Trends in Computer Science and Engineering, 2020. \u003cstrong\u003e9\u003c/strong\u003e(3): p. 3161-3172.\u003c/li\u003e\n\u003cli\u003eGholampour, S., \u003cem\u003eImpact of nature of Medical Data on Machine and Deep Learning for Imbalanced datasets: clinical validity of SMOTE is questionable.\u003c/em\u003e Machine Learning and Knowledge Extraction, 2024. \u003cstrong\u003e6\u003c/strong\u003e(2): p. 827-841.\u003c/li\u003e\n\u003cli\u003eMohasel, S.M. and H. Koosha, \u003cem\u003eRobust support vector machines for imbalanced and noisy data via benders decomposition.\u003c/em\u003e arXiv preprint arXiv:2503.14873, 2025.\u003c/li\u003e\n\u003cli\u003eAli, M. and Y. Fissha, \u003cem\u003ePropose adjustable Support Vector Machine approach for classifying imbalanced work travel mode choice data.\u003c/em\u003e Transportation Research Interdisciplinary Perspectives, 2026. \u003cstrong\u003e35\u003c/strong\u003e: p. 101786.\u003c/li\u003e\n\u003cli\u003eCahyana, N., S. Khomsah, and A.S. Aribowo. \u003cem\u003eImproving imbalanced dataset classification using oversampling and gradient boosting\u003c/em\u003e. in \u003cem\u003e2019 5th International Conference on Science in Information Technology (ICSITech)\u003c/em\u003e. 2019. IEEE.\u003c/li\u003e\n\u003cli\u003ePan Quan, S. Agarwal, N. Nissim, and S.K. Sabut, \u003cem\u003eRandom Forest Classifier \u003c/em\u003e\u003cem\u003ehttps://www.sciencedirect.com/topics/computer-science/random-forest-classifier\u003c/em\u003e\u003cem\u003e.\u003c/em\u003e 2025.\u003c/li\u003e\n\u003cli\u003eRazzaghi, H., et al., \u003cem\u003eCOVID-19 vaccination coverage and intent among women aged 18\u0026ndash;49 years by pregnancy status, United States, April\u0026ndash;November 2021.\u003c/em\u003e Vaccine, 2022. \u003cstrong\u003e40\u003c/strong\u003e(32): p. 4554-4563.\u003c/li\u003e\n\u003cli\u003eMalik, A.A., S.M. McFadden, J. Elharake, and S.B. Omer, \u003cem\u003eDeterminants of COVID-19 vaccine acceptance in the US.\u003c/em\u003e EClinicalMedicine, 2020. \u003cstrong\u003e26\u003c/strong\u003e.\u003c/li\u003e\n\u003cli\u003eWiltse, D. and F. Viskupič, \u003cem\u003eAge and partisan self-identification predict uptake of additional COVID-19 booster doses: Evidence from a longitudinal study.\u003c/em\u003e Preventive Medicine Reports, 2023. \u003cstrong\u003e36\u003c/strong\u003e: p. 102407.\u003c/li\u003e\n\u003cli\u003eGilliss, L., et al., \u003cem\u003eFactors associated with COVID-19 vaccine uptake and hesitancy among women of reproductive age in Mozambique.\u003c/em\u003e Frontiers in Public Health, 2025. \u003cstrong\u003e13\u003c/strong\u003e: p. 1484477.\u003c/li\u003e\n\u003cli\u003eLiu, H., G.R. Nowak III, J. Wang, and Z. Luo, \u003cem\u003eA national study of marital status differences in early uptake of COVID-19 vaccine among older Americans.\u003c/em\u003e Geriatrics, 2023. \u003cstrong\u003e8\u003c/strong\u003e(4): p. 69.\u003c/li\u003e\n\u003cli\u003eNdlovu, K., S. Ndlangamandla, and B.G. Olutola, \u003cem\u003eFactors influencing Covid-19 vaccine uptake among women in the rural communities of South Africa (SDG 3, target 3. b. 1).\u003c/em\u003e OIDA International Journal of Sustainable Development, 2024. \u003cstrong\u003e17\u003c/strong\u003e(07): p. 35-46.\u003c/li\u003e\n\u003cli\u003eJaravaza, D.C., J. Risiro, and P. Mukucha, \u003cem\u003eCOVID-19 vaccination national radio advertising credibility assessment by rural consumers: The influence of indigenous medical knowledge systems and traditional beliefs.\u003c/em\u003e Cogent Public Health, 2023. \u003cstrong\u003e10\u003c/strong\u003e(1): p. 2178052.\u003c/li\u003e\n\u003cli\u003eRecio-Rom\u0026aacute;n, A., M. Recio-Men\u0026eacute;ndez, and M.V. Rom\u0026aacute;n-Gonz\u0026aacute;lez, \u003cem\u003eInfluence of media information sources on vaccine uptake: the full and inconsistent mediating role of vaccine hesitancy.\u003c/em\u003e Computation, 2023. \u003cstrong\u003e11\u003c/strong\u003e(10): p. 208.\u003c/li\u003e\n\u003cli\u003eAmodan, B.O., et al., \u003cem\u003eKnowledge, attitudes and barriers to uptake of COVID-19 vaccine in Uganda, February 2021.\u003c/em\u003e BMJ Global Health, 2025. \u003cstrong\u003e10\u003c/strong\u003e(3): p. e016959.\u003c/li\u003e\n\u003cli\u003ePatenaude, B.N., et al., \u003cem\u003eComparing multivariate with wealth-based inequity in vaccination coverage in 56 countries: Toward a better measure of equity in vaccination coverage.\u003c/em\u003e Vaccines, 2023. \u003cstrong\u003e11\u003c/strong\u003e(3): p. 536.\u003c/li\u003e\n\u003cli\u003eBayati, M., R. Noroozi, M. Ghanbari-Jahromi, and F.S. Jalali, \u003cem\u003eInequality in the distribution of Covid-19 vaccine: a systematic review.\u003c/em\u003e International journal for equity in health, 2022. \u003cstrong\u003e21\u003c/strong\u003e(1): p. 122.\u003c/li\u003e\n\u003cli\u003eJanuszek, S.M., et al., \u003cem\u003eThe approach of pregnant women to vaccination based on a COVID-19 systematic review.\u003c/em\u003e Medicina, 2021. \u003cstrong\u003e57\u003c/strong\u003e(9): p. 977.\u003c/li\u003e\n\u003cli\u003eSyan, S.K., et al., \u003cem\u003eCOVID-19 vaccine perceptions and differences by sex, age, and education in 1,367 community adults in Ontario.\u003c/em\u003e Frontiers in public health, 2021. \u003cstrong\u003e9\u003c/strong\u003e: p. 719665.\u003c/li\u003e\n\u003cli\u003eSani, J., et al., \u003cem\u003eRegional Disparities and Maternal Sociodemographic Determinants of Full Immunization Coverage Among Children Aged 12\u0026ndash;23 Months in Nigeria: Insights from NDHS 2018.\u003c/em\u003e Pediatric health, medicine and therapeutics, 2025: p. 157-170.\u003c/li\u003e\n\u003cli\u003eOgundele, O.A., et al., \u003cem\u003eDeterminants of vaccine hesitancy among pregnant women in South-West Nigeria: an explanatory sequential mixed method design.\u003c/em\u003e BMJ open, 2025. \u003cstrong\u003e15\u003c/strong\u003e(10): p. e101767.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"discover-artificial-intelligence","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diai","sideBox":"Learn more about [Discover Artificial Intelligence](https://www.springer.com/44163)","snPcode":"","submissionUrl":"","title":"Discover Artificial Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"COVID-19, Vaccine, Women, Machine Learning, Nigeria, Demographic and Health Survey","lastPublishedDoi":"10.21203/rs.3.rs-9288102/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9288102/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eCOVID-19 vaccination remains a key strategy for reducing severe illness and mortality, yet uptake remains uneven in Nigeria. Although prior studies have highlighted multiple predictors of vaccine uptake among women, most rely on traditional analytical approaches, limiting their ability to capture complex, non-linear drivers of vaccine behaviour. Applying robust, data-driven machine learning methods is therefore essential to more accurately identify key predictors of vaccine uptake and to inform targeted, equitable vaccination interventions.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eThis study employed machine learning techniques using data from the 2024 Nigeria Demographic and Health Survey. A weighted sample of 36,205 women aged 15\u0026ndash;49 years was considered using Python software. Sociodemographic, household, reproductive, health-system, and media-exposure variables were considered as features. After preprocessing, the dataset was split into training (80%) and testing (20%) sets. To address class imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied. Multiple supervised machine-learning algorithms, namely Logistic Regression, Decision Tree Classifier, Random Forest, Gradient Boosting Machines, extreme gradient boosting, CatBoost, Support Vector Machines, K-Nearest Neighbors, and Artificial Neural Networks, were applied. Model effectivenesswas evaluated using accuracy and Area Under the Receiver Operating Characteristic Curve (AUC-ROC) as primary metrics. The best-performing model was interpreted using SHapley Additive exPlanations (SHAP) to identify the most influential predictors of vaccine uptake.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eOverall, 27.6% of women reported receiving at least one dose of a COVID-19 vaccine (95% CI: 27.14\u0026ndash;28.06). Ten-fold cross-validation with SMOTE produced the best results, emphasizing the significance of addressing class. Random Forest achieved the highest performance (accuracy\u0026thinsp;=\u0026thinsp;75.1%, AUC\u0026thinsp;=\u0026thinsp;82.8%). SHAP-based interpretation of the Random Forest model identified older age (35\u0026ndash;49 years), region of residence (notably outside the North West and South East, particularly South West), and marital status (not never in union) as key positive contributors to vaccine uptake, while no radio exposure reduced the predicted likelihood of vaccination.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThis study identified low coverage of COVID-19 vaccine uptake among Nigerian women. Policymakers should adopt targeted, equity-focused vaccination strategies that prioritise underserved regions and socioeconomically disadvantaged women. Integrating COVID-19 vaccination into routine maternal and reproductive health services, expanding community-based and mobile outreach, and strengthening radio-based and culturally tailored communication campaigns are essential to improve coverage.\u003c/p\u003e","manuscriptTitle":"Prediction of COVID 19 vaccine uptake among Nigerian women using supervised machine learning based on 2024 Demographic and Health Survey data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-20 09:05:38","doi":"10.21203/rs.3.rs-9288102/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-08T21:00:30+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"302336215682811018875954675926393438787","date":"2026-04-29T16:52:22+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-04-13T12:33:09+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-04-13T12:29:49+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-04-13T11:44:40+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-04-08T14:46:28+00:00","index":"","fulltext":""},{"type":"submitted","content":"Discover Artificial Intelligence","date":"2026-04-08T14:15:23+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"discover-artificial-intelligence","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"diai","sideBox":"Learn more about [Discover Artificial Intelligence](https://www.springer.com/44163)","snPcode":"","submissionUrl":"","title":"Discover Artificial Intelligence","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Discover Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"fe6e82fb-d832-4ce1-a1c2-decebfa6f7a7","owner":[],"postedDate":"April 20th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-08T21:00:30+00:00","index":58,"fulltext":""},{"type":"reviewerAgreed","content":"302336215682811018875954675926393438787","date":"2026-04-29T16:52:22+00:00","index":45,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-04-20T09:05:38+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-20 09:05:38","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9288102","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9288102","identity":"rs-9288102","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00