Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers

preprint OA: closed
Full text JSON View at publisher
Full text 148,188 characters · extracted from preprint-html · click to expand
Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers Kaleem Maqsood, Javeria Malik, Mahnoor Fatima, Sundas Akram, Husna Ahmad, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8240167/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 23 Feb, 2026 Read the published version in BMC Pregnancy and Childbirth → Version 1 posted 33 You are reading this latest preprint version Abstract Background Preterm birth (PTB) is a major cause of neonatal morbidity and mortality. Inflammation and metabolic disruption are involved in its pathology. This study aimed to assess maternal serum inflammatory and lipid markers as predictors of preterm birth using various machine learning models. Methods Women who were pregnant and attending antenatal clinics were recruited for this study. A group of 186 females who had their births before 37 weeks was marked PTB. The 140 control term deliveries were selected at random. T-tests were used to evaluate variations in baseline and clinical parameters Pearson correlations were visualized via a heatmap. We built models for random forests (RF), logistic regression (LR), XGBoost, and support vector machine (SVM) using a 70/30 train/test split and 5-fold cross-validation. Model performance was measured using accuracy and AUC. Results CRP (r ≈ 0.45), IL-6 (r ≈ 0.40), C3 (r ≈ 0.31), BMI, and lipids correlated positively with PTB, whereas HDL correlated inversely (r ≈ − 0.13). Multivariable logistic regression identified age, BMI, IL-6, C3, and CRP as independent predictors. All ML models showed good discrimination (test AUC ≥ 0.819); logistic regression performed best (accuracy 78.57%, AUC 0.849) with cross-validated AUCs around 0.86–0.87 across models. SHAP analysis confirmed that IL-6, BMI, CRP, age, and C3 were dominant contributors to PTB risk. Conclusions Maternal inflammation and high BMI are important risk factors for preterm birth in this cohort. The logistic regression model combining clinical and serum measures is as good a predictor as complex ML algorithms. It is an interpretable model that can help with risk assessment at an early stage in similar settings. Inflammation Preterm birth Machine learning Biomarkers Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Globally, preterm birth (PTB) is a main healthcare issue due to its strong link with increased neonatal morbidity and mortality. It is clinically defined as the delivery of an infant before 37 gestational weeks [ 1 ]. Around 15 million children are born preterm yearly worldwide, with a global 11% preterm birth rate [ 2 ], among these, about 1 million die due to PTB complications [ 3 ]. South Asia and Sub-Saharan Africa account for over 60% of these PTBs worldwide [ 4 ]. In terms of PTB rates, Pakistan (14.3%) is one of the worst affected countries in the world, and in South Asia it is only after Bangladesh (16.2%) [ 5 ]. The etiology of preterm birth is multifactorial, making it difficult to outline its causative mechanisms. Hence, it remains a central focus of significant research and clinical examination [ 6 , 7 ]. Almost half of all cases of preterm birth occur without a clearly identifiable cause, reflecting the characteristic complexity of the condition [ 8 ]. During labour, there is an influx of immune cells which initiates the parturition process characterised by cervical ripening uterine contractility, and membrane rupturing, leads to inflamation [ 9 , 10 ], and about 40 and 80% PTBs are associated with inflammation and infection [ 11 ] Immune moleules like IL-6 (high levels at labour onset ) [ 12 , 13 ], C-reactive protein (CRP), C3, and C4 have been associated with PTB [ 14 , 15 ]. Obesity can lead to maternal morbidities like thromboembolic and hypertensive disorders, gestational diabetes, and infections [ 16 ]. Dyslipidemia is associated with maternal hormonal changes and metabolic syndrome [ 17 , 18 ]. These complications increase the risk of PTB At present, testing methods for predicting preterm birth can be categorized into three groups: assessment of maternal risk factors, biochemical biomarkers, and cervical parameters. These diagnostics may not be sensitive enough to detect true-positive PTB cases. For example, biochemical tests are costly and may affect pregnant women psychologically and physically. Analyzing risk factors is a commonly used method that relies on evidence from statistical hypothesis testing. More often than not, the case involves a single-factor test in controlled circumstances. Nonetheless, it is very expensive and time-consuming, and it omits many untested and relevant factors. A previous history of preterm birth is one of the strongest predictors with relative risks of 13.56; these do not predict risk in nulliparous women [ 19 , 20 ]. These observations highlight the limitations of current methods in identifying high-risk pregnancies, particularly among women experiencing their first pregnancy. Although several predictive systems using proven risk factors, maternal demographics, and medical and obstetric history have been explored, their predictive performance has remained limited [ 21 , 22 ]. This may be attributed to their reliance on simple linear statistical models, which are inadequate for capturing the complex and multifactorial nature of preterm birth. Consequently, traditional risk-factor-based approaches fail to identify more than half of PTB cases [ 23 ]. In comparison to logistic regression algorithm, the machine learning (ML) models can process higher-dimensional data and have self-learn capacity [ 24 ]. Therefore, in recent years, ML techniques have emerged as promising tools for enhancing individual risk prediction beyond traditional models [ 25 ]. Moreover, these algorithms have improved the predictive accuracy of the PTB prediction models [ 26 ]. In Pakistan, biochemical tests and fetal fibronectin testing to predict PTB remain costly. Similarly, the uptake of cervical-length ultrasound is low due to a shortage of resources and trained personnel. These is also a lack of PTB prediction guidelines in Pakistan, resulting in delayed identification of women at risk [ 27 ]. Therefore, this study aimed to evaluate maternal inflammatory and lipid biomarkers as predictors of preterm birth and to compare the predictive performance of these machine-learning models Logistic Regression, Random Forest, XGBoost, and Support Vector Machine [ 14 ]. We hypothesized that machine learning approaches would outperform traditional regression by better capturing nonlinear interactions among biomarkers. This work fills a major knowledge gap by providing the first biomarker-based ML prediction framework for preterm birth in Pakistan, where reliable screening tools remain limited. Methodology Study Design and Population This study had ethical approval from the University of the Punjab's ethics review board in Lahore. Subjects for the study were recruited at Lady Wellington Hospital and Jinnah Hospital, Lahore, during their antenatal visits in the second and third trimesters. They looked at the pregnancy of each subject until delivery, and gestational age after birth. Out of these, 186 premature birth (PTB) women were included in the PTB group and remaining who delivered at term 140 were randomized as controls (term birth). Informed consent was acquired from all subjects after the study's objectives were explained. Detailed information was collected on their sociodemographic and clinical baseline characteristics. Females with medical complications like uterine fibroids, polycystic ovarian syndrome, carcinoma, hepatitis, and other undefined pathological conditions were also omitted. Sample Collection and Serum Analysis About five milliliters (5 mL) of blood was collected from each individual. The blood samples were transferred into labeled collection tubes for 10 minutes at 3000 × g to collect the serum for the analysis. The serum was aliquoted into labeled cryovials and kept at -80°C until further biochemical or molecular analysis was carried out. Serum lipid analysis was done with commercially available kits of “Monlab” (Spain). Serum concentrations of immune markers, including C-reactive protein (CRP), interleukin-6 (IL-6), and complement factor 3 (C3), were measured using enzyme-linked immunosorbent assay (ELISA) kits from Glory Bioscience (China). All reagents and materials were handled according to the manufacturer’s biosafety and storage recommendations. Descriptive and Inferential Statistical Analysis A one-way ANOVA was conducted to compare biomarker levels in different cohorts. When appropriate, Tukey’s post hoc tests were ran to detect specific differences. A p-value of < 0.05 is statistically significant. To give a suitable representation of data, all values are presented as Mean ± SEM. Also, a Pearson Correlation Matrix was computed to look at the inter-relationships between the parameters. Variables that were studied included maternal age, BMI, triglycerides, HDL-C, total cholesterol, LDL-C, IL-6, Complement C3, C-reactive protein and outcome preterm birth. A heatmap visualization was generated using seaborn. Data Processing and Preparation The outcome variable was encoded into a binary form: 0 = term birth and 1 = preterm birth (including both very preterm and extremely preterm cases). Categorical predictors were converted to numerical values via one-hot encoding where necessary. Because some algorithms are sensitive to predictor scale, standardized scaling was applied selectively. For logistic regression and SVM models, continuous predictors were standardized to zero mean and unit variance using StandardScaler after mean imputation. For random forests and XGBoost models, only mean imputation was applied; tree-based methods were trained on unscaled features, as they are inherently scale-invariant. Scaling and imputation were implemented using scikit-learn Pipelines, ensuring that all preprocessing was confined to the training data and consistently applied to the test data. Data Partitioning and Model Development For predictive modeling, the dataset was split into training (70%) and test (30%) sets using stratified random sampling to preserve the outcome proportions across groups. We used traditional and advanced supervised machine learning algorithms to predict preterm birth outcomes. We used four algorithms for this task. The algorithms were LR [ 28 ], RF [ 29 ], XGBoost [ 30 ], and SVM [ 31 ]. Logistic Regression was used as the baseline interpretable model. We used Random Forest and XGBoost to capture nonlinear interactions and assess feature importance. We included the SVM model in this analysis to assess classification efficiency in high-dimensional data spaces.. Model Evaluation and Validation Model performance was evaluated on the independent test dataset. The main evaluation metrics were Accuracy, Area Under Curve (AUC), Precision, Sensitivity, and F1-Score. Each model was evaluated using its own ROC curve to assess discriminative performance. The best predictive accuracy was achieved among the classifiers tested with Logistic Regression, with a mean cross-validated AUC of 0.86. To ensure model stability and reduce overfitting, five-fold cross-validation was invoked. Feature Importance Analysis For the random forest and XGBoost models, global feature importance was obtained from the fitted models via the feature_importances attribute, which reflects the average reduction in node impurity attributable to each variable's splits. Importance scores were normalized and plotted as horizontal bar charts using descriptive feature names. Because the RBF-kernel SVM does not yield direct coefficients, permutation feature importance was used to assess variable contributions. Using the trained SVM pipeline, baseline accuracy on the test set was recorded, and each predictor was permuted repeatedly while keeping all other features fixed. The mean decrease in classification accuracy across 30 permutations per feature, along with its standard deviation, was used as the importance score and visualized in a horizontal bar plot. SHAP (Shapley Additive Explanations) Analysis To quantify the relative importance and direction of influence of each predictor in the LR model, SHAP analysis was performed. SHAP values measure the marginal contribution of each feature to the model’s output using cooperative game theory, providing both local (individual-level) and global (model-level) interpretability. Summary plots and bar-type feature importance plots were generated using the shap Python library. These plots visually illustrate how increases or decreases in biomarkers contribute to the predicted risk of preterm birth. All SHAP visualizations were produced on the test set to avoid information leakage. Five-fold cross-validation Internal validation was achieved using 5-fold stratified cross-validation on the 70% training set. The data were divided into five folds, preserving the proportions of term and preterm births. In each cycle, four folds were used to train the full preprocessing–model pipeline and the remaining fold served as the validation set. This process was repeated until every fold had been used once for validation. For each model and fold, accuracy, AUC, sensitivity, specificity, and the F1-score for preterm birth were calculated, and the mean and standard deviation across the five folds were taken as the cross-validated performance estimates. Software and Tools All conventional statistical analyses were executed using IBM SPSS (Version 20). All computational analyses were performed using Python libraries including pandas, numpy, scikit-learn, matplotlib, seaborn, and xgboost. Graphical illustrations, confusion matrices, ROC curves, and summary tables were generated using Matplotlib and Seaborn. Data integrity and reproducibility were maintained by using the same train-test split across all algorithms, allowing for direct model comparison. Results Demographic and Clinical Characteristics We conducted tests on different sample sets of mothers who delivered at term and preterm to compare maternal characteristics and inflammatory biomarkers. We found very different results for each factor. The mean maternal age (P < 0.001) and BMI (P < 0.001) were higher in preterm birth group than in the term birth group. Among lipid profile markers, women with preterm birth had higher triglycerides, total cholesterol, and LDL cholesterol (P < 0.001). Conversely, HDL cholesterol was significantly lower among preterm births (P = 0.0283) (Table 1 ). Inflammatory biomarkers demonstrated pronounced elevations in the preterm birth group. IL-6 levels (P < 0.001), Complement C3 concentrations (P < 0.001), and C-reactive protein (CRP) (P < 0.001) were markedly higher in preterm deliveries (Table 1 ). Table 1 Baseline maternal and pregnancy characteristics in term and preterm birth groups. Parameters Mean ± SD P value Term Group Preterm Group Maternal Age 27.77 ± 4.96 30.17 ± 5.11 0.0000 Body Mass Index (BMI) 24.99 ± 5.28 29.09 ± 6.55 0.0000 Triglycerides 161.25 ± 44.69 183.26 ± 41.68 0.0000 HDL Cholesterol 60.96 ± 20.79 55.63 ± 21.06 0.0283 Total Cholesterol 174.93 ± 38.59 192.57 ± 36.24 0.0001 LDL Cholesterol 71.59 ± 25.37 81.18 ± 23.76 0.0008 Interleukin-6 (IL-6) 42.98 ± 9.84 52.75 ± 12.11 0.0000 Complement C3 65.46 ± 20.60 79.17 ± 21.20 0.0000 C-Reactive Protein (CRP) 6.76 ± 1.97 9.69 ± 3.41 0.0000 Correlation Among Maternal Serum Markers and Preterm Birth The study found that inflammatory markers and metabolic factors are positively correlated with preterm birth and other adverse pregnancy outcomes. Serum CRP and preterm birth are correlated with r approx 0.45, followed by IL-6, r approx 0.40 and complement C3, r approx 0.31. The maternal BMI showed a weak positive surprise (r ≈ 0.32), as well as triglycerides (r ≈ 0.25), total cholesterol (r ≈ 0.23), and maternal age (r ≈ 0.23). Having said that, HDL cholesterol was found to have a week inverse association with preterm birth. High levels of HDL possibly have a protective effect and this association is in line. The overall finding suggests that a greater inflammatory burden and unfavourable metabolic profiles are associated with an enhanced threat of preterm birth (Fig. 1 ). Logistic Regression Analysis Binary logistic regression was applied to identify the most significant predictors of preterm birth. Multivariable logistic regression identified maternal age, BMI, IL-6, C3, and CRP as independent predictors of preterm birth. For each 1-year increase in age, the odds of preterm birth increased by about 11% (OR = 1.108, P = 0.001), and each 1-kg/m² increase in BMI raised the odds by 13% (OR = 1.133, P < 0.001). Inflammatory markers remained strongly associated with risk: IL-6 (OR = 1.060, P < 0.001), C3 (OR = 1.023, P = 0.002), and especially CRP (OR = 1.338, P < 0.001). In contrast, triglycerides showed only a borderline association (P = 0.054), while HDL, total cholesterol, and LDL were not significant predictors after adjustment (Table 2 ). Table 2 Multivariable logistic regression model for preterm birth. Parameters β OR 95% CI P value Maternal Age 0.103 1.108 1.04–1.18 0.001 Body Mass Index (BMI) 0.125 1.133 1.07–1.20 0.000 Triglycerides 0.007 1.007 1.00-1.01 0.054 HDL Cholesterol -0.005 0.995 0.98–1.01 0.557 Total Cholesterol 0.004 1.004 0.99–1.01 0.500 LDL Cholesterol 0.007 1.007 0.99–1.02 0.397 Interleukin-6 (IL-6) 0.058 1.060 1.03–1.09 0.000 Complement C3 0.023 1.023 1.01–1.04 0.002 C-Reactive Protein (CRP) 0.291 1.338 1.18–1.51 0.000 Test-Set Performance of Machine Learning Models Across the four models evaluated, all approaches showed good discrimination for preterm birth, with accuracies ranging from 74.49 to 78.57 and ROC-AUC values above 0.819. Logistic regression achieved the best overall performance (accuracy 78.57, ROC-AUC 0.849), closely followed by SVM (accuracy 77.55, ROC-AUC 0.819). The ensemble tree-based methods performed slightly worse but still achieved competitive accuracies of 74.49 and 75.51 with AUCs of 0.835 and 0.823 for random forest and XGBoost, respectively (Table 3 ; Fig. 2 ). Table 3 Test-set performance of ML models for the preterm birth prediction. Model Accuracy (%) Sensitivity (%) Specificity (%) F1-score AUC Logistic regression 78.57 76.79 80.95 0.80 0.849 (0.77–0.92) Random forest 74.49 75.00 73.81 0.77 0.835 (0.76–0.91) XGBoost 75.51 76.79 73.81 0.78 0.823 (0.74–0.89) SVM 77.55 78.57 76.19 0.80 0.819 (0.73–0.89) Five-Fold Cross-Validated Performance (Internal Validation) In 5-fold stratified cross-validation of the training set, all models demonstrated stable and comparable discrimination (Table 5 ). Mean accuracies ranged from 75.86% to 79.39%, with logistic regression showing the highest accuracy (79.39 ± 4.24%), closely followed by SVM (78.51 ± 4.82%). Sensitivity for preterm birth remained around 78–82% across models, while specificity for term birth ranged from 71–79%, indicating balanced ability to identify both preterm and term outcomes. The F1-scores for the preterm class were consistently around 0.79–0.82, reflecting good overall classification of preterm cases. All models achieved cross-validated AUC values between 0.86 and 0.87, confirming robust discriminatory performance and supporting the reliability of the test-set results (Table 4 ). Table 4 Five-fold cross-validated performance of ML models for preterm birth prediction. Model Accuracy (%) Sensitivity (%) Specificity (%) F1-score AUC Logistic regression 79.39 ± 4.24 81.54 ± 8.21 76.63 ± 6.68 0.82 ± 0.04 0.86 ± 0.04 Random forest 77.61 ± 5.15 80.00 ± 7.46 74.53 ± 9.20 0.80 ± 0.05 0.87 ± 0.02 XGBoost 75.86 ± 3.74 79.23 ± 5.22 71.42 ± 5.32 0.79 ± 0.03 0.86 ± 0.03 SVM 78.51 ± 4.82 78.46 ± 8.63 78.68 ± 7.84 0.80 ± 0.05 0.87 ± 0.04 Feature Importance Across Models Across all models, inflammatory biomarkers consistently emerged as the strongest predictors of preterm birth. Logistic regression identified CRP as the dominant factor, followed by BMI, maternal age, IL-6, and C3, while lipid measures contributed minimally after adjustment. Feature-importance rankings from the random forest and XGBoost models reinforced this pattern, with CRP, IL-6, and BMI showing the highest contributions, and triglycerides and C3 providing moderate additional value. In contrast, total, HDL, and LDL cholesterol played only a small role. The SVM model showed a slightly different pattern, placing C3 and maternal age highest, followed by BMI, CRP, and IL-6 (Fig. 3 ). Model Explainability Using SHAP for Logistic Regression SHAP analysis of the best-performing logistic regression model showed that IL-6, BMI, and CRP were the dominant contributors to predicted preterm birth risk, followed by maternal age and complement C3. At the same time, lipid markers had much smaller average SHAP values (Fig. 4 ). The SHAP summary plot further illustrated that higher values of IL-6, BMI, CRP, age, and C3 were associated with positive SHAP values, pushing predictions toward preterm birth. In contrast, lower values of these features tended to be protective. Lipid variables clustered around zero SHAP, and higher HDL levels were slightly associated with reduced predicted risk. Together, these results confirm that systemic inflammation and maternal adiposity drive most of the model’s discriminative power (Table 5 ; Fig. 5 ). Table 5 Mean absolute SHAP values for each predictor. Feature Mean_ABS_SHAP Interleukin-6 (IL-6) 0.693 Body Mass Index (BMI) 0.623 C-Reactive Protein (CRP) 0.619 Maternal Age 0.334 Complement C3 0.313 Total Cholesterol 0.224 Triglycerides 0.222 LDL Cholesterol 0.065 HDL Cholesterol 0.059 Discussion In this study, women who delivered preterm had a less favorable inflammatory and metabolic profile. Older age and higher BMI, along with an atherogenic lipid pattern, were associated with preterm birth condition, along with elevated inflammatory markers, including IL-6, CRP, and C3.In the altered models, maternal age, BMI, IL-6, C3, and especially CRP were significant predictors of preterm birth. All four machine-learning models performed reasonably well, with AUCs above 0.82 on the test set. The mean cross-validated AUCs were in the range of 0.86–0.87. The logistic regression performed similarly and occasionally better than the other complicated models. SHAP analysis was also consistent with the regression results, indicating that IL-6, BMI, CRP, maternal age, and C3 were the strongest predictors. These findings affirm that simple, readily obtainable maternal biomarkers can provide significant early risk assessment. The performance of all four machine learning models involved in our study is comparable to previous studies involving these algoithms to predict PTB using maternal biomarkers. Teng et al. identified seven major predictors of PTB (ALP, AFP, hemoglobin, urea, sodium, lymphocyte count, andRBCs) using LR and four machine learning models followed by SHAP analysis. XGBoost model performed best (AUC = 0.893) followed by logistic regression (0.872), LightGBM (0.840) and GBDT (0.879) [ 32 ]. Sun et al., analyzed a total of 9550 pregnant women cases and found that the AUC (0.885) and accuracy (0.816) of the RF were the highest compared with other algorithms to predict PTB using predictors like gestational age, serum inorganic phosphorus, magnesium, platelet volume, waist size, total cholesterol, total bilirubin, triglycerides, and globulins [ 33 ]. In a study performed by Kloska et al., on maternal data collected from 50 patients, the linear SVM (accuracy: 82%) and logistic regression model (accuracy: 80%) demonstrated comparable performance [ 1 ] which is in line with our findings. In our study the logistic regression model somewhat outperformed machine learning models. Other studies which compared machine learning models to conventional logistic regression to predict various clinical conditions also showed that in general, no single method consistently provides the best prediction [ 34 , 35 ]. Linear models (e.g linear SVM and logistic regression) work well when data relationships are linear or roughly linear as is often the case with patients' medical data e.g their histories and blood tests [ 36 ] These linear models offer greater interpretability which is important in medical settings, and can effectively capture simple and linear relationships between predictors and PTB risk [ 1 ]. Even though logistic regression is often used, it requires the predictors to be linear and independent. On the other hand, machine learning is a non-parametric method which deals with complex and non-linear models. The growing evidence of maternal inflammation leading to preterm birth and elevated levels of CRP, IL-6 and C3 in women who have experienced preterm birth is more or less fits with the literature on the subject. In early or mid-pregnancy, an increase in CRP has long been associated with a shorter duration of pregnancy (gestation). At the same time, IL-6 is among the most frequently cited cytokines in preterm labour and rupture of membranes. Increased IL-6 and CRP levels may damage membranes and lead to uterine contractions. [ 37 – 40 ]. The inflammatory markers differ significantly between the groups and are strong predictors in the regression and machine-learning models in this study. The assessment found a strong association between C-reactive protein (CRP) and the outcome. CRP is generally known as an easy available marker of inflammatory stress. According to earlier literature, inflammation is the primary cause of preterm labor and birth due to bacteria infiltrating the mother’s uterus [ 41 ] and the subsequent activation of maternal and fetal immune responses [ 42 ]. Chorioamnionitis, characterized by inflammation of the fetal membranes, is frequently linked to such infections and leads to elevated levels of pro-inflammatory cytokines, including IL-6 and IL-1β. These mediators enhance uterine contractions and compromise the integrity of the membranes, thereby increasing the risk of premature delivery [ 43 ]. Complement activation has been less widely studied in general obstetric populations, yet existing literature supports its involvement in adverse pregnancy outcomes. The complement system is increasingly recognized as a key contributor to preterm birth, as its activation has been linked to enhancing myometrium contractions, cervical collagen remodeling, and the recruitment of inflammatory immune cells [ 44 , 45 ]. The association we observed between higher C3 levels and preterm birth suggests that complement-driven inflammatory pathways may contribute meaningfully to early parturition. Past literature has supported our findings by reporting that the complement cascade facilitates the recruitment and activation of neutrophils and macrophages [ 46 , 47 ]. In preterm birth, activated macrophages promote the release of matrix metalloproteinase-9 (MMP-9), which degrades cervical collagen, contributing to tissue remodeling, distension, and eventual dilation [ 46 , 48 ]. This finding offers a potentially important avenue for further mechanistic and translational research. Women who had a preterm delivery showed greater levels of triglycerides, total cholesterol, and LDL cholesterol than normal women. Similar reports of maternal dyslipidemia influencing preterm birth have characterised previous findings. Elevated levels of cholesterol and triglycerides during early pregnancy force beyond 2.8 times the risk for pre-delivery in less than 34 weeks and two times in case of 34–37 weeks of pre-delivery [ 49 ]. These findings suggest that pregnancy-related lipid alterations may reflect underlying inflammatory or infectious processes, as hypertriglyceridemia can function as an innate immune response and heightened inflammatory activity may contribute to hypercholesterolemia. Furthermore, excessive cholesterol levels have been associated with thrombotic events, potentially increasing the likelihood of obstetric complications such as placental abruption, which can precipitate preterm birth [ 50 ]. Nevertheless, most lipid components lost their high significance after controlling for age, BMI, and inflammatory markers, suggesting that they may have broader metabolic and inflammatory effects. This interpretation is consistent with evidence that metabolic stress, adiposity, and inflammation are linked during pregnancy. Dyslipidemia is probably indicative of a higher-risk metabolic state, but our data demonstrate that inflammatory markers and BMI more directly predict preterm birth in this population. Maternal age and BMI were both independent predictors of preterm birth with clinically relevant effect sizes. Obstetric risk factors can influence gestational outcomes through their potential effects on vascular, metabolic, and inflammatory dysfunctions. These are best recognised in these variables. If you are overweight or obese before pregnancy, you are at a higher risk for gestational hypertension and gestational diabetes [ 51 , 52 ], both of which increase the likelihood of labor induction or planned cesarean delivery. These obstetric interventions may partly account for the elevated risk of preterm birth observed among individuals with higher BMI [ 53 ]. Given the rising trend in advanced maternal age and obesity globally, these findings reinforce the need for strategies that support maternal metabolic health before and during pregnancy. Interventions focused on weight optimisation, improved nutrition, and management of chronic inflammation may help lower the risk of preterm birth, although further evidence from interventional research is needed. Although numerous machine learning models for PTB prediction exist, relatively few have been developed by using this biomarker panel especially in LMICs, where patient demographics and clinical characteristics differ from high-income countries. Thus, this study is particularly important for providing realistic and reproducible insights for the subcontinent population. Given their high AUC and accuracy, these machine learning models can be applied early in pregnancy could allow timely identification of high-risk individuals and prompt interventions to reduce PTB, particularly in LMICs. Future research should increase sample sizes and incorporate multicenter data to enhance generalizability. Furthermore, clinical validation of machine learning models and evaluation of their real-world effectiveness and cost-efficiency are crucial next steps. The four machine-learning models, such as logistic regression, SVM, random forest, and XGBoost, showed very similar and robust predictive capacity. Logistic regression performed particularly well and offered the advantage of interpretability. These results highlight an important message: the choice of predictors and data quality often matter more than model complexity. Earlier research has shown that Linear Kernel support vector machines and logistic regression are sensitive to variable scaling, and that appropriate normalization can improve their ability to detect meaningful patterns in the dataset [ 1 ]. The SHAP analysis further strengthened the interpretability of the findings, showing that the same set of variables, IL-6, BMI, CRP, maternal age, and C3, dominated model decision-making across approaches. Lipid markers, in contrast, played a much smaller role. The coherence between statistical and ML-driven insights increases confidence in the robustness of the results. Our study's findings lend support to the routine use of a focused panel of easily measurable biomarkers namely CRP, IL-6 and C3, in conjunction with age and BMI, which may assist in early identification of women at higher risk of preterm birth. Because this approach is so simple, it might be introduced into clinical workflows of clinics that lack advanced diagnostic imaging. The models showed similar sensitivity and specificity, the next step is external validation before clinical use. Even so, these results show how biomarker-informed risk stratification could be used to monitor closely, refer in a timely manner and deliver targeted preventive measures. One of the strengths of this study is its prospective design, which provides access to well-characterized biochemical and clinical data. We applied both conventional and machine learning methods, coupled with the SHAP explainability tool. We were able to evaluate predictions of performance and the biological meaning behind them. The primary contribution of this study is the application of advanced machine learning models for PTB prediction while maintaining paired with strong SHAP-based interpretability. By measuring the contribution of each predictor, SHAP improves the model’s transparency and clinical relevance, giving clinicians a clearer and more reliable decision-support tool [ 32 ]. Transparent interpretability is especially critical for clinicians in LMIC settings because they often work with limited resources, high patient loads, and variable data quality. Nevertheless, the participants of the study were drawn from two tertiary centres and may not reflect the greater community. Even though they were adequate for the study’s analyses, the sample sizes are moderate for machine-learning applications. Because biomarker samples were collected in the second and third trimesters, inference regarding earlier screening is limited. We also did not account for numerous clinical or environmental factors that influence preterm birth. The internal validation of our models is more important than external validation. For future studies, we should validate the findings in larger and more diverse populations. Additionally, we may explore early pregnancy or longitudinal trajectories of biomarkers. Finally, we should integrate biochemical, clinical, or ultrasound parameters into a joint model. Studies examining implementation will also help determine whether these models can significantly reduce preterm birth rates or improve outcomes in such cases. Conclusion This study shows that preterm birth is closely linked to maternal inflammatory and metabolic status. Higher CRP, IL-6, C3, BMI, and maternal age were the strongest predictors, while lipid markers contributed less after these factors were taken into account. All machine-learning models performed well, with logistic regression matching more complex approaches and offering clearer interpretability. A small set of routine biomarkers, combined with basic clinical data, can provide meaningful early prediction of preterm birth, although external validation is needed before clinical use. Declarations Ethics approval and consent to participate: Ethical approval for this study was obtained from the Institutional Ethics Review Board, University of the Punjab, Lahore. Written informed consent was obtained from all participants or their legal guardians. Consent for publication: Not Applicable Competing interests: The authors declare no potential conflicts of interest. Funding: This study received no specific funding. Author Contribution Authors' contributions: KM, JM and MF contributed equally to the study design, data analysis, and manuscript drafting and interpretation. SA and HA supported literature review and manuscript preparation. NR and SB supervised the project, provided critical revisions, and finalized the manuscript. Acknowledgements: We gratefully acknowledge the Institute of Zoology, University of the Punjab, Lahore for providing financial support for this study. Data Availability There is no additional supporting data to be declared. All data generated or analyzed in this study are fully reported within the Results section of the manuscript. References Kloska A, et al. Predicting preterm birth using machine learning methods. Sci Rep. 2025;15(1):5683. Blencowe H et al. Chap. 2: 15 million preterm births: Priorities for action based on national, regional and global estimates. Born too soon: The global action report on preterm birth, 2012. IGME). U.N.I.-a.G.f.C.M.E.U. Levels & Trends in Child Mortality: Report. Estimates developed by the United Nations Inter-agency Group for Child Mortality Estimation . WHO. Preterm birth. [cited 2018.; Available from: http://www.who.int/en/news-room/factsheets/detail/preterm-birth Blencowe H, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet. 2012;379(9832):2162–72. Goldenberg RL, et al. Epidemiology and causes of preterm birth. lancet. 2008;371(9606):75–84. Koullali B, et al. Risk assessment and management to prevent preterm birth . in Seminars in fetal and neonatal medicine . Elsevier; 2016. Shivkumar PV, Priyadarshani P, Choksi N. Preterm Labor. Labour Room Emergencies. Springer Singapore: Singapore; 2020. pp. 33–8. A. Sharma, Editor. Young A, et al. Immunolocalization of proinflammatory cytokines in myometrium, cervix, and fetal membranes during human parturition at term. Biol Reprod. 2002;66(2):445–9. Rinaldi SF, et al. Anti-inflammatory mediators as physiological and pharmacological regulators of parturition. Expert Rev Clin Immunol. 2011;7(5):675–96. Miller FA, et al. Interventions for Infection and Inflammation-Induced Preterm Birth: a Preclinical Systematic Review. Reproductive Sci. 2023;30(2):361–79. Leimert KB, et al. Inflammatory amplification: a central tenet of uterine transition for labor. Front Cell Infect Microbiol. 2021;11:660983. Gilman-Sachs A, et al. Inflammation induced preterm labor and birth. J Reprod Immunol. 2018;129:53–8. Moghaddam Banaem L, et al. Maternal serum C-reactive protein in early pregnancy and occurrence of preterm premature rupture of membranes and preterm birth. J Obstet Gynecol Res. 2012;38(5):780–6. Lynch AM, et al. Early elevations of the complement activation fragment C3a and adverse pregnancy outcomes. Obstet Gynecol. 2011;117(1):75–83. Galtier-Dereure F, Boegner C, Bringer J. Obesity and pregnancy: complications and cost. Am J Clin Nutr. 2000;71(5):S1242–8. Aghaie Z, Hajian S, Abdi F. The relationship between lipid profiles in pregnancy and preterm delivery: a systematic review. Biomedical Res Therapy. 2018;5(8):2590–609. Chatzi L, et al. Metabolic syndrome in early pregnancy and risk of preterm birth. Am J Epidemiol. 2009;170(7):829–36. Tran T et al. Preterm birth prediction: Deriving stable and interpretable rules from high dimensional data. arXiv preprint arXiv:1607.08310, 2016. Esplin MS, et al. Estimating recurrence of spontaneous preterm delivery. Obstet Gynecol. 2008;112(3):516–23. Mercer B, et al. The preterm prediction study: a clinical risk assessment system. Am J Obstet Gynecol. 1996;174(6):1885–95. Lee KA, et al. A model for prediction of spontaneous preterm birth in asymptomatic women. J Women's Health. 2011;20(12):1825–31. Georgiou HM, et al. Predicting Preterm Labour: Current Status and Future Prospects. Dis Markers. 2015;2015(1):435014. Ngiam KY, Khor IW. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 2019;20(5):e262–73. Koivu A, Sairanen M. Predicting risk of stillbirth and preterm pregnancies with machine learning. Health Inform Sci Syst. 2020;8(1):14. Vovsha I et al. Predicting Preterm Birth Is Not Elusive: Machine Learning Paves the Way to Individual Wellness . in AAAI Spring Symposia . 2014. Bhutta ZA, et al. Reproductive, maternal, newborn, and child health in Pakistan: challenges and opportunities. Lancet. 2013;381(9884):2207–18. Maalouf M. Logistic regression in data analysis: an overview. Int J Data Anal Techniques Strategies. 2011;3(3):281–99. Gupta D, Malviya A. Satyendra Singh Performance Analysis of Classification Tree Learning Algorithms. International Journal of Computer Applications: pp. 0975–8887. Devan P, Khare N. An efficient XGBoost–DNN-based classification model for network intrusion detection system. Neural Comput Appl. 2020;32(16):12499–514. Ayat N-E, Cheriet M, Suen CY. Automatic model selection for the optimization of SVM kernels. Pattern Recogn. 2005;38(10):1733–45. Teng X, et al. Machine learning prediction of preterm birth in women under 35 using routine biomarkers in a retrospective cohort study. Sci Rep. 2025;15(1):10213. Sun Q, et al. Machine Learning-Based Prediction Model of Preterm Birth Using Electronic Health Record. J Healthc Eng. 2022;2022(1):9635526. Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol. 1996;49(11):1225–31. Christodoulou E, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. Rocha TAH, et al. Data-driven risk stratification for preterm birth in Brazil: a population-based study to develop of a machine learning risk assessment approach. The Lancet Regional Health–Americas; 2021. p. 3. Shahshahan Z, Rasouli O. The use of maternal C-reactive protein in the predicting of preterm labor and tocolytic therapy in preterm labor women. Adv biomedical Res. 2014;3(1):154. Pandey M, Chauhan M, Awasthi S. Interplay of cytokines in preterm birth. Indian J Med Res. 2017;146(3):316–27. Kadivnik M, et al. Role of IL-6, IL-10 and TNFα gene variants in preterm birth. J Clin Med. 2024;13(8):2429. Huang S, et al. Elevated C-reactive protein and complement C3 levels are associated with preterm birth: a nested case–control study in Chinese women. BMC Pregnancy Childbirth. 2020;20(1):131. Jefferson KK. Chapter One - The Bacterial Etiology of Preterm Birth , in Advances in Applied Microbiology , S. Sariaslani and G.M. Gadd, Editors. 2012, Academic Press. pp. 1–22. Areia AL, Mota-Pinto A. Inflammation and Preterm Birth: A Systematic Review. Reproductive Med. 2022;3(2):101–11. Romero R, et al. Inflammation in preterm and term labour and delivery . in Seminars in Fetal and Neonatal Medicine . Elsevier; 2006. Gonzalez JM, et al. Complement activation triggers metalloproteinases release inducing cervical remodeling and preterm birth in mice. Am J Pathol. 2011;179(2):838–49. Gonzalez JM, Pedroni SM, Girardi G. Statins prevent cervical remodeling, myometrial contractions and preterm labor through a mechanism that involves hemoxygenase-1 and complement inhibition. Mol Hum Reprod. 2014;20(6):579–89. Gonzalez JM, et al. Cervical remodeling/ripening at term and preterm delivery: the same mechanism initiated by different mediators and different effector cells. PLoS ONE. 2011;6(11):e26877. Stygar D, et al. Increased level of matrix metalloproteinases 2 and 9 in the ripening process of the human cervix. Biol Reprod. 2002;67(3):889–94. Choi S-J, et al. Cervicovaginal matrix metalloproteinase-9 and cervical ripening in human term parturition. Eur J Obstet Gynecol Reproductive Biology. 2009;142(1):43–7. Catov JM, et al. Early pregnancy lipid concentrations and spontaneous preterm birth. Am J Obstet Gynecol. 2007;197(6):610. e1-610. e7. Zhang B, et al. Combination of Colchicine and Ticagrelor Inhibits Carrageenan-Induced Thrombi in Mice. Oxidative Med Cell Longev. 2022;2022(1):3087198. Santos S, et al. Impact of maternal body mass index and gestational weight gain on pregnancy complications: an individual participant data meta-analysis of European, North American and Australian cohorts. BJOG: Int J Obstet Gynecol. 2019;126(8):984–95. Zong Xn, et al. maternal pre-pregnancy body mass index categories and infant birth outcomes: a population-based study of 9 million mother–infant pairs. Front Nutr. 2022;9:789833. Kersten I, et al. Chronic diseases in pregnant women: prevalence and birth outcomes based on the SNiP-study. BMC Pregnancy Childbirth. 2014;14(1):75. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 23 Feb, 2026 Read the published version in BMC Pregnancy and Childbirth → Version 1 posted Editorial decision: Revision requested 17 Dec, 2025 Reviews received at journal 17 Dec, 2025 Reviewers agreed at journal 16 Dec, 2025 Reviews received at journal 16 Dec, 2025 Reviews received at journal 16 Dec, 2025 Reviews received at journal 15 Dec, 2025 Reviews received at journal 15 Dec, 2025 Reviewers agreed at journal 15 Dec, 2025 Reviews received at journal 14 Dec, 2025 Reviews received at journal 13 Dec, 2025 Reviewers agreed at journal 13 Dec, 2025 Reviewers agreed at journal 13 Dec, 2025 Reviews received at journal 12 Dec, 2025 Reviews received at journal 12 Dec, 2025 Reviewers agreed at journal 12 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviews received at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviews received at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers agreed at journal 11 Dec, 2025 Reviewers invited by journal 11 Dec, 2025 Editor invited by journal 02 Dec, 2025 Editor assigned by journal 01 Dec, 2025 Submission checks completed at journal 01 Dec, 2025 First submitted to journal 30 Nov, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8240167","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":560827560,"identity":"e9f4191f-5e75-4f05-ae69-28028828e0a4","order_by":0,"name":"Kaleem Maqsood","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABBklEQVRIiWNgGAWjYBACgwPMDQgeD48NkGRsPIBPi2EDI7IWmTSQlga8WowZULTYHAbTeLWYsR9sfPAxZ5s83/nDBx+8yTlvt7b9MNCWGptoXFpseBKbDWduu20480ZasuGcM7eTt51JBGo5lpbbgEsLQ2KbNO+224wbbvCYSfP23E42OwDUwthwGKcWM/6H7b//brttv+H8GaCWf+eSzc4/xK/FWCKxjZlx2+3EDQdyzKR5eA7Ymd0gYIvhjIfNkr3bbidD/MKTnGB2A2hLAh6/GJxPPvjh57bbtn3gEOOxszc7n/7wwYcaG5xaEOAAhEoEq0wgqBxJiz1RikfBKBgFo2BEAQAYOG13jEoi4gAAAABJRU5ErkJggg==","orcid":"","institution":"Lahore garrison University","correspondingAuthor":true,"prefix":"","firstName":"Kaleem","middleName":"","lastName":"Maqsood","suffix":""},{"id":560827567,"identity":"314d5923-668a-4584-88a2-68e34e12de83","order_by":1,"name":"Javeria Malik","email":"","orcid":"","institution":"University of the Punjab","correspondingAuthor":false,"prefix":"","firstName":"Javeria","middleName":"","lastName":"Malik","suffix":""},{"id":560827568,"identity":"a083134b-a61e-4ca7-a25b-9b5fdc3a7ce5","order_by":2,"name":"Mahnoor Fatima","email":"","orcid":"","institution":"University of the Punjab","correspondingAuthor":false,"prefix":"","firstName":"Mahnoor","middleName":"","lastName":"Fatima","suffix":""},{"id":560827569,"identity":"edb16017-912c-4997-8c77-d6257f4ffa52","order_by":3,"name":"Sundas Akram","email":"","orcid":"","institution":"University of the Punjab","correspondingAuthor":false,"prefix":"","firstName":"Sundas","middleName":"","lastName":"Akram","suffix":""},{"id":560827571,"identity":"a1f37c61-6313-4a18-b791-9295331496f4","order_by":4,"name":"Husna Ahmad","email":"","orcid":"","institution":"University of the Punjab","correspondingAuthor":false,"prefix":"","firstName":"Husna","middleName":"","lastName":"Ahmad","suffix":""},{"id":560827572,"identity":"d01a095b-4a6d-4007-99ca-a211022af7f8","order_by":5,"name":"Nabila Roohi","email":"","orcid":"","institution":"University of the Punjab","correspondingAuthor":false,"prefix":"","firstName":"Nabila","middleName":"","lastName":"Roohi","suffix":""},{"id":560827574,"identity":"4358340d-5b36-4be0-8f67-f6c8b75faff0","order_by":6,"name":"Shahid Bashir","email":"","orcid":"","institution":"King Fahad Specialist Hospital","correspondingAuthor":false,"prefix":"","firstName":"Shahid","middleName":"","lastName":"Bashir","suffix":""}],"badges":[],"createdAt":"2025-11-30 05:23:12","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8240167/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8240167/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12884-026-08784-0","type":"published","date":"2026-02-23T15:58:20+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":98322513,"identity":"b1187888-9fa4-4a93-baa4-9807f285aaba","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":416643,"visible":true,"origin":"","legend":"","description":"","filename":"FinalmanuscriptMLPTB.docx","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/0c6a1ece4a56cf5e10f3722e.docx"},{"id":98322518,"identity":"c83c27df-173d-45fc-9edc-ce627dbf9dc4","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8399,"visible":true,"origin":"","legend":"","description":"","filename":"80164575afe64258ae7617d9f8b61e4b.json","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/861069ebceb1b0a0192605d9.json"},{"id":98322525,"identity":"1d1e3d52-7dd2-4ff1-99d5-c064cb1d9afc","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":119389,"visible":true,"origin":"","legend":"","description":"","filename":"80164575afe64258ae7617d9f8b61e4b1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/748ec10f5b093ae0e7d8240e.xml"},{"id":98437701,"identity":"aa3c3adf-b0c7-4c65-88d2-cb6f32b44806","added_by":"auto","created_at":"2025-12-17 16:57:35","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":54534,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/9f95c0934e7b5a5c349b1281.png"},{"id":98436861,"identity":"4e294bb6-0fbd-43e0-b8cb-d99333b3b6d5","added_by":"auto","created_at":"2025-12-17 16:56:22","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":21421,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/d5f72c1c182f0bc6bac2a788.png"},{"id":98436066,"identity":"4c8b7a2f-a25b-4c1b-a84c-839b98996024","added_by":"auto","created_at":"2025-12-17 16:54:50","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":67398,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/44ca05b0471279a92a92a499.png"},{"id":98322523,"identity":"e41e2e52-aebe-484d-8092-7d4b3b91f820","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":15970,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/9334160b8e2c264818861370.png"},{"id":98436292,"identity":"645458b3-e34a-4eee-bff7-071bc055378f","added_by":"auto","created_at":"2025-12-17 16:55:19","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":20242,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/ba51761629c4290f322d7b00.png"},{"id":98322528,"identity":"209bebe1-fd3b-43c8-b2d2-acddfde0636b","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"xml","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":117504,"visible":true,"origin":"","legend":"","description":"","filename":"80164575afe64258ae7617d9f8b61e4b1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/b2966d9c52bd181eb32a8749.xml"},{"id":98322526,"identity":"f99e12cc-841c-4198-a4de-4422727f7781","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"html","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":130859,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/12c634748fe41b9f5e2dbd9b.html"},{"id":98436297,"identity":"7eeb35d8-f34b-462a-aebb-cf99a6419ee2","added_by":"auto","created_at":"2025-12-17 16:55:19","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":170054,"visible":true,"origin":"","legend":"\u003cp\u003eHeatmap showing correlation coefficient matrix. Blue shows a positive relationship while yellow negative relationship. The color's brightness indicates the strength of the correlation coefficient; a blue tint indicates a coefficient close to 1 while a yellow tint indicates a coefficient close to -1.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/7cf6ffe3c603ecee1285be53.png"},{"id":98322514,"identity":"614cdbc7-17c7-41b3-acf8-13a2a513e36a","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":62292,"visible":true,"origin":"","legend":"\u003cp\u003eROC curves for all four models on the test set.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/eb0a0616cc991de045d47fcb.png"},{"id":98436832,"identity":"2ba95036-85b0-40ab-ac18-3f0c0c7acffd","added_by":"auto","created_at":"2025-12-17 16:56:18","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":454235,"visible":true,"origin":"","legend":"\u003cp\u003eFeature Importance plot for all ML models.\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/26684026b9ac8dcf46dfc1f7.jpeg"},{"id":98322517,"identity":"2a737c64-a526-41e3-8116-97728194cf5d","added_by":"auto","created_at":"2025-12-16 14:11:50","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":16862,"visible":true,"origin":"","legend":"\u003cp\u003eSHAP bar plot (global mean absolute SHAP values for LR).\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/bd48d54fe168dd0e33ebbeda.png"},{"id":98436847,"identity":"0e2275cb-b59b-44b6-92b1-7d5f5dd62e8f","added_by":"auto","created_at":"2025-12-17 16:56:19","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":73390,"visible":true,"origin":"","legend":"\u003cp\u003eBeeswarm plot showing SHAP summary with magnitude and direction of feature effects.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/6023fac5edc55a8d69397d7a.png"},{"id":103766372,"identity":"3844d1de-b449-4f9f-8693-d1283461f0a6","added_by":"auto","created_at":"2026-03-02 16:14:14","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1919575,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8240167/v1/bce4db62-5f86-466a-8982-ed0ca0573d72.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers","fulltext":[{"header":"Introduction","content":"\u003cp\u003eGlobally, preterm birth (PTB) is a main healthcare issue due to its strong link with increased neonatal morbidity and mortality. It is clinically defined as the delivery of an infant before 37 gestational weeks [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Around 15\u0026nbsp;million children are born preterm yearly worldwide, with a global 11% preterm birth rate [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e], among these, about 1\u0026nbsp;million die due to PTB complications [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. South Asia and Sub-Saharan Africa account for over 60% of these PTBs worldwide [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. In terms of PTB rates, Pakistan (14.3%) is one of the worst affected countries in the world, and in South Asia it is only after Bangladesh (16.2%) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. The etiology of preterm birth is multifactorial, making it difficult to outline its causative mechanisms. Hence, it remains a central focus of significant research and clinical examination [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Almost half of all cases of preterm birth occur without a clearly identifiable cause, reflecting the characteristic complexity of the condition [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eDuring labour, there is an influx of immune cells which initiates the parturition process characterised by cervical ripening uterine contractility, and membrane rupturing, leads to inflamation [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e], and about 40 and 80% PTBs are associated with inflammation and infection [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e] Immune moleules like IL-6 (high levels at labour onset ) [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], C-reactive protein (CRP), C3, and C4 have been associated with PTB [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. Obesity can lead to maternal morbidities like thromboembolic and hypertensive disorders, gestational diabetes, and infections [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Dyslipidemia is associated with maternal hormonal changes and metabolic syndrome [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. These complications increase the risk of PTB\u003c/p\u003e \u003cp\u003eAt present, testing methods for predicting preterm birth can be categorized into three groups: assessment of maternal risk factors, biochemical biomarkers, and cervical parameters. These diagnostics may not be sensitive enough to detect true-positive PTB cases. For example, biochemical tests are costly and may affect pregnant women psychologically and physically. Analyzing risk factors is a commonly used method that relies on evidence from statistical hypothesis testing. More often than not, the case involves a single-factor test in controlled circumstances. Nonetheless, it is very expensive and time-consuming, and it omits many untested and relevant factors. A previous history of preterm birth is one of the strongest predictors with relative risks of 13.56; these do not predict risk in nulliparous women [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThese observations highlight the limitations of current methods in identifying high-risk pregnancies, particularly among women experiencing their first pregnancy. Although several predictive systems using proven risk factors, maternal demographics, and medical and obstetric history have been explored, their predictive performance has remained limited [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. This may be attributed to their reliance on simple linear statistical models, which are inadequate for capturing the complex and multifactorial nature of preterm birth. Consequently, traditional risk-factor-based approaches fail to identify more than half of PTB cases [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. In comparison to logistic regression algorithm, the machine learning (ML) models can process higher-dimensional data and have self-learn capacity [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. Therefore, in recent years, ML techniques have emerged as promising tools for enhancing individual risk prediction beyond traditional models [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]. Moreover, these algorithms have improved the predictive accuracy of the PTB prediction models [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn Pakistan, biochemical tests and fetal fibronectin testing to predict PTB remain costly. Similarly, the uptake of cervical-length ultrasound is low due to a shortage of resources and trained personnel. These is also a lack of PTB prediction guidelines in Pakistan, resulting in delayed identification of women at risk [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Therefore, this study aimed to evaluate maternal inflammatory and lipid biomarkers as predictors of preterm birth and to compare the predictive performance of these machine-learning models Logistic Regression, Random Forest, XGBoost, and Support Vector Machine [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. We hypothesized that machine learning approaches would outperform traditional regression by better capturing nonlinear interactions among biomarkers. This work fills a major knowledge gap by providing the first biomarker-based ML prediction framework for preterm birth in Pakistan, where reliable screening tools remain limited.\u003c/p\u003e"},{"header":"Methodology","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy Design and Population\u003c/h2\u003e \u003cp\u003e This study had ethical approval from the University of the Punjab's ethics review board in Lahore. Subjects for the study were recruited at Lady Wellington Hospital and Jinnah Hospital, Lahore, during their antenatal visits in the second and third trimesters. They looked at the pregnancy of each subject until delivery, and gestational age after birth. Out of these, 186 premature birth (PTB) women were included in the PTB group and remaining who delivered at term 140 were randomized as controls (term birth).\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eInformed consent\u003c/strong\u003e \u003cp\u003ewas acquired from all subjects after the study's objectives were explained. Detailed information was collected on their sociodemographic and clinical baseline characteristics. Females with medical complications like uterine fibroids, polycystic ovarian syndrome, carcinoma, hepatitis, and other undefined pathological conditions were also omitted.\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eSample Collection and Serum Analysis\u003c/h3\u003e\n\u003cp\u003eAbout five milliliters (5 mL) of blood was collected from each individual. The blood samples were transferred into labeled collection tubes for 10 minutes at 3000 \u0026times; g to collect the serum for the analysis. The serum was aliquoted into labeled cryovials and kept at -80\u0026deg;C until further biochemical or molecular analysis was carried out. Serum lipid analysis was done with commercially available kits of \u0026ldquo;Monlab\u0026rdquo; (Spain).\u003c/p\u003e \u003cp\u003eSerum concentrations of immune markers, including C-reactive protein (CRP), interleukin-6 (IL-6), and complement factor 3 (C3), were measured using enzyme-linked immunosorbent assay (ELISA) kits from Glory Bioscience (China). All reagents and materials were handled according to the manufacturer\u0026rsquo;s biosafety and storage recommendations.\u003c/p\u003e\n\u003ch3\u003eDescriptive and Inferential Statistical Analysis\u003c/h3\u003e\n\u003cp\u003eA one-way ANOVA was conducted to compare biomarker levels in different cohorts. When appropriate, Tukey\u0026rsquo;s post hoc tests were ran to detect specific differences. A p-value of \u0026lt;\u0026thinsp;0.05 is statistically significant. To give a suitable representation of data, all values are presented as Mean\u0026thinsp;\u0026plusmn;\u0026thinsp;SEM. Also, a Pearson Correlation Matrix was computed to look at the inter-relationships between the parameters. Variables that were studied included maternal age, BMI, triglycerides, HDL-C, total cholesterol, LDL-C, IL-6, Complement C3, C-reactive protein and outcome preterm birth. A heatmap visualization was generated using seaborn.\u003c/p\u003e\n\u003ch3\u003eData Processing and Preparation\u003c/h3\u003e\n\u003cp\u003eThe outcome variable was encoded into a binary form: \u003cem\u003e0\u0026thinsp;=\u0026thinsp;term birth\u003c/em\u003e and \u003cem\u003e1\u0026thinsp;=\u0026thinsp;preterm birth\u003c/em\u003e (including both very preterm and extremely preterm cases). Categorical predictors were converted to numerical values via one-hot encoding where necessary. Because some algorithms are sensitive to predictor scale, standardized scaling was applied selectively. For logistic regression and SVM models, continuous predictors were standardized to zero mean and unit variance using StandardScaler after mean imputation. For random forests and XGBoost models, only mean imputation was applied; tree-based methods were trained on unscaled features, as they are inherently scale-invariant.\u003c/p\u003e \u003cp\u003eScaling and imputation were implemented using scikit-learn Pipelines, ensuring that all preprocessing was confined to the training data and consistently applied to the test data.\u003c/p\u003e\n\u003ch3\u003eData Partitioning and Model Development\u003c/h3\u003e\n\u003cp\u003eFor predictive modeling, the dataset was split into training (70%) and test (30%) sets using stratified random sampling to preserve the outcome proportions across groups. We used traditional and advanced supervised machine learning algorithms to predict preterm birth outcomes. We used four algorithms for this task. The algorithms were LR [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], RF [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e], XGBoost [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e], and SVM [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Logistic Regression was used as the baseline interpretable model. We used Random Forest and XGBoost to capture nonlinear interactions and assess feature importance. We included the SVM model in this analysis to assess classification efficiency in high-dimensional data spaces..\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eModel Evaluation and Validation\u003c/h2\u003e \u003cp\u003eModel performance was evaluated on the independent test dataset. The main evaluation metrics were Accuracy, Area Under Curve (AUC), Precision, Sensitivity, and F1-Score. Each model was evaluated using its own ROC curve to assess discriminative performance. The best predictive accuracy was achieved among the classifiers tested with Logistic Regression, with a mean cross-validated AUC of 0.86. To ensure model stability and reduce overfitting, five-fold cross-validation was invoked.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eFeature Importance Analysis\u003c/h3\u003e\n\u003cp\u003eFor the random forest and XGBoost models, global feature importance was obtained from the fitted models via the feature_importances attribute, which reflects the average reduction in node impurity attributable to each variable's splits. Importance scores were normalized and plotted as horizontal bar charts using descriptive feature names.\u003c/p\u003e \u003cp\u003eBecause the RBF-kernel SVM does not yield direct coefficients, permutation feature importance was used to assess variable contributions. Using the trained SVM pipeline, baseline accuracy on the test set was recorded, and each predictor was permuted repeatedly while keeping all other features fixed. The mean decrease in classification accuracy across 30 permutations per feature, along with its standard deviation, was used as the importance score and visualized in a horizontal bar plot.\u003c/p\u003e\n\u003ch3\u003eSHAP (Shapley Additive Explanations) Analysis\u003c/h3\u003e\n\u003cp\u003eTo quantify the relative importance and direction of influence of each predictor in the LR model, SHAP analysis was performed. SHAP values measure the marginal contribution of each feature to the model\u0026rsquo;s output using cooperative game theory, providing both local (individual-level) and global (model-level) interpretability. Summary plots and bar-type feature importance plots were generated using the shap Python library. These plots visually illustrate how increases or decreases in biomarkers contribute to the predicted risk of preterm birth. All SHAP visualizations were produced on the test set to avoid information leakage.\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eFive-fold cross-validation\u003c/h2\u003e \u003cp\u003eInternal validation was achieved using 5-fold stratified cross-validation on the 70% training set. The data were divided into five folds, preserving the proportions of term and preterm births. In each cycle, four folds were used to train the full preprocessing\u0026ndash;model pipeline and the remaining fold served as the validation set. This process was repeated until every fold had been used once for validation. For each model and fold, accuracy, AUC, sensitivity, specificity, and the F1-score for preterm birth were calculated, and the mean and standard deviation across the five folds were taken as the cross-validated performance estimates.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eSoftware and Tools\u003c/h2\u003e \u003cp\u003eAll conventional statistical analyses were executed using IBM SPSS (Version 20). All computational analyses were performed using Python libraries including pandas, numpy, scikit-learn, matplotlib, seaborn, and xgboost. Graphical illustrations, confusion matrices, ROC curves, and summary tables were generated using Matplotlib and Seaborn. Data integrity and reproducibility were maintained by using the same train-test split across all algorithms, allowing for direct model comparison.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eDemographic and Clinical Characteristics\u003c/h2\u003e \u003cp\u003eWe conducted tests on different sample sets of mothers who delivered at term and preterm to compare maternal characteristics and inflammatory biomarkers. We found very different results for each factor. The mean maternal age (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001) and BMI (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001) were higher in preterm birth group than in the term birth group. Among lipid profile markers, women with preterm birth had higher triglycerides, total cholesterol, and LDL cholesterol (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001). Conversely, HDL cholesterol was significantly lower among preterm births (P\u0026thinsp;=\u0026thinsp;0.0283) (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eInflammatory biomarkers demonstrated pronounced elevations in the preterm birth group. IL-6 levels (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001), Complement C3 concentrations (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001), and C-reactive protein (CRP) (P\u0026thinsp;\u0026lt;\u0026thinsp;0.001) were markedly higher in preterm deliveries (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eBaseline maternal and pregnancy characteristics in term and preterm birth groups.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eParameters\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e \u003cp\u003eMean\u0026thinsp;\u0026plusmn;\u0026thinsp;SD\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003eP value\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTerm Group\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePreterm Group\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMaternal Age\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e27.77\u0026thinsp;\u0026plusmn;\u0026thinsp;4.96\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e30.17\u0026thinsp;\u0026plusmn;\u0026thinsp;5.11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eBody Mass Index (BMI)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e24.99\u0026thinsp;\u0026plusmn;\u0026thinsp;5.28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e29.09\u0026thinsp;\u0026plusmn;\u0026thinsp;6.55\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTriglycerides\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e161.25\u0026thinsp;\u0026plusmn;\u0026thinsp;44.69\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e183.26\u0026thinsp;\u0026plusmn;\u0026thinsp;41.68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHDL Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e60.96\u0026thinsp;\u0026plusmn;\u0026thinsp;20.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e55.63\u0026thinsp;\u0026plusmn;\u0026thinsp;21.06\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0283\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTotal Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e174.93\u0026thinsp;\u0026plusmn;\u0026thinsp;38.59\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e192.57\u0026thinsp;\u0026plusmn;\u0026thinsp;36.24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLDL Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e71.59\u0026thinsp;\u0026plusmn;\u0026thinsp;25.37\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e81.18\u0026thinsp;\u0026plusmn;\u0026thinsp;23.76\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0008\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eInterleukin-6 (IL-6)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e42.98\u0026thinsp;\u0026plusmn;\u0026thinsp;9.84\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e52.75\u0026thinsp;\u0026plusmn;\u0026thinsp;12.11\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eComplement C3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e65.46\u0026thinsp;\u0026plusmn;\u0026thinsp;20.60\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e79.17\u0026thinsp;\u0026plusmn;\u0026thinsp;21.20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eC-Reactive Protein (CRP)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e6.76\u0026thinsp;\u0026plusmn;\u0026thinsp;1.97\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e9.69\u0026thinsp;\u0026plusmn;\u0026thinsp;3.41\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.0000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eCorrelation Among Maternal Serum Markers and Preterm Birth\u003c/h2\u003e \u003cp\u003eThe study found that inflammatory markers and metabolic factors are positively correlated with preterm birth and other adverse pregnancy outcomes. Serum CRP and preterm birth are correlated with r approx 0.45, followed by IL-6, r approx 0.40 and complement C3, r approx 0.31. The maternal BMI showed a weak positive surprise (r\u0026thinsp;\u0026asymp;\u0026thinsp;0.32), as well as triglycerides (r\u0026thinsp;\u0026asymp;\u0026thinsp;0.25), total cholesterol (r\u0026thinsp;\u0026asymp;\u0026thinsp;0.23), and maternal age (r\u0026thinsp;\u0026asymp;\u0026thinsp;0.23). Having said that, HDL cholesterol was found to have a week inverse association with preterm birth. High levels of HDL possibly have a protective effect and this association is in line. The overall finding suggests that a greater inflammatory burden and unfavourable metabolic profiles are associated with an enhanced threat of preterm birth (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eLogistic Regression Analysis\u003c/h2\u003e \u003cp\u003eBinary logistic regression was applied to identify the most significant predictors of preterm birth. Multivariable logistic regression identified maternal age, BMI, IL-6, C3, and CRP as independent predictors of preterm birth. For each 1-year increase in age, the odds of preterm birth increased by about 11% (OR\u0026thinsp;=\u0026thinsp;1.108, P\u0026thinsp;=\u0026thinsp;0.001), and each 1-kg/m\u0026sup2; increase in BMI raised the odds by 13% (OR\u0026thinsp;=\u0026thinsp;1.133, P\u0026thinsp;\u0026lt;\u0026thinsp;0.001). Inflammatory markers remained strongly associated with risk: IL-6 (OR\u0026thinsp;=\u0026thinsp;1.060, P\u0026thinsp;\u0026lt;\u0026thinsp;0.001), C3 (OR\u0026thinsp;=\u0026thinsp;1.023, P\u0026thinsp;=\u0026thinsp;0.002), and especially CRP (OR\u0026thinsp;=\u0026thinsp;1.338, P\u0026thinsp;\u0026lt;\u0026thinsp;0.001). In contrast, triglycerides showed only a borderline association (P\u0026thinsp;=\u0026thinsp;0.054), while HDL, total cholesterol, and LDL were not significant predictors after adjustment (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eMultivariable logistic regression model for preterm birth.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eParameters\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eβ\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eOR\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e95% CI\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eP value\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMaternal Age\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.103\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.108\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.04\u0026ndash;1.18\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.001\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eBody Mass Index (BMI)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.125\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.133\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.07\u0026ndash;1.20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTriglycerides\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.007\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.007\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.00-1.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.054\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eHDL Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e-0.005\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.995\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.98\u0026ndash;1.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.557\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTotal Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.004\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.004\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.99\u0026ndash;1.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.500\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLDL Cholesterol\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.007\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.007\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.99\u0026ndash;1.02\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.397\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eInterleukin-6 (IL-6)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.058\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.060\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.03\u0026ndash;1.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eComplement C3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.023\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.023\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.01\u0026ndash;1.04\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.002\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eC-Reactive Protein (CRP)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.291\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.338\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.18\u0026ndash;1.51\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.000\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eTest-Set Performance of Machine Learning Models\u003c/h2\u003e \u003cp\u003eAcross the four models evaluated, all approaches showed good discrimination for preterm birth, with accuracies ranging from 74.49 to 78.57 and ROC-AUC values above 0.819. Logistic regression achieved the best overall performance (accuracy 78.57, ROC-AUC 0.849), closely followed by SVM (accuracy 77.55, ROC-AUC 0.819). The ensemble tree-based methods performed slightly worse but still achieved competitive accuracies of 74.49 and 75.51 with AUCs of 0.835 and 0.823 for random forest and XGBoost, respectively (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e; Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eTest-set performance of ML models for the preterm birth prediction.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSensitivity (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSpecificity (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLogistic regression\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e78.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e76.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e80.95\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.849 (0.77\u0026ndash;0.92)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRandom forest\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e74.49\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e75.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e73.81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.77\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.835 (0.76\u0026ndash;0.91)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eXGBoost\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e75.51\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e76.79\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e73.81\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.823 (0.74\u0026ndash;0.89)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eSVM\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e77.55\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e78.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e76.19\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.819 (0.73\u0026ndash;0.89)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eFive-Fold Cross-Validated Performance (Internal Validation)\u003c/h2\u003e \u003cp\u003eIn 5-fold stratified cross-validation of the training set, all models demonstrated stable and comparable discrimination (Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). Mean accuracies ranged from 75.86% to 79.39%, with logistic regression showing the highest accuracy (79.39\u0026thinsp;\u0026plusmn;\u0026thinsp;4.24%), closely followed by SVM (78.51\u0026thinsp;\u0026plusmn;\u0026thinsp;4.82%). Sensitivity for preterm birth remained around 78\u0026ndash;82% across models, while specificity for term birth ranged from 71\u0026ndash;79%, indicating balanced ability to identify both preterm and term outcomes. The F1-scores for the preterm class were consistently around 0.79\u0026ndash;0.82, reflecting good overall classification of preterm cases. All models achieved cross-validated AUC values between 0.86 and 0.87, confirming robust discriminatory performance and supporting the reliability of the test-set results (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eFive-fold cross-validated performance of ML models for preterm birth prediction.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAccuracy (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSensitivity (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSpecificity (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eF1-score\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLogistic regression\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e79.39\u0026thinsp;\u0026plusmn;\u0026thinsp;4.24\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e81.54\u0026thinsp;\u0026plusmn;\u0026thinsp;8.21\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e76.63\u0026thinsp;\u0026plusmn;\u0026thinsp;6.68\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.82\u0026thinsp;\u0026plusmn;\u0026thinsp;0.04\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.04\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRandom forest\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e77.61\u0026thinsp;\u0026plusmn;\u0026thinsp;5.15\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e80.00\u0026thinsp;\u0026plusmn;\u0026thinsp;7.46\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e74.53\u0026thinsp;\u0026plusmn;\u0026thinsp;9.20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.80\u0026thinsp;\u0026plusmn;\u0026thinsp;0.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.87\u0026thinsp;\u0026plusmn;\u0026thinsp;0.02\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eXGBoost\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e75.86\u0026thinsp;\u0026plusmn;\u0026thinsp;3.74\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e79.23\u0026thinsp;\u0026plusmn;\u0026thinsp;5.22\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e71.42\u0026thinsp;\u0026plusmn;\u0026thinsp;5.32\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.79\u0026thinsp;\u0026plusmn;\u0026thinsp;0.03\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.03\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eSVM\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c2\"\u003e \u003cp\u003e78.51\u0026thinsp;\u0026plusmn;\u0026thinsp;4.82\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c3\"\u003e \u003cp\u003e78.46\u0026thinsp;\u0026plusmn;\u0026thinsp;8.63\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e \u003cp\u003e78.68\u0026thinsp;\u0026plusmn;\u0026thinsp;7.84\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e \u003cp\u003e0.80\u0026thinsp;\u0026plusmn;\u0026thinsp;0.05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c6\"\u003e \u003cp\u003e0.87\u0026thinsp;\u0026plusmn;\u0026thinsp;0.04\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003eFeature Importance Across Models\u003c/h2\u003e \u003cp\u003eAcross all models, inflammatory biomarkers consistently emerged as the strongest predictors of preterm birth. Logistic regression identified CRP as the dominant factor, followed by BMI, maternal age, IL-6, and C3, while lipid measures contributed minimally after adjustment. Feature-importance rankings from the random forest and XGBoost models reinforced this pattern, with CRP, IL-6, and BMI showing the highest contributions, and triglycerides and C3 providing moderate additional value. In contrast, total, HDL, and LDL cholesterol played only a small role. The SVM model showed a slightly different pattern, placing C3 and maternal age highest, followed by BMI, CRP, and IL-6 (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003eModel Explainability Using SHAP for Logistic Regression\u003c/h2\u003e \u003cp\u003eSHAP analysis of the best-performing logistic regression model showed that IL-6, BMI, and CRP were the dominant contributors to predicted preterm birth risk, followed by maternal age and complement C3. At the same time, lipid markers had much smaller average SHAP values (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). The SHAP summary plot further illustrated that higher values of IL-6, BMI, CRP, age, and C3 were associated with positive SHAP values, pushing predictions toward preterm birth. In contrast, lower values of these features tended to be protective. Lipid variables clustered around zero SHAP, and higher HDL levels were slightly associated with reduced predicted risk. Together, these results confirm that systemic inflammation and maternal adiposity drive most of the model\u0026rsquo;s discriminative power (Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e; Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eMean absolute SHAP values for each predictor.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFeature\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMean_ABS_SHAP\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eInterleukin-6 (IL-6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.693\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBody Mass Index (BMI)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.623\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eC-Reactive Protein (CRP)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.619\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMaternal Age\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.334\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComplement C3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.313\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTotal Cholesterol\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.224\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTriglycerides\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.222\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLDL Cholesterol\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.065\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHDL Cholesterol\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.059\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, women who delivered preterm had a less favorable inflammatory and metabolic profile. Older age and higher BMI, along with an atherogenic lipid pattern, were associated with preterm birth condition, along with elevated inflammatory markers, including IL-6, CRP, and C3.In the altered models, maternal age, BMI, IL-6, C3, and especially CRP were significant predictors of preterm birth. All four machine-learning models performed reasonably well, with AUCs above 0.82 on the test set. The mean cross-validated AUCs were in the range of 0.86\u0026ndash;0.87. The logistic regression performed similarly and occasionally better than the other complicated models. SHAP analysis was also consistent with the regression results, indicating that IL-6, BMI, CRP, maternal age, and C3 were the strongest predictors. These findings affirm that simple, readily obtainable maternal biomarkers can provide significant early risk assessment.\u003c/p\u003e \u003cp\u003eThe performance of all four machine learning models involved in our study is comparable to previous studies involving these algoithms to predict PTB using maternal biomarkers. Teng et al. identified seven major predictors of PTB (ALP, AFP, hemoglobin, urea, sodium, lymphocyte count, andRBCs) using LR and four machine learning models followed by SHAP analysis. XGBoost model performed best (AUC\u0026thinsp;=\u0026thinsp;0.893) followed by logistic regression (0.872), LightGBM (0.840) and GBDT (0.879) [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Sun et al., analyzed a total of 9550 pregnant women cases and found that the AUC (0.885) and accuracy (0.816) of the RF were the highest compared with other algorithms to predict PTB using predictors like gestational age, serum inorganic phosphorus, magnesium, platelet volume, waist size, total cholesterol, total bilirubin, triglycerides, and globulins [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. In a study performed by Kloska et al., on maternal data collected from 50 patients, the linear SVM (accuracy: 82%) and logistic regression model (accuracy: 80%) demonstrated comparable performance [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e] which is in line with our findings.\u003c/p\u003e \u003cp\u003eIn our study the logistic regression model somewhat outperformed machine learning models. Other studies which compared machine learning models to conventional logistic regression to predict various clinical conditions also showed that in general, no single method consistently provides the best prediction [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. Linear models (e.g linear SVM and logistic regression) work well when data relationships are linear or roughly linear as is often the case with patients' medical data e.g their histories and blood tests [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e] These linear models offer greater interpretability which is important in medical settings, and can effectively capture simple and linear relationships between predictors and PTB risk [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Even though logistic regression is often used, it requires the predictors to be linear and independent. On the other hand, machine learning is a non-parametric method which deals with complex and non-linear models.\u003c/p\u003e \u003cp\u003eThe growing evidence of maternal inflammation leading to preterm birth and elevated levels of CRP, IL-6 and C3 in women who have experienced preterm birth is more or less fits with the literature on the subject. In early or mid-pregnancy, an increase in CRP has long been associated with a shorter duration of pregnancy (gestation). At the same time, IL-6 is among the most frequently cited cytokines in preterm labour and rupture of membranes. Increased IL-6 and CRP levels may damage membranes and lead to uterine contractions. [\u003cspan additionalcitationids=\"CR38 CR39\" citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe inflammatory markers differ significantly between the groups and are strong predictors in the regression and machine-learning models in this study. The assessment found a strong association between C-reactive protein (CRP) and the outcome. CRP is generally known as an easy available marker of inflammatory stress. According to earlier literature, inflammation is the primary cause of preterm labor and birth due to bacteria infiltrating the mother\u0026rsquo;s uterus [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e] and the subsequent activation of maternal and fetal immune responses [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. Chorioamnionitis, characterized by inflammation of the fetal membranes, is frequently linked to such infections and leads to elevated levels of pro-inflammatory cytokines, including IL-6 and IL-1β. These mediators enhance uterine contractions and compromise the integrity of the membranes, thereby increasing the risk of premature delivery [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eComplement activation has been less widely studied in general obstetric populations, yet existing literature supports its involvement in adverse pregnancy outcomes. The complement system is increasingly recognized as a key contributor to preterm birth, as its activation has been linked to enhancing myometrium contractions, cervical collagen remodeling, and the recruitment of inflammatory immune cells [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe association we observed between higher C3 levels and preterm birth suggests that complement-driven inflammatory pathways may contribute meaningfully to early parturition. Past literature has supported our findings by reporting that the complement cascade facilitates the recruitment and activation of neutrophils and macrophages [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e, \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. In preterm birth, activated macrophages promote the release of matrix metalloproteinase-9 (MMP-9), which degrades cervical collagen, contributing to tissue remodeling, distension, and eventual dilation [\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e, \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. This finding offers a potentially important avenue for further mechanistic and translational research.\u003c/p\u003e \u003cp\u003eWomen who had a preterm delivery showed greater levels of triglycerides, total cholesterol, and LDL cholesterol than normal women. Similar reports of maternal dyslipidemia influencing preterm birth have characterised previous findings. Elevated levels of cholesterol and triglycerides during early pregnancy force beyond 2.8 times the risk for pre-delivery in less than 34 weeks and two times in case of 34\u0026ndash;37 weeks of pre-delivery [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. These findings suggest that pregnancy-related lipid alterations may reflect underlying inflammatory or infectious processes, as hypertriglyceridemia can function as an innate immune response and heightened inflammatory activity may contribute to hypercholesterolemia. Furthermore, excessive cholesterol levels have been associated with thrombotic events, potentially increasing the likelihood of obstetric complications such as placental abruption, which can precipitate preterm birth [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eNevertheless, most lipid components lost their high significance after controlling for age, BMI, and inflammatory markers, suggesting that they may have broader metabolic and inflammatory effects. This interpretation is consistent with evidence that metabolic stress, adiposity, and inflammation are linked during pregnancy. Dyslipidemia is probably indicative of a higher-risk metabolic state, but our data demonstrate that inflammatory markers and BMI more directly predict preterm birth in this population.\u003c/p\u003e \u003cp\u003eMaternal age and BMI were both independent predictors of preterm birth with clinically relevant effect sizes. Obstetric risk factors can influence gestational outcomes through their potential effects on vascular, metabolic, and inflammatory dysfunctions. These are best recognised in these variables. If you are overweight or obese before pregnancy, you are at a higher risk for gestational hypertension and gestational diabetes [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e], both of which increase the likelihood of labor induction or planned cesarean delivery. These obstetric interventions may partly account for the elevated risk of preterm birth observed among individuals with higher BMI [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eGiven the rising trend in advanced maternal age and obesity globally, these findings reinforce the need for strategies that support maternal metabolic health before and during pregnancy. Interventions focused on weight optimisation, improved nutrition, and management of chronic inflammation may help lower the risk of preterm birth, although further evidence from interventional research is needed. Although numerous machine learning models for PTB prediction exist, relatively few have been developed by using this biomarker panel especially in LMICs, where patient demographics and clinical characteristics differ from high-income countries. Thus, this study is particularly important for providing realistic and reproducible insights for the subcontinent population. Given their high AUC and accuracy, these machine learning models can be applied early in pregnancy could allow timely identification of high-risk individuals and prompt interventions to reduce PTB, particularly in LMICs. Future research should increase sample sizes and incorporate multicenter data to enhance generalizability. Furthermore, clinical validation of machine learning models and evaluation of their real-world effectiveness and cost-efficiency are crucial next steps.\u003c/p\u003e \u003cp\u003eThe four machine-learning models, such as logistic regression, SVM, random forest, and XGBoost, showed very similar and robust predictive capacity. Logistic regression performed particularly well and offered the advantage of interpretability. These results highlight an important message: the choice of predictors and data quality often matter more than model complexity. Earlier research has shown that Linear Kernel support vector machines and logistic regression are sensitive to variable scaling, and that appropriate normalization can improve their ability to detect meaningful patterns in the dataset [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe SHAP analysis further strengthened the interpretability of the findings, showing that the same set of variables, IL-6, BMI, CRP, maternal age, and C3, dominated model decision-making across approaches. Lipid markers, in contrast, played a much smaller role. The coherence between statistical and ML-driven insights increases confidence in the robustness of the results.\u003c/p\u003e \u003cp\u003eOur study's findings lend support to the routine use of a focused panel of easily measurable biomarkers namely CRP, IL-6 and C3, in conjunction with age and BMI, which may assist in early identification of women at higher risk of preterm birth. Because this approach is so simple, it might be introduced into clinical workflows of clinics that lack advanced diagnostic imaging.\u003c/p\u003e \u003cp\u003eThe models showed similar sensitivity and specificity, the next step is external validation before clinical use. Even so, these results show how biomarker-informed risk stratification could be used to monitor closely, refer in a timely manner and deliver targeted preventive measures.\u003c/p\u003e \u003cp\u003eOne of the strengths of this study is its prospective design, which provides access to well-characterized biochemical and clinical data. We applied both conventional and machine learning methods, coupled with the SHAP explainability tool. We were able to evaluate predictions of performance and the biological meaning behind them. The primary contribution of this study is the application of advanced machine learning models for PTB prediction while maintaining paired with strong SHAP-based interpretability. By measuring the contribution of each predictor, SHAP improves the model\u0026rsquo;s transparency and clinical relevance, giving clinicians a clearer and more reliable decision-support tool [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Transparent interpretability is especially critical for clinicians in LMIC settings because they often work with limited resources, high patient loads, and variable data quality.\u003c/p\u003e \u003cp\u003eNevertheless, the participants of the study were drawn from two tertiary centres and may not reflect the greater community. Even though they were adequate for the study\u0026rsquo;s analyses, the sample sizes are moderate for machine-learning applications. Because biomarker samples were collected in the second and third trimesters, inference regarding earlier screening is limited. We also did not account for numerous clinical or environmental factors that influence preterm birth. The internal validation of our models is more important than external validation.\u003c/p\u003e \u003cp\u003eFor future studies, we should validate the findings in larger and more diverse populations. Additionally, we may explore early pregnancy or longitudinal trajectories of biomarkers. Finally, we should integrate biochemical, clinical, or ultrasound parameters into a joint model. Studies examining implementation will also help determine whether these models can significantly reduce preterm birth rates or improve outcomes in such cases.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis study shows that preterm birth is closely linked to maternal inflammatory and metabolic status. Higher CRP, IL-6, C3, BMI, and maternal age were the strongest predictors, while lipid markers contributed less after these factors were taken into account. All machine-learning models performed well, with logistic regression matching more complex approaches and offering clearer interpretability. A small set of routine biomarkers, combined with basic clinical data, can provide meaningful early prediction of preterm birth, although external validation is needed before clinical use.\u003c/p\u003e"},{"header":"Declarations","content":" \u003cp\u003e \u003cstrong\u003eEthics approval and consent to participate:\u003c/strong\u003e \u003cp\u003e Ethical approval for this study was obtained from the Institutional Ethics Review Board, University of the Punjab, Lahore. Written informed consent was obtained from all participants or their legal guardians.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eConsent for publication:\u003c/strong\u003e \u003cp\u003eNot Applicable\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCompeting interests:\u003c/strong\u003e \u003cp\u003eThe authors declare no potential conflicts of interest.\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eFunding:\u003c/h2\u003e \u003cp\u003eThis study received no specific funding.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eAuthors' contributions: KM, JM and MF contributed equally to the study design, data analysis, and manuscript drafting and interpretation. SA and HA supported literature review and manuscript preparation. NR and SB supervised the project, provided critical revisions, and finalized the manuscript.\u003c/p\u003e\u003ch2\u003eAcknowledgements:\u003c/h2\u003e \u003cp\u003eWe gratefully acknowledge the Institute of Zoology, University of the Punjab, Lahore for providing financial support for this study.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThere is no additional supporting data to be declared. All data generated or analyzed in this study are fully reported within the Results section of the manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eKloska A, et al. Predicting preterm birth using machine learning methods. Sci Rep. 2025;15(1):5683.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBlencowe H et al. Chap. 2: \u003cem\u003e15 million preterm births: Priorities for action based on national, regional and global estimates.\u003c/em\u003e Born too soon: The global action report on preterm birth, 2012.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIGME). U.N.I.-a.G.f.C.M.E.U. \u003cem\u003eLevels \u0026amp; Trends in Child Mortality: Report. Estimates developed by the United Nations Inter-agency Group for Child Mortality Estimation\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWHO. Preterm birth. [cited 2018.; Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.who.int/en/news-room/factsheets/detail/preterm-birth\u003c/span\u003e\u003cspan address=\"http://www.who.int/en/news-room/factsheets/detail/preterm-birth\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBlencowe H, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet. 2012;379(9832):2162\u0026ndash;72.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoldenberg RL, et al. Epidemiology and causes of preterm birth. lancet. 2008;371(9606):75\u0026ndash;84.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKoullali B, et al. \u003cem\u003eRisk assessment and management to prevent preterm birth\u003c/em\u003e. in \u003cem\u003eSeminars in fetal and neonatal medicine\u003c/em\u003e. Elsevier; 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShivkumar PV, Priyadarshani P, Choksi N. Preterm Labor. Labour Room Emergencies. Springer Singapore: Singapore; 2020. pp. 33\u0026ndash;8. A. Sharma, Editor.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYoung A, et al. Immunolocalization of proinflammatory cytokines in myometrium, cervix, and fetal membranes during human parturition at term. Biol Reprod. 2002;66(2):445\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRinaldi SF, et al. Anti-inflammatory mediators as physiological and pharmacological regulators of parturition. Expert Rev Clin Immunol. 2011;7(5):675\u0026ndash;96.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMiller FA, et al. Interventions for Infection and Inflammation-Induced Preterm Birth: a Preclinical Systematic Review. Reproductive Sci. 2023;30(2):361\u0026ndash;79.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeimert KB, et al. Inflammatory amplification: a central tenet of uterine transition for labor. Front Cell Infect Microbiol. 2021;11:660983.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGilman-Sachs A, et al. Inflammation induced preterm labor and birth. J Reprod Immunol. 2018;129:53\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMoghaddam Banaem L, et al. Maternal serum C-reactive protein in early pregnancy and occurrence of preterm premature rupture of membranes and preterm birth. J Obstet Gynecol Res. 2012;38(5):780\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLynch AM, et al. Early elevations of the complement activation fragment C3a and adverse pregnancy outcomes. Obstet Gynecol. 2011;117(1):75\u0026ndash;83.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGaltier-Dereure F, Boegner C, Bringer J. Obesity and pregnancy: complications and cost. Am J Clin Nutr. 2000;71(5):S1242\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAghaie Z, Hajian S, Abdi F. The relationship between lipid profiles in pregnancy and preterm delivery: a systematic review. Biomedical Res Therapy. 2018;5(8):2590\u0026ndash;609.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChatzi L, et al. Metabolic syndrome in early pregnancy and risk of preterm birth. Am J Epidemiol. 2009;170(7):829\u0026ndash;36.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTran T et al. \u003cem\u003ePreterm birth prediction: Deriving stable and interpretable rules from high dimensional data.\u003c/em\u003e arXiv preprint arXiv:1607.08310, 2016.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEsplin MS, et al. Estimating recurrence of spontaneous preterm delivery. Obstet Gynecol. 2008;112(3):516\u0026ndash;23.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMercer B, et al. The preterm prediction study: a clinical risk assessment system. Am J Obstet Gynecol. 1996;174(6):1885\u0026ndash;95.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee KA, et al. A model for prediction of spontaneous preterm birth in asymptomatic women. J Women's Health. 2011;20(12):1825\u0026ndash;31.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGeorgiou HM, et al. Predicting Preterm Labour: Current Status and Future Prospects. Dis Markers. 2015;2015(1):435014.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNgiam KY, Khor IW. Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 2019;20(5):e262\u0026ndash;73.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKoivu A, Sairanen M. Predicting risk of stillbirth and preterm pregnancies with machine learning. Health Inform Sci Syst. 2020;8(1):14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVovsha I et al. \u003cem\u003ePredicting Preterm Birth Is Not Elusive: Machine Learning Paves the Way to Individual Wellness\u003c/em\u003e. in \u003cem\u003eAAAI Spring Symposia\u003c/em\u003e. 2014.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBhutta ZA, et al. Reproductive, maternal, newborn, and child health in Pakistan: challenges and opportunities. Lancet. 2013;381(9884):2207\u0026ndash;18.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaalouf M. Logistic regression in data analysis: an overview. Int J Data Anal Techniques Strategies. 2011;3(3):281\u0026ndash;99.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGupta D, Malviya A. \u003cem\u003eSatyendra Singh Performance Analysis of Classification Tree Learning Algorithms.\u003c/em\u003e International Journal of Computer Applications: pp. 0975\u0026ndash;8887.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDevan P, Khare N. An efficient XGBoost\u0026ndash;DNN-based classification model for network intrusion detection system. Neural Comput Appl. 2020;32(16):12499\u0026ndash;514.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAyat N-E, Cheriet M, Suen CY. Automatic model selection for the optimization of SVM kernels. Pattern Recogn. 2005;38(10):1733\u0026ndash;45.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTeng X, et al. Machine learning prediction of preterm birth in women under 35 using routine biomarkers in a retrospective cohort study. Sci Rep. 2025;15(1):10213.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun Q, et al. Machine Learning-Based Prediction Model of Preterm Birth Using Electronic Health Record. J Healthc Eng. 2022;2022(1):9635526.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol. 1996;49(11):1225\u0026ndash;31.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChristodoulou E, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRocha TAH, et al. Data-driven risk stratification for preterm birth in Brazil: a population-based study to develop of a machine learning risk assessment approach. The Lancet Regional Health\u0026ndash;Americas; 2021. p. 3.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShahshahan Z, Rasouli O. The use of maternal C-reactive protein in the predicting of preterm labor and tocolytic therapy in preterm labor women. Adv biomedical Res. 2014;3(1):154.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePandey M, Chauhan M, Awasthi S. Interplay of cytokines in preterm birth. Indian J Med Res. 2017;146(3):316\u0026ndash;27.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKadivnik M, et al. Role of IL-6, IL-10 and TNFα gene variants in preterm birth. J Clin Med. 2024;13(8):2429.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang S, et al. Elevated C-reactive protein and complement C3 levels are associated with preterm birth: a nested case\u0026ndash;control study in Chinese women. BMC Pregnancy Childbirth. 2020;20(1):131.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJefferson KK. \u003cem\u003eChapter One - The Bacterial Etiology of Preterm Birth\u003c/em\u003e, in \u003cem\u003eAdvances in Applied Microbiology\u003c/em\u003e, S. Sariaslani and G.M. Gadd, Editors. 2012, Academic Press. pp. 1\u0026ndash;22.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAreia AL, Mota-Pinto A. Inflammation and Preterm Birth: A Systematic Review. Reproductive Med. 2022;3(2):101\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRomero R, et al. \u003cem\u003eInflammation in preterm and term labour and delivery\u003c/em\u003e. in \u003cem\u003eSeminars in Fetal and Neonatal Medicine\u003c/em\u003e. Elsevier; 2006.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGonzalez JM, et al. Complement activation triggers metalloproteinases release inducing cervical remodeling and preterm birth in mice. Am J Pathol. 2011;179(2):838\u0026ndash;49.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGonzalez JM, Pedroni SM, Girardi G. Statins prevent cervical remodeling, myometrial contractions and preterm labor through a mechanism that involves hemoxygenase-1 and complement inhibition. Mol Hum Reprod. 2014;20(6):579\u0026ndash;89.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGonzalez JM, et al. Cervical remodeling/ripening at term and preterm delivery: the same mechanism initiated by different mediators and different effector cells. PLoS ONE. 2011;6(11):e26877.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStygar D, et al. Increased level of matrix metalloproteinases 2 and 9 in the ripening process of the human cervix. Biol Reprod. 2002;67(3):889\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChoi S-J, et al. Cervicovaginal matrix metalloproteinase-9 and cervical ripening in human term parturition. Eur J Obstet Gynecol Reproductive Biology. 2009;142(1):43\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCatov JM, et al. Early pregnancy lipid concentrations and spontaneous preterm birth. Am J Obstet Gynecol. 2007;197(6):610. e1-610. e7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang B, et al. Combination of Colchicine and Ticagrelor Inhibits Carrageenan-Induced Thrombi in Mice. Oxidative Med Cell Longev. 2022;2022(1):3087198.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSantos S, et al. Impact of maternal body mass index and gestational weight gain on pregnancy complications: an individual participant data meta-analysis of European, North American and Australian cohorts. BJOG: Int J Obstet Gynecol. 2019;126(8):984\u0026ndash;95.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZong Xn, et al. maternal pre-pregnancy body mass index categories and infant birth outcomes: a population-based study of 9 million mother\u0026ndash;infant pairs. Front Nutr. 2022;9:789833.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKersten I, et al. Chronic diseases in pregnant women: prevalence and birth outcomes based on the SNiP-study. BMC Pregnancy Childbirth. 2014;14(1):75.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-pregnancy-and-childbirth","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"prch","sideBox":"Learn more about [BMC Pregnancy and Childbirth](http://bmcpregnancychildbirth.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/prch/default.aspx","title":"BMC Pregnancy and Childbirth","twitterHandle":"@BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Inflammation, Preterm birth, Machine learning, Biomarkers","lastPublishedDoi":"10.21203/rs.3.rs-8240167/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8240167/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePreterm birth (PTB) is a major cause of neonatal morbidity and mortality. Inflammation and metabolic disruption are involved in its pathology. This study aimed to assess maternal serum inflammatory and lipid markers as predictors of preterm birth using various machine learning models.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWomen who were pregnant and attending antenatal clinics were recruited for this study. A group of 186 females who had their births before 37 weeks was marked PTB. The 140 control term deliveries were selected at random. T-tests were used to evaluate variations in baseline and clinical parameters Pearson correlations were visualized via a heatmap. We built models for random forests (RF), logistic regression (LR), XGBoost, and support vector machine (SVM) using a 70/30 train/test split and 5-fold cross-validation. Model performance was measured using accuracy and AUC.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eCRP (r ≈ 0.45), IL-6 (r ≈ 0.40), C3 (r ≈ 0.31), BMI, and lipids correlated positively with PTB, whereas HDL correlated inversely (r ≈ − 0.13). Multivariable logistic regression identified age, BMI, IL-6, C3, and CRP as independent predictors. All ML models showed good discrimination (test AUC ≥ 0.819); logistic regression performed best (accuracy 78.57%, AUC 0.849) with cross-validated AUCs around 0.86–0.87 across models. SHAP analysis confirmed that IL-6, BMI, CRP, age, and C3 were dominant contributors to PTB risk.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMaternal inflammation and high BMI are important risk factors for preterm birth in this cohort. The logistic regression model combining clinical and serum measures is as good a predictor as complex ML algorithms. It is an interpretable model that can help with risk assessment at an early stage in similar settings.\u003c/p\u003e","manuscriptTitle":"Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-16 14:11:40","doi":"10.21203/rs.3.rs-8240167/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-12-17T10:39:18+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-17T07:28:55+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"192322270914906116415107256174649767985","date":"2025-12-17T02:23:28+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-16T08:38:53+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-16T05:19:41+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-16T03:18:42+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-15T21:51:17+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"19952706674170692864686835042750195029","date":"2025-12-15T19:49:25+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-15T03:25:58+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-13T13:54:50+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"224905327190088061923407127411948936546","date":"2025-12-13T11:48:53+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"2938184080420535886171700717282502378","date":"2025-12-13T09:06:29+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-12T23:01:01+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-12T15:37:42+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"175862545186536707507055094544835963098","date":"2025-12-12T07:02:26+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"198084718078264943309200202219110702624","date":"2025-12-12T04:12:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"96892516073998454432362656978663874927","date":"2025-12-12T01:59:18+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-11T19:06:39+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"221676053317933491597529200814315869143","date":"2025-12-11T15:18:36+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-11T14:10:11+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"103286564890191956118291145054669224400","date":"2025-12-11T14:03:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"51660777991536359355477670949410258389","date":"2025-12-11T12:20:10+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"204218641649548435801146827123295739096","date":"2025-12-11T09:48:44+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"38244645176821276514441763502206655974","date":"2025-12-11T09:27:48+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"157243016218881408077387982315039979162","date":"2025-12-11T09:09:15+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"309519328796177205247665433765232145105","date":"2025-12-11T09:01:37+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"198419806223534152249103004001940378880","date":"2025-12-11T08:59:32+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"154329609274424945869235742482841118033","date":"2025-12-11T08:57:52+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-12-11T08:53:32+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-12-02T17:43:18+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-12-01T23:19:00+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-01T23:18:10+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Pregnancy and Childbirth","date":"2025-11-30T05:13:29+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-pregnancy-and-childbirth","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"prch","sideBox":"Learn more about [BMC Pregnancy and Childbirth](http://bmcpregnancychildbirth.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/prch/default.aspx","title":"BMC Pregnancy and Childbirth","twitterHandle":"@BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"9e5bd4b4-fc24-42eb-b04d-e482407097a4","owner":[],"postedDate":"December 16th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-03-02T16:11:00+00:00","versionOfRecord":{"articleIdentity":"rs-8240167","link":"https://doi.org/10.1186/s12884-026-08784-0","journal":{"identity":"bmc-pregnancy-and-childbirth","isVorOnly":false,"title":"BMC Pregnancy and Childbirth"},"publishedOn":"2026-02-23 15:58:20","publishedOnDateReadable":"February 23rd, 2026"},"versionCreatedAt":"2025-12-16 14:11:40","video":"","vorDoi":"10.1186/s12884-026-08784-0","vorDoiUrl":"https://doi.org/10.1186/s12884-026-08784-0","workflowStages":[]},"version":"v1","identity":"rs-8240167","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8240167","identity":"rs-8240167","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00