Machine Learning in Early Screening for High-Grade Cervical Intraepithelial Neoplasia Using Blood Testing | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Machine Learning in Early Screening for High-Grade Cervical Intraepithelial Neoplasia Using Blood Testing Congbo Yue, Shichao Liu, Wenhua Wang, Yu Zhao, Xiaofeng Zhang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7574572/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 18 Dec, 2025 Read the published version in BMC Medical Informatics and Decision Making → Version 1 posted 30 You are reading this latest preprint version Abstract Background: High-grade cervical intraepithelial neoplasia (CIN2/3) is a critical precursor to cervical cancer, yet current screening methods (e.g., HPV testing, colposcopy) face challenges in accessibility and invasiveness, especially in resource-limited settings. We aimed to develop a non-invasive, machine learning (ML)-based model using routine blood biomarkers to predict high-grade CIN, offering a scalable and cost-effective screening alternative. Methods: Data from 128 cases of high-grade CIN and 120 cases of low-grade CIN were collected from a hospital in China. A total of 29 clinical characteristics and blood test measurements were considered for use in model development. Four feature selection algorithms (F-test, LASSO regression, decision tree, and random forest) were used to identify key predictors, and 11 machine learning algorithms were employed for model training. The dataset was split into training (70%) and testing (30%) cohorts. Model performance was evaluated using learning curves, receiver operating characteristic curves (ROC), area under the curve (AUC), Brier score, calibration curves, Precision-Recall (PR) curves, and Decision Curve Analysis (DCA). A web-based calculator was developed for clinical deployment. Results: Key features selected for the model included creatinine (CREA), red blood cell count (RBC), neutrophil percentage (NEU%), direct bilirubin (DBIL), and monocyte count (MON). The Support Vector Machine (SVM) algorithm achieved the best predictive performance, with an AUC of 0.75 and a Brier score of 0.21. The web tool (https://dvhl6xsf29zmdewixjx7kz.streamlit.app) provides real-time risk stratification. Conclusions: The model demonstrated strong performance across various validation metrics, indicating potential clinical utility. We also developed a web-based calculator to estimate high-grade CIN. Cervical intraepithelial neoplasia Machine learning Blood biomarkers Prediction model Decision support tool Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Background Cervical intraepithelial neoplasia (CIN) is a precursor to cervical cancer, with the progression from low-grade (CIN1) to high-grade (CIN3) lesions representing a critical step in the progression to invasive cervical cancer [1]. High-grade CIN, particularly CIN2 and CIN3, is a precursor of cervical cancer, which is among the leading causes of morbidity and mortality in women worldwide [2]. In 2020, there were approximately 604,000 new cases of cervical cancer and 342,000 related deaths globally, with disproportionately high rates in certain regions [3]; although cervical cancer rates have declined in high-income countries owing to widespread use of screening programs, the disease remains a major public health challenge in low- and middle-income countries [4]. Early detection and accurate histopathological classification of cervical high-grade squamous intraepithelial lesions (HSIL) are critical to preventing progression to invasive cervical cancer. Several techniques have been employed to detect and diagnose cervical HSIL, including human papilloma virus (HPV) DNA testing, cytology, colposcopy, and biopsy [5, 6]. HPV testing is a non-invasive and highly sensitive method for identification of women at risk of developing cervical cancer [7, 8]. Colposcopy, an optical examination of the cervix, can provide more definitive information but involves subjective interpretation and often requires specialized training [9]. Biopsy, the gold standard for diagnosis of cervical HSIL, is invasive and not suitable for widespread use owing to its cost and requirement for skilled professionals [10]. Moreover, although these methods are in widespread use, they are often not seamlessly integrated in an automated and cost-effective process, especially in resource-limited regions. Recent advances in machine learning show potential for enhancing the accuracy of detection and prediction of high-grade CIN. Machine learning techniques have shown efficacy in analysis of large-scale datasets, enabling identification of complex patterns and features that may be overlooked with traditional diagnostic methods [11]. Numerous studies have demonstrated the potential of machine learning models in medical diagnostics, particularly in the fields of imaging and pathology, in which they contribute to increased diagnostic precision and reproducibility [12−15]. However, there is a research gap regarding specific applications of machine learning for prediction of high-grade CIN, with few comprehensive studies having developed and validated models for this purpose [16]. In the present study, we aimed to develop and validate a predictive model for high-grade CIN detection using machine learning, thereby addressing critical gaps in current screening techniques. By employing a machine learning approach, we sought to provide an effective, accessible, and standardized method for high-grade CIN prediction that could reduce reliance on subjective interpretation of cytology and histology results and promote early detection. The predictive model developed in this study was trained and validated on separate datasets, and its performance was evaluated using key validation indicators including sensitivity, specificity, and AUC to optimize its diagnostic accuracy. We thus present an evidence-based, machine-learning-driven diagnostic tool that could facilitate earlier intervention. Ultimately, this may reduce the incidence and mortality of cervical cancer, particularly in resource-limited settings [17, 18]. Methods Study population and design For this study, we prospectively collected 128 cases of high-grade CIN (CIN2/3) and 120 cases of low-grade CIN (CIN1) from the gynecology department of Peking University People’s Hospital, Qingdao, China, from January 1, 2024, to May 31, 2025. The inclusion criteria were as follows: (1) patients diagnosed with high-grade CIN or low-grade CIN based on histopathological examination of biopsy specimens; (2) patients who had not undergone any prior treatment for cervical lesions (e.g., loop electrosurgical excision procedure, conization, or cryotherapy) [19]; and (3) patients with complete and available relevant blood test data. The exclusion criteria were: (1) previous treatment for cervical lesions; (2) insufficient or missing data; or (3) other gynecological conditions that could interfere with the assessment of cervical lesions (e.g., severe pelvic inflammatory disease or endometrial cancer) [20]. This study was approved by the Ethics Committee of Peking University People’s Hospital, Qingdao. Data collection and feature selection For this study, we collected baseline characteristic data, including patient age, CIN grade, and various predictor variables comprising routine hematological and biochemical parameters. The routine hematological parameters included red blood cell count (RBC), white blood cell count, hemoglobin (HGB), platelet count (PLT), neutrophil ratio (NEU%), lymphocyte ratio (LYM%), monocyte ratio (MON%), neutrophil count (NEU), lymphocyte count (LYM), monocyte count (MON), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean platelet volume (MPV), and large platelet ratio (P_LCR). The routine biochemical parameters included indirect bilirubin (IBIL), direct bilirubin (DBIL), total bilirubin (TBIL), total protein (TP), globulin (GLB), albumin (ALB), aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyl transferase(GGT), alkaline phosphatase (ALP), creatinine (CREA), and urea (UREA). All laboratory data, as well as age, were standardized and used for analysis. After data collection, four algorithms (F-test, LASSO regression, decision tree, and random forest) were used to screen the 29 features included in the study. Subsequently, Pearson correlation analysis and LASSO regression were used to identify the most suitable features for construction of a simplified predictive model. Machine learning algorithms A total of 11 machine learning algorithms were used for model development and evaluation: Naive Bayes (NB), K -Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT), Artificial Neural Network (ANN), Support Vector Machine (SVM), Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Extreme Gradient Boosting (XGBoost). Model establishment and evaluation The entire model was implemented using Python (3.10). All samples were randomly divided into a training cohort and a testing cohort in a 7:3 ratio. Scikit-learn (version 1.2.2) was used for data splitting, with stratified sampling to ensure the class distribution remained consistent between the training and testing cohorts. Eleven machine learning algorithms were built using Python libraries (scikit-learn 1.2.2, XGBoost 1.7.4, LightGBM 4.0.0). Grid search and ten-fold cross-validation were used to select the optimal hyperparameters for each algorithm. We compared 11 distinct machine learning algorithms using the selected predictor features and identified the best-performing model. The effectiveness of different algorithms was evaluated using ROC curves plotted with matplotlib (3.7.1) and machine learning curves plotted with scikit-plot (0.3.7). The optimal diagnostic model was selected by combining the AUC and the performance of the algorithm in machine learning. Predictive accuracy was evaluated using the ROC AUC and calibration curve. Decision curve analysis (DCA) was employed to assess the clinical usefulness and net benefit of the models. Precision–recall (PR) curves were also used, as they provide a more informative indication of performance than accuracy or ROC evaluation when positive class prediction is of primary interest. Details of the study design are displayed in Figure 1. Web application development To bridge the gap between research and clinical practice, we developed a user-friendly web application based on the final prediction model using the Python Streamlit framework. This tool allows healthcare professionals to input patient data and receive real-time predictions of the probability of high-grade CIN. It can also be used to generate a force plot for each participant, providing a visual representation of how different features contribute to the prediction. Statistical processing Data analysis was conducted using SPSS 23.0 software. The normality of the data was assessed using the Shapiro–Wilk (S-W) test. For data that followed a normal distribution, results are expressed as mean ± standard deviation (x̄ ± SD), and the homogeneity of variance between two groups was tested using Levene’s test. If homogeneity of variance was assumed, independent-samples t -tests were performed; otherwise, Welch’s ANOVA, a parametric test that includes adjustment for unequal variances, was applied. For data that did not follow a normal distribution, results are presented as the median and interquartile range (IQR), and comparisons between two groups were made using the Mann–Whitney U test. Statistical significance for all tests was determined at a threshold of P < 0.05. Results Clinical characteristics of the training cohort and testing cohort A total of 248 patients with high-grade or low-grade CIN were included in the study and randomly divided into a training cohort and testing cohort. The baseline characteristics of patients in the training and testing cohorts, which were crucial for assessing group comparability, are shown in Table 1. The training cohort consisted of 173 cases (approximately 59.5% of the total sample), whereas the testing cohort comprised 75 cases (approximately 40.5%). The distribution of high-grade and low-grade CIN cases was well balanced between the two cohorts; this ensured consistency in outcome proportions across both groups. Table1. Perioperative Statistical Data of Participants Variables Cohort Statistic P train (n = 173) test (n = 75) WBC (Mean ± SD) 5.54±1.50 5.59±1.31 t=-0.241 0.810 RBC (Mean ± SD) 4.29±0.38 4.38±0.35 t=-1.630 0.104 HGB (Mean ± SD) 129.35±12.40 132.33±10.72 t=-1.809 0.072 PLT (Mean ± SD) 245.38±55.27 249.43±62.35 t=-0.510 0.611 NEU% (Mean ± SD) 60.04±8.67 59.63±8.86 t=0.344 0.731 LYM% (Mean ± SD) 32.29±7.81 32.70±8.32 t=-0.372 0.710 MON% (Mean ± SD) 5.42±1.47 5.35±1.35 t=0.338 0.735 Table1. Perioperative Statistical Data of Participants Variables Cohort Statistic P train (n = 173) test (n = 75) LYM (Mean ± SD) 1.74±0.47 1.77±0.43 t=-1.773 0.584 HCT (Mean ± SD) 38.40±3.17 39.15±2.77 t=-1.773 0.077 MCV (Mean ± SD) 89.70±6.01 89.66±5.02 t=0.048 0.962 MCH (Mean ± SD) 30.21±2.49 30.31±2.11 t=-0.309 0.758 MCHC (Mean ± SD) 336.51±8.82 337.75±7.83 t=-1.049 0.295 MPV (Mean ± SD) 9.84±0.95 9.90±1.15 t=-0.455 0.649 TP (Mean ± SD) 73.34±4.29 73.64±4.19 t=-0.515 0.607 ALB (Mean ± SD) 45.32±2.43 45.76±2.29 t=-1.339 0.182 GLB (Mean ± SD) 28.02±3.22 27.88±3.02 t=0.316 0.752 UREA (Mean ± SD) 4.84±1.13 5.18±1.19 t=-2.139 0.033 CREA (Mean ± SD) 56.63±8.47 55.66±10.64 t=0.764 0.446 Age M (Q 1 , Q 3 ) 38.00(32.00,49.50) 37.00(31.00,51.00) Z=-0.781 0.435 NEU M (Q 1 , Q 3 ) 3.28(2.54,3.98) 3.26(2.54,4.01) Z=-0.220 0.826 MON M (Q 1 , Q 3 ) 0.28(0.23,0.33) 0.28(0.24,0.33) Z=-0.441 0.659 TBIL M (Q 1 , Q 3 ) 11.40(9.00,14.45) 10.90(9.50,13.30) Z=-0.765 0.444 DBIL M (Q 1 , Q 3 ) 3.20(2.60,4.25) 3.10(2.60,4.30) Z=-0.786 0.432 P_LCR M (Q 1 , Q 3 ) 24.20(20.05,28.65) 24.80(19.30,31.00) Z=-0.384 0.701 IBIL M (Q 1 , Q 3 ) 8.00(6.27,10.20) 7.71(6.20,9.60) Z=-0.589 0.556 ALT M (Q 1 , Q 3 ) 15.40(11.25,20.35) 16.70(12.90,22.10) Z=-1.869 0.062 AST M (Q 1 , Q 3 ) 17.30(15.00,20.00) 18.55(15.10,22.50) Z=-1.706 0.088 table1. Perioperative Statistical Data of Participants Variables Cohort Statistic P train (n = 173) test (n = 75) GGT M (Q 1 , Q 3 ) 17.40(13.75,22.55) 17.00(14.20,22.80) Z=-2.78 0.781 ALP M (Q 1 , Q 3 ) 60.00(48.65,76.55) 59.60(49.00,73.00) Z=-0.09 0.993 t: t-test, Z: Mann-Whitney test SD: standard deviation, M: Median, Q 1 : 1st Quartile, Q 3 : 3st Quartile Characteristic indicators and screening results from different algorithms We generated a Venn diagram based on the results of the four algorithms (F-test, LASSO regression, decision tree, and random forest) and found that the following seven features were selected by three or more algorithms: CREA, RBC, HCT, NEU%, DBIL, TBIL, and MON (Figure 2A). Pearson correlation coefficient analysis was performed to evaluate pairwise correlations among these seven features. As shown in Figure 2B, strong positive correlations were observed between HCT and RBC, as well as DBIL and TBIL, with Pearson correlation coefficients exceeding 0.6. Figure 2C presents the results of the LASSO regression analysis, which was used to reduce overfitting by selecting the most relevant features (CREA, RBC, NEU%, DBIL, and MON) for subsequent model construction. Predictive accuracy of the developed model for diagnosis of HSIL The five selected indicators were modeled and evaluated using 11 different algorithms. Participants were randomly divided into a training set of 173 cases and a test set of 75 cases at a ratio of 7:3. As shown in Figure 3A and B, the model showed satisfactory overall performance, even after a reduction in the number of indicators. SVM was identified as the best learning algorithm; as shown in Figure 4A–E, the learning curve, calibration curve, decision curve, PR curve, and ROC curve, all demonstrated the strong performance of the model constructed using the SVM algorithm. Specifically, the AUC was 0.75, and the Brier score was 0.21. To better elucidate the contributions of different factors to the prediction ability of our SVM-based model, we performed further evaluation using SHAP. The results indicated that RBC made the most significant contribution to the performance of the model, followed by CREA, NEU%, MON, and DBIL (Figure 4F). Clinical applications To facilitate clinical applications of the SVM model, we developed a user-friendly web application (https://dvhl6xsf29zmdewixjx7kz.streamlit.app). This application allows clinicians to input values for the five key features and automatically predicts the risk of high-grade CIN for individual patients. A screenshot of the generalized model is shown in Figure 5. Discussion In this study, we developed and validated a machine learning model using routine blood biomarkers to predict high-grade CIN. The SVM-based model achieved an AUC of 0.75 and a well-calibrated Brier score of 0.21, demonstrating robust diagnostic potential. Five key features—RBC, CREA, NEU%, DBIL, and MON—emerged as optimal predictors and were combined in a model that offers a minimally invasive, cost-effective tool for early risk stratification. Traditional high-grade CIN detection relies on HPV testing, cytology, colposcopy, and biopsy; however, the use of these techniques is hindered by their limited accessibility, as well as their subjectivity and invasiveness, particularly in resource-constrained regions [21]. Our model addresses these limitations by leveraging ubiquitous blood parameters, enabling detection without the need for specialized equipment or expertise. The biomarkers identified here are all relevant to the pathophysiology of cervical cancer: RBC and MON reflect anemia and chronic inflammation, CREA is correlated with metabolic dysregulation, and NEU% is an indicator of immune response [22−24]. This approach complements existing methods by providing a scalable, objective, first-line screening tool. Although studies of machine learning models using imaging or genomic data have reported AUC values of 0.80–0.95 [25], the performance of our blood-based model (AUC 0.75) was clinically significant given its practicality. Blood tests are routinely performed at low-cost and can be integrated seamlessly into primary care workflows; these aspects are critical for expanding screening in low- and middle-income countries in which the burden of cervical cancer is highest [26]. The strong calibration (Brier score of 0.21) further supports the clinical utility of our model, indicating reliable probability estimates for individual risk assessment. To support clinical applications, we have created an intuitive web application using the Streamlit framework to make diagnosis of high-grade CIN more accessible to healthcare providers. By utilizing simple and commonly available clinical indicators, the tool provides fast, real-time predictions without requiring specialized hardware or high-end computing equipment. Its user-friendly interface also enables smooth integration into standard clinical workflows, thereby promoting early detection and timely intervention for high-grade CIN across diverse healthcare environments. Although the model shows promise, it is important to note some limitations related to sample size and potential for overfitting. For instance, Rahimi et al. have demonstrated the importance of large, diverse datasets in ensuring the generalizability and robustness of predictive models [27]. The current study had a relatively small sample size ( n = 248), which may have limited the generalizability of the model across different populations. In addition, the model relies solely on blood-based features, whereas other critical factors, including HPV status and histopathological characteristics, are known to influence CIN progression [28]. Future research should therefore focus on increasing sample size and integrating a wider range of clinical and molecular markers to enhance the accuracy and robustness of the model. Conclusions In this study, we successfully developed and validated a non-invasive, machine-learning-based model that utilizes readily available, routine blood test parameters (specifically CREA, RBC, NEU%, DBIL, and MON) to predict high-grade CIN (CIN2/3). The SVM model demonstrated moderate predictive accuracy (AUC of 0.75) and calibration (Brier score of 0.21) and thus represents a cost-effective and accessible means of identifying women at higher risk. This approach, which has been translated into a web-based calculator, shows significant potential to enhance cervical cancer prevention by enabling early detection and prioritizing women needing more intensive follow-up, particularly in resource-limited settings. Declarations Ethics approval and consent to participate This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Peking University People's Hospital Qingdao Hospital (Approval No.: 2025PHQDB022-01). The need for informed consent was waived by the Ethics Committee of Peking University People's Hospital Qingdao Hospital because this was a retrospective analysis of anonymized data. Consent for publication Written informed consent for publication was obtained from all participants. Availability of data and materials The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. Competing interests The authors declare that they have no competing interests. Funding This study was sponsored by the National High Performance Medical Device Innovation Center Project (No. NMED2025KF-01-005) and the Application of Intelligent Mutual Recognition of Inspection Results Project (No. JYHRXZ2025B06). Authors' contributions C.Y. and S.L. designed the study, performed data analysis, and wrote the original manuscript. X.Z. collected and curated clinical data. W.W. developed the methodology and validated results. Y.Z. reviewed pathological classifications. G.Z. supervised the project, acquired funding, and revised the manuscript. All authors reviewed the manuscript. Note : C.Y. and S.L. contributed equally as co-first authors. Acknowledgements This study was supported by the National High Performance Medical Device Innovation Center Project (No. NMED2025KF-01-005) and the Application of Intelligent Mutual Recognition of Inspection Results Project (No. JYHRXZ2025B06). We sincerely thank the medical staff at Peking University People's Hospital Qingdao Hospital for their assistance in data collection, as well as all patients who participated in this research. References Soergel P, Dahl GF, Onsrud M, Hillemanns P. Photodynamic therapy of cervical intraepithelial neoplasia 1–3 and human papilloma virus (HPV) infection with methylaminolevulinate and hexaminolevulinate—a double‐blind, dose‐finding study. Lasers Surg Med. 2012;44:468–74. Saslow D, Solomon D, Lawson HW, Killackey M, Kulasingam SL, Cain J, et al. American Cancer Society, American Society for Colposcopy and Cervical Pathology, and American Society for Clinical Pathology screening guidelines for the prevention and early detection of cervical cancer. CA Cancer J Clin. 2012;62:147–72. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209–49. Petersen Z, Jaca A, Ginindza TG, Maseko G, Takatshana S, Ndlovu P, et al. Barriers to uptake of cervical cancer screening services in low-and-middle-income countries: a systematic review. BMC Womens Health. 2022;22:486. Mao C, Balasubramanian A, Koutsky LA. Should liquid-based cytology be repeated at the time of colposcopy? J Low Genit Tract Dis. 2005;9:82–8. Zhang L, Tian P, Li B, Xu L, Qiu L, Bi Z, et al. Risk‐stratified management of cervical high‐grade squamous intraepithelial lesion based on machine learning. J Med Virol. 2024;96:70016. Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. 2020;8:e191–203. Castle PE, Cremer M. Human Papillomavirus Testing in Cervical Cancer Screening. Obstet Gynecol Clin North Am. 2013;40:377–90. Steinauer JE, Turk JK, Pomerantz T, Simonson K, Learman LA, Landy U. Abortion training in US obstetrics and gynecology residency programs. Am J Obstet Gynecol. 2018;219:86.e1–6. Stoler MH. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA. 2001;285:1500. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–8. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56. Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, Mehta H, et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018;15:e1002686. Bejnordi BE, Veta M, Van Diest PJ, Van Ginneken B, Karssemeijer N, Litjens G, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017:2199–210. Mantula F, Toefy Y, Sewram V. Barriers to cervical cancer screening in Africa: a systematic review. BMC Public Health. 2024;24:525. Rahimi M, Akbari A, Asadi F, Emami H. Cervical cancer survival prediction by machine learning algorithms: a systematic review. BMC Cancer. 2023;23:341. Liu Y, Chen PC, Krause J, Peng L. How to read articles that use machine learning: users' guides to the medical literature. JAMA. 2019;322:1806–16. Castle PE, Sideri M, Jeronimo J, Solomon D, Schiffman M. Risk assessment to guide the prevention of cervical cancer. Am J Obstet Gynecol. 2007;197:356.e1–6. Massad LS, Einstein MH, Huh WK, Katki HA, Kinney WK, Schiffman M, et al. 2012 updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors. J Low Genit Tract Dis. 2013;17(Supplement 1):S1–27. Arbyn M, Sankaranarayanan R, Muwonge R, Keita N, Dolo A, Mbalawa CG, et al. Pooled analysis of the accuracy of five cervical cancer screening tests assessed in eleven studies in Africa and India. Int J Cancer. 2008;123:153–60. Purde M-T, Nock S, Risch L, Medina Escobar P, Grebhardt C, Nydegger UE, et al. The cystatin C/creatinine ratio, a marker of glomerular filtration quality: associated factors, reference intervals, and prediction of morbidity and mortality in healthy seniors. Transl Res. 2016;169:80–90.e2. Pan Y-P, Fang Y-P, Xu Y-H, Wang Z-X, Shen J-L. The diagnostic value of procalcitonin versus other biomarkers in prediction of bloodstream infection. Clin Lab. 2017;63(02/2017):277–85. Hou F, Qiao Y, Qiao Y, Shi Y, Chen M, Kong M, et al. A retrospective analysis comparing metagenomic next-generation sequencing with conventional microbiology testing for the identification of pathogens in patients with severe infections. Front Cell Infect Microbiol. 2025;15:1530486. Batra U, Nathany S, Nath SK, Jose JT, Sharma T, P P, et al. AI-based pipeline for early screening of lung cancer: integrating radiology, clinical, and genomics data. Lancet Reg Health Southeast Asia. 2024;24:100352. Xu M, Lin Z, Siegel CE, Laska EM, Abu-Amara D, Genfi A, et al. Screening for PTSD and TBI in veterans using routine clinical laboratory blood tests. Transl Psychiatry. 2023;13:64. Mohammad‐Rahimi H, Sohrabniya F, Ourang SA, Dianat O, Aminoshariae A, Nagendrababu V, et al. Artificial intelligence in endodontics: data preparation, clinical applications, ethical considerations, limitations, and future directions. Int Endod J. 2024;57:1566–95. Castle PE, Sideri M, Jeronimo J, Solomon D, Schiffman M. Risk assessment to guide the prevention of cervical cancer. Am J Obstet Gynecol. 2007;197:356.e1–6. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 18 Dec, 2025 Read the published version in BMC Medical Informatics and Decision Making → Version 1 posted Editorial decision: Revision requested 06 Oct, 2025 Reviewers agreed at journal 24 Sep, 2025 Reviewers agreed at journal 22 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviews received at journal 21 Sep, 2025 Reviewers agreed at journal 21 Sep, 2025 Reviews received at journal 20 Sep, 2025 Reviewers agreed at journal 20 Sep, 2025 Reviewers agreed at journal 20 Sep, 2025 Reviews received at journal 20 Sep, 2025 Reviewers agreed at journal 20 Sep, 2025 Reviewers agreed at journal 20 Sep, 2025 Reviews received at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviews received at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviewers agreed at journal 19 Sep, 2025 Reviewers agreed at journal 17 Sep, 2025 Reviewers invited by journal 12 Sep, 2025 Editor invited by journal 12 Sep, 2025 Editor assigned by journal 11 Sep, 2025 Submission checks completed at journal 11 Sep, 2025 First submitted to journal 09 Sep, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7574572","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":518369668,"identity":"2af72e95-aeeb-413f-8a12-6fbc45610211","order_by":0,"name":"Congbo Yue","email":"","orcid":"","institution":"Peking University People's Hospital, Qingdao","correspondingAuthor":false,"prefix":"","firstName":"Congbo","middleName":"","lastName":"Yue","suffix":""},{"id":518369669,"identity":"d961523b-ed3f-4f1b-a718-01db082cf693","order_by":1,"name":"Shichao Liu","email":"","orcid":"","institution":"Qilu Hospital of Shandong University","correspondingAuthor":false,"prefix":"","firstName":"Shichao","middleName":"","lastName":"Liu","suffix":""},{"id":518369670,"identity":"fbc079a6-cbe6-40ec-a46e-6fd809243f23","order_by":2,"name":"Wenhua Wang","email":"","orcid":"","institution":"Peking University People's Hospital, Qingdao","correspondingAuthor":false,"prefix":"","firstName":"Wenhua","middleName":"","lastName":"Wang","suffix":""},{"id":518369671,"identity":"1839954f-435c-4ba5-a9c3-45c238e27d42","order_by":3,"name":"Yu Zhao","email":"","orcid":"","institution":"Qingdao Eighth People's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yu","middleName":"","lastName":"Zhao","suffix":""},{"id":518369672,"identity":"e832ae0e-fd0b-4f41-b240-c10921d3f37a","order_by":4,"name":"Xiaofeng Zhang","email":"","orcid":"","institution":"Peking University People's Hospital, Qingdao","correspondingAuthor":false,"prefix":"","firstName":"Xiaofeng","middleName":"","lastName":"Zhang","suffix":""},{"id":518369673,"identity":"d580a7ea-1026-4963-bd57-9d5b25a0e296","order_by":5,"name":"Guanghui Zhao","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+ElEQVRIiWNgGAWjYLACxgYGHgYJIOMDAzOIb0C8FsYZpGhhAGlh5iFGi3xE8rOHX3cclpGf3fzwsc0f68QG9uZtEgw1d3BqMbyRZm4se+YwD+OcY8bGuW3piQ08x8okGI49w61lRoKZtGTbYR5mCSAjt+FwYoNEjpkEY8NhPFrSv4G1sEkAGRZ/gFrk3+DXIg80U/IjUAsPkCHNwAayhQe/FgOeN2XSjG3pPBISOcWGvW3pxm08acUWCcfw2NKevk3yZ5u1vfyM9I0Pfvyxlu1nP7zxxocaPLYcAEcHEmADEQk4NQBtaQDG5A88CkbBKBgFo2AUMAAAkslQgIJmqvoAAAAASUVORK5CYII=","orcid":"","institution":"Peking University People's Hospital, Qingdao","correspondingAuthor":true,"prefix":"","firstName":"Guanghui","middleName":"","lastName":"Zhao","suffix":""}],"badges":[],"createdAt":"2025-09-09 13:53:38","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7574572/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7574572/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12911-025-03321-z","type":"published","date":"2025-12-18T15:57:56+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":91936430,"identity":"58e927e4-555d-43f6-9c90-90b59efa7c87","added_by":"auto","created_at":"2025-09-23 02:49:23","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":397559,"visible":true,"origin":"","legend":"\u003cp\u003eFlow chart of the survey design\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/6b60270dffa8dabde5db6b73.png"},{"id":91934749,"identity":"b1034609-3242-4626-af0f-05b32239312f","added_by":"auto","created_at":"2025-09-23 02:41:23","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":636096,"visible":true,"origin":"","legend":"\u003cp\u003eComprehensive screening of the included features. (A) F test, LASSO regression, decision tree and random forest intersection of the four algorithms selected features. (B) Pearson correlation coefficient of 7 features. (C) The fitting degree of LASSO regression analysis parameters\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/0e640ec5d183e5b759097f4d.png"},{"id":91934752,"identity":"16fa40ae-9b23-4c35-92d3-5e9f7e45f94e","added_by":"auto","created_at":"2025-09-23 02:41:23","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":730136,"visible":true,"origin":"","legend":"\u003cp\u003eComparison set of the area under the receiver operating curve before (A) and after (B) index screening.\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/2632c54bd49dee8aaf6894ae.png"},{"id":91936432,"identity":"1d133d93-d7f4-409d-8971-fb468c217ac5","added_by":"auto","created_at":"2025-09-23 02:49:23","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":771425,"visible":true,"origin":"","legend":"\u003cp\u003eThe evaluation of machine learning algorithms by machine learning curve (A), calibration curve (B), DCA curve (C), PR curve (D), and ROC curve (E). (F) Interpretation of the model constructed by the SVM algorithm with 5 variables.\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/a13d0b7b577ec33e066ff8d0.png"},{"id":91936431,"identity":"5bc9060f-ae37-44bd-b3c0-9774ce8cb96b","added_by":"auto","created_at":"2025-09-23 02:49:23","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":370326,"visible":true,"origin":"","legend":"\u003cp\u003eAn example output of the web application.\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/76e9eeda88589e4e0b0f7909.png"},{"id":98814578,"identity":"1fff4b2e-a98d-445f-92d7-1f65805a9e23","added_by":"auto","created_at":"2025-12-22 16:12:39","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3413834,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7574572/v1/78f464d3-3588-4576-85c3-272759ca6fa9.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Machine Learning in Early Screening for High-Grade Cervical Intraepithelial Neoplasia Using Blood Testing","fulltext":[{"header":"Background","content":"\u003cp\u003eCervical intraepithelial neoplasia (CIN) is a precursor to cervical cancer, with the progression from low-grade (CIN1) to high-grade (CIN3) lesions representing a critical step in the progression to invasive cervical cancer [1]. High-grade CIN, particularly CIN2 and CIN3, is a precursor of cervical cancer, which is among the leading causes of morbidity and mortality in women worldwide [2]. In 2020, there were approximately 604,000 new cases of cervical cancer and 342,000 related deaths globally, with disproportionately high rates in certain regions [3]; although cervical cancer rates have declined in high-income countries owing to widespread use of screening programs, the disease remains a major public health challenge in low- and middle-income countries [4].\u0026nbsp;Early detection and accurate histopathological classification of cervical high-grade squamous intraepithelial lesions (HSIL) are critical to preventing progression to invasive cervical cancer.\u003c/p\u003e\n\u003cp\u003eSeveral techniques have been employed to detect and diagnose cervical HSIL, including human papilloma virus (HPV) DNA testing, cytology, colposcopy, and biopsy [5, 6]. HPV testing is a non-invasive and highly sensitive method for identification of women at risk of developing cervical cancer [7, 8]. Colposcopy, an optical examination of the cervix, can provide more definitive information but involves subjective interpretation and often requires specialized training [9]. Biopsy, the gold standard for diagnosis of cervical HSIL, is invasive and not suitable for widespread use owing to its cost and requirement for skilled professionals [10]. Moreover, although these methods are in widespread use, they are often not seamlessly integrated in an automated and cost-effective process, especially in resource-limited regions.\u003c/p\u003e\n\u003cp\u003eRecent advances in machine learning show potential for enhancing the accuracy of detection and prediction of high-grade CIN. Machine learning techniques have shown efficacy in analysis of large-scale datasets, enabling identification of complex patterns and features that may be overlooked with traditional diagnostic methods [11]. Numerous studies have demonstrated the potential of machine learning models in medical diagnostics, particularly in the fields of imaging and pathology, in which they contribute to increased diagnostic precision and reproducibility\u0026nbsp;[12−15]. However, there is a research gap regarding specific applications of machine learning for prediction of high-grade CIN, with few comprehensive studies having developed and validated models for this purpose [16].\u003c/p\u003e\n\u003cp\u003eIn the present study, we aimed to develop and validate a predictive model for high-grade CIN detection using machine learning, thereby addressing critical gaps in current screening techniques. By employing a machine learning approach, we sought to provide an effective, accessible, and standardized method for high-grade CIN prediction that could reduce reliance on subjective interpretation of cytology and histology results and promote early detection. The predictive model developed in this study was trained and validated on separate datasets, and its performance was evaluated using key validation indicators including sensitivity, specificity, and AUC to optimize its diagnostic accuracy. We thus present an evidence-based, machine-learning-driven diagnostic tool that could facilitate earlier intervention. Ultimately, this may reduce the incidence and mortality of cervical cancer, particularly in resource-limited settings [17, 18].\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cstrong\u003eStudy population and design\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor this study, we prospectively collected 128 cases of high-grade CIN (CIN2/3) and 120 cases of low-grade CIN (CIN1) from the gynecology department of Peking University People\u0026rsquo;s Hospital, Qingdao, China, from January 1, 2024, to May 31, 2025. The inclusion criteria were as follows: (1) patients diagnosed with high-grade CIN or low-grade CIN based on histopathological examination of biopsy specimens; (2) patients who had not undergone any prior treatment for cervical lesions (e.g., loop electrosurgical excision procedure, conization, or cryotherapy) [19]; and (3) patients with complete and available relevant blood test data. The exclusion criteria were: (1) previous treatment for cervical lesions; (2) insufficient or missing data; or (3) other gynecological conditions that could interfere with the assessment of cervical lesions (e.g., severe pelvic inflammatory disease or endometrial cancer) [20].\u0026nbsp;This study was approved by the Ethics Committee of Peking University People\u0026rsquo;s Hospital, Qingdao.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData collection and feature selection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor this study, we collected baseline characteristic data, including patient age, CIN grade, and various predictor variables comprising routine hematological and biochemical parameters. The routine hematological parameters included red blood cell count (RBC), white blood cell count, hemoglobin (HGB), platelet count (PLT), neutrophil ratio (NEU%), lymphocyte ratio (LYM%), monocyte ratio (MON%), neutrophil count (NEU), lymphocyte count (LYM), monocyte count (MON), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean platelet volume (MPV), and large platelet ratio (P_LCR). The routine biochemical parameters included indirect bilirubin (IBIL), direct bilirubin (DBIL), total bilirubin (TBIL), total protein (TP), globulin (GLB), albumin (ALB), aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyl transferase(GGT), alkaline phosphatase (ALP), creatinine (CREA), and urea (UREA). All laboratory data, as well as age, were standardized and used for analysis.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAfter data collection, four algorithms (F-test, LASSO regression, decision tree, and random forest) were used to screen the 29 features included in the study. Subsequently, Pearson correlation analysis and LASSO regression were used to identify the most suitable features for construction of a simplified predictive model.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMachine learning algorithms\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA total of 11 machine learning algorithms were used for model development and evaluation: Naive Bayes (NB), \u003cem\u003eK\u003c/em\u003e-Nearest Neighbors (KNN), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT), Artificial Neural Network (ANN), Support Vector Machine (SVM), Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Extreme Gradient Boosting (XGBoost).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eModel establishment and evaluation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe entire model was implemented using Python (3.10). All samples were randomly divided into a training cohort and a testing cohort in a 7:3 ratio. Scikit-learn (version 1.2.2) was used for data splitting, with stratified sampling to ensure the class distribution remained consistent between the training and testing cohorts. Eleven machine learning algorithms were built using Python libraries (scikit-learn 1.2.2, XGBoost 1.7.4, LightGBM 4.0.0). Grid search and ten-fold cross-validation were used to select the optimal hyperparameters for each algorithm. We compared 11 distinct machine learning algorithms using the selected predictor features and identified the best-performing model.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe effectiveness of different algorithms was evaluated using ROC curves plotted with matplotlib (3.7.1) and machine learning curves plotted with scikit-plot (0.3.7). The optimal diagnostic model was selected by combining the AUC and the performance of the algorithm in machine learning. Predictive accuracy was evaluated using the ROC AUC and calibration curve. Decision curve analysis (DCA) was employed to assess the clinical usefulness and net benefit of the models. Precision\u0026ndash;recall (PR) curves were also used, as they provide a more informative indication of performance than accuracy or ROC evaluation when positive class prediction is of primary interest. Details of the study design are displayed in Figure 1.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eWeb application development\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo bridge the gap between research and clinical practice, we developed a user-friendly web application based on the final prediction model using the Python Streamlit framework. This tool allows healthcare professionals to input patient data and receive real-time predictions of the probability of high-grade CIN. It can also be used to generate a force plot for each participant, providing a visual representation of how different features contribute to the prediction.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatistical processing\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eData analysis was conducted using SPSS 23.0 software. The normality of the data was assessed using the Shapiro\u0026ndash;Wilk (S-W) test. For data that followed a normal distribution, results are expressed as mean \u0026plusmn; standard deviation (x̄ \u0026plusmn; SD), and the homogeneity of variance between two groups was tested using Levene\u0026rsquo;s test. If homogeneity of variance was assumed, independent-samples \u003cem\u003et\u003c/em\u003e-tests were performed; otherwise, Welch\u0026rsquo;s ANOVA, a parametric test that includes adjustment for unequal variances, was applied. For data that did not follow a normal distribution, results are presented as the median and interquartile range (IQR), and comparisons between two groups were made using the Mann\u0026ndash;Whitney U test. Statistical significance for all tests was determined at a threshold of \u003cem\u003eP\u003c/em\u003e \u0026lt; 0.05.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cstrong\u003eClinical characteristics of the training cohort and testing cohort\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA total of 248 patients with high-grade or low-grade CIN were included in the study and randomly divided into a training cohort and testing cohort. The baseline characteristics of patients in the training and testing cohorts, which were crucial for assessing group comparability, are shown in Table 1. The training cohort consisted of 173 cases (approximately 59.5% of the total sample), whereas the testing cohort comprised 75 cases (approximately 40.5%). The distribution of high-grade and low-grade CIN cases was well balanced between the two cohorts; this ensured consistency in outcome proportions across both groups.\u003c/p\u003e\n\u003cdiv\u003e\n \u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"567\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"10\" style=\"width: 59px;\"\u003e\n \u003cp\u003eTable1. Perioperative Statistical Data of Participants\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eVariables\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 121px;\"\u003e\n \u003cp\u003eCohort\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003eStatistic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e\u003cem\u003eP\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003etrain (n = 173)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003etest (n = 75)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eWBC\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e5.54\u0026plusmn;1.50\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e5.59\u0026plusmn;1.31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=-0.241\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.810\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eRBC\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e4.29\u0026plusmn;0.38\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e4.38\u0026plusmn;0.35\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=-1.630\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.104\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eHGB\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e129.35\u0026plusmn;12.40\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e132.33\u0026plusmn;10.72\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=-1.809\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.072\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003ePLT\u0026nbsp;\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e245.38\u0026plusmn;55.27\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e249.43\u0026plusmn;62.35\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=-0.510\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.611\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eNEU%\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e60.04\u0026plusmn;8.67\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e59.63\u0026plusmn;8.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=0.344\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.731\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eLYM%\u0026nbsp;\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e32.29\u0026plusmn;7.81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e32.70\u0026plusmn;8.32\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=-0.372\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.710\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eMON% (Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e5.42\u0026plusmn;1.47\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e5.35\u0026plusmn;1.35\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003et=0.338\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e0.735\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"10\" style=\"width: 59px;\"\u003e\n \u003cp\u003eTable1. Perioperative Statistical Data of Participants\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 181px;\"\u003e\n \u003cp\u003eVariables\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 121px;\"\u003e\n \u003cp\u003eCohort\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 81px;\"\u003e\n \u003cp\u003eStatistic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 59px;\"\u003e\n \u003cp\u003e\u003cem\u003eP\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 126px;\"\u003e\n \u003cp\u003etrain (n = 173)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 121px;\"\u003e\n \u003cp\u003etest (n = 75)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eLYM\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e1.74\u0026plusmn;0.47\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e1.77\u0026plusmn;0.43\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-1.773\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.584\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eHCT\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e38.40\u0026plusmn;3.17\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e39.15\u0026plusmn;2.77\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-1.773\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.077\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eMCV\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e89.70\u0026plusmn;6.01\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e89.66\u0026plusmn;5.02\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=0.048\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.962\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eMCH\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e30.21\u0026plusmn;2.49\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e30.31\u0026plusmn;2.11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-0.309\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.758\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eMCHC\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e336.51\u0026plusmn;8.82\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e337.75\u0026plusmn;7.83\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-1.049\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.295\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eMPV\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e9.84\u0026plusmn;0.95\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e9.90\u0026plusmn;1.15\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-0.455\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.649\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eTP\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e73.34\u0026plusmn;4.29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e73.64\u0026plusmn;4.19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-0.515\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.607\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eALB\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e45.32\u0026plusmn;2.43\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e45.76\u0026plusmn;2.29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-1.339\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.182\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eGLB\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e28.02\u0026plusmn;3.22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e27.88\u0026plusmn;3.02\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=0.316\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.752\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eUREA\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e4.84\u0026plusmn;1.13\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e5.18\u0026plusmn;1.19\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=-2.139\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.033\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eCREA\u003c/p\u003e\n \u003cp\u003e(Mean \u0026plusmn; SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e56.63\u0026plusmn;8.47\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e55.66\u0026plusmn;10.64\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003et=0.764\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.446\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eAge\u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e38.00(32.00,49.50)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e37.00(31.00,51.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.781\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.435\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eNEU \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e3.28(2.54,3.98)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e3.26(2.54,4.01)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.220\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.826\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eMON \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e0.28(0.23,0.33)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e0.28(0.24,0.33)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.441\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.659\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eTBIL \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e11.40(9.00,14.45)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e10.90(9.50,13.30)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.765\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.444\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eDBIL \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e3.20(2.60,4.25)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e3.10(2.60,4.30)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.786\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.432\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eP_LCR \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e24.20(20.05,28.65)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e24.80(19.30,31.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.384\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.701\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eIBIL \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e8.00(6.27,10.20)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e7.71(6.20,9.60)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.589\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.556\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eALT \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e15.40(11.25,20.35)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e16.70(12.90,22.10)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-1.869\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.062\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eAST \u0026nbsp;\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e17.30(15.00,20.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e18.55(15.10,22.50)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-1.706\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.088\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"9\" style=\"width: 539px;\"\u003e\n \u003cp\u003etable1. Perioperative Statistical Data of Participants\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" style=\"width: 111px;\"\u003e\n \u003cp\u003eVariables\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"4\" style=\"width: 274px;\"\u003e\n \u003cp\u003eCohort\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eStatistic\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" rowspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e\u003cem\u003eP\u003c/em\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003etrain (n = 173)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003etest (n = 75)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eGGT\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e17.40(13.75,22.55)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e17.00(14.20,22.80)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-2.78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.781\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 111px;\"\u003e\n \u003cp\u003eALP\u003c/p\u003e\n \u003cp\u003eM (Q\u003csub\u003e1\u003c/sub\u003e, Q\u003csub\u003e3\u003c/sub\u003e)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e60.00(48.65,76.55)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 137px;\"\u003e\n \u003cp\u003e59.60(49.00,73.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003eZ=-0.09\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" style=\"width: 77px;\"\u003e\n \u003cp\u003e0.993\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"9\" style=\"width: 539px;\"\u003e\n \u003cp\u003e\u0026nbsp;t: t-test, Z: Mann-Whitney test\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"9\" style=\"width: 539px;\"\u003e\n \u003cp\u003e\u0026nbsp;SD: standard deviation, M: Median, Q\u003csub\u003e1\u003c/sub\u003e: 1st Quartile, Q\u003csub\u003e3\u003c/sub\u003e: 3st Quartile\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 28px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003e\u003cstrong\u003eCharacteristic indicators and screening results from different algorithms\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe generated a Venn diagram based on the results of the four algorithms (F-test, LASSO regression, decision tree, and random forest) and found that the following seven features were selected by three or more algorithms: CREA, RBC, HCT, NEU%, DBIL, TBIL, and MON (Figure 2A). Pearson correlation coefficient analysis was performed to evaluate pairwise correlations among these seven features. As shown in Figure 2B, strong positive correlations were observed between HCT and RBC, as well as DBIL and TBIL, with Pearson correlation coefficients exceeding 0.6. Figure 2C presents the results of the LASSO regression analysis, which was used to reduce overfitting by selecting the most relevant features (CREA, RBC, NEU%, DBIL, and MON) for subsequent model construction.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ePredictive accuracy of the developed model for diagnosis of HSIL\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe five selected indicators were modeled and evaluated using 11 different algorithms. Participants were randomly divided into a training set of 173 cases and a test set of 75 cases at a ratio of 7:3. As shown in Figure 3A and B, the model showed satisfactory overall performance, even after a reduction in the number of indicators. SVM was identified as the best learning algorithm; as shown in Figure 4A\u0026ndash;E, the learning curve, calibration curve, decision curve, PR curve, and ROC curve, all demonstrated the strong performance of the model constructed using the SVM algorithm. Specifically, the AUC was 0.75, and the Brier score was 0.21.\u003c/p\u003e\n\u003cp\u003eTo better elucidate the contributions of different factors to the prediction ability of our SVM-based model, we performed further evaluation using SHAP. The results indicated that RBC made the most significant contribution to the performance of the model, followed by CREA, NEU%, MON, and DBIL (Figure 4F).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eClinical applications\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTo facilitate clinical applications of the SVM model, we developed a user-friendly web application (https://dvhl6xsf29zmdewixjx7kz.streamlit.app). This application allows clinicians to input values for the five key features and automatically predicts the risk of high-grade CIN for individual patients. A screenshot of the generalized model is shown in Figure 5.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, we developed and validated a machine learning model using routine blood biomarkers to predict high-grade CIN. The SVM-based model achieved an AUC of 0.75 and a well-calibrated Brier score of 0.21, demonstrating robust diagnostic potential. Five key features—RBC, CREA, NEU%, DBIL, and MON—emerged as optimal predictors and were combined in a model that offers a minimally invasive, cost-effective tool for early risk stratification.\u003c/p\u003e\n\u003cp\u003eTraditional high-grade CIN detection relies on HPV testing, cytology, colposcopy, and biopsy; however, the use of these techniques is hindered by their limited accessibility, as well as their subjectivity and invasiveness, particularly in resource-constrained regions [21]. Our model addresses these limitations by leveraging ubiquitous blood parameters, enabling detection without the need for specialized equipment or expertise. The biomarkers identified here are all relevant to the pathophysiology of cervical cancer: RBC and MON reflect anemia and chronic inflammation, CREA is correlated with metabolic dysregulation, and NEU% is an indicator of immune response [22−24]. This approach complements existing methods by providing a scalable, objective, first-line screening tool.\u003c/p\u003e\n\u003cp\u003eAlthough studies of machine learning models using imaging or genomic data have reported AUC values of 0.80–0.95 [25], the performance of our blood-based model (AUC 0.75) was clinically significant given its practicality. Blood tests are routinely performed at low-cost and can be integrated seamlessly into primary care workflows; these aspects are critical for expanding screening in low- and middle-income countries in which the burden of cervical cancer is highest [26]. The strong calibration (Brier score of 0.21) further supports the clinical utility of our model, indicating reliable probability estimates for individual risk assessment.\u003c/p\u003e\n\u003cp\u003eTo support clinical applications, we have created an intuitive web application using the Streamlit framework to make diagnosis of high-grade CIN more accessible to healthcare providers. By utilizing simple and commonly available clinical indicators, the tool provides fast, real-time predictions without requiring specialized hardware or high-end computing equipment. Its user-friendly interface also enables smooth integration into standard clinical workflows, thereby promoting early detection and timely intervention for high-grade CIN across diverse healthcare environments.\u003c/p\u003e\n\u003cp\u003eAlthough the model shows promise, it is important to note some limitations related to sample size and potential for overfitting. For instance, Rahimi et al. have demonstrated the importance of large, diverse datasets in ensuring the generalizability and robustness of predictive models [27]. The current study had a relatively small sample size (\u003cem\u003en\u003c/em\u003e = 248), which may have limited the generalizability of the model across different populations. In addition, the model relies solely on blood-based features, whereas other critical factors, including HPV status and histopathological characteristics, are known to influence CIN progression [28]. Future research should therefore focus on increasing sample size and integrating a wider range of clinical and molecular markers to enhance the accuracy and robustness of the model.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eIn this study, we successfully developed and validated a non-invasive, machine-learning-based model that utilizes readily available, routine blood test parameters (specifically CREA, RBC, NEU%, DBIL, and MON) to predict high-grade CIN (CIN2/3). The SVM model demonstrated moderate predictive accuracy (AUC of 0.75) and calibration (Brier score of 0.21) and thus represents a cost-effective and accessible means of identifying women at higher risk. This approach, which has been translated into a web-based calculator, shows significant potential to enhance cervical cancer prevention by enabling early detection and prioritizing women needing more intensive follow-up, particularly in resource-limited settings.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Peking University People's Hospital Qingdao Hospital (Approval No.: 2025PHQDB022-01). The need for informed consent was waived by the Ethics Committee of Peking University People's Hospital Qingdao Hospital because this was a retrospective analysis of anonymized data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWritten informed consent for publication was obtained from all participants.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was sponsored by the National High Performance Medical Device Innovation Center Project (No. NMED2025KF-01-005) and the Application of Intelligent Mutual Recognition of Inspection Results Project (No. JYHRXZ2025B06).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors' contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eC.Y.\u003c/strong\u003e and \u003cstrong\u003eS.L.\u003c/strong\u003e designed the study, performed data analysis, and wrote the original manuscript. \u003cstrong\u003eX.Z.\u003c/strong\u003e collected and curated clinical data. \u003cstrong\u003eW.W.\u003c/strong\u003e developed the methodology and validated results. \u003cstrong\u003eY.Z.\u003c/strong\u003e reviewed pathological classifications. \u003cstrong\u003eG.Z.\u003c/strong\u003e supervised the project, acquired funding, and revised the manuscript. All authors reviewed the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eNote\u003c/strong\u003e:\u003cstrong\u003e\u0026nbsp;C.Y.\u0026nbsp;\u003c/strong\u003eand \u003cstrong\u003eS.L.\u003c/strong\u003e contributed equally as co-first authors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by the \u003cstrong\u003eNational High Performance Medical Device Innovation Center Project\u003c/strong\u003e (No. NMED2025KF-01-005) and the \u003cstrong\u003eApplication of Intelligent Mutual Recognition of Inspection Results Project\u003c/strong\u003e (No. JYHRXZ2025B06). We sincerely thank the medical staff at \u003cstrong\u003ePeking University People's Hospital Qingdao Hospital\u003c/strong\u003e for their assistance in data collection, as well as all patients who participated in this research.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSoergel P, Dahl GF, Onsrud M, Hillemanns P. Photodynamic therapy of cervical intraepithelial neoplasia 1\u0026ndash;3 and human papilloma virus (HPV) infection with methylaminolevulinate and hexaminolevulinate\u0026mdash;a double‐blind, dose‐finding study. Lasers Surg Med. 2012;44:468\u0026ndash;74.\u003c/li\u003e\n\u003cli\u003eSaslow D, Solomon D, Lawson HW, Killackey M, Kulasingam SL, Cain J, et al. American Cancer Society, American Society for Colposcopy and Cervical Pathology, and American Society for Clinical Pathology screening guidelines for the prevention and early detection of cervical cancer. CA Cancer J Clin. 2012;62:147\u0026ndash;72.\u003c/li\u003e\n\u003cli\u003eSung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209\u0026ndash;49.\u003c/li\u003e\n\u003cli\u003ePetersen Z, Jaca A, Ginindza TG, Maseko G, Takatshana S, Ndlovu P, et al. Barriers to uptake of cervical cancer screening services in low-and-middle-income countries: a systematic review. BMC Womens Health. 2022;22:486.\u003c/li\u003e\n\u003cli\u003eMao C, Balasubramanian A, Koutsky LA. Should liquid-based cytology be repeated at the time of colposcopy? J Low Genit Tract Dis. 2005;9:82\u0026ndash;8.\u003c/li\u003e\n\u003cli\u003eZhang L, Tian P, Li B, Xu L, Qiu L, Bi Z, et al. Risk‐stratified management of cervical high‐grade squamous intraepithelial lesion based on machine learning. J Med Virol. 2024;96:70016.\u003c/li\u003e\n\u003cli\u003eArbyn M, Weiderpass E, Bruni L, de Sanjos\u0026eacute; S, Saraiya M, Ferlay J, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. 2020;8:e191\u0026ndash;203.\u003c/li\u003e\n\u003cli\u003eCastle PE, Cremer M. Human Papillomavirus Testing in Cervical Cancer Screening. Obstet Gynecol Clin North Am. 2013;40:377\u0026ndash;90.\u003c/li\u003e\n\u003cli\u003eSteinauer JE, Turk JK, Pomerantz T, Simonson K, Learman LA, Landy U. Abortion training in US obstetrics and gynecology residency programs. Am J Obstet Gynecol. 2018;219:86.e1\u0026ndash;6.\u003c/li\u003e\n\u003cli\u003eStoler MH. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA. 2001;285:1500.\u003c/li\u003e\n\u003cli\u003eEsteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115\u0026ndash;8.\u003c/li\u003e\n\u003cli\u003eLitjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60\u0026ndash;88.\u003c/li\u003e\n\u003cli\u003eTopol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44\u0026ndash;56.\u003c/li\u003e\n\u003cli\u003eRajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, Mehta H, et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018;15:e1002686.\u003c/li\u003e\n\u003cli\u003eBejnordi BE, Veta M, Van Diest PJ, Van Ginneken B, Karssemeijer N, Litjens G, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017:2199\u0026ndash;210.\u003c/li\u003e\n\u003cli\u003eMantula F, Toefy Y, Sewram V. Barriers to cervical cancer screening in Africa: a systematic review. BMC Public Health. 2024;24:525.\u003c/li\u003e\n\u003cli\u003eRahimi M, Akbari A, Asadi F, Emami H. Cervical cancer survival prediction by machine learning algorithms: a systematic review. BMC Cancer. 2023;23:341.\u003c/li\u003e\n\u003cli\u003eLiu Y, Chen PC, Krause J, Peng L. How to read articles that use machine learning: users\u0026apos; guides to the medical literature. JAMA. 2019;322:1806\u0026ndash;16.\u003c/li\u003e\n\u003cli\u003eCastle PE, Sideri M, Jeronimo J, Solomon D, Schiffman M. Risk assessment to guide the prevention of cervical cancer. Am J Obstet Gynecol. 2007;197:356.e1\u0026ndash;6.\u003c/li\u003e\n\u003cli\u003eMassad LS, Einstein MH, Huh WK, Katki HA, Kinney WK, Schiffman M, et al. 2012 updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors. J Low Genit Tract Dis. 2013;17(Supplement 1):S1\u0026ndash;27.\u003c/li\u003e\n\u003cli\u003eArbyn M, Sankaranarayanan R, Muwonge R, Keita N, Dolo A, Mbalawa CG, et al. Pooled analysis of the accuracy of five cervical cancer screening tests assessed in eleven studies in Africa and India. Int J Cancer. 2008;123:153\u0026ndash;60.\u003c/li\u003e\n\u003cli\u003ePurde M-T, Nock S, Risch L, Medina Escobar P, Grebhardt C, Nydegger UE, et al. The cystatin C/creatinine ratio, a marker of glomerular filtration quality: associated factors, reference intervals, and prediction of morbidity and mortality in healthy seniors. Transl Res. 2016;169:80\u0026ndash;90.e2.\u003c/li\u003e\n\u003cli\u003ePan Y-P, Fang Y-P, Xu Y-H, Wang Z-X, Shen J-L. The diagnostic value of procalcitonin versus other biomarkers in prediction of bloodstream infection. Clin Lab. 2017;63(02/2017):277\u0026ndash;85.\u003c/li\u003e\n\u003cli\u003eHou F, Qiao Y, Qiao Y, Shi Y, Chen M, Kong M, et al. A retrospective analysis comparing metagenomic next-generation sequencing with conventional microbiology testing for the identification of pathogens in patients with severe infections. Front Cell Infect Microbiol. 2025;15:1530486.\u003c/li\u003e\n\u003cli\u003eBatra U, Nathany S, Nath SK, Jose JT, Sharma T, P P, et al. AI-based pipeline for early screening of lung cancer: integrating radiology, clinical, and genomics data. Lancet Reg Health Southeast Asia. 2024;24:100352.\u003c/li\u003e\n\u003cli\u003eXu M, Lin Z, Siegel CE, Laska EM, Abu-Amara D, Genfi A, et al. Screening for PTSD and TBI in veterans using routine clinical laboratory blood tests. Transl Psychiatry. 2023;13:64.\u003c/li\u003e\n\u003cli\u003eMohammad‐Rahimi H, Sohrabniya F, Ourang SA, Dianat O, Aminoshariae A, Nagendrababu V, et al. Artificial intelligence in endodontics: data preparation, clinical applications, ethical considerations, limitations, and future directions. Int Endod J. 2024;57:1566\u0026ndash;95.\u003c/li\u003e\n\u003cli\u003eCastle PE, Sideri M, Jeronimo J, Solomon D, Schiffman M. Risk assessment to guide the prevention of cervical cancer. Am J Obstet Gynecol. 2007;197:356.e1\u0026ndash;6.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-informatics-and-decision-making","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"midm","sideBox":"Learn more about [BMC Medical Informatics and Decision Making](http://bmcmedinformdecismak.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/midm/default.aspx","title":"BMC Medical Informatics and Decision Making","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Cervical intraepithelial neoplasia, Machine learning, Blood biomarkers, Prediction model, Decision support tool","lastPublishedDoi":"10.21203/rs.3.rs-7574572/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7574572/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground:\u003c/strong\u003e High-grade cervical intraepithelial neoplasia (CIN2/3) is a critical precursor to cervical cancer, yet current screening methods (e.g., HPV testing, colposcopy) face challenges in accessibility and invasiveness, especially in resource-limited settings. We aimed to develop a non-invasive, machine learning (ML)-based model using routine blood biomarkers to predict high-grade CIN, offering a scalable and cost-effective screening alternative.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods:\u003c/strong\u003eData from 128 cases of high-grade CIN and 120 cases of low-grade CIN were collected from a hospital in China. A total of 29 clinical characteristics and blood test measurements were considered for use in model development. Four feature selection algorithms (F-test, LASSO regression, decision tree, and random forest) were used to identify key predictors, and 11 machine learning algorithms were employed for model training. The dataset was split into training (70%) and testing (30%) cohorts. Model performance was evaluated using learning curves, receiver operating characteristic curves (ROC), area under the curve (AUC), Brier score, calibration curves, Precision-Recall (PR) curves, and Decision Curve Analysis (DCA). A web-based calculator was developed for clinical deployment.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003eKey features selected for the model included creatinine (CREA), red blood cell count (RBC), neutrophil percentage (NEU%), direct bilirubin (DBIL), and monocyte count (MON). The Support Vector Machine (SVM) algorithm achieved the best predictive performance, with an AUC of 0.75 and a Brier score of 0.21. The web tool (https://dvhl6xsf29zmdewixjx7kz.streamlit.app) provides real-time risk stratification.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusions: \u003c/strong\u003eThe model demonstrated strong performance across various validation metrics, indicating potential clinical utility. We also developed a web-based calculator to estimate high-grade CIN.\u003c/p\u003e","manuscriptTitle":"Machine Learning in Early Screening for High-Grade Cervical Intraepithelial Neoplasia Using Blood Testing","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-23 02:41:18","doi":"10.21203/rs.3.rs-7574572/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-10-06T09:16:52+00:00","index":"","fulltext":""},{"type":"reviewerAgreed","content":"9461021391432208662189778207196990527","date":"2025-09-24T10:16:50+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"167933372417318688637352215633381446625","date":"2025-09-22T04:44:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"24879111861184356314966768349280187913","date":"2025-09-22T03:43:05+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"173065228494259624638978021290954675020","date":"2025-09-22T02:47:43+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"","date":"2025-09-22T00:54:32+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"196145029826242064516870537536862951030","date":"2025-09-21T20:58:50+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"118285148597829266399557171833386547362","date":"2025-09-21T20:43:54+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-21T19:54:14+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"205111181676136401357396232418834266231","date":"2025-09-21T11:05:31+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-20T18:05:30+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"229038971736415333214893548485083250161","date":"2025-09-20T15:11:19+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"121362788751923286048438709548487838078","date":"2025-09-20T11:10:50+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-20T08:06:53+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"35085164253292849009960536636660356588","date":"2025-09-20T07:21:38+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"116602942063525048327711172154189413073","date":"2025-09-20T04:23:11+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-20T03:42:23+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"62949190856127052193867669142517031786","date":"2025-09-20T02:47:08+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-09-20T02:40:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"95410748616323221810482612656351982632","date":"2025-09-20T02:32:40+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"75672428327573942770516685021972551578","date":"2025-09-20T01:09:41+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"132963709399681890216058687293699625836","date":"2025-09-19T21:46:22+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"72081499476923628073051237660815787295","date":"2025-09-19T20:41:33+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"175942916910343221894903994948957957256","date":"2025-09-19T20:33:48+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"225168238298715832830659050221518495897","date":"2025-09-18T02:33:12+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-09-12T20:46:30+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-09-12T20:00:16+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-09-11T09:14:18+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-09-11T09:12:54+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Medical Informatics and Decision Making","date":"2025-09-09T13:46:10+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-informatics-and-decision-making","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"midm","sideBox":"Learn more about [BMC Medical Informatics and Decision Making](http://bmcmedinformdecismak.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/midm/default.aspx","title":"BMC Medical Informatics and Decision Making","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4a666595-e410-4f9f-b0e4-17811f72be97","owner":[],"postedDate":"September 23rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-12-22T16:07:57+00:00","versionOfRecord":{"articleIdentity":"rs-7574572","link":"https://doi.org/10.1186/s12911-025-03321-z","journal":{"identity":"bmc-medical-informatics-and-decision-making","isVorOnly":false,"title":"BMC Medical Informatics and Decision Making"},"publishedOn":"2025-12-18 15:57:56","publishedOnDateReadable":"December 18th, 2025"},"versionCreatedAt":"2025-09-23 02:41:18","video":"","vorDoi":"10.1186/s12911-025-03321-z","vorDoiUrl":"https://doi.org/10.1186/s12911-025-03321-z","workflowStages":[]},"version":"v1","identity":"rs-7574572","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7574572","identity":"rs-7574572","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.