Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study

doi:10.21203/rs.3.rs-9261325/v1

Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study

2026 · doi:10.21203/rs.3.rs-9261325/v1

preprint OA: closed

Full text JSON View at publisher

Full text 171,599 characters · extracted from preprint-html · click to expand

Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study Qingkai Wang, Hao Qiu, Liran Shen, Jinxing Dai, Fachen Miao, Qianjin Shi, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9261325/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Objective IgA vasculitis nephritis (IgAVN) plays a decisive role in the long-term prognosis of pediatric IgA vasculitis (IgAV), yet the temporal lag of routine urinalysis frequently hinders early and precise risk stratification. This study aimed to develop a machine-learning predictive model using non-invasive peripheral blood parameters to facilitate early identification of IgAVN progression and to reveal underlying pathophysiological risk thresholds. Methods This retrospective study enrolled 509 pediatric IgAV patients from Siyang Hospital and Shanxian Central Hospital, among whom 213 developed IgAVN. Twelve core features were selected based on the intersection of LASSO regression and the Boruta algorithm. Seven machine learning algorithms were systematically evaluated to construct the optimal eXtreme Gradient Boosting (XGBoost) model. Furthermore, the Shapley Additive Explanations (SHAP) framework was incorporated to quantify feature importance and to decipher non-linear risk interactions. Results The XGBoost model demonstrated outstanding predictive performance in the independent validation set, achieving an area under the receiver operating characteristic curve (AUC) of 0.966, an accuracy of 0.907, an F1 score of 0.892, and a sensitivity of 0.921. SHAP analysis identified the Inflammatory Burden Index (IBI), C-reactive protein (CRP), and monocyte-to-lymphocyte ratio (MLR) as the primary driving factors. SHAP dependence plots revealed critical non-linear threshold effects: the risk of IgAVN escalated sharply and non-linearly in the presence of early subclinical albumin (ALB) depletion and decompensated inflammatory load. Decision curve analysis (DCA) demonstrated that the model achieved substantial clinical net benefit across a broad continuum of threshold probabilities. Conclusion The explainable XGBoost model, developed utilising routine non-invasive peripheral blood parameters, demonstrates promising potential as a supportive tool for the early risk stratification of IgAVN. By visualising complex, data-driven nonlinear risk inflexion points, this model may assist frontline clinicians in more effectively identifying high-risk pediatric patients in outpatient and emergency settings. Ultimately, these findings provide an objective reference to inform future clinical strategies to optimise the timing of interventions and potentially minimise unnecessary immunosuppressive exposure in low-risk patients. IgA vasculitis IgA vasculitis nephritis Machine learning Risk strat-ification Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction IgA vasculitis (IgAV), formerly known as Henoch-Schönlein purpura (HSP), is among the most common systemic small-vessel vasculitides in childhood, pathologically characterised by the predominant deposition of IgA1-dominant immune complexes [ 1 , 2 ] . Although the classic presentations of cutaneous purpura, as well as articular and gastrointestinal involvement, typically follow a self-limiting course, the long-term prognosis and overall disease burden in pediatric patients are largely dictated by the onset and severity of IgA vasculitis nephritis (IgAVN), the most severe form of target-organ damage [ 3 , 4 ] .In clinical practice, the onset and progression of IgAVN exhibit substantial clinical heterogeneity and temporal unpredictability. While some patients present merely with transient microscopic hematuria or mild proteinuria, others may experience an insidious yet rapid progression to crescentic glomerulonephritis during the early stages of the disease, ultimately culminating in end-stage renal disease (ESRD) [ 5 , 6 ] . This pronounced disparity in prognostic outcomes, coupled with the highly unpredictable course of disease, poses a significant challenge for clinicians in risk assessment at the time of initial IgAV diagnosis. Consequently, there is an urgent need to develop an early-warning tool that facilitates precise identification and risk-stratified management in the early phases of the disease. Currently, renal biopsy remains the gold standard for the definitive diagnosis and pathological grading of IgAVN; however, its invasive nature precludes its widespread application as a routine screening modality for pediatric patients presenting at outpatient or emergency settings [ 7 ] . In clinical settings, the identification of early renal involvement relies primarily on routine urinalysis, including monitoring for proteinuria and microscopic hematuria [ 8 ] . However, abnormalities in urinary parameters often exhibit a temporal lag; by the time definitive urinary abnormalities are clinically detected, glomerular microvascular injury and parenchymal lesions have typically already occurred [ 9 ] . These diagnostic limitations present substantial challenges for early intervention: high-risk patients with initially negative urinalysis results may miss the optimal therapeutic window to arrest inflammatory progression, whereas low-risk patients may be subjected to unnecessary exposure to glucocorticoids or immunosuppressive therapies. Given that the pathogenesis of IgAV inherently involves systemic immune dysregulation and pervasive microvascular inflammation, an evident cascade of inflammatory activation and early metabolic stress typically emerges in the peripheral blood of pediatric patients before the onset of irreversible structural renal damage [ 10 ] . Routine peripheral blood biomarkers, including IBI and CRP (reflecting acute inflammatory burden), MLR (indicating immune network dysregulation), and ALB and UA (representing early endothelial function and nutritional-metabolic status), serve as noninvasive, readily accessible metrics. Theoretically, these indices can objectively delineate the pathophysiological trajectory of disease progression across multiple dimensions [ 11 ] . Previously, Özdemir et al. systematically evaluated the diagnostic efficacy of routine haematological parameters in predicting visceral involvement in pediatric IgAV using traditional logistic regression. Their findings demonstrated that elevated peripheral monocyte counts, the neutrophil-to-lymphocyte ratio (NLR), and CRP levels are independent risk factors for acute organ involvement [ 12 ] . However, these high-dimensional clinical features frequently exhibit complex multicollinearity and nonlinear interaction effects, which traditional statistical models predicated on linear assumptions struggle to adequately address [ 13 , 14 ] . In recent years, machine learning (ML) algorithms—particularly tree-based ensemble methods such as Extreme Gradient Boosting (XGBoost)—have demonstrated substantial advantages in handling high-dimensional, collinear data and capturing higher-order nonlinear associations among variables [ 15 , 16 ] . Concurrently, the Shapley Additive exPlanations (SHAP) framework mitigates the inherent "black-box" limitations of complex algorithms. It not only quantifies the global importance of individual features but also delineates nonlinear risk trajectories and potential threshold effects associated with deviations of continuous variables from physiological homeostasis [ 17 , 18 ] . Building on this rationale, the present study aimed to develop and optimise a machine learning model based on routine peripheral blood inflammatory and metabolic indices to predict progression from IgAV to IgAVN in pediatric patients. Furthermore, we employed the SHAP framework to elucidate the pathophysiological contributions of core biomarkers to renal microvascular injury. Ultimately, this research endeavours to provide an efficient primary screening tool for outpatient and emergency settings, facilitating individualised precision risk stratification and establishing a robust evidence base to guide clinical decision-making. Methods Study Design and Population In this retrospective study, we initially screened 684 pediatric patients diagnosed with IgAV who presented to the Departments of Pediatrics at Siyang Hospital and Shanxian Central Hospital between June 2022 and August 2025. After strictly applying the predefined exclusion criteria, 175 patients were excluded. Ultimately, 509 eligible patients—comprising 213 in the progression group and 296 in the non-progression group—were enrolled for subsequent feature extraction and model construction. This study was conducted in strict adherence to the principles of the Declaration of Helsinki and was approved by the respective Institutional Review Boards (Ethics Approval No. KS2026002). All included patients were aged ≤ 18 years at initial presentation, and the diagnosis of IgAV was established precisely according to the 2010 joint classification criteria endorsed by the European League Against Rheumatism (EULAR), the Paediatric Rheumatology International Trials Organisation (PRINTO), and the Paediatric Rheumatology European Society (PRES) [ 19 ] . The primary outcome was defined as progression to IgAVN within 6 months of disease onset. In accordance with the Kidney Disease: Improving Global Outcomes (KDIGO) guidelines [ 20 ] , incident IgAVN was defined by the persistent presence of any of the following criteria during the follow-up period: (1) a random urinary albumin-to-creatinine ratio (ACR) > 30 mg/g, a urinary protein-to-creatinine ratio (UPCR) > 0.2 mg/mg, or a qualitative proteinuria of ≥ 1 + on two consecutive routine urinalyses; (2) microscopic hematuria, defined as > 5 red blood cells per high-power field (HPF) on two consecutive examinations, or the presence of macroscopic hematuria; or (3) biopsy-proven IgAVN. To mitigate the influence of potential confounders on baseline peripheral blood immunological and metabolic profiles, patients were excluded if they met any of the following criteria: (1) a prior definitive diagnosis of primary nephrotic syndrome, IgA nephropathy (IgAN), or other secondary renal diseases; (2) concomitant systemic autoimmune disorders, such as systemic lupus erythematosus (SLE) or antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis; (3) the presence of severe acute infections at admission (e.g., sepsis or severe pneumonia) that could significantly alter the systemic inflammatory burden;(4) exposure to systemic high-dose glucocorticoids or immunosuppressive therapies before baseline blood sample collection. Data Selection and Preprocessing We systematically reviewed the electronic medical records of the two participating centres to extract baseline data obtained within 24 hours of the initial diagnosis of IgAV, strictly before the administration of any systemic high-dose glucocorticoids or immunosuppressive therapies. Initial candidate features encompassed routinely available demographic data, prodromal infection-related parameters, biochemical indices, immunological profiles, coagulation function metrics, complete blood count (CBC) parameters, and acute-phase reactants in the outpatient or emergency settings. Specific variables included age, sex, alanine aminotransferase (ALT), alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT), total protein (TP), ALB, urea, creatinine (CREA), UA, total carbon dioxide (TCO2), immunoglobulins (IgA, IgG, IgM), complement components (C3, C4), lactate dehydrogenase (LDH), antistreptolysin O (ASO), cystatin C (Cys-C), cholyglycine (CG), ischemia-modified albumin (IMA), prothrombin time (PT), activated partial thromboplastin time (APTT), fibrinogen (Fbg), thrombin time (TT), D-dimer (DD), white blood cell (WBC) count, platelet (PLT) count, CRP, total bile acid (TBA), adenosine deaminase (ADA), glutamate dehydrogenase (GLDH), homocysteine (HCY), non-esterified fatty acids (NEFA), and sialic acid (SA). To further quantify the systemic immune-inflammatory homeostasis in pediatric patients with IgAV, four higher-order composite indices were derived from primary blood cell counts and inflammatory markers: MLR, CRP-to-lymphocyte ratio (CLR), neutrophil-monocyte-lymphocyte derived ratio (NMLR), and IBI. During the machine learning modelling phase, all data preprocessing steps were performed exclusively on the training set to prevent data leakage. For continuous variables with a missingness rate of less than 20%, imputation was performed using the k-nearest neighbours (KNN) algorithm to preserve the original feature distribution optimally. To mitigate the impact of dimensional and magnitude discrepancies among parameters on model training, all continuous features underwent Z-score standardisation (mean = 0, standard deviation = 1). At the same time, categorical variables were transformed via one-hot encoding. The implementation of this uniform standardisation protocol across the entire dataset neutralised potential batch effects and systematic baseline variations stemming from divergent laboratory assays between the two participating centres, ensuring the homogeneity and comparability of the pooled data. Following these procedures, a final high-dimensional feature matrix was constructed for subsequent feature selection, machine learning model training, performance evaluation, and SHAP interpretability analysis. Machine Learning Model Construction To identify the optimal algorithm for predicting the progression from IgAV to IgAVN, the 509 eligible patients were randomly partitioned into a training set (n = 356) and an independent validation set (n = 153) at a 7:3 ratio using a stratified sampling strategy. This stratification ensured that the proportional distribution of positive (progression to IgAVN) and negative outcomes remained consistent across both cohorts, thereby establishing a robust foundation for model training and reliable performance evaluation. To strictly prevent data leakage, all data preprocessing steps—including missing-value imputation, standardisation, categorical-variable encoding, and feature selection—were fitted exclusively on the training set, and the resulting transformation parameters were subsequently applied to the validation set. The optimal hyperparameter configurations for each evaluated model are detailed in Table 1 . Table 1 Optimal parameter combination for machine learning models Model Optimal hyperparameter combination Decision tree (DT) ccp_alpha = 0.0; max_depth = 10; max_features = None; min_samples_split = 20 Random forest (RF) n_estimators = 50; max_features = 2 Extreme Gradient Boosting (XGBoost) learning_rate = 0.1; max_depth = 3; n_estimators = 100; subsample = 0.6 Light Gradient Boosting Machine (LightGBM) colsample_bytree = 0.6; learning_rate = 0.1; n_estimators = 100; num_leaves = 31; subsample = 0.6 Support vector machine (SVM) C = 0.1; degree = 2; gamma = scale; kernel = linear Artificial neural network (ANN) activation = relu; hidden_layer_sizes = (50,) During the model development phase, seven supervised classification algorithms, spanning traditional linear statistical models to complex tree-based ensembles and nonlinear networks, were trained in parallel and systematically evaluated on the training set. These algorithms included logistic regression (LR), decision tree (DT), random forest (RF), XGBoost, light gradient boosting machine (LightGBM), support vector machine (SVM), and artificial neural network (ANN). To mitigate the risk of overfitting and optimise predictive performance, hyperparameter tuning for the tree-based models, SVM, and ANN was conducted using 10-fold cross-validation and a grid search. The LR model was constructed using default parameters. Finally, the models with their optimised hyperparameters were deployed on the independent validation set to provide an unbiased assessment of their generalisation performance. Evaluation of Model Performance and Clinical Utility To comprehensively evaluate the predictive performance of the machine learning models in the independent validation set, we established a multidimensional evaluation framework. Initially, the overall discrimination of each model was quantified by plotting receiver operating characteristic (ROC) curves and calculating the area under the curve (AUC) with corresponding 95% confidence intervals (CIs). Subsequently, accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), Youden’s J index, and Kappa value were calculated. These metrics, in conjunction with the F1 score, were utilised to systematically compare the classification performance of the various algorithms in identifying high-risk patients and ruling out low-risk cases. Following confirmation of discriminative ability, the models' goodness of fit and clinical translational potential were further assessed. Calibration curves, coupled with Brier scores, were used to visualise and quantify the agreement (calibration) between predicted probabilities and the observed incidence of IgAVN. Furthermore, to address the critical issue of intervention thresholds in pediatric clinical practice, decision curve analysis (DCA) was utilised to calculate the clinical net benefit across a wide range of threshold probabilities. This analysis evaluated the practical utility of the models in balancing the risks of overtreatment (e.g., unnecessary immunosuppressive exposure) against those of under-intervention (e.g., missing the optimal therapeutic window to arrest disease progression). Ultimately, this multidimensional evaluation facilitated a rigorous comparison of the overall performance, robustness, and clinical feasibility of each model, providing a robust evidence base for selecting the optimal predictive tool. Statistical Analysis and Model Interpretation All baseline statistical analyses were conducted using R software (version 4.2.2). Normally distributed continuous variables were expressed as the mean ± standard deviation (SD) and compared using the independent-samples t-test. Non-normally distributed continuous variables were presented as medians and interquartile ranges (IQRs) and compared using the Wilcoxon rank-sum test. Categorical variables were summarised as frequencies and proportions (n, %) and compared using the Pearson chi-square test or Fisher’s exact test, as appropriate. Statistical significance was defined as a two-sided P < 0.05. Machine learning model development and interpretability analyses were implemented in Python (version 3.10.4). Following a comprehensive evaluation of discrimination, classification performance, calibration, and clinical net benefit across all models in the validation set, the XGBoost model—yielding the optimal overall performance—was selected for SHAP analysis. Specifically, a SHAP bar plot was utilised to identify the core predictive features. A SHAP beeswarm plot was then employed to visualise the directional impact of individual feature values on the risk of IgAVN. Furthermore, SHAP dependence plots were constructed to quantify nonlinear associations and potential risk-threshold effects of the core continuous variables. Results Patient characteristics A total of 509 pediatric patients were included in this study, comprising 296 in the non-IgAVN group and 213 in the IgAVN group. Patients in the IgAVN group were significantly older than those in the non-IgAVN group (P = 0.021). As detailed in Table 2, the IgAVN group exhibited significantly higher levels of ALT, urea, CREA, UA, ASO, Cys-C, Fbg, WBC, MLR, NMLR, CLR, IBI, PLT, and CRP, alongside significantly lower ALB levels, compared with the non-IgAVN group (all P < 0.05). Furthermore, both PT and APTT were significantly shorter in the IgAVN group (P = 0.001 for both). No statistically significant differences were observed between the two groups regarding sex distribution or the remaining baseline parameters (all P > 0.05). Table 2 Baseline characteristics. Variables Total(N = 509) No-IgAVN(N = 296) IgAVN(N = 213) P-Value Sex (%) 0.216 Female 250 (49.12) 138 (46.62) 112 (52.58) Male 259 (50.88) 158 (53.38) 101 (47.42) Age (years) 8.00 [7.00, 11.00] 8.00 [7.00, 10.00] 9.00 [7.00, 11.00] 0.021 ALT (U/L) 10.00 [9.00, 11.00] 9.00 [8.00, 10.00] 10.00 [9.00, 11.00] < 0.001 ALP (U/L) 182.00 [142.00, 224.00] 182.00 [147.00, 226.25] 179.00 [138.00, 217.00] 0.225 GGT (U/L) 16.00 [12.00, 19.00] 16.00 [12.00, 19.00] 15.00 [11.00, 19.00] 0.710 TP (g/L) 68.77 [65.48, 71.52] 68.83 [65.56, 71.37] 68.74 [65.37, 71.68] 0.971 ALB (g/L) 39.69 [38.44, 41.76] 40.27 [38.88, 42.29] 38.91 [36.72, 40.11] < 0.001 UREA (mmol/L) 3.59 ± 0.24 3.52 ± 0.22 3.69 ± 0.23 < 0.001 CREA (µmol/L) 35.49 [33.70, 37.21] 35.01 [33.15, 36.47] 36.42 [34.58, 38.59] < 0.001 UA (µmol/L) 238.00 [217.00, 257.00] 230.50 [208.00, 250.25] 249.00 [230.00, 275.00] < 0.001 TCO2 (mmol/L) 22.27 [21.23, 23.84] 22.29 [21.08, 23.96] 22.20 [21.26, 23.83] 0.834 IgA (g/L) 1.72 [1.65, 1.79] 1.72 [1.64, 1.78] 1.73 [1.66, 1.80] 0.091 IgG (g/L) 10.87 [9.38, 12.23] 10.91 [9.43, 12.19] 10.80 [9.23, 12.40] 0.977 IgM (g/L) 1.31 [1.13, 1.59] 1.32 [1.09, 1.59] 1.28 [1.15, 1.58] 0.968 C3 (g/L) 1.37 [1.29, 1.46] 1.37 [1.29, 1.46] 1.38 [1.29, 1.46] 0.982 C4 (g/L) 0.33 [0.26, 0.38] 0.33 [0.26, 0.37] 0.33 [0.28, 0.38] 0.475 LDH (U/L) 181.00 [172.00, 190.00] 181.00 [173.00, 190.25] 181.00 [172.00, 190.00] 0.360 ASO (IU/mL) 30.55 [23.48, 38.40] 26.06 [20.70, 34.58] 33.32 [26.35, 41.45] < 0.001 Cys. C (mg/L) 0.81 ± 0.05 0.80 ± 0.04 0.83 ± 0.04 < 0.001 CG (µg/mL) 1.02 [0.78, 1.19] 1.02 [0.79, 1.22] 1.02 [0.78, 1.17] 0.430 IMA (U/mL) 70.64 [70.38, 70.94] 70.65 [70.38, 70.98] 70.62 [70.38, 70.92] 0.708 PT (s) 11.47 [11.39, 11.56] 11.48 [11.41, 11.58] 11.45 [11.38, 11.54] 0.001 APTT (s) 30.15 [29.89, 30.51] 30.20 [29.92, 30.58] 30.10 [29.82, 30.41] 0.001 Fbg (g/L) 2.46 [2.36, 2.59] 2.42 [2.34, 2.49] 2.57 [2.43, 2.71] < 0.001 TT (s) 14.89 [14.14, 16.08] 14.89 [14.16, 16.08] 14.87 [14.11, 16.07] 0.938 DD (mg/L) 0.25 ± 0.03 0.25 ± 0.03 0.25 ± 0.03 0.154 WBC (×10⁹/L) 6.02 [5.69, 6.44] 5.82 [5.55, 6.12] 6.37 [5.96, 6.85] < 0.001 MLR 0.15 [0.12, 0.20] 0.13 [0.11, 0.17] 0.19 [0.15, 0.23] < 0.001 NMLR 3.65 [2.58, 4.49] 3.34 [2.07, 3.96] 4.16 [3.51, 5.32] < 0.001 CLR 1.78 [1.36, 2.67] 1.65 [1.18, 2.54] 2.26 [1.51, 2.80] < 0.001 IBI 6.41 [4.94, 8.66] 5.30 [4.25, 7.44] 8.02 [5.93, 10.25] < 0.001 PLT (×10⁹/L) 230.99 ± 11.91 228.29 ± 10.51 234.74 ± 12.72 < 0.001 CRP (mg/L) 3.14 [2.04, 4.82] 2.48 [1.82, 3.52] 4.49 [2.82, 6.02] < 0.001 TBA (µmol/L) 4.81 [3.28, 6.20] 4.79 [3.27, 6.18] 4.84 [3.32, 6.24] 0.822 ADA (U/L) 11.30 [9.93, 13.23] 11.23 [9.86, 13.23] 11.37 [10.02, 13.23] 0.654 GLDH (U/L) 2.11 [1.68, 2.57] 2.18 [1.70, 2.57] 2.06 [1.66, 2.55] 0.160 HCY (µmol/L) 7.19 [6.50, 7.83] 7.16 [6.44, 7.79] 7.21 [6.56, 7.91] 0.495 NEFA (mmol/L) 0.50 [0.40, 0.65] 0.50 [0.40, 0.65] 0.50 [0.41, 0.64] 0.946 SA (mg/dL) 53.87 [50.45, 59.81] 53.95 [50.74, 59.94] 53.74 [49.94, 59.50] 0.393 ALT, alanine aminotransferase; ALP, alkaline phosphatase; GGT, gamma-glutamyl transferase; TP, total protein; ALB, albumin; UREA, urea; CREA, creatinine; UA, uric acid; TCO2, total carbon dioxide; IgA, immunoglobulin A; IgG, immunoglobulin G; IgM, immunoglobulin M; C3, complement 3; C4, complement 4; LDH, lactate dehydrogenase; ASO, antistreptolysin O; Cys. C, cystatin C; CG, cholylglycine; IMA, ischemia-modified albumin; PT, prothrombin time; APTT, activated partial thromboplastin time; Fbg, fibrinogen; TT, thrombin time; DD, D-dimer; WBC, white blood cell count; MLR, monocyte-to-lymphocyte ratio; NMLR, neutrophil-monocyte-to-lymphocyte ratio; CLR, C-reactive protein-to-lymphocyte ratio; IBI, inflammatory burden index; PLT, platelet count; CRP, C-reactive protein; TBA, total bile acid; ADA, adenosine deaminase; GLDH, glutamate dehydrogenase; HCY, homocysteine; NEFA, non-esterified fatty acid; SA, sialic acid. Selection of Modelling Variables To further identify key features associated with the incidence of IgAVN, the least absolute shrinkage and selection operator (LASSO) regression and the Boruta algorithm were independently applied for feature selection in the training set. LASSO regression, coupled with 10-fold cross-validation, retained 18 candidate variables at the optimal penalty parameter corresponding to the 1-standard-error (1-SE) criterion (Fig. 1A). The coefficient path plot further demonstrated that as the penalisation parameter increased, the regression coefficients for the respective variables shrank continuously, ultimately approaching zero (Fig. 1B). Concurrently, the Boruta algorithm identified 17 critical variables, all of which were classified as confirmed features (Fig. 1C-D). Subsequently, the intersection of the results from both feature selection methods yielded 12 consensus variables—namely, ALT, ALB, urea, UA, ASO, APTT, MLR, NMLR, CLR, IBI, PLT, and CRP. These shared features were ultimately incorporated into the development of downstream machine-learning predictive models (Fig. 1E). Model development and performance assessment Utilising the 12 selected feature variables, we developed seven predictive models—LR, DT, RF, XGBoost, LightGBM, SVM, and ANN—and systematically compared their performance in the independent validation set. The results demonstrated that all seven models exhibited robust predictive capabilities, with XGBoost and LightGBM demonstrating the optimal overall performance. Although the LightGBM model achieved the highest AUC of 0.967 (95% CI: 0.941–0.988), the XGBoost model (AUC: 0.966, 95% CI: 0.935–0.990) achieved the highest accuracy (0.907), with substantial sensitivity (0.921) and specificity (0.898). Furthermore, XGBoost achieved an F1 score of 0.892, a Kappa value of 0.811, and a Youden’s J index of 0.818, indicating the best comprehensive classification performance. The RF model also performed admirably, with an AUC of 0.952 (95% CI: 0.914–0.984) and a specificity of 0.909. In contrast, the overall predictive efficacies of the DT, SVM, and ANN models were relatively inferior (Table 3). Table 3 Comparative analysis of the performance outcomes across machine learning models. XGBoost, Extreme Gradient Boosting, LightGBM, Light Gradient Boosting Machine, SVM, Support Vector Machine, ANN, Artificial Neural Network, PPV, Predictive Value, NPV, Negative Predictive Value, AUC, Area Under the Curve. Model AUC Accuracy Precision Sensitivity Specificity F1 Score Kappa Youden's J PPV NPV Logistic 0.934 0.868 0.803 0.905 0.841 0.851 0.732 0.746 0.803 0.925 Decision Tree 0.885 0.854 0.797 0.873 0.841 0.833 0.704 0.714 0.797 0.902 Random Forest 0.952 0.887 0.871 0.857 0.909 0.864 0.768 0.766 0.871 0.899 XGBoost 0.966 0.907 0.866 0.921 0.898 0.892 0.811 0.818 0.866 0.94 LightGBM 0.967 0.894 0.841 0.921 0.875 0.879 0.785 0.796 0.841 0.939 SVM 0.93 0.841 0.791 0.841 0.841 0.815 0.676 0.682 0.791 0.881 ANN 0.923 0.841 0.82 0.794 0.875 0.806 0.672 0.669 0.82 0.856 XGBoost, Extreme Gradient Boosting, LightGBM, Light Gradient Boosting Machine, SVM, Support Vector Machine, ANN, Artificial Neural Network, PPV, Predictive Value, NPV, Negative Predictive Value, AUC, Area Under the Curve. Confusion matrices from the validation set further confirmed that all models exhibited a degree of discriminative ability between patients with and without IgAVN (Fig. 2A-G). Notably, the XGBoost model correctly identified 58 patients with IgAVN and 79 patients in the non-IgAVN group, misclassifying only 5 and 9 cases, respectively; this overall classification outcome was superior to those of the other models (Fig. 2D). The LightGBM model also performed well, accurately identifying 58 IgAVN cases and 77 non-IgAVN cases (Fig. 2E). The RF model exhibited a strong capacity to recognise non-IgAVN patients, correctly classifying 80 such cases, reflecting its higher specificity (Fig. 2C). ROC curve analysis revealed that all seven models maintained good discriminative ability in the validation set, with XGBoost and LightGBM encompassing the largest areas under the curve (Fig. 3A). Calibration curves indicated a general concordance between the predicted probabilities and the actual observed incidences across the models, with XGBoost, LightGBM, and the LR model demonstrating superior calibration performance (Fig. 3B). DCA results showed that all models yielded a net clinical benefit across most threshold probabilities; however, the XGBoost model achieved a higher overall net benefit (Fig. 3C). Taking into careful consideration the discriminative ability, classification performance, calibration, and DCA results, XGBoost exhibited a more balanced and robust profile across accuracy, sensitivity, F1 score, Kappa value, Youden’s J index, and overall net benefit. Consequently, XGBoost was ultimately selected as the optimal model for subsequent interpretability analysis and potential clinical deployment. Interpretability analysis in the model To quantify the contribution of each predictive variable to the model output, SHAP analysis was applied to the optimal XGBoost model. The results demonstrated that IBI, CRP, and MLR exhibited the highest mean absolute SHAP values (all 0.11), followed sequentially by ASO (0.06), NMLR (0.06), ALB (0.05), CLR (0.04), UA (0.03), urea (0.02), ALT (0.02), APTT (0.01), and PLT (0.01) (Fig. 4A). The SHAP beeswarm plot revealed that elevated levels of IBI, CRP, MLR, ASO, NMLR, CLR, UA, urea, and ALT were generally associated with an increased risk of incident IgAVN. In contrast, higher levels of ALB and APTT were correlated with a reduced risk (Fig. 4B). At the individual level, SHAP waterfall plots for specific patients showed that the directional contributions and effect magnitudes of identical features differed substantially between the non-IgAVN and IgAVN outcomes, with MLR, ASO, CRP, and NMLR emerging as the pivotal variables that elucidated the model's predictions (Fig. 4C-D). Furthermore, SHAP dependence plots indicated pervasive nonlinear relationships between individual variables and the model output (Fig. 5). At lower levels, parameters including ALT, urea, UA, ASO, MLR, NMLR, CLR, IBI, and CRP exerted relatively limited impacts on the model output. However, as these indices increased, their corresponding SHAP values progressively increased, transitioning from negative to positive or showing continuous increases within the moderate-to-high ranges. Conversely, an increase in ALB levels corresponded with an overall decline in its respective SHAP value. APTT exhibited a similar negative trajectory, whereas PLT showed nonlinear fluctuations. Web application deployment To enhance the clinical accessibility and practical utility of the optimal XGBoost model, we deployed it as an online web-based prediction system (Fig. 6). Titled "Kidney Injury Risk Prediction System," this platform integrates individualised risk prediction and personalised SHAP explanation functionalities for specific pediatric patients. Upon inputting the required clinical parameters—namely, ALT, ALB, urea, UA, ASO, APTT, MLR, NMLR, CLR, IBI, PLT, and CRP—users can click "Run Prediction" to obtain the corresponding predicted probability. Concurrently, the system generates a personalised SHAP explanation that provides a visual representation of the predicted outcome and its key driving factors. Furthermore, the webpage includes a dedicated results panel and usage instructions, explicitly cautioning that the tool is intended strictly for scientific research and adjunctive evaluation, rather than as a substitute for professional clinical judgment. The online prediction system is freely accessible at: https://predictinglymphnodemetastasisingastriccancer.shinyapps.io/shiny_kidney_app/. Ultimately, these results demonstrate the successful online deployment of the proposed XGBoost model, providing a convenient tool for individualised risk assessment of IgAVN. Discussion In pediatric outpatient and emergency settings, the early progression from IgAV to IgAVN often lacks definitive, specific clinical manifestations at initial presentation. The insidious nature of disease evolution, coupled with pronounced clinical heterogeneity, predisposes high-risk patients to miss the critical therapeutic window for early identification and timely intervention. Concurrently, it increases the risk of unnecessary exposure to immunosuppressive therapies among low-risk patients with a self-limiting disease course [ 21 ] . Given the invasive nature of renal biopsy and the inherent temporal lag of routine urinalysis in detecting early renal involvement, exploring noninvasive, objective, and clinically feasible early predictive strategies is of profound practical significance. Driven by this clinical imperative, the present study developed and validated multiple machine-learning predictive models using routine peripheral blood parameters readily available at initial presentation. Our findings revealed that the XGBoost model demonstrated exceptional discrimination and calibration within the independent validation set, underscoring its superior stability and generalisation potential. Furthermore, subsequent SHAP interpretability analysis elucidated that composite inflammatory burden indices—specifically IBI and MLR—alongside ALB and ASO, serve as pivotal predictors intricately associated with the progression from IgAV to IgAVN. Historically, risk assessment for IgAVN has predominantly relied upon traditional clinical scoring systems or statistical models such as logistic regression [ 22 , 23 ] . However, the onset and progression of pediatric IgAV involve complex immune-inflammatory responses and metabolic remodelling processes. Consequently, routine peripheral blood indices are rarely independent; instead, they frequently exhibit substantial multicollinearity and higher-order nonlinear interactions [ 24 – 26 ] . Employing a two-sample Mendelian randomisation approach, Xie et al. systematically analysed the causal associations among IgAV, immune cells, metabolites, and inflammatory cytokines, revealing that the pathogenesis of IgAV is not a solitary inflammatory event, but rather a complex biological cascade co-orchestrated by immune-inflammatory responses and metabolic networks [ 27 ] . Within a modelling framework predicated on linear assumptions, such intricate relationships are difficult to adequately capture, potentially impairing the model's capacity to extract key information and, in turn, compromising predictive accuracy. Our findings demonstrate that the XGBoost algorithm, leveraging a tree-based ensemble strategy, can effectively capture complex nonlinear association patterns among features without requiring the a priori manual elimination of collinear variables. Compared with the traditional logistic regression model, XGBoost exhibited superior overall predictive performance while achieving a more optimal balance between sensitivity and specificity. Furthermore, its lower Brier score indicates that the model not only possesses robust discriminative capacity but also generates individualised predicted probabilities that are highly concordant with the actual observed risk. By leveraging the capacity of tree-based models to capture high-dimensional interactions, our SHAP analysis not only enhances model interpretability but also delineates the critical pathological sequence of IgAV: triggering by prodromal infections, amplification via immune dysregulation, and culmination in microvascular endothelial injury. The high predictive importance of ASO supports the classical hypothesis that prodromal infections (e.g., Streptococcus) induce aberrant immune responses and pathogenic IgA1 production, thereby facilitating immune complex deposition [ 28 ] . Furthermore, the prominence of composite inflammatory indices, such as IBI and MLR, suggests that systemic inflammatory activation is a primary driver of disease progression. Mechanistically, this aligns with Li et al., who demonstrated that IgA1-containing immune complexes activate neutrophils via FcαRI, precipitating endothelial damage. This highlights the vital role of innate immune hyperactivation in mesangial injury and systemic microvascular destruction [ 29 ] . Notably, ALB demonstrated a strong inverse association with IgAVN risk in the SHAP analysis. This indicates that subtle albumin leakage or depletion—resulting from early systemic endothelial barrier impairment—may act as an early warning sign of disease progression before overt target-organ damage (e.g., massive proteinuria) manifests. This finding is consistent with Shen et al., who identified baseline albumin as an independent predictor of renal involvement in pediatric IgAV, implying that an early decline in ALB reflects a latent risk of IgAVN [ 30 ] . Finally, SHAP dependence plots revealed nonlinear associations and distinct threshold effects between these key predictors and IgAVN risk. The risk of disease progression escalates nonlinearly once the systemic inflammatory burden exceeds the body's compensatory limits or when an early reduction in ALB occurs. These threshold effects underscore the complex, phased nature of IgAVN pathogenesis, providing clinicians with actionable metrics to identify high-risk patients nearing physiological decompensation. Crucially, the XGBoost model relies entirely on features derived from standardised laboratory tests (e.g., routine inflammatory markers, ALB, and composite immune indices) universally available in outpatient and emergency settings. This noninvasive, readily accessible biomarker panel mitigates the risks associated with early renal biopsy and overcomes the inherent temporal delay of routine urinalysis in detecting systemic microvascular inflammation [ 31 , 32 ] . Furthermore, decision curve analysis (DCA) substantiated the model's clinical utility, demonstrating a superior net clinical benefit across a broad range of threshold probabilities compared to default "treat-all" or "treat-none" strategies. This underscores the model's efficacy as a robust clinical decision-support tool. By providing frontline clinicians with an objective basis for early risk stratification, the model optimises the clinical management of pediatric IgAV. Particularly in resource-constrained environments, it empowers clinicians to strike an optimal balance—promptly initiating targeted interventions for patients at high risk of renal progression while sparing low-risk patients from unnecessary immunosuppressive exposure. Despite its robust predictive performance and clinical promise, these findings must be interpreted with caution. First, the retrospective design and limited sources of data introduce inherent selection bias. Furthermore, the current reliance on internal validation necessitates the use of independent external cohorts to rigorously verify the model's stability and generalizability across diverse geographic regions, healthcare tiers, and heterogeneous populations. In conclusion, by integrating early routine peripheral blood features with the XGBoost algorithm and SHAP framework, this study systematically delineates the nonlinear risk trajectories of IgAV progression to renal involvement. Ultimately, this model provides a noninvasive, objective, and highly actionable decision-support tool. It empowers frontline clinicians to promptly identify high-risk pediatric patients, optimise risk stratification, and maximise the efficiency of medical resource allocation. Conclusion In conclusion, this study developed an interpretable XGBoost model based on routine non-invasive peripheral blood parameters that enables accurate early identification of pediatric patients at high risk of progression from IgAV to IgAVN, while revealing nonlinear associations between key predictors and disease progression. This model provides a quantitative basis for early clinical risk stratification, with potential to optimise the timing of intervention and reduce unnecessary immunosuppressive exposure in low-risk children. Although further validation in large prospective cohorts is warranted, it shows promising potential as a visual decision-support tool for clinical translation. Abbreviations Abbreviation Full term ACR albumin-to-creatinine ratio ADA adenosine deaminase ALB albumin ALP alkaline phosphatase ALT alanine aminotransferase ANN artificial neural network CLR C-reactive protein-to-lymphocyte ratio HCY homocysteine HPF high-power field HSP Henoch-Schönlein purpura IBI Inflammatory Burden Index LDH lactate dehydrogenase LR logistic regression ML machine learning MLR monocyte-to-lymphocyte ratio NEFA non-esterified fatty acids NLR neutrophil-to-lymphocyte ratio NMLR neutrophil-monocyte-lymphocyte derived ratio PPV positive predictive value ROC receiver operating characteristic SA sialic acid SVM support vector machine TCO2 total carbon dioxide TP total protein TT thrombin time UA uric acid UPCR urinary protein-to-creatinine ratio UREA urea XGBoost eXtreme Gradient Boosting Declarations Ethics approval and consent to participate All procedures involving human participants were conducted in accordance with the ethical standards of the institutional research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. Ethical approval for this retrospective study was obtained from the Medical Ethics Committee of Siyang Hospital (approval number: KS2026002). The requirement for informed consent to participate was waived by the Medical Ethics Committee of Siyang Hospital due to the retrospective nature of the study. Consent for publication Not applicable. Availability of data and materials The datasets used and/or analysed during the current study are not publicly available due to institutional and patient privacy restrictions but are available from the corresponding author on reasonable request Competing Interests The authors declare they have no conflict of interest. Funding This study was jointly funded by Siyang Hospital and the Suqian Municipal Health Commission, as part of the 2024 Suqian Municipal Health Commission Medical Research Project (ZD202409). Authors' contributions Conceptualisation: Qingkai Wang, Hao Qiu, Weibing Qiu; Data curation: Liran Shen, Fachen Miao, Jinxing Dai; Formal analysis: Qingkai Wang, Hao Qiu, Jinxing Dai; Methodology: Kang Shen; Software: Qingkai Wang, Jinxing Dai; Supervision: Qingkai Wang, Yubiao Zhang, Liran Shen, Weibing Qiu; Writing - original draft: Jinxing Dai, Qingkai Wang; Writing - review & editing: Hao Qiu, Weibing Qiu. All authors have read and approved the manuscript. Acknowledgements None. References Castañeda S, Quiroga-Colina P, Floranes P, Uriarte-Ecenarro M, Valero-Martínez C, Vicente-Rabaneda EF, et al. IgA Vasculitis (Henoch-Schönlein Purpura): An Update on Treatment. J Clin Med. 2024;13(21). https://doi.org/10.3390/jcm13216621 . Sestan M, Jelusic M. Diagnostic and Management Strategies of IgA Vasculitis Nephritis/Henoch-Schönlein Purpura Nephritis in Pediatric Patients: Current Perspectives. Pediatr Health Med Ther. 2023;14:89–98. https://doi.org/10.2147/phmt.S379862 . Oni L, Platt C, Marlais M, McCann L, Barakat F, Hesseling M, et al. National recommendations for the management of children and young people with IgA vasculitis: a best available evidence, group agreement-based approach. Arch Dis Child. 2024;110(1):67–76. https://doi.org/10.1136/archdischild-2024-327364 . Jelusic M, Sestan M, Giani T, Cimaz R. New Insights and Challenges Associated With IgA Vasculitis and IgA Vasculitis With Nephritis-Is It Time to Change the Paradigm of the Most Common Systemic Vasculitis in Childhood? Front Pediatr. 2022;10:853724. https://doi.org/10.3389/fped.2022.853724 . Amatruda M, Carucci NS, Chimenz R, Conti G. Immunoglobulin A vasculitis nephritis: Current understanding of pathogenesis and treatment. World J Nephrol. 2023;12(4):82–92. https://doi.org/10.5527/wjn.v12.i4.82 . Gage A, Pepper RJ, Marro J, Salama AD, Oni L. IgA Vasculitis Across the Ages: Is It Time for a Precision Medicine Approach? ACR Open Rheumatol. 2025;7(9):e70083. https://doi.org/10.1002/acr2.70083 . Marro J, Chetwynd AJ, Wright RD, Dliso S, Oni L. Urinary Protein Array Analysis to Identify Key Inflammatory Markers in Children with IgA Vasculitis Nephritis. Child (Basel). 2022;9(5). https://doi.org/10.3390/children9050622 . Zhang Q, Lai LY, Cai YY, Wang MJ, Ma G, Qi LW, et al. Serum-Urine Matched Metabolomics for Predicting Progression of Henoch-Schonlein Purpura Nephritis. Front Med (Lausanne). 2021;8657073. https://doi.org/10.3389/fmed.2021.657073 . Oni L, Sampath S. Childhood IgA Vasculitis (Henoch Schonlein Purpura)-Advances and Knowledge Gaps. Front Pediatr. 2019;7:257. https://doi.org/10.3389/fped.2019.00257 . Xu S, Han S, Dai Y, Wang L, Zhang X, Ding Y. A Review of the Mechanism of Vascular Endothelial Injury in Immunoglobulin A Vasculitis. Front Physiol. 2022;13:833954. https://doi.org/10.3389/fphys.2022.833954 . Bi Y, Quan W, Hao W, Sun R, Li L, Jiang C, et al. A simple nomogram for assessing the risk of IgA vasculitis nephritis in IgA vasculitis Asian pediatric patients. Sci Rep. 2022;12(1):16809. https://doi.org/10.1038/s41598-022-20369-3 . Özdemir ZC, Çetin N, Kar YD, Öcal HO, Bilgin M, Bör Ö. Hemotologic Indices for Predicting Internal Organ Involvement in Henoch-Schönlein Purpura (IgA vasculitis). J Pediatr Hematol Oncol. 2020;42(1):e46–9. https://doi.org/10.1097/mph.0000000000001571 . Oh MY, Kim HS, Jung YM, Lee HC, Lee SB, Lee SM. Machine Learning-Based Explainable Automated Nonlinear Computation Scoring System for Health Score and an Application for Prediction of Perioperative Stroke: Retrospective Study. J Med Internet Res. 2025;27e58021. .https://doi.org/10.2196/58021 . Lee S, Kisiel MA, Lindberg P, Wheelock ÅM, Olofsson A, Eriksson J, et al. Using machine learning involving diagnoses and medications as a risk prediction tool for post-acute sequelae of COVID-19 (PASC) in primary care. BMC Med. 2025;23(1):251. https://doi.org/10.1186/s12916-025-04050-w . Xie Y, Chen Y, Han Y, Zhai S, Xiao L, Yin D, et al. Identifying influencing factors associated with sleep quality in undergraduates based on partial least squares regression and XGBoost. Front Psychol. 2025;16:1732946. https://doi.org/10.3389/fpsyg.2025.1732946 . Yang S, Cao L, Zhou Y, Hu C. A Retrospective Cohort Study: Predicting 90-Day Mortality for ICU Trauma Patients with a Machine Learning Algorithm Using XGBoost Using MIMIC-III Database. J Multidiscip Healthc. 2023;16:2625–40. https://doi.org/10.2147/jmdh.S416943 . Luo S, Lai J, Mo L, Shen X, Fang R. Prediction of hospital mortality in sepsis-associated acute kidney injury using a machine-learning approach: a multicenter study using SHAP interpretability analysis. Clin Kidney J. 2026;19(1):sfaf372. .https://doi.org/10.1093/ckj/sfaf372 . Zhang C, Niu B, Wang R, Zhang L. From traditional metabolic markers to ensemble learning: comparative application of machine learning models for predicting NAFLD risk in adolescents. Front Endocrinol (Lausanne). 2025;16:1681686. https://doi.org/10.3389/fendo.2025.1681686 . Ozen S, Pistorio A, Iusan SM, Bakkaloglu A, Herlin T, Brik R, et al. EULAR/PRINTO/PRES criteria for Henoch-Schönlein purpura, childhood polyarteritis nodosa, childhood Wegener granulomatosis and childhood Takayasu arteritis: Ankara 2008. Part II: Final classification criteria. Ann Rheum Dis. 2010;69(5):798–806. https://doi.org/10.1136/ard.2009.116657 . KDIGO 2021 Clinical Practice Guideline for the Management of Glomerular Diseases. Kidney Int. 2021;100(4s):S1. s276.https://doi.org/10.1016/j.kint.2021.05.021 . Ercan Emreol H, Yildirim-Toruner C, Jelusic M, Twilt M, Ozen S. New avenues in childhood vasculitis. Pediatr Rheumatol Online J. 2025;23(1):97. https://doi.org/10.1186/s12969-025-01149-5 . Guo L, Zhu A, Li W, Zeng F, Wang F. Clinical prediction model for progression from henoch-schönlein purpura to nephritis in pediatric patients. Am J Transl Res. 2024;16(12):7385–95. https://doi.org/10.62347/xdor8531 . Podraza Z, Poplicha K, Ufniarski T, Ucieklak J, Łysiak N, Mizerska-Wasiak M. Laboratory Findings and Clinical Features in IgA Vasculitis: Identifying Predictors of Kidney Involvement and Disease Relapse in Pediatric Patients. J Clin Med. 2025;14(9). .https://doi.org/10.3390/jcm14093055 . Chen G, Yang Z. Risk prediction for gastrointestinal bleeding in pediatric Henoch-Schönlein purpura using an interpretable transformer model. Front Physiol. 2025;16:1630807. https://doi.org/10.3389/fphys.2025.1630807 . Pan M, Li M, Li N, Mao J. Predicting renal damage in children with IgA vasculitis by machine learning. Pediatr Nephrol. 2024;39(10):2997–3004. https://doi.org/10.1007/s00467-024-06432-3 . Cao T, Zhu Y, Zhu Y. Construction of Prediction Model of Renal Damage in Children with Henoch-Schönlein Purpura Based on Machine Learning. Comput Math Methods Med 2022, 2022:6991218. https://doi.org/10.1155/2022/6991218 Xie M, Zhou N, Liang Q, Lin Z, Yao Y. Impact of Immune Cells on IgA Vasculitis via Metabolites and Inflammatory Cytokines. J Clin Immunol. 2025;45(1):157. https://doi.org/10.1007/s10875-025-01946-3 . Held M, Kozmar A, Sestan M, Turudic D, Kifer N, Srsen S, et al. Insight into the Interplay of Gd-IgA1, HMGB1, RAGE and PCDH1 in IgA Vasculitis (IgAV). Int J Mol Sci. 2024;25(8). .https://doi.org/10.3390/ijms25084383 . Li Q, Wu H, Yuan X, Shi S, Liu L, Lv J, et al. FcαRI-mediated neutrophil activation contributed to the pathogenesis of adult IgA vasculitis with nephritis. Nephrol Dial Transpl. 2025. https://doi.org/10.1093/ndt/gfaf260 . Shen L, Miao L, Xu L. Risk factors associated with renal injury in patients initially diagnosed with IgA vasculitis. Front Pediatr. 2025;13:1584768. https://doi.org/10.3389/fped.2025.1584768 . Williams CEC, Ging H, Skoutelis N, Marro J, Roberts L, Chetwynd AJ, et al. Biomarkers to predict kidney outcomes in children with IgA vasculitis. Minerva Pediatr (Torino). 2025;77(3):256–71. https://doi.org/10.23736/s2724-5276.24.07715-2 . Jin Y, He X, Lin W, Peng Z, Li W, Xiang W, et al. Serum cytokine profiles in children with IgA vasculitis with nephritis. Biomol Biomed. 2025;25(6):1425–43. https://doi.org/10.17305/bb.2024.11081 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9261325","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":625633540,"identity":"166ec5ac-dca5-42b2-868d-2bcf3391aa60","order_by":0,"name":"Qingkai Wang","email":"","orcid":"","institution":"Shanxian Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Qingkai","middleName":"","lastName":"Wang","suffix":""},{"id":625633541,"identity":"00288fcd-b7bf-43ff-a420-7355b0a14cb5","order_by":1,"name":"Hao Qiu","email":"","orcid":"","institution":"Siyang Hospital","correspondingAuthor":false,"prefix":"","firstName":"Hao","middleName":"","lastName":"Qiu","suffix":""},{"id":625633542,"identity":"f9c45574-6106-48ce-a618-40a14726b05f","order_by":2,"name":"Liran Shen","email":"","orcid":"","institution":"Shanxian Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Liran","middleName":"","lastName":"Shen","suffix":""},{"id":625633543,"identity":"96fd9212-e4fa-4907-85fa-4ded8d1abc4c","order_by":3,"name":"Jinxing Dai","email":"","orcid":"","institution":"Siyang Hospital","correspondingAuthor":false,"prefix":"","firstName":"Jinxing","middleName":"","lastName":"Dai","suffix":""},{"id":625633544,"identity":"35e4dc78-c9ea-482a-9ee5-8ff8d890629c","order_by":4,"name":"Fachen Miao","email":"","orcid":"","institution":"Shanxian Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Fachen","middleName":"","lastName":"Miao","suffix":""},{"id":625633545,"identity":"f3af4d22-21bf-4331-aced-e18da8de3ffd","order_by":5,"name":"Qianjin Shi","email":"","orcid":"","institution":"Siyang Hospital","correspondingAuthor":false,"prefix":"","firstName":"Qianjin","middleName":"","lastName":"Shi","suffix":""},{"id":625633546,"identity":"cac6d528-c9cb-4081-894b-3a574125a66b","order_by":6,"name":"Yunbiao Zhang","email":"","orcid":"","institution":"Shanxian Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yunbiao","middleName":"","lastName":"Zhang","suffix":""},{"id":625633549,"identity":"00c0d439-b5b9-4a72-b091-25793bf73d3c","order_by":7,"name":"Kang Shen","email":"","orcid":"","institution":"Siyang Hospital","correspondingAuthor":false,"prefix":"","firstName":"Kang","middleName":"","lastName":"Shen","suffix":""},{"id":625633550,"identity":"e6cb1c48-893b-4d16-83e6-06b3d5ff5994","order_by":8,"name":"Weibing Qiu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+0lEQVRIie3PsUrEQBCA4V0WkmZytrOQh1ixsQjGR9lwkOpU7FLJwsFW9wCK7yGWE7a4ZtU2cIUEe0kaqyvcVvSSKy32r4ZhvmIYi8X+ZcDacY9Qbo2mQRVwkppZwp005znz1Pf3TZ3LDc0SQZlpCtZVH2fgXaG6y2mw6K6I5DMC37gaM/sGrGN8GFeHieyuNd16BJHaGqXdAX80Qj48HSZqt1KECUICvsbTQEROicjmCAQCePOFlX0Nsz6CZBYBUS8VeQp2hsj3T+0w/KKAqt40S0Bo15O/LPyLG4fmrlSp0W6vLspyu26HcYIw+r3iZuL+bxKLxWKxn30D/sJbE7Sg798AAAAASUVORK5CYII=","orcid":"","institution":"Siyang Hospital","correspondingAuthor":true,"prefix":"","firstName":"Weibing","middleName":"","lastName":"Qiu","suffix":""}],"badges":[],"createdAt":"2026-03-30 00:38:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9261325/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9261325/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":107484672,"identity":"0f5fd4f4-22a5-434d-b2fb-c4e0ea015f75","added_by":"auto","created_at":"2026-04-22 02:32:42","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":2390382,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFeature selection process for model construction.\u003c/strong\u003e\u003cbr\u003e\n(A): Optimal penalty parameter λ identified by 10-fold cross-validation in the LASSO regression; (B): LASSO coefficient profiles plotted against log(λ); (C): Variable importance across 100 classifier runs in the Boruta algorithm; (D): Boxplots of feature importance generated by the Boruta algorithm; (E): Overlap analysis of variables selected by LASSO regression and the Boruta algorithm.\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/01168b5dc3bfc98b8e161e3a.png"},{"id":107356732,"identity":"b5436cd2-108e-4eaa-ad01-1f4ca5839e71","added_by":"auto","created_at":"2026-04-20 17:03:19","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1936342,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConfusion matrices of the seven models in the validation set.\u003c/strong\u003e\u003cbr\u003e\n(A): Logistic; (B): Decision Tree; (C): Random Forest; (D): XGBoost; (E): LightGBM; (F): SVM; (G): ANN. XGBoost, Extreme Gradient Boosting, LightGBM, Light Gradient Boosting Machine, SVM, Support Vector Machine, ANN, Artificial Neural Network.\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/76f650a5c233fa6704349cbc.png"},{"id":107487412,"identity":"361f5e22-8f01-4e7c-b4f9-4dcf2ba5154c","added_by":"auto","created_at":"2026-04-22 02:41:20","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":3234321,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eROC curves, calibration curves, and decision curve analysis of the seven models in the validation set.\u003c/strong\u003e\u003cbr\u003e\nA: ROC curves of the seven models in the validation set; B: Calibration curves of the seven models in the validation set; C: Decision curve analysis of the seven models in the validation set.\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/734cb91a66c35f714c21694e.png"},{"id":107487367,"identity":"84fd1909-a8d9-48f6-a691-6e241dddc625","added_by":"auto","created_at":"2026-04-22 02:41:07","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":2411325,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSHAP interpretation of the XGBoost model.\u003c/strong\u003e\u003cbr\u003e\n(A): Ranking of feature importance for the XGBoost model output; (B): SHAP beeswarm plot showing the relationships between feature values and model output; (C-D): SHAP waterfall plots of the same sample under different outcomes, C: no IgAVN (class0); D: IgAVN (class1).\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/6726664e6f51986df4b93bc6.png"},{"id":107484673,"identity":"64a3bd31-10ec-4f02-a957-d4f4b8a06d17","added_by":"auto","created_at":"2026-04-22 02:32:43","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":14364691,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSHAP dependence plots showing the nonlinear relationships between predictors and model output in the XGBoost model.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/d40e281e0a08d957f2832be0.png"},{"id":107484916,"identity":"db63b9cc-1849-4e9a-a41f-526f51f032f4","added_by":"auto","created_at":"2026-04-22 02:33:17","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1080072,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eInterface of the online web-based prediction system.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe web-based system integrates IgAVN risk prediction with individualised SHAP interpretation, allowing users to obtain the predicted probability and corresponding explanations after entering relevant clinical variables.\u003c/p\u003e","description":"","filename":"Figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/81bf4274fc05000bfa2a4c9e.png"},{"id":107488735,"identity":"baadaa9c-20dd-4a82-b246-c966c717bb8c","added_by":"auto","created_at":"2026-04-22 02:45:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":21143916,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9261325/v1/9c409265-a2af-46c2-ad2c-583285971fa2.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study","fulltext":[{"header":"Introduction","content":"\u003cp\u003eIgA vasculitis (IgAV), formerly known as Henoch-Sch\u0026ouml;nlein purpura (HSP), is among the most common systemic small-vessel vasculitides in childhood, pathologically characterised by the predominant deposition of IgA1-dominant immune complexes \u003csup\u003e[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e, \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]\u003c/sup\u003e. Although the classic presentations of cutaneous purpura, as well as articular and gastrointestinal involvement, typically follow a self-limiting course, the long-term prognosis and overall disease burden in pediatric patients are largely dictated by the onset and severity of IgA vasculitis nephritis (IgAVN), the most severe form of target-organ damage \u003csup\u003e[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]\u003c/sup\u003e.In clinical practice, the onset and progression of IgAVN exhibit substantial clinical heterogeneity and temporal unpredictability. While some patients present merely with transient microscopic hematuria or mild proteinuria, others may experience an insidious yet rapid progression to crescentic glomerulonephritis during the early stages of the disease, ultimately culminating in end-stage renal disease (ESRD) \u003csup\u003e[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]\u003c/sup\u003e. This pronounced disparity in prognostic outcomes, coupled with the highly unpredictable course of disease, poses a significant challenge for clinicians in risk assessment at the time of initial IgAV diagnosis. Consequently, there is an urgent need to develop an early-warning tool that facilitates precise identification and risk-stratified management in the early phases of the disease.\u003c/p\u003e \u003cp\u003eCurrently, renal biopsy remains the gold standard for the definitive diagnosis and pathological grading of IgAVN; however, its invasive nature precludes its widespread application as a routine screening modality for pediatric patients presenting at outpatient or emergency settings \u003csup\u003e[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]\u003c/sup\u003e. In clinical settings, the identification of early renal involvement relies primarily on routine urinalysis, including monitoring for proteinuria and microscopic hematuria \u003csup\u003e[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]\u003c/sup\u003e. However, abnormalities in urinary parameters often exhibit a temporal lag; by the time definitive urinary abnormalities are clinically detected, glomerular microvascular injury and parenchymal lesions have typically already occurred \u003csup\u003e[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]\u003c/sup\u003e. These diagnostic limitations present substantial challenges for early intervention: high-risk patients with initially negative urinalysis results may miss the optimal therapeutic window to arrest inflammatory progression, whereas low-risk patients may be subjected to unnecessary exposure to glucocorticoids or immunosuppressive therapies.\u003c/p\u003e \u003cp\u003eGiven that the pathogenesis of IgAV inherently involves systemic immune dysregulation and pervasive microvascular inflammation, an evident cascade of inflammatory activation and early metabolic stress typically emerges in the peripheral blood of pediatric patients before the onset of irreversible structural renal damage \u003csup\u003e[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]\u003c/sup\u003e. Routine peripheral blood biomarkers, including IBI and CRP (reflecting acute inflammatory burden), MLR (indicating immune network dysregulation), and ALB and UA (representing early endothelial function and nutritional-metabolic status), serve as noninvasive, readily accessible metrics. Theoretically, these indices can objectively delineate the pathophysiological trajectory of disease progression across multiple dimensions \u003csup\u003e[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]\u003c/sup\u003e. Previously, \u0026Ouml;zdemir et al. systematically evaluated the diagnostic efficacy of routine haematological parameters in predicting visceral involvement in pediatric IgAV using traditional logistic regression. Their findings demonstrated that elevated peripheral monocyte counts, the neutrophil-to-lymphocyte ratio (NLR), and CRP levels are independent risk factors for acute organ involvement \u003csup\u003e[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]\u003c/sup\u003e. However, these high-dimensional clinical features frequently exhibit complex multicollinearity and nonlinear interaction effects, which traditional statistical models predicated on linear assumptions struggle to adequately address\u003csup\u003e[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]\u003c/sup\u003e. In recent years, machine learning (ML) algorithms\u0026mdash;particularly tree-based ensemble methods such as Extreme Gradient Boosting (XGBoost)\u0026mdash;have demonstrated substantial advantages in handling high-dimensional, collinear data and capturing higher-order nonlinear associations among variables \u003csup\u003e[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]\u003c/sup\u003e. Concurrently, the Shapley Additive exPlanations (SHAP) framework mitigates the inherent \"black-box\" limitations of complex algorithms. It not only quantifies the global importance of individual features but also delineates nonlinear risk trajectories and potential threshold effects associated with deviations of continuous variables from physiological homeostasis \u003csup\u003e[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eBuilding on this rationale, the present study aimed to develop and optimise a machine learning model based on routine peripheral blood inflammatory and metabolic indices to predict progression from IgAV to IgAVN in pediatric patients. Furthermore, we employed the SHAP framework to elucidate the pathophysiological contributions of core biomarkers to renal microvascular injury. Ultimately, this research endeavours to provide an efficient primary screening tool for outpatient and emergency settings, facilitating individualised precision risk stratification and establishing a robust evidence base to guide clinical decision-making.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy Design and Population\u003c/h2\u003e \u003cp\u003eIn this retrospective study, we initially screened 684 pediatric patients diagnosed with IgAV who presented to the Departments of Pediatrics at Siyang Hospital and Shanxian Central Hospital between June 2022 and August 2025. After strictly applying the predefined exclusion criteria, 175 patients were excluded. Ultimately, 509 eligible patients\u0026mdash;comprising 213 in the progression group and 296 in the non-progression group\u0026mdash;were enrolled for subsequent feature extraction and model construction. This study was conducted in strict adherence to the principles of the Declaration of Helsinki and was approved by the respective Institutional Review Boards (Ethics Approval No. KS2026002). All included patients were aged\u0026thinsp;\u0026le;\u0026thinsp;18 years at initial presentation, and the diagnosis of IgAV was established precisely according to the 2010 joint classification criteria endorsed by the European League Against Rheumatism (EULAR), the Paediatric Rheumatology International Trials Organisation (PRINTO), and the Paediatric Rheumatology European Society (PRES) \u003csup\u003e[\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe primary outcome was defined as progression to IgAVN within 6 months of disease onset. In accordance with the Kidney Disease: Improving Global Outcomes (KDIGO) guidelines \u003csup\u003e[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]\u003c/sup\u003e, incident IgAVN was defined by the persistent presence of any of the following criteria during the follow-up period: (1) a random urinary albumin-to-creatinine ratio (ACR)\u0026thinsp;\u0026gt;\u0026thinsp;30 mg/g, a urinary protein-to-creatinine ratio (UPCR)\u0026thinsp;\u0026gt;\u0026thinsp;0.2 mg/mg, or a qualitative proteinuria of \u0026ge;\u0026thinsp;1\u0026thinsp;+\u0026thinsp;on two consecutive routine urinalyses; (2) microscopic hematuria, defined as \u0026gt;\u0026thinsp;5 red blood cells per high-power field (HPF) on two consecutive examinations, or the presence of macroscopic hematuria; or (3) biopsy-proven IgAVN.\u003c/p\u003e \u003cp\u003eTo mitigate the influence of potential confounders on baseline peripheral blood immunological and metabolic profiles, patients were excluded if they met any of the following criteria: (1) a prior definitive diagnosis of primary nephrotic syndrome, IgA nephropathy (IgAN), or other secondary renal diseases; (2) concomitant systemic autoimmune disorders, such as systemic lupus erythematosus (SLE) or antineutrophil cytoplasmic antibody (ANCA)-associated vasculitis; (3) the presence of severe acute infections at admission (e.g., sepsis or severe pneumonia) that could significantly alter the systemic inflammatory burden;(4) exposure to systemic high-dose glucocorticoids or immunosuppressive therapies before baseline blood sample collection.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eData Selection and Preprocessing\u003c/h3\u003e\n\u003cp\u003eWe systematically reviewed the electronic medical records of the two participating centres to extract baseline data obtained within 24 hours of the initial diagnosis of IgAV, strictly before the administration of any systemic high-dose glucocorticoids or immunosuppressive therapies. Initial candidate features encompassed routinely available demographic data, prodromal infection-related parameters, biochemical indices, immunological profiles, coagulation function metrics, complete blood count (CBC) parameters, and acute-phase reactants in the outpatient or emergency settings. Specific variables included age, sex, alanine aminotransferase (ALT), alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT), total protein (TP), ALB, urea, creatinine (CREA), UA, total carbon dioxide (TCO2), immunoglobulins (IgA, IgG, IgM), complement components (C3, C4), lactate dehydrogenase (LDH), antistreptolysin O (ASO), cystatin C (Cys-C), cholyglycine (CG), ischemia-modified albumin (IMA), prothrombin time (PT), activated partial thromboplastin time (APTT), fibrinogen (Fbg), thrombin time (TT), D-dimer (DD), white blood cell (WBC) count, platelet (PLT) count, CRP, total bile acid (TBA), adenosine deaminase (ADA), glutamate dehydrogenase (GLDH), homocysteine (HCY), non-esterified fatty acids (NEFA), and sialic acid (SA). To further quantify the systemic immune-inflammatory homeostasis in pediatric patients with IgAV, four higher-order composite indices were derived from primary blood cell counts and inflammatory markers: MLR, CRP-to-lymphocyte ratio (CLR), neutrophil-monocyte-lymphocyte derived ratio (NMLR), and IBI.\u003c/p\u003e \u003cp\u003eDuring the machine learning modelling phase, all data preprocessing steps were performed exclusively on the training set to prevent data leakage. For continuous variables with a missingness rate of less than 20%, imputation was performed using the k-nearest neighbours (KNN) algorithm to preserve the original feature distribution optimally. To mitigate the impact of dimensional and magnitude discrepancies among parameters on model training, all continuous features underwent Z-score standardisation (mean\u0026thinsp;=\u0026thinsp;0, standard deviation\u0026thinsp;=\u0026thinsp;1). At the same time, categorical variables were transformed via one-hot encoding. The implementation of this uniform standardisation protocol across the entire dataset neutralised potential batch effects and systematic baseline variations stemming from divergent laboratory assays between the two participating centres, ensuring the homogeneity and comparability of the pooled data. Following these procedures, a final high-dimensional feature matrix was constructed for subsequent feature selection, machine learning model training, performance evaluation, and SHAP interpretability analysis.\u003c/p\u003e\n\u003ch3\u003eMachine Learning Model Construction\u003c/h3\u003e\n\u003cp\u003eTo identify the optimal algorithm for predicting the progression from IgAV to IgAVN, the 509 eligible patients were randomly partitioned into a training set (n\u0026thinsp;=\u0026thinsp;356) and an independent validation set (n\u0026thinsp;=\u0026thinsp;153) at a 7:3 ratio using a stratified sampling strategy. This stratification ensured that the proportional distribution of positive (progression to IgAVN) and negative outcomes remained consistent across both cohorts, thereby establishing a robust foundation for model training and reliable performance evaluation. To strictly prevent data leakage, all data preprocessing steps\u0026mdash;including missing-value imputation, standardisation, categorical-variable encoding, and feature selection\u0026mdash;were fitted exclusively on the training set, and the resulting transformation parameters were subsequently applied to the validation set. The optimal hyperparameter configurations for each evaluated model are detailed in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eOptimal parameter combination for machine learning models\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"2\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOptimal hyperparameter combination\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDecision tree (DT)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eccp_alpha\u0026thinsp;=\u0026thinsp;0.0; max_depth\u0026thinsp;=\u0026thinsp;10; max_features\u0026thinsp;=\u0026thinsp;None; min_samples_split\u0026thinsp;=\u0026thinsp;20\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRandom forest (RF)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003en_estimators\u0026thinsp;=\u0026thinsp;50; max_features\u0026thinsp;=\u0026thinsp;2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eExtreme Gradient Boosting (XGBoost)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003elearning_rate\u0026thinsp;=\u0026thinsp;0.1; max_depth\u0026thinsp;=\u0026thinsp;3; n_estimators\u0026thinsp;=\u0026thinsp;100; subsample\u0026thinsp;=\u0026thinsp;0.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLight Gradient Boosting Machine (LightGBM)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ecolsample_bytree\u0026thinsp;=\u0026thinsp;0.6; learning_rate\u0026thinsp;=\u0026thinsp;0.1; n_estimators\u0026thinsp;=\u0026thinsp;100; num_leaves\u0026thinsp;=\u0026thinsp;31; subsample\u0026thinsp;=\u0026thinsp;0.6\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSupport vector machine (SVM)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eC\u0026thinsp;=\u0026thinsp;0.1; degree\u0026thinsp;=\u0026thinsp;2; gamma\u0026thinsp;=\u0026thinsp;scale; kernel\u0026thinsp;=\u0026thinsp;linear\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eArtificial neural network (ANN)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eactivation\u0026thinsp;=\u0026thinsp;relu; hidden_layer_sizes = (50,)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eDuring the model development phase, seven supervised classification algorithms, spanning traditional linear statistical models to complex tree-based ensembles and nonlinear networks, were trained in parallel and systematically evaluated on the training set. These algorithms included logistic regression (LR), decision tree (DT), random forest (RF), XGBoost, light gradient boosting machine (LightGBM), support vector machine (SVM), and artificial neural network (ANN). To mitigate the risk of overfitting and optimise predictive performance, hyperparameter tuning for the tree-based models, SVM, and ANN was conducted using 10-fold cross-validation and a grid search. The LR model was constructed using default parameters. Finally, the models with their optimised hyperparameters were deployed on the independent validation set to provide an unbiased assessment of their generalisation performance.\u003c/p\u003e\n\u003ch3\u003eEvaluation of Model Performance and Clinical Utility\u003c/h3\u003e\n\u003cp\u003eTo comprehensively evaluate the predictive performance of the machine learning models in the independent validation set, we established a multidimensional evaluation framework. Initially, the overall discrimination of each model was quantified by plotting receiver operating characteristic (ROC) curves and calculating the area under the curve (AUC) with corresponding 95% confidence intervals (CIs). Subsequently, accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), Youden\u0026rsquo;s J index, and Kappa value were calculated. These metrics, in conjunction with the F1 score, were utilised to systematically compare the classification performance of the various algorithms in identifying high-risk patients and ruling out low-risk cases.\u003c/p\u003e \u003cp\u003eFollowing confirmation of discriminative ability, the models' goodness of fit and clinical translational potential were further assessed. Calibration curves, coupled with Brier scores, were used to visualise and quantify the agreement (calibration) between predicted probabilities and the observed incidence of IgAVN. Furthermore, to address the critical issue of intervention thresholds in pediatric clinical practice, decision curve analysis (DCA) was utilised to calculate the clinical net benefit across a wide range of threshold probabilities. This analysis evaluated the practical utility of the models in balancing the risks of overtreatment (e.g., unnecessary immunosuppressive exposure) against those of under-intervention (e.g., missing the optimal therapeutic window to arrest disease progression). Ultimately, this multidimensional evaluation facilitated a rigorous comparison of the overall performance, robustness, and clinical feasibility of each model, providing a robust evidence base for selecting the optimal predictive tool.\u003c/p\u003e\n\u003ch3\u003eStatistical Analysis and Model Interpretation\u003c/h3\u003e\n\u003cp\u003eAll baseline statistical analyses were conducted using R software (version 4.2.2). Normally distributed continuous variables were expressed as the mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation (SD) and compared using the independent-samples t-test. Non-normally distributed continuous variables were presented as medians and interquartile ranges (IQRs) and compared using the Wilcoxon rank-sum test. Categorical variables were summarised as frequencies and proportions (n, %) and compared using the Pearson chi-square test or Fisher\u0026rsquo;s exact test, as appropriate. Statistical significance was defined as a two-sided P\u0026thinsp;\u0026lt;\u0026thinsp;0.05.\u003c/p\u003e \u003cp\u003eMachine learning model development and interpretability analyses were implemented in Python (version 3.10.4). Following a comprehensive evaluation of discrimination, classification performance, calibration, and clinical net benefit across all models in the validation set, the XGBoost model\u0026mdash;yielding the optimal overall performance\u0026mdash;was selected for SHAP analysis. Specifically, a SHAP bar plot was utilised to identify the core predictive features. A SHAP beeswarm plot was then employed to visualise the directional impact of individual feature values on the risk of IgAVN. Furthermore, SHAP dependence plots were constructed to quantify nonlinear associations and potential risk-threshold effects of the core continuous variables.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec9\"\u003e\n \u003ch2\u003ePatient characteristics\u003c/h2\u003e\n \u003cp\u003eA total of 509 pediatric patients were included in this study, comprising 296 in the non-IgAVN group and 213 in the IgAVN group. Patients in the IgAVN group were significantly older than those in the non-IgAVN group (P = 0.021). As detailed in Table 2, the IgAVN group exhibited significantly higher levels of ALT, urea, CREA, UA, ASO, Cys-C, Fbg, WBC, MLR, NMLR, CLR, IBI, PLT, and CRP, alongside significantly lower ALB levels, compared with the non-IgAVN group (all P \u0026lt; 0.05). Furthermore, both PT and APTT were significantly shorter in the IgAVN group (P = 0.001 for both). No statistically significant differences were observed between the two groups regarding sex distribution or the remaining baseline parameters (all P \u0026gt; 0.05).\u0026nbsp;\u003c/p\u003e\u0026nbsp;\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 2\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eBaseline characteristics.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eVariables\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eTotal(N = 509)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003eNo-IgAVN(N = 296)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003eIgAVN(N = 213)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003eP-Value\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eSex (%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.216\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eFemale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e250 (49.12)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e138 (46.62)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e112 (52.58)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eMale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e259 (50.88)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e158 (53.38)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e101 (47.42)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eAge (years)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e8.00 [7.00, 11.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e8.00 [7.00, 10.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e9.00 [7.00, 11.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.021\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eALT (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e10.00 [9.00, 11.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e9.00 [8.00, 10.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e10.00 [9.00, 11.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eALP (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e182.00 [142.00, 224.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e182.00 [147.00, 226.25]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e179.00 [138.00, 217.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.225\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eGGT (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e16.00 [12.00, 19.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e16.00 [12.00, 19.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e15.00 [11.00, 19.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.710\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eTP (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e68.77 [65.48, 71.52]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e68.83 [65.56, 71.37]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e68.74 [65.37, 71.68]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.971\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eALB (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e39.69 [38.44, 41.76]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e40.27 [38.88, 42.29]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e38.91 [36.72, 40.11]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eUREA (mmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e3.59 ± 0.24\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e3.52 ± 0.22\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e3.69 ± 0.23\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCREA (µmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e35.49 [33.70, 37.21]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e35.01 [33.15, 36.47]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e36.42 [34.58, 38.59]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eUA (µmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e238.00 [217.00, 257.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e230.50 [208.00, 250.25]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e249.00 [230.00, 275.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eTCO2 (mmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e22.27 [21.23, 23.84]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e22.29 [21.08, 23.96]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e22.20 [21.26, 23.83]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.834\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eIgA (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e1.72 [1.65, 1.79]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e1.72 [1.64, 1.78]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e1.73 [1.66, 1.80]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.091\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eIgG (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e10.87 [9.38, 12.23]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e10.91 [9.43, 12.19]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e10.80 [9.23, 12.40]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.977\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eIgM (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e1.31 [1.13, 1.59]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e1.32 [1.09, 1.59]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e1.28 [1.15, 1.58]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.968\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eC3 (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e1.37 [1.29, 1.46]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e1.37 [1.29, 1.46]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e1.38 [1.29, 1.46]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.982\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eC4 (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e0.33 [0.26, 0.38]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e0.33 [0.26, 0.37]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e0.33 [0.28, 0.38]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.475\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eLDH (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e181.00 [172.00, 190.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e181.00 [173.00, 190.25]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e181.00 [172.00, 190.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.360\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eASO (IU/mL)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e30.55 [23.48, 38.40]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e26.06 [20.70, 34.58]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e33.32 [26.35, 41.45]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCys. C (mg/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e0.81 ± 0.05\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e0.80 ± 0.04\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e0.83 ± 0.04\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCG (µg/mL)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e1.02 [0.78, 1.19]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e1.02 [0.79, 1.22]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e1.02 [0.78, 1.17]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.430\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eIMA (U/mL)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e70.64 [70.38, 70.94]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e70.65 [70.38, 70.98]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e70.62 [70.38, 70.92]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.708\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003ePT (s)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e11.47 [11.39, 11.56]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e11.48 [11.41, 11.58]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e11.45 [11.38, 11.54]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eAPTT (s)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e30.15 [29.89, 30.51]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e30.20 [29.92, 30.58]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e30.10 [29.82, 30.41]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eFbg (g/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e2.46 [2.36, 2.59]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e2.42 [2.34, 2.49]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e2.57 [2.43, 2.71]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eTT (s)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e14.89 [14.14, 16.08]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e14.89 [14.16, 16.08]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e14.87 [14.11, 16.07]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.938\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eDD (mg/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e0.25 ± 0.03\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e0.25 ± 0.03\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e0.25 ± 0.03\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.154\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eWBC (×10⁹/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e6.02 [5.69, 6.44]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e5.82 [5.55, 6.12]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e6.37 [5.96, 6.85]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eMLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e0.15 [0.12, 0.20]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e0.13 [0.11, 0.17]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e0.19 [0.15, 0.23]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eNMLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e3.65 [2.58, 4.49]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e3.34 [2.07, 3.96]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e4.16 [3.51, 5.32]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e1.78 [1.36, 2.67]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e1.65 [1.18, 2.54]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e2.26 [1.51, 2.80]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eIBI\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e6.41 [4.94, 8.66]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e5.30 [4.25, 7.44]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e8.02 [5.93, 10.25]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003ePLT (×10⁹/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e230.99 ± 11.91\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e228.29 ± 10.51\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e234.74 ± 12.72\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eCRP (mg/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e3.14 [2.04, 4.82]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e2.48 [1.82, 3.52]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e4.49 [2.82, 6.02]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e\u0026lt; 0.001\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eTBA (µmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e4.81 [3.28, 6.20]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e4.79 [3.27, 6.18]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e4.84 [3.32, 6.24]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.822\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eADA (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e11.30 [9.93, 13.23]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e11.23 [9.86, 13.23]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e11.37 [10.02, 13.23]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.654\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eGLDH (U/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e2.11 [1.68, 2.57]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e2.18 [1.70, 2.57]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e2.06 [1.66, 2.55]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.160\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eHCY (µmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e7.19 [6.50, 7.83]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e7.16 [6.44, 7.79]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e7.21 [6.56, 7.91]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.495\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eNEFA (mmol/L)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e0.50 [0.40, 0.65]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e0.50 [0.40, 0.65]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e0.50 [0.41, 0.64]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.946\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eSA (mg/dL)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\n \u003cp\u003e53.87 [50.45, 59.81]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\n \u003cp\u003e53.95 [50.74, 59.94]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\n \u003cp\u003e53.74 [49.94, 59.50]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\n \u003cp\u003e0.393\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003ctfoot\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"5\"\u003eALT, alanine aminotransferase; ALP, alkaline phosphatase; GGT, gamma-glutamyl transferase; TP, total protein; ALB, albumin; UREA, urea; CREA, creatinine; UA, uric acid; TCO2, total carbon dioxide; IgA, immunoglobulin A; IgG, immunoglobulin G; IgM, immunoglobulin M; C3, complement 3; C4, complement 4; LDH, lactate dehydrogenase; ASO, antistreptolysin O; Cys. C, cystatin C; CG, cholylglycine; IMA, ischemia-modified albumin; PT, prothrombin time; APTT, activated partial thromboplastin time; Fbg, fibrinogen; TT, thrombin time; DD, D-dimer; WBC, white blood cell count; MLR, monocyte-to-lymphocyte ratio; NMLR, neutrophil-monocyte-to-lymphocyte ratio; CLR, C-reactive protein-to-lymphocyte ratio; IBI, inflammatory burden index; PLT, platelet count; CRP, C-reactive protein; TBA, total bile acid; ADA, adenosine deaminase; GLDH, glutamate dehydrogenase; HCY, homocysteine; NEFA, non-esterified fatty acid; SA, sialic acid.\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tfoot\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003ch3\u003eSelection of Modelling Variables\u003c/h3\u003e\n\u003cp\u003eTo further identify key features associated with the incidence of IgAVN, the least absolute shrinkage and selection operator (LASSO) regression and the Boruta algorithm were independently applied for feature selection in the training set. LASSO regression, coupled with 10-fold cross-validation, retained 18 candidate variables at the optimal penalty parameter corresponding to the 1-standard-error (1-SE) criterion (Fig.\u0026nbsp;1A). The coefficient path plot further demonstrated that as the penalisation parameter increased, the regression coefficients for the respective variables shrank continuously, ultimately approaching zero (Fig.\u0026nbsp;1B). Concurrently, the Boruta algorithm identified 17 critical variables, all of which were classified as confirmed features (Fig.\u0026nbsp;1C-D). Subsequently, the intersection of the results from both feature selection methods yielded 12 consensus variables—namely, ALT, ALB, urea, UA, ASO, APTT, MLR, NMLR, CLR, IBI, PLT, and CRP. These shared features were ultimately incorporated into the development of downstream machine-learning predictive models (Fig.\u0026nbsp;1E).\u003c/p\u003e\n\u003cdiv id=\"Sec11\"\u003e\n \u003ch2\u003eModel development and performance assessment\u003c/h2\u003e\n \u003cp\u003eUtilising the 12 selected feature variables, we developed seven predictive models—LR, DT, RF, XGBoost, LightGBM, SVM, and ANN—and systematically compared their performance in the independent validation set. The results demonstrated that all seven models exhibited robust predictive capabilities, with XGBoost and LightGBM demonstrating the optimal overall performance. Although the LightGBM model achieved the highest AUC of 0.967 (95% CI: 0.941–0.988), the XGBoost model (AUC: 0.966, 95% CI: 0.935–0.990) achieved the highest accuracy (0.907), with substantial sensitivity (0.921) and specificity (0.898). Furthermore, XGBoost achieved an F1 score of 0.892, a Kappa value of 0.811, and a Youden’s J index of 0.818, indicating the best comprehensive classification performance. The RF model also performed admirably, with an AUC of 0.952 (95% CI: 0.914–0.984) and a specificity of 0.909. In contrast, the overall predictive efficacies of the DT, SVM, and ANN models were relatively inferior (Table 3).\u0026nbsp;\u003c/p\u003e\n \u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 3\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003e\u003cstrong\u003eComparative analysis of the performance outcomes across machine learning models.\u003c/strong\u003e XGBoost, Extreme Gradient Boosting, LightGBM, Light Gradient Boosting Machine, SVM, Support Vector Machine, ANN, Artificial Neural Network, PPV, Predictive Value, NPV, Negative Predictive Value, AUC, Area Under the Curve.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003eModel\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003eAUC\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003eSensitivity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003eSpecificity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003eF1 Score\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003eKappa\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003eYouden's J\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003ePPV\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003eNPV\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eLogistic\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.934\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.868\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.803\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.905\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.851\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.732\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.746\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.803\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.925\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eDecision Tree\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.885\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.854\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.797\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.873\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.833\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.704\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.714\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.797\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.902\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eRandom Forest\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.952\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.887\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.871\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.857\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.909\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.864\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.768\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.766\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.871\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.899\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eXGBoost\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.966\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.907\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.866\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.921\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.898\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.892\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.811\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.818\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.866\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eLightGBM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.967\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.894\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.921\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.875\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.879\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.785\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.796\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.939\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eSVM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.791\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.815\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.676\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.682\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.791\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.881\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\" colname=\"c1\"\u003e\n \u003cp\u003e\u003cstrong\u003eANN\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c2\"\u003e\n \u003cp\u003e0.923\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c3\"\u003e\n \u003cp\u003e0.841\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c4\"\u003e\n \u003cp\u003e0.82\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c5\"\u003e\n \u003cp\u003e0.794\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c6\"\u003e\n \u003cp\u003e0.875\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c7\"\u003e\n \u003cp\u003e0.806\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c8\"\u003e\n \u003cp\u003e0.672\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c9\"\u003e\n \u003cp\u003e0.669\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c10\"\u003e\n \u003cp\u003e0.82\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\" colname=\"c11\"\u003e\n \u003cp\u003e0.856\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\u003cp\u003eXGBoost, Extreme Gradient Boosting, LightGBM, Light Gradient Boosting Machine, SVM, Support Vector Machine, ANN, Artificial Neural Network, PPV, Predictive Value, NPV, Negative Predictive Value, AUC, Area Under the Curve.\u003c/p\u003e\n \u003cdiv\u003eConfusion matrices from the validation set further confirmed that all models exhibited a degree of discriminative ability between patients with and without IgAVN (Fig. 2A-G). Notably, the XGBoost model correctly identified 58 patients with IgAVN and 79 patients in the non-IgAVN group, misclassifying only 5 and 9 cases, respectively; this overall classification outcome was superior to those of the other models (Fig. 2D). The LightGBM model also performed well, accurately identifying 58 IgAVN cases and 77 non-IgAVN cases (Fig. 2E). The RF model exhibited a strong capacity to recognise non-IgAVN patients, correctly classifying 80 such cases, reflecting its higher specificity (Fig. 2C).\u003c/div\u003e\n \u003cp\u003eROC curve analysis revealed that all seven models maintained good discriminative ability in the validation set, with XGBoost and LightGBM encompassing the largest areas under the curve (Fig.\u0026nbsp;3A). Calibration curves indicated a general concordance between the predicted probabilities and the actual observed incidences across the models, with XGBoost, LightGBM, and the LR model demonstrating superior calibration performance (Fig.\u0026nbsp;3B). DCA results showed that all models yielded a net clinical benefit across most threshold probabilities; however, the XGBoost model achieved a higher overall net benefit (Fig.\u0026nbsp;3C). Taking into careful consideration the discriminative ability, classification performance, calibration, and DCA results, XGBoost exhibited a more balanced and robust profile across accuracy, sensitivity, F1 score, Kappa value, Youden’s J index, and overall net benefit. Consequently, XGBoost was ultimately selected as the optimal model for subsequent interpretability analysis and potential clinical deployment.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\"\u003e\n \u003ch2\u003eInterpretability analysis in the model\u003c/h2\u003e\n \u003cp\u003eTo quantify the contribution of each predictive variable to the model output, SHAP analysis was applied to the optimal XGBoost model. The results demonstrated that IBI, CRP, and MLR exhibited the highest mean absolute SHAP values (all 0.11), followed sequentially by ASO (0.06), NMLR (0.06), ALB (0.05), CLR (0.04), UA (0.03), urea (0.02), ALT (0.02), APTT (0.01), and PLT (0.01) (Fig.\u0026nbsp;4A). The SHAP beeswarm plot revealed that elevated levels of IBI, CRP, MLR, ASO, NMLR, CLR, UA, urea, and ALT were generally associated with an increased risk of incident IgAVN. In contrast, higher levels of ALB and APTT were correlated with a reduced risk (Fig.\u0026nbsp;4B).\u003c/p\u003e\n \u003cp\u003eAt the individual level, SHAP waterfall plots for specific patients showed that the directional contributions and effect magnitudes of identical features differed substantially between the non-IgAVN and IgAVN outcomes, with MLR, ASO, CRP, and NMLR emerging as the pivotal variables that elucidated the model's predictions (Fig.\u0026nbsp;4C-D). Furthermore, SHAP dependence plots indicated pervasive nonlinear relationships between individual variables and the model output (Fig.\u0026nbsp;5). At lower levels, parameters including ALT, urea, UA, ASO, MLR, NMLR, CLR, IBI, and CRP exerted relatively limited impacts on the model output. However, as these indices increased, their corresponding SHAP values progressively increased, transitioning from negative to positive or showing continuous increases within the moderate-to-high ranges. Conversely, an increase in ALB levels corresponded with an overall decline in its respective SHAP value. APTT exhibited a similar negative trajectory, whereas PLT showed nonlinear fluctuations.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\"\u003e\n \u003ch2\u003eWeb application deployment\u003c/h2\u003e\n \u003cp\u003eTo enhance the clinical accessibility and practical utility of the optimal XGBoost model, we deployed it as an online web-based prediction system (Fig.\u0026nbsp;6). Titled \"Kidney Injury Risk Prediction System,\" this platform integrates individualised risk prediction and personalised SHAP explanation functionalities for specific pediatric patients. Upon inputting the required clinical parameters—namely, ALT, ALB, urea, UA, ASO, APTT, MLR, NMLR, CLR, IBI, PLT, and CRP—users can click \"Run Prediction\" to obtain the corresponding predicted probability. Concurrently, the system generates a personalised SHAP explanation that provides a visual representation of the predicted outcome and its key driving factors. Furthermore, the webpage includes a dedicated results panel and usage instructions, explicitly cautioning that the tool is intended strictly for scientific research and adjunctive evaluation, rather than as a substitute for professional clinical judgment. The online prediction system is freely accessible at: https://predictinglymphnodemetastasisingastriccancer.shinyapps.io/shiny_kidney_app/. Ultimately, these results demonstrate the successful online deployment of the proposed XGBoost model, providing a convenient tool for individualised risk assessment of IgAVN.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn pediatric outpatient and emergency settings, the early progression from IgAV to IgAVN often lacks definitive, specific clinical manifestations at initial presentation. The insidious nature of disease evolution, coupled with pronounced clinical heterogeneity, predisposes high-risk patients to miss the critical therapeutic window for early identification and timely intervention. Concurrently, it increases the risk of unnecessary exposure to immunosuppressive therapies among low-risk patients with a self-limiting disease course \u003csup\u003e[\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]\u003c/sup\u003e. Given the invasive nature of renal biopsy and the inherent temporal lag of routine urinalysis in detecting early renal involvement, exploring noninvasive, objective, and clinically feasible early predictive strategies is of profound practical significance. Driven by this clinical imperative, the present study developed and validated multiple machine-learning predictive models using routine peripheral blood parameters readily available at initial presentation. Our findings revealed that the XGBoost model demonstrated exceptional discrimination and calibration within the independent validation set, underscoring its superior stability and generalisation potential. Furthermore, subsequent SHAP interpretability analysis elucidated that composite inflammatory burden indices\u0026mdash;specifically IBI and MLR\u0026mdash;alongside ALB and ASO, serve as pivotal predictors intricately associated with the progression from IgAV to IgAVN.\u003c/p\u003e \u003cp\u003eHistorically, risk assessment for IgAVN has predominantly relied upon traditional clinical scoring systems or statistical models such as logistic regression \u003csup\u003e[\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]\u003c/sup\u003e. However, the onset and progression of pediatric IgAV involve complex immune-inflammatory responses and metabolic remodelling processes. Consequently, routine peripheral blood indices are rarely independent; instead, they frequently exhibit substantial multicollinearity and higher-order nonlinear interactions \u003csup\u003e[\u003cspan additionalcitationids=\"CR25\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]\u003c/sup\u003e. Employing a two-sample Mendelian randomisation approach, Xie et al. systematically analysed the causal associations among IgAV, immune cells, metabolites, and inflammatory cytokines, revealing that the pathogenesis of IgAV is not a solitary inflammatory event, but rather a complex biological cascade co-orchestrated by immune-inflammatory responses and metabolic networks \u003csup\u003e[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/sup\u003e. Within a modelling framework predicated on linear assumptions, such intricate relationships are difficult to adequately capture, potentially impairing the model's capacity to extract key information and, in turn, compromising predictive accuracy. Our findings demonstrate that the XGBoost algorithm, leveraging a tree-based ensemble strategy, can effectively capture complex nonlinear association patterns among features without requiring the a priori manual elimination of collinear variables. Compared with the traditional logistic regression model, XGBoost exhibited superior overall predictive performance while achieving a more optimal balance between sensitivity and specificity. Furthermore, its lower Brier score indicates that the model not only possesses robust discriminative capacity but also generates individualised predicted probabilities that are highly concordant with the actual observed risk.\u003c/p\u003e \u003cp\u003eBy leveraging the capacity of tree-based models to capture high-dimensional interactions, our SHAP analysis not only enhances model interpretability but also delineates the critical pathological sequence of IgAV: triggering by prodromal infections, amplification via immune dysregulation, and culmination in microvascular endothelial injury. The high predictive importance of ASO supports the classical hypothesis that prodromal infections (e.g., Streptococcus) induce aberrant immune responses and pathogenic IgA1 production, thereby facilitating immune complex deposition \u003csup\u003e[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]\u003c/sup\u003e. Furthermore, the prominence of composite inflammatory indices, such as IBI and MLR, suggests that systemic inflammatory activation is a primary driver of disease progression. Mechanistically, this aligns with Li et al., who demonstrated that IgA1-containing immune complexes activate neutrophils via FcαRI, precipitating endothelial damage. This highlights the vital role of innate immune hyperactivation in mesangial injury and systemic microvascular destruction \u003csup\u003e[\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eNotably, ALB demonstrated a strong inverse association with IgAVN risk in the SHAP analysis. This indicates that subtle albumin leakage or depletion\u0026mdash;resulting from early systemic endothelial barrier impairment\u0026mdash;may act as an early warning sign of disease progression before overt target-organ damage (e.g., massive proteinuria) manifests. This finding is consistent with Shen et al., who identified baseline albumin as an independent predictor of renal involvement in pediatric IgAV, implying that an early decline in ALB reflects a latent risk of IgAVN \u003csup\u003e[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]\u003c/sup\u003e. Finally, SHAP dependence plots revealed nonlinear associations and distinct threshold effects between these key predictors and IgAVN risk. The risk of disease progression escalates nonlinearly once the systemic inflammatory burden exceeds the body's compensatory limits or when an early reduction in ALB occurs. These threshold effects underscore the complex, phased nature of IgAVN pathogenesis, providing clinicians with actionable metrics to identify high-risk patients nearing physiological decompensation.\u003c/p\u003e \u003cp\u003eCrucially, the XGBoost model relies entirely on features derived from standardised laboratory tests (e.g., routine inflammatory markers, ALB, and composite immune indices) universally available in outpatient and emergency settings. This noninvasive, readily accessible biomarker panel mitigates the risks associated with early renal biopsy and overcomes the inherent temporal delay of routine urinalysis in detecting systemic microvascular inflammation \u003csup\u003e[\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eFurthermore, decision curve analysis (DCA) substantiated the model's clinical utility, demonstrating a superior net clinical benefit across a broad range of threshold probabilities compared to default \"treat-all\" or \"treat-none\" strategies. This underscores the model's efficacy as a robust clinical decision-support tool. By providing frontline clinicians with an objective basis for early risk stratification, the model optimises the clinical management of pediatric IgAV. Particularly in resource-constrained environments, it empowers clinicians to strike an optimal balance\u0026mdash;promptly initiating targeted interventions for patients at high risk of renal progression while sparing low-risk patients from unnecessary immunosuppressive exposure.\u003c/p\u003e \u003cp\u003eDespite its robust predictive performance and clinical promise, these findings must be interpreted with caution. First, the retrospective design and limited sources of data introduce inherent selection bias. Furthermore, the current reliance on internal validation necessitates the use of independent external cohorts to rigorously verify the model's stability and generalizability across diverse geographic regions, healthcare tiers, and heterogeneous populations. In conclusion, by integrating early routine peripheral blood features with the XGBoost algorithm and SHAP framework, this study systematically delineates the nonlinear risk trajectories of IgAV progression to renal involvement. Ultimately, this model provides a noninvasive, objective, and highly actionable decision-support tool. It empowers frontline clinicians to promptly identify high-risk pediatric patients, optimise risk stratification, and maximise the efficiency of medical resource allocation.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eIn conclusion, this study developed an interpretable XGBoost model based on routine non-invasive peripheral blood parameters that enables accurate early identification of pediatric patients at high risk of progression from IgAV to IgAVN, while revealing nonlinear associations between key predictors and disease progression. This model provides a quantitative basis for early clinical risk stratification, with potential to optimise the timing of intervention and reduce unnecessary immunosuppressive exposure in low-risk children. Although further validation in large prospective cohorts is warranted, it shows promising potential as a visual decision-support tool for clinical translation.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"499\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAbbreviation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 70.5411%;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFull term\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eACR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ealbumin-to-creatinine ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eADA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eadenosine deaminase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eALB\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ealbumin\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eALP\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ealkaline phosphatase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eALT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ealanine aminotransferase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eANN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eartificial neural network\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eCLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eC-reactive protein-to-lymphocyte ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eHCY\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ehomocysteine\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eHPF\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ehigh-power field\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eHSP\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eHenoch-Sch\u0026ouml;nlein purpura\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eIBI\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eInflammatory Burden Index\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eLDH\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003elactate dehydrogenase\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003elogistic regression\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eML\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003emachine learning\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eMLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003emonocyte-to-lymphocyte ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eNEFA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003enon-esterified fatty acids\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eNLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eneutrophil-to-lymphocyte ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eNMLR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eneutrophil-monocyte-lymphocyte derived ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003ePPV\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003epositive predictive value\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eROC\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ereceiver operating characteristic\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eSA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003esialic acid\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eSVM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003esupport vector machine\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eTCO2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003etotal carbon dioxide\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eTP\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003etotal protein\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eTT\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003ethrombin time\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eUA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003euric acid\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eUPCR\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eurinary protein-to-creatinine ratio\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eUREA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eurea\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 29.4589%;\"\u003e\n \u003cp\u003eXGBoost\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 70.5411%;\"\u003e\n \u003cp\u003eeXtreme Gradient Boosting\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll procedures involving human participants were conducted in accordance with the ethical standards of the institutional research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. Ethical approval for this retrospective study was obtained from the Medical Ethics Committee of Siyang Hospital (approval number: KS2026002). The requirement for informed consent to participate was waived by the Medical Ethics Committee of Siyang Hospital due to the retrospective nature of the study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003cbr\u003e\u0026nbsp;The datasets used and/or analysed during the current study are not publicly available due to institutional and patient privacy restrictions but are available from the corresponding author on reasonable request\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare they have no conflict of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was jointly funded by Siyang Hospital and the Suqian Municipal Health Commission, as part of the 2024 Suqian Municipal Health Commission Medical Research Project (ZD202409).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026apos; contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualisation: Qingkai Wang, Hao Qiu, Weibing Qiu; Data curation: Liran Shen, Fachen Miao, Jinxing Dai; Formal analysis: Qingkai Wang, Hao Qiu, Jinxing Dai; Methodology: Kang Shen; Software: Qingkai Wang, Jinxing Dai; Supervision: Qingkai Wang, Yubiao Zhang, Liran Shen, Weibing Qiu; Writing - original draft: Jinxing Dai, Qingkai Wang; Writing - review \u0026amp; editing: Hao Qiu, Weibing Qiu. All authors have read and approved the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNone.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eCasta\u0026ntilde;eda S, Quiroga-Colina P, Floranes P, Uriarte-Ecenarro M, Valero-Mart\u0026iacute;nez C, Vicente-Rabaneda EF, et al. IgA Vasculitis (Henoch-Sch\u0026ouml;nlein Purpura): An Update on Treatment. J Clin Med. 2024;13(21). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/jcm13216621\u003c/span\u003e\u003cspan address=\"10.3390/jcm13216621\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSestan M, Jelusic M. Diagnostic and Management Strategies of IgA Vasculitis Nephritis/Henoch-Sch\u0026ouml;nlein Purpura Nephritis in Pediatric Patients: Current Perspectives. Pediatr Health Med Ther. 2023;14:89\u0026ndash;98. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2147/phmt.S379862\u003c/span\u003e\u003cspan address=\"10.2147/phmt.S379862\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOni L, Platt C, Marlais M, McCann L, Barakat F, Hesseling M, et al. National recommendations for the management of children and young people with IgA vasculitis: a best available evidence, group agreement-based approach. Arch Dis Child. 2024;110(1):67\u0026ndash;76. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1136/archdischild-2024-327364\u003c/span\u003e\u003cspan address=\"10.1136/archdischild-2024-327364\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJelusic M, Sestan M, Giani T, Cimaz R. New Insights and Challenges Associated With IgA Vasculitis and IgA Vasculitis With Nephritis-Is It Time to Change the Paradigm of the Most Common Systemic Vasculitis in Childhood? Front Pediatr. 2022;10:853724. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fped.2022.853724\u003c/span\u003e\u003cspan address=\"10.3389/fped.2022.853724\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAmatruda M, Carucci NS, Chimenz R, Conti G. Immunoglobulin A vasculitis nephritis: Current understanding of pathogenesis and treatment. World J Nephrol. 2023;12(4):82\u0026ndash;92. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.5527/wjn.v12.i4.82\u003c/span\u003e\u003cspan address=\"10.5527/wjn.v12.i4.82\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGage A, Pepper RJ, Marro J, Salama AD, Oni L. IgA Vasculitis Across the Ages: Is It Time for a Precision Medicine Approach? ACR Open Rheumatol. 2025;7(9):e70083. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/acr2.70083\u003c/span\u003e\u003cspan address=\"10.1002/acr2.70083\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarro J, Chetwynd AJ, Wright RD, Dliso S, Oni L. Urinary Protein Array Analysis to Identify Key Inflammatory Markers in Children with IgA Vasculitis Nephritis. Child (Basel). 2022;9(5). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/children9050622\u003c/span\u003e\u003cspan address=\"10.3390/children9050622\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Q, Lai LY, Cai YY, Wang MJ, Ma G, Qi LW, et al. Serum-Urine Matched Metabolomics for Predicting Progression of Henoch-Schonlein Purpura Nephritis. Front Med (Lausanne). 2021;8657073. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fmed.2021.657073\u003c/span\u003e\u003cspan address=\"10.3389/fmed.2021.657073\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOni L, Sampath S. Childhood IgA Vasculitis (Henoch Schonlein Purpura)-Advances and Knowledge Gaps. Front Pediatr. 2019;7:257. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fped.2019.00257\u003c/span\u003e\u003cspan address=\"10.3389/fped.2019.00257\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu S, Han S, Dai Y, Wang L, Zhang X, Ding Y. A Review of the Mechanism of Vascular Endothelial Injury in Immunoglobulin A Vasculitis. Front Physiol. 2022;13:833954. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fphys.2022.833954\u003c/span\u003e\u003cspan address=\"10.3389/fphys.2022.833954\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBi Y, Quan W, Hao W, Sun R, Li L, Jiang C, et al. A simple nomogram for assessing the risk of IgA vasculitis nephritis in IgA vasculitis Asian pediatric patients. Sci Rep. 2022;12(1):16809. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-022-20369-3\u003c/span\u003e\u003cspan address=\"10.1038/s41598-022-20369-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Ouml;zdemir ZC, \u0026Ccedil;etin N, Kar YD, \u0026Ouml;cal HO, Bilgin M, B\u0026ouml;r \u0026Ouml;. Hemotologic Indices for Predicting Internal Organ Involvement in Henoch-Sch\u0026ouml;nlein Purpura (IgA vasculitis). J Pediatr Hematol Oncol. 2020;42(1):e46\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1097/mph.0000000000001571\u003c/span\u003e\u003cspan address=\"10.1097/mph.0000000000001571\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOh MY, Kim HS, Jung YM, Lee HC, Lee SB, Lee SM. Machine Learning-Based Explainable Automated Nonlinear Computation Scoring System for Health Score and an Application for Prediction of Perioperative Stroke: Retrospective Study. J Med Internet Res. 2025;27e58021. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e.https://doi.org/10.2196/58021\u003c/span\u003e\u003cspan address=\".10.2196/58021\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee S, Kisiel MA, Lindberg P, Wheelock \u0026Aring;M, Olofsson A, Eriksson J, et al. Using machine learning involving diagnoses and medications as a risk prediction tool for post-acute sequelae of COVID-19 (PASC) in primary care. BMC Med. 2025;23(1):251. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12916-025-04050-w\u003c/span\u003e\u003cspan address=\"10.1186/s12916-025-04050-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXie Y, Chen Y, Han Y, Zhai S, Xiao L, Yin D, et al. Identifying influencing factors associated with sleep quality in undergraduates based on partial least squares regression and XGBoost. Front Psychol. 2025;16:1732946. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpsyg.2025.1732946\u003c/span\u003e\u003cspan address=\"10.3389/fpsyg.2025.1732946\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang S, Cao L, Zhou Y, Hu C. A Retrospective Cohort Study: Predicting 90-Day Mortality for ICU Trauma Patients with a Machine Learning Algorithm Using XGBoost Using MIMIC-III Database. J Multidiscip Healthc. 2023;16:2625\u0026ndash;40. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2147/jmdh.S416943\u003c/span\u003e\u003cspan address=\"10.2147/jmdh.S416943\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLuo S, Lai J, Mo L, Shen X, Fang R. Prediction of hospital mortality in sepsis-associated acute kidney injury using a machine-learning approach: a multicenter study using SHAP interpretability analysis. Clin Kidney J. 2026;19(1):sfaf372. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e.https://doi.org/10.1093/ckj/sfaf372\u003c/span\u003e\u003cspan address=\".10.1093/ckj/sfaf372\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang C, Niu B, Wang R, Zhang L. From traditional metabolic markers to ensemble learning: comparative application of machine learning models for predicting NAFLD risk in adolescents. Front Endocrinol (Lausanne). 2025;16:1681686. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fendo.2025.1681686\u003c/span\u003e\u003cspan address=\"10.3389/fendo.2025.1681686\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOzen S, Pistorio A, Iusan SM, Bakkaloglu A, Herlin T, Brik R, et al. EULAR/PRINTO/PRES criteria for Henoch-Sch\u0026ouml;nlein purpura, childhood polyarteritis nodosa, childhood Wegener granulomatosis and childhood Takayasu arteritis: Ankara 2008. Part II: Final classification criteria. Ann Rheum Dis. 2010;69(5):798\u0026ndash;806. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1136/ard.2009.116657\u003c/span\u003e\u003cspan address=\"10.1136/ard.2009.116657\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKDIGO 2021 Clinical Practice Guideline for the Management of Glomerular Diseases. Kidney Int. 2021;100(4s):S1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003es276.https://doi.org/10.1016/j.kint.2021.05.021\u003c/span\u003e\u003cspan address=\"s276.10.1016/j.kint.2021.05.021\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eErcan Emreol H, Yildirim-Toruner C, Jelusic M, Twilt M, Ozen S. New avenues in childhood vasculitis. Pediatr Rheumatol Online J. 2025;23(1):97. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12969-025-01149-5\u003c/span\u003e\u003cspan address=\"10.1186/s12969-025-01149-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuo L, Zhu A, Li W, Zeng F, Wang F. Clinical prediction model for progression from henoch-sch\u0026ouml;nlein purpura to nephritis in pediatric patients. Am J Transl Res. 2024;16(12):7385\u0026ndash;95. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.62347/xdor8531\u003c/span\u003e\u003cspan address=\"10.62347/xdor8531\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePodraza Z, Poplicha K, Ufniarski T, Ucieklak J, Łysiak N, Mizerska-Wasiak M. Laboratory Findings and Clinical Features in IgA Vasculitis: Identifying Predictors of Kidney Involvement and Disease Relapse in Pediatric Patients. J Clin Med. 2025;14(9). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e.https://doi.org/10.3390/jcm14093055\u003c/span\u003e\u003cspan address=\".10.3390/jcm14093055\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen G, Yang Z. Risk prediction for gastrointestinal bleeding in pediatric Henoch-Sch\u0026ouml;nlein purpura using an interpretable transformer model. Front Physiol. 2025;16:1630807. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fphys.2025.1630807\u003c/span\u003e\u003cspan address=\"10.3389/fphys.2025.1630807\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePan M, Li M, Li N, Mao J. Predicting renal damage in children with IgA vasculitis by machine learning. Pediatr Nephrol. 2024;39(10):2997\u0026ndash;3004. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00467-024-06432-3\u003c/span\u003e\u003cspan address=\"10.1007/s00467-024-06432-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCao T, Zhu Y, Zhu Y. Construction of Prediction Model of Renal Damage in Children with Henoch-Sch\u0026ouml;nlein Purpura Based on Machine Learning. \u003cem\u003eComput Math Methods Med\u003c/em\u003e 2022, 2022:6991218.\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1155/2022/6991218\u003c/span\u003e\u003cspan address=\"10.1155/2022/6991218\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXie M, Zhou N, Liang Q, Lin Z, Yao Y. Impact of Immune Cells on IgA Vasculitis via Metabolites and Inflammatory Cytokines. J Clin Immunol. 2025;45(1):157. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10875-025-01946-3\u003c/span\u003e\u003cspan address=\"10.1007/s10875-025-01946-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeld M, Kozmar A, Sestan M, Turudic D, Kifer N, Srsen S, et al. Insight into the Interplay of Gd-IgA1, HMGB1, RAGE and PCDH1 in IgA Vasculitis (IgAV). Int J Mol Sci. 2024;25(8). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e.https://doi.org/10.3390/ijms25084383\u003c/span\u003e\u003cspan address=\".10.3390/ijms25084383\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi Q, Wu H, Yuan X, Shi S, Liu L, Lv J, et al. FcαRI-mediated neutrophil activation contributed to the pathogenesis of adult IgA vasculitis with nephritis. Nephrol Dial Transpl. 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/ndt/gfaf260\u003c/span\u003e\u003cspan address=\"10.1093/ndt/gfaf260\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShen L, Miao L, Xu L. Risk factors associated with renal injury in patients initially diagnosed with IgA vasculitis. Front Pediatr. 2025;13:1584768. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fped.2025.1584768\u003c/span\u003e\u003cspan address=\"10.3389/fped.2025.1584768\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWilliams CEC, Ging H, Skoutelis N, Marro J, Roberts L, Chetwynd AJ, et al. Biomarkers to predict kidney outcomes in children with IgA vasculitis. Minerva Pediatr (Torino). 2025;77(3):256\u0026ndash;71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.23736/s2724-5276.24.07715-2\u003c/span\u003e\u003cspan address=\"10.23736/s2724-5276.24.07715-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJin Y, He X, Lin W, Peng Z, Li W, Xiang W, et al. Serum cytokine profiles in children with IgA vasculitis with nephritis. Biomol Biomed. 2025;25(6):1425\u0026ndash;43. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.17305/bb.2024.11081\u003c/span\u003e\u003cspan address=\"10.17305/bb.2024.11081\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"IgA vasculitis, IgA vasculitis nephritis, Machine learning, Risk strat-ification","lastPublishedDoi":"10.21203/rs.3.rs-9261325/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9261325/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cb\u003eObjective\u003c/b\u003e\u003c/p\u003e \u003cp\u003eIgA vasculitis nephritis (IgAVN) plays a decisive role in the long-term prognosis of pediatric IgA vasculitis (IgAV), yet the temporal lag of routine urinalysis frequently hinders early and precise risk stratification. This study aimed to develop a machine-learning predictive model using non-invasive peripheral blood parameters to facilitate early identification of IgAVN progression and to reveal underlying pathophysiological risk thresholds.\u003c/p\u003e\u003cp\u003e\u003cb\u003eMethods\u003c/b\u003e\u003c/p\u003e \u003cp\u003eThis retrospective study enrolled 509 pediatric IgAV patients from Siyang Hospital and Shanxian Central Hospital, among whom 213 developed IgAVN. Twelve core features were selected based on the intersection of LASSO regression and the Boruta algorithm. Seven machine learning algorithms were systematically evaluated to construct the optimal eXtreme Gradient Boosting (XGBoost) model. Furthermore, the Shapley Additive Explanations (SHAP) framework was incorporated to quantify feature importance and to decipher non-linear risk interactions.\u003c/p\u003e\u003cp\u003e\u003cb\u003eResults\u003c/b\u003e\u003c/p\u003e \u003cp\u003eThe XGBoost model demonstrated outstanding predictive performance in the independent validation set, achieving an area under the receiver operating characteristic curve (AUC) of 0.966, an accuracy of 0.907, an F1 score of 0.892, and a sensitivity of 0.921. SHAP analysis identified the Inflammatory Burden Index (IBI), C-reactive protein (CRP), and monocyte-to-lymphocyte ratio (MLR) as the primary driving factors. SHAP dependence plots revealed critical non-linear threshold effects: the risk of IgAVN escalated sharply and non-linearly in the presence of early subclinical albumin (ALB) depletion and decompensated inflammatory load. Decision curve analysis (DCA) demonstrated that the model achieved substantial clinical net benefit across a broad continuum of threshold probabilities.\u003c/p\u003e\u003cp\u003e\u003cb\u003eConclusion\u003c/b\u003e\u003c/p\u003e \u003cp\u003eThe explainable XGBoost model, developed utilising routine non-invasive peripheral blood parameters, demonstrates promising potential as a supportive tool for the early risk stratification of IgAVN. By visualising complex, data-driven nonlinear risk inflexion points, this model may assist frontline clinicians in more effectively identifying high-risk pediatric patients in outpatient and emergency settings. Ultimately, these findings provide an objective reference to inform future clinical strategies to optimise the timing of interventions and potentially minimise unnecessary immunosuppressive exposure in low-risk patients.\u003c/p\u003e","manuscriptTitle":"Explainable Machine Learning for Predicting Progression From IgA Vasculitis to IgA Vasculitis Nephritis in Children: A Dual-Centre Retrospective Study","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-20 17:03:13","doi":"10.21203/rs.3.rs-9261325/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2a9b83ad-cf96-486a-8492-2ebd9c9ed325","owner":[],"postedDate":"April 20th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-04-21T07:12:17+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-20 17:03:13","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9261325","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9261325","identity":"rs-9261325","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00