A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study

doi:10.21203/rs.3.rs-6345504/v1

A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study

2025 · doi:10.21203/rs.3.rs-6345504/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 120,417 characters · extracted from preprint-html · click to expand

A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study Yuhang Wang, Xufeng Liu, Xiaojiang Zhao, Zixiao Wang, Xin Li, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6345504/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 29 Sep, 2025 Read the published version in Cancer Imaging → Version 1 posted 7 You are reading this latest preprint version Abstract Background : Despite early detection via low-dose computed tomography and complete surgical resection for early-stage lung adenocarcinoma, postoperative recurrence remains high, particularly in patients with tumor spread through air spaces. A reliable preoperative prediction model is urgently needed to adjust the treatment modality. Methods : In this multicenter retrospective study, 609 patients with pathological stage I lung adenocarcinoma from 3 independent centers were enrolled. Regions of interest for the primary tumor and peritumoral areas (extended by three, six, and twelve voxel units) were manually delineated from preoperative CT imaging. Quantitative imaging features were extracted and filtered by correlation analysis and random forest ranking to yield 40 candidate features. Fifteen machine learning methods were evaluated, and a ten-fold cross-validated elastic net regression model was selected to construct the radiomics-based prediction model. A clinical model based on five key clinical variables and a combined model integrating imaging and clinical features were also developed. Results : The radiomics model achieved accuracies of 0.801, 0.866, and 0.831 in the training set and two external test sets, with AUC of 0.791, 0.829, and 0.807. In one external test set, the clinical model had an AUC of 0.689, significantly lower than the radiomics model (0.807, p < 0.05). The combined model achieved the highest performance, with AUC of 0.834 in the training set and 0.894 in an external test set (p < 0.01 and p < 0.001, respectively). Interpretability analysis revealed that wavelet-transformed features dominated the model, with the highest contribution from a feature reflecting small high-intensity clusters within the tumor and the second highest from a feature representing low-intensity clusters in the six-voxel peritumoral region. Kaplan–Meier analysis demonstrated that patients with either pathologically confirmed or model-predicted spread had significantly shorter progression-free survival (p < 0.001). Conclusion : Our novel machine learning model, integrating imaging features from both tumor and peritumoral regions, preoperatively predicts tumor spread through air spaces in stage I lung adenocarcinoma. It outperforms traditional clinical models, highlighting the potential of quantitative imaging analysis in personalizing treatment. Future prospective studies and further optimization are warranted. Lung adenocarcinoma STAS Radiomics Machine learning SHAP Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction Lung cancer remains the leading cause of cancer-related mortality worldwide [ 1 ]. Among its pathological subtypes, lung adenocarcinoma (LUAD) is the most common, and research on its early detection and treatment has been ongoing [ 2 ]. With advancements in medical imaging, an increasing number of early-stage LUAD cases are being detected and treated through low-dose computed tomography (CT) screening. Despite complete surgical resection being the primary treatment for stage I LUAD, studies have shown that recurrence rates range from 20–50%, even after curative surgery [ 3 ]. Therefore, identifying high-risk patients at an early stage is crucial for improving patient prognosis and guiding treatment strategies. In the 2015 WHO classification of lung cancer, the concept of spread through air spaces (STAS) was introduced. STAS is defined as the presence of tumor cells spreading beyond the tumor margin into the alveolar spaces in the form of micropapillary clusters, solid nests, or single cells [ 4 ]. Subsequent studies have demonstrated that STAS is a significant risk factor for recurrence in patients with stage I LUAD after surgical resection [ 5 , 6 ]. However, STAS is typically diagnosed postoperatively through pathological examination, limiting its utility for preoperative treatment planning. Therefore, a reliable preoperative STAS prediction model is essential for identifying high-risk patients and tailoring surgical and adjuvant treatment strategies. With the rapid development of machine learning and artificial intelligence in the medical field, researchers have begun extracting quantitative features from medical images to assist in disease diagnosis and prognostication. Radiomics is a powerful approach that aims to transform standard imaging data into high-dimensional quantitative features, capturing complex tumor characteristics beyond human visual perception [ 7 ]. Previous studies have demonstrated that both tumoral and peritumoral CT radiomics features are valuable in assisting the diagnosis, prognostication, and histological classification of LUAD [ 8 ]. However, existing studies on preoperative STAS prediction have primarily focused on either tumoral or peritumoral radiomics features, and few have comprehensively integrated these modalities to improve predictive performance [ 9 – 12 ]. In our previous research, we developed a clinical model for preoperative STAS prediction in stage I non-small cell lung cancer (NSCLC) patients based on demographic and imaging characteristics, such as tumor size, spiculation, vacuole sign, and carcinoembryonic antigen (CEA) levels [ 13 ]. Other studies have constructed STAS prediction models based on traditional or deep learning radiomics features [ 14 – 16 ]. However, most of these studies were limited to single-center datasets, and their model generalizability remains uncertain. To address these limitations, we conducted a multicenter retrospective cohort study to develop and validate a multimodal preoperative STAS risk prediction model incorporating both tumoral and peritumoral radiomics features. We employed a multi-machine learning approach, integrating various algorithms—including LASSO, support vector machine, random forest, gradient boosting, elastic net regression, and neural networks—to optimize model performance. We combined radiomics features with key clinical variables to construct a hybrid model, enhancing predictive accuracy. Our dataset was derived from three independent medical centers, allowing us to assess model robustness and generalizability. We believe that our findings will contribute to improved risk stratification and personalized treatment strategies for early-stage LUAD patients. The technical workflow and main findings of this study are summarized in Fig. 1 . Methods In this multi-center retrospective cohort study, we developed and validated a multimodal preoperative STAS risk prediction model based on radiomics features extracted from preoperative CT, including both tumoral and peritumoral radiomics features. This model aims to assist clinicians in the early identification of high-recurrence-risk stage I LUAD patients, enabling timely adjustments to treatment strategies. Clinical Data Collection and Follow-Up In this multi-center cohort study, we included patients with pathological stage I lung adenocarcinoma (LUAD) who had undergone complete surgical resection. The study samples were collected from three centers. Specifically, patients who underwent surgery at Tianjin Chest Hospital (Center A) between January 1, 2015, and December 31, 2018, were assigned to the training set. Patients who underwent surgery at the same institution from January 1, 2019, to December 31, 2019, were included in test set A. Additionally, patients who underwent surgery at Tianjin Binhai New Area Haibin People’s Hospital (Center B) and Qinhuangdao First Hospital (Center C) between January 1, 2019, and December 31, 2020, were included in test set B. The detailed inclusion and exclusion criteria are illustrated in Figure 2. In the training set, STAS status was re-evaluated based on pathological slides (Supplement figure1), whereas in Test Set A and Test Set B, the STAS status was obtained from pathology reports. Based on our previous research, several clinical features were identified as significantly associated with STAS status in stage I non-small cell lung cancer (NSCLC), including maximum tumor diameter (Tdmax), consolidation-to-tumor ratio (CTR), spiculation, vacuole sign, and carcinoembryonic antigen (CEA) levels. Therefore, only these five clinical features were incorporated to compare the performance of the radiomics-based STAS risk prediction model with a clinical feature-based model. These clinical features were collected only in the training set and test set B. Patients from Tianjin Chest Hospital (Center A) were followed up for prognosis assessment, primarily evaluating tumor progression. The last follow-up date was January 1, 2025. Image Acquisition and Preprocessing Preoperative thin-slice chest CT scans of patients were retrieved from the Picture Archiving and Communication System (PACS) in Digital Imaging and Communications in Medicine (DICOM) format. Given the multi-center nature of this study and variations in CT scanners, all CT images were resampled to standardize voxel size to 1 mm × 1 mm × 1 mm, ensuring a uniform slice thickness of 1 mm across all scans. Additionally, window width and window level were standardized to 1600 HU and -500 HU, respectively. ITK-SNAP open-source software was used for CT image parameter adjustments, while Python was employed for batch processing. ROI Delineation Following CT image preprocessing, manual delineation of the regions of interest (ROIs) was performed (Figure 3A). Based on the tumoral ROI (ROI-tumoral), the peritumoral ROI (ROI-peritumoral) was generated by outward expansion. After reviewing the literature [17-23], we selected peritumoral regions extending 3, 6, and 12 voxel units from the tumor boundary (Figure 3A). To prevent extraction errors, areas of the peritumoral ROIs that extended beyond the lung parenchyma were removed [24] (Figure 3B). Both the manual delineation and region expansion were executed using ITK-SNAP. Radiomics Feature Extraction and Selection Radiomics features were extracted from CT images using the Pyradiomics package in Python from both tumoral and peritumoral regions. The extracted features included first-order features, shape features (2D and 3D), and gray-level features—namely, gray-level co-occurrence matrix (GLCM), gray-level size zone matrix (GLSZM), gray-level run length matrix (GLRLM), neighboring gray tone difference matrix (NGTDM), and gray-level dependence matrix (GLDM)—in addition to wavelet features. For the four feature groups, an initial screening was performed in R using Spearman correlation analysis, whereby one feature from each pair with a correlation coefficient greater than 0.9 was randomly excluded. Subsequently, a random forest regression algorithm was applied to rank the remaining features by variable importance. The top 10 STAS-related features from each feature group were selected, resulting in a total of 40 features that were used for subsequent model development. The workflow for radiomics feature extraction and selection is illustrated in Figure 3C. Machine Learning Model Development We employed 15 machine learning algorithms, including LASSO, support vector machine, random forest, gradient boosting, elastic net regression, and neural networks. By optimizing hyperparameters and combining different models, we developed multiple radiomics models for STAS prediction. Additionally, we constructed a clinical prediction model for STAS risk (Clinic-LR) using five clinical features: Tdmax, CTR, spiculation, vacuole sign, and CEA. Furthermore, we integrated the radiomics-predicted STAS label with these clinical features to form a combined logistic regression model (Combined Model) for comparison. Finally, the models were evaluated in both the training and test sets in terms of accuracy, discrimination ability, and clinical benefit. SHapley Additive exPlanations ( SHAP ) explainability analysis is used to interpret machine learning models [25]. Results Baseline Characteristics of Enrolled Patients Table 1 summarizes the data of 609 patients included in this study. Among these, 226 patients from Center A (2015–2018) were used as the training set, for which both imaging features and prognostic follow-up were collected, the median follow-up was 2330 days, with a maximum follow-up of 10 years. In contrast, 306 patients from Center A (2019) formed Test Set A, with only prognostic follow-up data available, the median follow-up was 1940 days. Additionally, 77 patients from Centers B and C comprised Test Set B, where imaging features were collected without prognostic follow-up. Missing values in CEA were imputed using the median value. Table 1 Baseline characteristics of patients in the model development cohort Train set Test setA Test setB Negative (N=133) Positive (N=93) All (N=226) Negative (N=183) Positive (N=123) All (N=306) Negative (N=47) Positive (N=30) All (N=77) Tdmax_cm Mean (SD) 1.96 (0.747) 2.45 (0.812) 2.16 (0.810) - - - 1.71 (0.675) 1.89 (0.619) 1.78 (0.656) Median [Min, Max] 1.87 [0.700, 4.50] 2.37 [0.990, 5.28] 2.07 [0.700, 5.28] - - - 1.70 [0.730, 3.31] 1.91 [0.770, 3.10] 1.80 [0.730, 3.31] CTR Mean (SD) 0.614 (0.310) 0.769 (0.220) 0.678 (0.286) - - - 0.478 (0.439) 0.556 (0.407) 0.508 (0.426) Median [Min, Max] 0.638 [0, 1.00] 0.806 [0.136, 1.00] 0.707 [0, 1.00] - - - 0.400 [0, 1.00] 0.480 [0, 1.00] 0.450 [0, 1.00] spiculation No 69 (51.9%) 26 (28.0%) 95 (42.0%) - - - 36 (76.6%) 8 (26.7%) 44 (57.1%) Yes 64 (48.1%) 67 (72.0%) 131 (58.0%) - - - 11 (23.4%) 22 (73.3%) 33 (42.9%) vacuole No 90 (67.7%) 56 (60.2%) 146 (64.6%) - - - 44 (93.6%) 19 (63.3%) 63 (81.8%) Yes 43 (32.3%) 37 (39.8%) 80 (35.4%) - - - 3 (6.4%) 11 (36.7%) 14 (18.2%) CEA Mean (SD) 3.36 (3.80) 5.06 (7.58) 4.06 (5.71) - - - 2.37 (1.43) 4.81 (4.83) 3.32 (3.40) Median [Min, Max] 2.80 [0.210, 34.0] 2.81 [0.760, 54.9] 2.81 [0.210, 54.9] - - - 2.20 [0.660, 8.50] 3.04 [0.720, 25.5] 2.56 [0.660, 25.5] PFS.status No-progress 122 (91.7%) 57 (61.3%) 179 (79.2%) 176 (96.2%) 107 (87.0%) 283 (92.5%) - - - Progress 11 (8.3%) 36 (38.7%) 47 (20.8%) 7 (3.8%) 16 (13.0%) 23 (7.5%) - - - PFS.time Mean (SD) 2570 (482) 2000 (895) 2330 (736) 1990 (184) 1870 (334) 1940 (262) - - - Median [Min, Max] 2490 [752, 3650] 2340 [246, 3460] 2440 [246, 3650] 2010 [943, 2190] 1950 [695, 2190] 1990 [695, 2190] - - - Tdmax: Tumor diameter max; CTR: Consolidation-to-tumour ratio; CEA: Carcinoembryonic antigen; PFS: Progression free survival. Feature extraction and selection For each ROI, Pyradiomics extracted 1,133 features. After Spearman correlation analysis, 258 features from ROI-tumoral, 232 from ROI-peritumor_3mm, 236 from ROI-peritumor_6mm, and 236 from ROI-peritumor_12mm were retained. Random forest was then used for further feature selection within each ROI, retaining the top 10 features based on variable importance ranking (Figure 4). Ultimately, these 40 selected features were incorporated into the radiomics model. Model Development and Evaluation A total of 237 optimized models were constructed. Based on the average accuracy across the training set and two test sets (Figure 5A, Supplementary figure 2), the best-performing model was selected: a 10-fold cross-validated elastic net regression model (ENR−CV: 10-fold, cutoff = 0.5, alpha = 0.9), referred to as ENR-Rad . The accuracy of ENR-Rad was 0.801 in the training set, 0.866 in Validation Set A, and 0.831 in Validation Set B. The receiver operating characteristic (ROC) curves and corresponding area under the curve (AUC) values were calculated for all three datasets (train = 0.791, test A = 0.829, test B = 0.807) (Figure 5B). Additionally, a logistic regression model based on clinical imaging features ( Clinic-LR ) was developed for comparison. The diagnostic performance of ENR-Rad, Clinic-LR, and a combined model ( CM ) integrating ENR-Rad outcome with clinical features was evaluated (Figure 5C). In the training set, ENR-Rad (AUC = 0.791) demonstrated higher diagnostic performance than Clinic-LR (AUC = 0.738), though the difference was not statistically significant (DeLong test p = 0.1246). However, CM (AUC = 0.834) exhibited significantly better performance than ENR-Rad (DeLong test p = 0.0089). The superiority of the radiomics-based model was further highlighted in Test set B (Figure 1), where ENR-Rad (AUC = 0.807) significantly outperformed Clinic-LR (AUC = 0.689) (DeLong test p < 0.05). Additionally, CM (AUC = 0.894) showed a significant improvement over ENR-Rad (DeLong test p < 0.001). Decision curve analysis (DCA) in the training set indicated that CM provided the highest net clinical benefit, followed by ENR-Rad, with Clinic-LR ranking last (Figure 5D). A similar trend was observed in test set B (Figure 1). SHAP Interpretability Analysis In the ENR-Rad model, only 12 features are actually retained when lambda = lambda.min, i.e., these 12 features really play a role in the model (Supplementary Figure 3). SHAP interpretability analysis of ENR-Rad showed that wavelet.LLL_glszm_SmallAreaHighGrayLevelEmphasis made the most contributions to the model (Figure 5E). Figure 5F is a single-sample SHAP force plot that shows the extent to which each feature contributes to the model. Blue indicates positive contribution and red indicates negative contribution. The baseline expected value E[f(x)]=0.412 is the overall prediction mean of the model. The current sample prediction value f(x)=0.025 is significantly lower than the baseline, indicating that its feature combination tends to suppress the prediction result, that is, STAS negative. Prognostic Relevance of ENR-Rad As a newly identified pathway of lung cancer dissemination, STAS has been proven to be closely associated with tumor prognosis. To evaluate the prognostic value of ENR-Rad, Kaplan-Meier survival curves were plotted to compare progression-free survival (PFS) between high-risk and low-risk groups predicted by the model. In the training set, STAS-positive patients exhibited significantly worse PFS than STAS-negative patients ( p < 0.001) (Figure 6A). Similarly, patients predicted by ENR-Rad to be at high risk for STAS had significantly poorer PFS than those predicted to be at low risk ( p = 0.011) (Figure 6B). This finding was further validated in Test B, where STAS-negative patients had a significantly longer PFS ( p = 0.002) (Figure 6C), and those classified as low risk by ENR-Rad also showed a prolonged PFS ( p < 0.001) (Figure 6D). These results reinforce the potential clinical utility of the ENR-Rad model, offering valuable insights for improving diagnostic accuracy and personalizing treatment strategies. Discussion Studies in recent years have shown that sublobectomy has disease-free survival and overall survival that are not inferior to lobectomy for hilar and mediastinal lymph node-negative early-stage lung cancer with a diameter of less than 2 cm [ 26 – 28 ]. Currently, there are no clear international guidelines for surgical strategies for adenocarcinoma of the lung (LUAD) nodules ≤ 3 cm in diameter and positive STAS (airway spread). However, several studies have suggested that the option of lobectomy may provide a better prognosis for patients with STAS-positive early-stage lung adenocarcinoma [ 29 – 33 ]. Regarding postoperative adjuvant therapy, the 2023 NCCN guidelines recommend adjuvant chemotherapy as a standard treatment for early-stage NSCLC patients with high-risk factors, such as highly invasive tumors [ 34 ]. According to the Guidelines for Clinical Diagnosis and Treatment of Lung Cancer issued by the Chinese Medical Association in 2024 [ 35 ], adjuvant chemotherapy is recommended for patients with early-stage non-small cell lung cancer (NSCLC) with high-risk factors, including STAS-positive patients. In certain clinical trials, such as the KEYNOTE-091 trial [ 36 ], immune checkpoint inhibitors (ICIs) have been shown to significantly improve disease-free survival in early-stage (IB) high-risk NSCLC patients. Sum up, STAS-positive patients may derive additional clinical benefits from immunotherapy. However, there are currently no clinical trials that specifically target STAS-positive patients as a separate subgroup, highlighting the need for further investigation. Based on our previous research [ 13 ], we developed a clinical feature-based model (Clinic-LR) using five clinical factors: Tdmax, CTR, burr sign, vacuole sign, and CEA, and compared its performance with a radiomics-based model (ENR-Rad) and a combined model (CM). Our results demonstrated that the radiomics-based model (ENR-Rad) significantly outperformed the clinical feature-based model (Clinic-LR) in the external validation cohort (Test Set B). Furthermore, the combined model (CM) incorporating both radiomic and clinical features achieved the best predictive performance. These findings suggest that radiomic features serve as a powerful complement to traditional clinical factors in improving STAS prediction. The superiority of the CM model further indicates that while radiomic features provide strong predictive capabilities, integrating clinical information can enhance model stability and generalizability. Kaplan-Meier survival analysis revealed that STAS-positive patients had significantly shorter progression-free survival (PFS) than STAS-negative patients, further confirming STAS as a high-risk factor for recurrence. Additionally, when patients were stratified according to STAS status predicted by the radiomics model, those predicted as STAS-positive exhibited worse survival outcomes. This finding suggests that our radiomics model not only serves as a tool for STAS risk assessment but may also function as a surrogate indicator for recurrence risk, thereby assisting in personalized treatment decision-making. Recent studies have employed SHapley Additive exPlanations (SHAP) for feature selection in predicting spread through air spaces (STAS) in lung adenocarcinoma (LUAD), highlighting the significance of multi-resolution texture information and both tumor and peritumoral characteristics [ 37 – 39 ]. In our study, SHAP-based feature selection results showed that wavelet-transformed features played a dominant role, indicating that multi-resolution texture information is crucial for STAS prediction [ 40 , 41 ]. Among the 12 selected features, tumor and peritumoral features contributed equally, suggesting that both intrinsic tumor characteristics and peritumoral tissue properties are essential for STAS identification. Features from peri3, peri6, and peri12 regions were all highly ranked, underscoring the significance of both proximal and distant peritumoral characteristics in STAS prediction. Several GLSZM and GLDM-based features (e.g., SmallAreaHighGrayLevelEmphasis and LargeDependenceLowGrayLevelEmphasis) were among the highest-ranking SHAP features, indicating that gray-level heterogeneity is a key factor in distinguishing STAS-positive cases. The top-ranked feature, wavelet.LLL_glszm_SmallAreaHighGrayLevelEmphasis, represents small high-intensity clusters within the tumor, suggesting that high-density heterogeneous regions are associated with STAS positivity. The second-ranked feature, peri6_wavelet.LHL_glszm_SmallAreaLowGrayLevelEmphasis, derived from the 6mm peritumoral region, describes small low-intensity clusters, emphasizing the role of peritumoral microstructural alterations in STAS dissemination. The integration of SHAP analysis enhances the model's clinical interpretability and provides novel insights into the biological mechanisms of STAS. Based on our findings, we propose a treatment strategy for ≤ 3 cm pulmonary nodules, as illustrated in Fig. 7 . Patients identified as high-risk for STAS through the CM model can undergo adjusted surgical planning preoperatively to reduce local recurrence risk. Additionally, patients confirmed to be STAS-positive via postoperative pathology should receive aggressive adjuvant therapy, including chemotherapy, targeted therapy, or immunotherapy, depending on individual clinical characteristics. Our study has several limitations. First, this is a retrospective study, and while it includes survival analysis, the extended follow-up period limited our ability to perform prospective validation. Second, although this is a multi-center study, the dataset consists exclusively of Chinese northern populations, and given potential population heterogeneity, future studies should incorporate samples from diverse geographic regions for broader validation. Third, radiomic feature extraction is sensitive to imaging parameters and segmentation variability. Despite standardized preprocessing, subtle differences in imaging protocols or manual segmentation procedures may affect feature reproducibility and, consequently, model performance. Therefore, developing automated segmentation models would be beneficial for future applications. Conclusion In this study, we developed a novel radiomics-based machine learning model incorporating both tumor and peritumoral features to preoperatively predict STAS status in stage I LUAD. The model outperformed traditional clinical feature-based models, demonstrating the potential application of radiomics in personalized surgical decision-making. Future prospective validation studies and model optimization will be crucial for the clinical translation of this technology. Declarations Ethics approval and consent to participate All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. This study was approved by the ethical review committee of Tianjin Chest Hospital. Consent for publication Not applicable. Availability of data and materials Since the research data used in this paper are from clinical patients, they will not be directly disclosed for the sake of protecting patient privacy. The imaging data used and/or analysed during the current study are available from the corresponding author on reasonable request. Competing interests The authors declare that they have no competing interests. Funding The research was funded by Tianjin Key Medical Discipline (Specialty) Construction Project(TJYXZDXK-018A). Authors' contributions Conceptualization, Yuhang Wang and Xufeng Liu; Data curation, Yuhang Wang and Xin Li; Formal analysis, Xiaojiang Zhao; Funding acquisition, Xin Li and Daqiang Sun; Investigation, Zixiao Wang and Xufeng Liu; Methodology, Yuhang Wang and Xiaojiang Zhao; Project administration, Daqiang Sun; Resources, Zixiao Wang, Xufeng Liu and Xin Li; Software, Yuhang Wang and Xiaojiang Zhao; Supervision, Daqiang Sun; Writing – original draft, Yuhang Wang; Writing – review & editing, Daqiang Sun. Acknowledgements We thank all authors for their contributions to this manuscript. We also thank OnekeyAI for providing part of the technical support for this study. References Siegel RL, Kratzer TB, Giaquinto AN, Sung H, Jemal A. Cancer statistics, 2025. CA Cancer J Clin . 2025;75(1):10-45. Chansky K, Detterbeck FC, Nicholson AG, et al. The IASLC Lung Cancer Staging Project: External Validation of the Revision of the TNM Stage Groupings in the Eighth Edition of the TNM Classification of Lung Cancer. J Thorac Oncol. 2017;12(7):1109-1121. Hung JJ, Jeng WJ, Hsu WH, et al. Prognostic factors of postrecurrence survival in completely resected stage I non-small cell lung cancer with distant metastasis. Thorax . 2010;65(3):241-245. Travis WD, Brambilla E, Burke AP, Marx A, Nicholson AG. Introduction to The 2015 World Health Organization Classification of Tumors of the Lung, Pleura, Thymus, and Heart. J Thorac Oncol . 2015;10(9):1240-1242. Yanagawa N, Shiono S, Endo M, Ogata SY. Tumor spread through air spaces is a useful predictor of recurrence and prognosis in stage I lung squamous cell carcinoma, but not in stage II and III. Lung Cancer . 2018;120:14-21. Dai C, Xie H, Su H, et al. Tumor Spread through Air Spaces Affects the Recurrence and Overall Survival in Patients with Lung Adenocarcinoma >2 to 3 cm. J Thorac Oncol . 2017;12(7):1052-1060. Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016 Feb;278(2):563-77. Yip SS, Aerts HJ. Applications and limitations of radiomics. Phys Med Biol . 2016;61(13):R150-R166. Wu L, Lou X, Kong N, Xu M, Gao C. Can quantitative peritumoral CT radiomics features predict the prognosis of patients with non-small cell lung cancer? A systematic review [published online ahead of print, 2022 Oct 29]. Eur Radiol. 2022;10.1007/s00330-022-09174-8. Mu W, Jiang L, Zhang J, et al. Non-invasive decision support for NSCLC treatment using PET/CT radiomics. Nat Commun . 2020;11(1):5228. Published 2020 Oct 16. Chen D, She Y, Wang T, et al. Radiomics-based prediction for tumour spread through air spaces in stage I lung adenocarcinoma using machine learning. Eur J Cardiothorac Surg. 2020;58(1):51-58. Liao G, Huang L, Wu S, et al. Preoperative CT-based peritumoral and tumoral radiomic features prediction for tumor spread through air spaces in clinical stage I lung adenocarcinoma. Lung Cancer. 2022;163:87-95. Ding Y, Chen Y, Wen H, et al. Pretreatment prediction of tumour spread through air spaces in clinical stage I non-small-cell lung cancer. Eur J Cardiothorac Surg . 2022;62(3):ezac248. Zhuo Y, Feng M, Yang S, et al. Radiomics nomograms of tumors and peritumoral regions for the preoperative prediction of spread through air spaces in lung adenocarcinoma. Transl Oncol . 2020;13(10):100820. Qi L, Li X, He L, et al. Comparison of Diagnostic Performance of Spread Through Airspaces of Lung Adenocarcinoma Based on Morphological Analysis and Perinodular and Intranodular Radiomic Features on Chest CT Images. Front Oncol . 2021;11:654413. Liao G, Huang L, Wu S, et al. Preoperative CT-based peritumoral and tumoral radiomic features prediction for tumor spread through air spaces in clinical stage I lung adenocarcinoma. Lung Cancer . 2022;163:87-95. Tang X, Huang H, Du P, Wang L, Yin H, Xu X. Intratumoral and peritumoral CT-based radiomics strategy reveals distinct subtypes of non-small-cell lung cancer. J Cancer Res Clin Oncol. 2022;148(9):2247-2260. Ran J, Cao R, Cai J, Yu T, Zhao D, Wang Z. Development and Validation of a Nomogram for Preoperative Prediction of Lymph Node Metastasis in Lung Adenocarcinoma Based on Radiomics Signature and Deep Learning Signature. Front Oncol. 2021;11:585942. Guo QK, Yang HS, Shan SC, et al. A radiomics nomogram prediction for survival of patients with "driver gene-negative" lung adenocarcinomas (LUAD). Radiol Med. 2023;128(6):714-725. Zhao M, Kluge K, Papp L, et al. Multi-lesion radiomics of PET/CT for non-invasive survival stratification and histologic tumor risk profiling in patients with lung adenocarcinoma. Eur Radiol. 2022;32(10):7056-7067. Wu L, Lou X, Kong N, Xu M, Gao C. Can quantitative peritumoral CT radiomics features predict the prognosis of patients with non-small cell lung cancer? A systematic review. Eur Radiol . 2023;33(3):2105-2117. Chen Q, Shao J, Xue T, et al. Intratumoral and peritumoral radiomics nomograms for the preoperative prediction of lymphovascular invasion and overall survival in non-small cell lung cancer [published online ahead of print, 2022 Sep 6]. Eur Radiol . 2022;10.1007/s00330-022-09109-3. Liu K, Li K, Wu T, et al. Improving the accuracy of prognosis for clinical stage I solid lung adenocarcinoma by radiomics models covering tumor per se and peritumoral changes on CT. Eur Radiol. 2022;32(2):1065-1077. Tunali I, Hall LO, Napel S, et al. Stability and reproducibility of computed tomography radiomic features extracted from peritumoral regions of lung cancer lesions. Med Phys . 2019;46(11):5075-5085. Rodríguez-Pérez R, Bajorath J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J Comput Aided Mol Des. 2020;34(10):1013-1026. Zhang J, Bai W, Guo C, et al. Postoperative Short-term Outcomes Between Sublobar Resection and Lobectomy in Patients with Lung Adenocarcinoma. Cancer Manag Res . 2020;12:9485-9493. Altorki NK, Wang X, Wigle D, et al. Perioperative mortality and morbidity after sublobar versus lobar resection for early-stage non-small-cell lung cancer: post-hoc analysis of an international, randomised, phase 3 trial (CALGB/Alliance 140503). Lancet Respir Med . 2018;6(12):915-924. Altorki N, Wang X, Kozono D, et al. Lobar or Sublobar Resection for Peripheral Stage IA Non-Small-Cell Lung Cancer. N Engl J Med . 2023;388(6):489-498. Eguchi T, Kameda K, Lu S, et al. Lobectomy Is Associated with Better Outcomes than Sublobar Resection in Spread through Air Spaces (STAS)-Positive T1 Lung Adenocarcinoma: A Propensity Score-Matched Analysis. J Thorac Oncol . 2019;14(1):87-98. Toki MI, Harrington K, Syrigos KN. The role of spread through air spaces (STAS) in lung adenocarcinoma prognosis and therapeutic decision making. Lung Cancer . 2020;146:127-133. Kadota K, Nitadori JI, Sima CS, et al. Tumor Spread through Air Spaces is an Important Pattern of Invasion and Impacts the Frequency and Location of Recurrences after Limited Resection for Small Stage I Lung Adenocarcinomas. J Thorac Oncol . 2015;10(5):806-814. Yang Y, Xie X, Wang Y, et al. A systematic review and meta-analysis of the influence of STAS on the long-term prognosis of stage I lung adenocarcinoma. Transl Cancer Res . 2021;10(5):2428-2436. Kagimoto A, Tsutani Y, Okada M. Segmentectomy for Spread Through Air Spaces-positive Lung Adenocarcinoma. Ann Thorac Surg . 2022;114(5):1989-1990. Ettinger DS, Wood DE, Aisner DL, et al. NCCN Guidelines® Insights: Non-Small Cell Lung Cancer, Version 2.2023. J Natl Compr Canc Netw . 2023;21(4):340-350. Oncology Society of Chinese Medical Association. Zhonghua Yi Xue Za Zhi . 2024;104(34):3175-3213. O'Brien M, Paz-Ares L, Marreaud S, et al. Pembrolizumab versus placebo as adjuvant therapy for completely resected stage IB-IIIA non-small-cell lung cancer (PEARLS/KEYNOTE-091): an interim analysis of a randomised, triple-blind, phase 3 trial. Lancet Oncol . 2022;23(10):1274-1286. Liu C, Meng A, Xue XQ, et al. Prediction of early lung adenocarcinoma spread through air spaces by machine learning radiomics: a cross-center cohort study. Transl Lung Cancer Res . 2024;13(12):3443-3459. Zhang Z, Zhao Y, Ma YJ, et al. Prediction of STAS in lung adenocarcinoma with nodules ≤ 2 cm using machine learning: a multicenter retrospective study. BMC Cancer . 2025;25(1):417. Published 2025 Mar 7. Ye G, Wu G, Li Y, et al. Advancing presurgical non-invasive spread through air spaces prediction in clinical stage IA lung adenocarcinoma using artificial intelligence and CT signatures. Front Surg . 2025;11:1511024. Published 2025 Jan 14. Jiang Z, Yin J, Han P, et al. Wavelet transformation can enhance computed tomography texture features: a multicenter radiomics study for grade assessment of COVID-19 pulmonary lesions. Quant Imaging Med Surg . 2022;12(10):4758-4770. Huang K, Aviyente S. Wavelet feature selection for image classification. IEEE Trans Image Process . 2008;17(9):1709-1720. Additional Declarations No competing interests reported. Supplementary Files supplementfigure1.pdf Supplementfigure2.pdf supplementfigure3.pdf Cite Share Download PDF Status: Published Journal Publication published 29 Sep, 2025 Read the published version in Cancer Imaging → Version 1 posted Editorial decision: Revision requested 12 Jul, 2025 Reviews received at journal 04 Jul, 2025 Reviewers agreed at journal 04 Jul, 2025 Reviewers invited by journal 04 Apr, 2025 Editor assigned by journal 02 Apr, 2025 Submission checks completed at journal 01 Apr, 2025 First submitted to journal 31 Mar, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6345504","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":447525380,"identity":"ee844fb9-7fee-42fc-8bae-984cecbe55dd","order_by":0,"name":"Yuhang Wang","email":"","orcid":"","institution":"Department of Thoracic, Tianjin Chest Hospital","correspondingAuthor":false,"prefix":"","firstName":"Yuhang","middleName":"","lastName":"Wang","suffix":""},{"id":447525381,"identity":"14055f3d-b6df-4771-8031-7c4cab17fdbe","order_by":1,"name":"Xufeng Liu","email":"","orcid":"","institution":"Department of Cardiothoracic Surgery, Tianjin Binhai New Area Haibin People’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Xufeng","middleName":"","lastName":"Liu","suffix":""},{"id":447525382,"identity":"06401970-1b8b-4216-a14f-7f1bca328918","order_by":2,"name":"Xiaojiang Zhao","email":"","orcid":"","institution":"TianJin Chest Hospital of Tianjin University","correspondingAuthor":false,"prefix":"","firstName":"Xiaojiang","middleName":"","lastName":"Zhao","suffix":""},{"id":447525383,"identity":"5a24b623-896c-4a80-a517-9068bc693570","order_by":3,"name":"Zixiao Wang","email":"","orcid":"","institution":"Department of Thoracic Surgery, Qinhuangdao First Hospital","correspondingAuthor":false,"prefix":"","firstName":"Zixiao","middleName":"","lastName":"Wang","suffix":""},{"id":447525384,"identity":"0c2e4e43-2656-4bd6-8973-32c1254f7ff8","order_by":4,"name":"Xin Li","email":"","orcid":"","institution":"Department of Thoracic, Tianjin Chest Hospital","correspondingAuthor":false,"prefix":"","firstName":"Xin","middleName":"","lastName":"Li","suffix":""},{"id":447525385,"identity":"6b474ccd-9b26-467a-b8a2-03e216cd5dd4","order_by":5,"name":"Daqiang Sun","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAtUlEQVRIiWNgGAWjYFCCBAYGngoGZhBTggQtZwxI1cLbZsBAvBb59uStG97O+8NucID54G0eBrs8gloMzjwruzl3mwGzwQG2ZGsehuRiwlokcsxu84K18JhJ8zAcSGwg6LAZIC1zQFr4vxGnheEGSEsD2BY24rSA/TLnmDGz5GE2Y8s5BslEOKw9eduNNzVyyXzHmx/eeFNhR4TDgBaBiGRI/BsQoR6mzI44taNgFIyCUTAiAQDymTj+vTan4gAAAABJRU5ErkJggg==","orcid":"","institution":"Department of Thoracic, Tianjin Chest Hospital","correspondingAuthor":true,"prefix":"","firstName":"Daqiang","middleName":"","lastName":"Sun","suffix":""}],"badges":[],"createdAt":"2025-03-31 14:08:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6345504/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6345504/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s40644-025-00935-4","type":"published","date":"2025-09-29T15:57:42+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":82077834,"identity":"a3ab32b7-0d0f-419c-b608-40c2f588655a","added_by":"auto","created_at":"2025-05-06 14:06:16","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":158337,"visible":true,"origin":"","legend":"\u003cp\u003eOverview of the data processing, feature selection, model development, and evaluation workflow. Data sources include three medical centers: Center A (2015-2018) for training set (n=226), Center A (2019) for test A (n=306), and Centers B and C for test B (n=77). Radiomics features selection utilizes the top 10 most important variables from random forest ranking. The 237 model combinations were developed using 15 machine learning algorithms. The final model, ENR-Rad, was evaluated using ROC curves (Test B: AUC Clinic-LR=0.567, ENR-Rad=0.807, Combined=0.861) and Decision Curve Analysis. \u003cstrong\u003eSTAS\u003c/strong\u003e: Spread Through Air Spaces; \u003cstrong\u003eCTR\u003c/strong\u003e: Consolidation-to-Tumor Ratio; \u003cstrong\u003eCEA\u003c/strong\u003e: Carcinoembryonic Antigen; \u003cstrong\u003eClinic-LR\u003c/strong\u003e: Clinical Logistic Regression Model;\u003cstrong\u003e ENR-rad\u003c/strong\u003e: Elastic Net Regression-Radiomics Model; \u003cstrong\u003eCB\u003c/strong\u003e: Combined Model; \u003cstrong\u003eROC\u003c/strong\u003e: Receiver Operating Characteristic\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/341964634fd25264c8a577ca.jpeg"},{"id":82079999,"identity":"9b80ed49-0b17-4290-930b-fcee0391d583","added_by":"auto","created_at":"2025-05-06 14:22:16","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":240574,"visible":true,"origin":"","legend":"\u003cp\u003eFlow diagram of patient selection. Patients with primary invasive lung adenocarcinoma were retrospectively recruited from multiple centers. From Tianjin Chest Hospital between January 2015 and December 2018 (Center A), 226 patients were finally included for model development (training set). In addition, 306 patients from the same centers between January 2019 and December 2019 (Center A) were selected as Test A, and 77 patients from Tianjin Jinnan Hospital and Qinhuangdao First Hospital between January 2019 and December 2020 (Centers B and C) were included as Test B.\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/256fde23a7e7207fa4ac57df.jpeg"},{"id":82080001,"identity":"4dbf763a-82df-4180-894e-3d71ea0faa33","added_by":"auto","created_at":"2025-05-06 14:22:16","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":486311,"visible":true,"origin":"","legend":"\u003cp\u003eExamples diagram of the region of interest is sketched. A. ITK-SNAP was used to delineate the region of interest, and the peritumoral area of 3mm, 6mm, and 12mm was expanded, and the radiomic features of the tumor and peritumoral were extracted, respectively. B. For areas beyond the lung parenchyma, manually wipe off the excess portion.\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/6a019e2dc94be2c98ce52557.jpeg"},{"id":82081229,"identity":"a0f588b7-1abd-4bef-b328-3afcd43901f2","added_by":"auto","created_at":"2025-05-06 14:30:16","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":206320,"visible":true,"origin":"","legend":"\u003cp\u003eRandom forest-based feature selection for radiomics. The figure displays the top 10 features ranked by importance for each region of interest: peritumoral regions at 3 mm, 6 mm, and 12 mm from the tumor boundary, as well as for the tumoral region. For each panel, the horizontal bars represent the importance scores derived from the random forest algorithm, and the vertical axis lists the feature names (including various wavelet, log-sigma, and original shape/first-order metrics).\u003c/p\u003e","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/8867addbb317f815ca344c89.jpeg"},{"id":82079517,"identity":"3b6b91e9-9d4c-4ae3-a873-627672a9da2d","added_by":"auto","created_at":"2025-05-06 14:14:16","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":890881,"visible":true,"origin":"","legend":"\u003cp\u003eModel construction and evaluation. A: By combining 15 machine learning methods and adjusting parameters, a total of 237 parameter optimization models were constructed, and their accuracy on the training set and test set was shown as the basis for selecting the model. Finally, based on the average accuracy of the model in the three data sets, ENR−CV:10 fold (cutoff:0.5, alpha:0.9) was selected and named ENR-Rad; B: ROC curve of ENR-Rad in the training set and test set; C: The ENR-Rad model is compared with the LR model based only on clinical characteristics, and then a Combined Model combining the ENR-Rad and clinical models is constructed. The ROC curve is drawn based on the training set data. It can be seen that CM has the best AUC (0.834); D: The DCA curves of the three models are drawn based on the training set data, with a bootstrap of 50. It can be seen that CM has the best clinical benefit rate. E: Finally, 12 meaningful features were screened out in the ENR-Rad model, and SHAP interpretability analysis was performed on these features, and a SHAP importance bar chart was drawn based on the SHAP value; F: Single sample SHAP force diagram. \u003cstrong\u003e\u0026nbsp;STAS\u003c/strong\u003e: Spread Through Air Spaces; \u003cstrong\u003eClinic-LR\u003c/strong\u003e: Clinical Logistic Regression Model;\u003cstrong\u003eENR-rad\u003c/strong\u003e: Elastic Net Regression-Radiomics Model; \u003cstrong\u003eCB\u003c/strong\u003e: Combined Model; \u003cstrong\u003eROC\u003c/strong\u003e: Receiver Operating Characteristic; \u003cstrong\u003eAUC\u003c/strong\u003e: Area Under the Curve;\u003c/p\u003e","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/24f5b2561179b4763bff0977.jpeg"},{"id":82077838,"identity":"587f7288-7396-46da-8a72-18a336669bd9","added_by":"auto","created_at":"2025-05-06 14:06:16","extension":"jpeg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":212617,"visible":true,"origin":"","legend":"\u003cp\u003eEvaluate the prognostic relevance of ENR-Rad. A: Recurrence-free survival of the STAS positive group and negative group in the training set; B: Recurrence-free survival of the STAS positive group and negative group predicted by ENR-Rad in the training set; C: Recurrence-free survival of the STAS positive group and negative group in the testB; D: Recurrence-free survival of the STAS positive group and negative group predicted by ENR-Rad in the testB. \u003cstrong\u003eSTAS\u003c/strong\u003e: Spread Through Air Spaces; \u003cstrong\u003eENR-rad\u003c/strong\u003e: Elastic Net Regression-Radiomics Model;\u003c/p\u003e","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/82d4d553de0310e0253b2029.jpeg"},{"id":82079518,"identity":"120c136e-8182-44fd-b5bc-9413d7c58867","added_by":"auto","created_at":"2025-05-06 14:14:16","extension":"jpeg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":141035,"visible":true,"origin":"","legend":"\u003cp\u003eA new treatment strategy for ≤3 cm pulmonary nodules based on Combined Model. For patients with intraoperative frozen pathology considered invasive carcinoma, preoperative STAS evaluation should be performed by the model first, and sublobectomy can be performed if STAS is considered negative. If STAS is considered positive, lobectomy should be performed, and adjuvant therapy should be given postoperatively.\u003cstrong\u003eSTAS\u003c/strong\u003e: Spread Through Air Spaces;\u003c/p\u003e","description":"","filename":"floatimage7.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/53aa6123087a113d9b0d808d.jpeg"},{"id":92883996,"identity":"363ce158-c062-4d76-a6f4-eeb91833a2a0","added_by":"auto","created_at":"2025-10-06 16:11:55","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3236014,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/585b6810-9836-4058-9bf2-67fc36708f49.pdf"},{"id":82077874,"identity":"ce2250dd-87e9-4810-bf54-50955c142c8f","added_by":"auto","created_at":"2025-05-06 14:06:18","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":19349094,"visible":true,"origin":"","legend":"","description":"","filename":"supplementfigure1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/92815627283d14806e0a9e57.pdf"},{"id":82077832,"identity":"57692c3c-40ed-45c2-832f-136e2f47b0ee","added_by":"auto","created_at":"2025-05-06 14:06:16","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":24750,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementfigure2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/385f58e094b3e1db32c1c300.pdf"},{"id":82077842,"identity":"292e9c16-9d95-4910-8649-9830dbc0aa70","added_by":"auto","created_at":"2025-05-06 14:06:16","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":533761,"visible":true,"origin":"","legend":"","description":"","filename":"supplementfigure3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6345504/v1/b6b72f3b5cd12d98ad9bf67a.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study","fulltext":[{"header":"Introduction","content":"\u003cp\u003eLung cancer remains the leading cause of cancer-related mortality worldwide [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Among its pathological subtypes, lung adenocarcinoma (LUAD) is the most common, and research on its early detection and treatment has been ongoing [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. With advancements in medical imaging, an increasing number of early-stage LUAD cases are being detected and treated through low-dose computed tomography (CT) screening. Despite complete surgical resection being the primary treatment for stage I LUAD, studies have shown that recurrence rates range from 20\u0026ndash;50%, even after curative surgery [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Therefore, identifying high-risk patients at an early stage is crucial for improving patient prognosis and guiding treatment strategies.\u003c/p\u003e \u003cp\u003eIn the 2015 WHO classification of lung cancer, the concept of spread through air spaces (STAS) was introduced. STAS is defined as the presence of tumor cells spreading beyond the tumor margin into the alveolar spaces in the form of micropapillary clusters, solid nests, or single cells [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Subsequent studies have demonstrated that STAS is a significant risk factor for recurrence in patients with stage I LUAD after surgical resection [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. However, STAS is typically diagnosed postoperatively through pathological examination, limiting its utility for preoperative treatment planning. Therefore, a reliable preoperative STAS prediction model is essential for identifying high-risk patients and tailoring surgical and adjuvant treatment strategies.\u003c/p\u003e \u003cp\u003eWith the rapid development of machine learning and artificial intelligence in the medical field, researchers have begun extracting quantitative features from medical images to assist in disease diagnosis and prognostication. Radiomics is a powerful approach that aims to transform standard imaging data into high-dimensional quantitative features, capturing complex tumor characteristics beyond human visual perception [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Previous studies have demonstrated that both tumoral and peritumoral CT radiomics features are valuable in assisting the diagnosis, prognostication, and histological classification of LUAD [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. However, existing studies on preoperative STAS prediction have primarily focused on either tumoral or peritumoral radiomics features, and few have comprehensively integrated these modalities to improve predictive performance [\u003cspan additionalcitationids=\"CR10 CR11\" citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIn our previous research, we developed a clinical model for preoperative STAS prediction in stage I non-small cell lung cancer (NSCLC) patients based on demographic and imaging characteristics, such as tumor size, spiculation, vacuole sign, and carcinoembryonic antigen (CEA) levels [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Other studies have constructed STAS prediction models based on traditional or deep learning radiomics features [\u003cspan additionalcitationids=\"CR15\" citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. However, most of these studies were limited to single-center datasets, and their model generalizability remains uncertain.\u003c/p\u003e \u003cp\u003eTo address these limitations, we conducted a multicenter retrospective cohort study to develop and validate a multimodal preoperative STAS risk prediction model incorporating both tumoral and peritumoral radiomics features. We employed a multi-machine learning approach, integrating various algorithms\u0026mdash;including LASSO, support vector machine, random forest, gradient boosting, elastic net regression, and neural networks\u0026mdash;to optimize model performance. We combined radiomics features with key clinical variables to construct a hybrid model, enhancing predictive accuracy. Our dataset was derived from three independent medical centers, allowing us to assess model robustness and generalizability. We believe that our findings will contribute to improved risk stratification and personalized treatment strategies for early-stage LUAD patients.\u003c/p\u003e \u003cp\u003eThe technical workflow and main findings of this study are summarized in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eIn this multi-center retrospective cohort study, we developed and validated a multimodal preoperative STAS risk prediction model based on radiomics features extracted from preoperative CT, including both tumoral and peritumoral radiomics features. This model aims to assist clinicians in the early identification of high-recurrence-risk stage I LUAD patients, enabling timely adjustments to treatment strategies.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eClinical Data Collection and Follow-Up\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn this multi-center cohort study, we included patients with pathological stage I lung adenocarcinoma (LUAD) who had undergone complete surgical resection. The study samples were collected from three centers. Specifically, patients who underwent surgery at Tianjin Chest Hospital (Center A) between January 1, 2015, and December 31, 2018, were assigned to the training set. Patients who underwent surgery at the same institution from January 1, 2019, to December 31, 2019, were included in test set A. Additionally, patients who underwent surgery at Tianjin Binhai New Area Haibin People\u0026rsquo;s Hospital (Center B) and Qinhuangdao First Hospital (Center C) between January 1, 2019, and December 31, 2020, were included in test set B. The detailed inclusion and exclusion criteria are illustrated in Figure 2.\u003c/p\u003e\n\u003cp\u003eIn the training set, STAS status was re-evaluated based on pathological slides (Supplement figure1), whereas in Test Set A and Test Set B, the STAS status was obtained from pathology reports. Based on our previous research, several clinical features were identified as significantly associated with STAS status in stage I non-small cell lung cancer (NSCLC), including maximum tumor diameter (Tdmax), consolidation-to-tumor ratio (CTR), spiculation, vacuole sign, and carcinoembryonic antigen (CEA) levels. Therefore, only these five clinical features were incorporated to compare the performance of the radiomics-based STAS risk prediction model with a clinical feature-based model. These clinical features were collected only in the training set and test set B.\u003c/p\u003e\n\u003cp\u003ePatients from Tianjin Chest Hospital (Center A) were followed up for prognosis assessment, primarily evaluating tumor progression. The last follow-up date was January 1, 2025.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eImage Acquisition and Preprocessing\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePreoperative thin-slice chest CT scans of patients were retrieved from the Picture Archiving and Communication System (PACS) in Digital Imaging and Communications in Medicine (DICOM) format.\u003c/p\u003e\n\u003cp\u003eGiven the multi-center nature of this study and variations in CT scanners, all CT images were resampled to standardize voxel size to 1 mm \u0026times; 1 mm \u0026times; 1 mm, ensuring a uniform slice thickness of 1 mm across all scans. Additionally, window width and window level were standardized to 1600 HU and -500 HU, respectively. ITK-SNAP open-source software was used for CT image parameter adjustments, while Python was employed for batch processing.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eROI Delineation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFollowing CT image preprocessing, manual delineation of the regions of interest (ROIs) was performed (Figure 3A). Based on the tumoral ROI (ROI-tumoral), the peritumoral ROI (ROI-peritumoral) was generated by outward expansion. After reviewing the literature [17-23], we selected peritumoral regions extending 3, 6, and 12 voxel units from the tumor boundary (Figure 3A). To prevent extraction errors, areas of the peritumoral ROIs that extended beyond the lung parenchyma were removed [24] (Figure 3B). Both the manual delineation and region expansion were executed using ITK-SNAP.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRadiomics Feature Extraction and Selection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRadiomics features were extracted from CT images using the Pyradiomics package in Python from both tumoral and peritumoral regions. The extracted features included first-order features, shape features (2D and 3D), and gray-level features\u0026mdash;namely, gray-level co-occurrence matrix (GLCM), gray-level size zone matrix (GLSZM), gray-level run length matrix (GLRLM), neighboring gray tone difference matrix (NGTDM), and gray-level dependence matrix (GLDM)\u0026mdash;in addition to wavelet features.\u003c/p\u003e\n\u003cp\u003eFor the four feature groups, an initial screening was performed in R using Spearman correlation analysis, whereby one feature from each pair with a correlation coefficient greater than 0.9 was randomly excluded. Subsequently, a random forest regression algorithm was applied to rank the remaining features by variable importance. The top 10 STAS-related features from each feature group were selected, resulting in a total of 40 features that were used for subsequent model development. The workflow for radiomics feature extraction and selection is illustrated in Figure 3C.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMachine Learning Model Development\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe employed 15 machine learning algorithms, including LASSO, support vector machine, random forest, gradient boosting, elastic net regression, and neural networks. By optimizing hyperparameters and combining different models, we developed multiple radiomics models for STAS prediction. Additionally, we constructed a clinical prediction model for STAS risk (Clinic-LR) using five clinical features: Tdmax, CTR, spiculation, vacuole sign, and CEA. Furthermore, we integrated the radiomics-predicted STAS label with these clinical features to form a combined logistic regression model (Combined Model) for comparison. Finally, the models were evaluated in both the training and test sets in terms of accuracy, discrimination ability, and clinical benefit.\u0026nbsp;SHapley Additive exPlanations (\u003cstrong\u003eSHAP\u003c/strong\u003e) explainability analysis is used to interpret machine learning models [25].\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cstrong\u003eBaseline Characteristics of Enrolled Patients\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTable 1 summarizes the data of 609 patients included in this study. Among these, 226 patients from Center A (2015\u0026ndash;2018) were used as the training set, for which both imaging features and prognostic follow-up were collected, the median follow-up was 2330 days, with a maximum follow-up of 10 years. In contrast, 306 patients from Center A (2019) formed Test Set A, with only prognostic follow-up data available, the median follow-up was 1940 days. Additionally, 77 patients from Centers B and C comprised Test Set B, where imaging features were collected without prognostic follow-up. Missing values in CEA were imputed using the median value.\u003c/p\u003e\n\u003cp\u003eTable 1 Baseline characteristics of patients in the model development cohort\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\"\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" style=\"width: 285px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTrain set\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 270px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTest setA\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 278px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTest setB\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNegative\u003cbr\u003e\u0026nbsp;(N=133)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePositive\u003cbr\u003e\u0026nbsp;(N=93)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAll\u003cbr\u003e\u0026nbsp;(N=226)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNegative\u003cbr\u003e\u0026nbsp;(N=183)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePositive\u003cbr\u003e\u0026nbsp;(N=123)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAll\u003cbr\u003e\u0026nbsp;(N=306)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNegative\u003cbr\u003e\u0026nbsp;(N=47)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePositive\u003cbr\u003e\u0026nbsp;(N=30)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAll\u003cbr\u003e\u0026nbsp;(N=77)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTdmax_cm\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e1.96 (0.747)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e2.45 (0.812)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e2.16 (0.810)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.71 (0.675)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.89 (0.619)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.78 (0.656)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMedian [Min, Max]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e1.87 [0.700, 4.50]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e2.37 [0.990, 5.28]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e2.07 [0.700, 5.28]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.70 [0.730, 3.31]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.91 [0.770, 3.10]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1.80 [0.730, 3.31]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCTR\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e0.614 (0.310)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e0.769 (0.220)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e0.678 (0.286)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.478 (0.439)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.556 (0.407)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.508 (0.426)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMedian [Min, Max]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e0.638 [0, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e0.806 [0.136, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e0.707 [0, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.400 [0, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.480 [0, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e0.450 [0, 1.00]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003espiculation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eNo\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e69 (51.9%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e26 (28.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e95 (42.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e36 (76.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e8 (26.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e44 (57.1%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eYes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e64 (48.1%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e67 (72.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e131 (58.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e11 (23.4%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e22 (73.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e33 (42.9%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003evacuole\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eNo\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e90 (67.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e56 (60.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e146 (64.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e44 (93.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e19 (63.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e63 (81.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eYes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e43 (32.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e37 (39.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e80 (35.4%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e3 (6.4%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e11 (36.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e14 (18.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCEA\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e3.36 (3.80)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e5.06 (7.58)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e4.06 (5.71)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e2.37 (1.43)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e4.81 (4.83)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e3.32 (3.40)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMedian [Min, Max]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e2.80 [0.210, 34.0]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e2.81 [0.760, 54.9]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e2.81 [0.210, 54.9]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e2.20 [0.660, 8.50]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e3.04 [0.720, 25.5]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e2.56 [0.660, 25.5]\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePFS.status\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eNo-progress\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e122 (91.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e57 (61.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e179 (79.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e176 (96.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e107 (87.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e283 (92.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eProgress\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e11 (8.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e36 (38.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e47 (20.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e7 (3.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e16 (13.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e23 (7.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePFS.time\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e2570 (482)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e2000 (895)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e2330 (736)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e1990 (184)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1870 (334)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1940 (262)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003eMedian [Min, Max]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 96px;\"\u003e\n \u003cp\u003e2490 [752, 3650]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 89px;\"\u003e\n \u003cp\u003e2340 [246, 3460]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 100px;\"\u003e\n \u003cp\u003e2440 [246, 3650]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 85px;\"\u003e\n \u003cp\u003e2010 [943, 2190]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1950 [695, 2190]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e1990 [695, 2190]\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 93px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eTdmax: Tumor diameter max; CTR: Consolidation-to-tumour ratio; CEA: Carcinoembryonic antigen; PFS: Progression free survival.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFeature extraction and selection\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor each ROI, Pyradiomics extracted 1,133 features. After Spearman correlation analysis, 258 features from ROI-tumoral, 232 from ROI-peritumor_3mm, 236 from ROI-peritumor_6mm, and 236 from ROI-peritumor_12mm were retained. Random forest was then used for further feature selection within each ROI, retaining the top 10 features based on variable importance ranking (Figure 4). Ultimately, these 40 selected features were incorporated into the radiomics model.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eModel Development and Evaluation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA total of 237 optimized models were constructed. Based on the average accuracy across the training set and two test sets (Figure 5A, Supplementary figure 2), the best-performing model was selected: a 10-fold cross-validated elastic net regression model (ENR\u0026minus;CV: 10-fold, cutoff = 0.5, alpha = 0.9), referred to as \u003cstrong\u003eENR-Rad\u003c/strong\u003e. The accuracy of ENR-Rad was 0.801 in the training set, 0.866 in Validation Set A, and 0.831 in Validation Set B. The receiver operating characteristic (ROC) curves and corresponding area under the curve (AUC) values were calculated for all three datasets (train = 0.791, test A = 0.829, test B = 0.807) (Figure 5B).\u003c/p\u003e\n\u003cp\u003eAdditionally, a logistic regression model based on clinical imaging features (\u003cstrong\u003eClinic-LR\u003c/strong\u003e) was developed for comparison. The diagnostic performance of ENR-Rad, Clinic-LR, and a combined model (\u003cstrong\u003eCM\u003c/strong\u003e) integrating ENR-Rad outcome with clinical features was evaluated (Figure 5C). In the training set, ENR-Rad (AUC = 0.791) demonstrated higher diagnostic performance than Clinic-LR (AUC = 0.738), though the difference was not statistically significant (DeLong test \u003cem\u003ep\u003c/em\u003e = 0.1246). However, CM (AUC = 0.834) exhibited significantly better performance than ENR-Rad (DeLong test \u003cem\u003ep\u003c/em\u003e = 0.0089). The superiority of the radiomics-based model was further highlighted in Test set B (Figure 1), where ENR-Rad (AUC = 0.807) significantly outperformed Clinic-LR (AUC = 0.689) (DeLong test \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.05). Additionally, CM (AUC = 0.894) showed a significant improvement over ENR-Rad (DeLong test \u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001).\u003c/p\u003e\n\u003cp\u003eDecision curve analysis (DCA) in the training set indicated that CM provided the highest net clinical benefit, followed by ENR-Rad, with Clinic-LR ranking last (Figure 5D). A similar trend was observed in test set B (Figure 1).\u003c/p\u003e\n\u003cp\u003e\u003cbr\u003e\u003cstrong\u003eSHAP Interpretability Analysis\u003c/strong\u003e\u003cbr\u003e\u0026nbsp;In the ENR-Rad model, only 12 features are actually retained when lambda = lambda.min, i.e., these 12 features really play a role in the model (Supplementary Figure 3). SHAP interpretability analysis of ENR-Rad showed that wavelet.LLL_glszm_SmallAreaHighGrayLevelEmphasis made the most contributions to the model (Figure 5E). Figure 5F is a single-sample SHAP force plot that shows the extent to which each feature contributes to the model. Blue indicates positive contribution and red indicates negative contribution. The baseline expected value E[f(x)]=0.412 is the overall prediction mean of the model. The current sample prediction value f(x)=0.025 is significantly lower than the baseline, indicating that its feature combination tends to suppress the prediction result, that is, STAS negative.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003ePrognostic Relevance of ENR-Rad\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAs a newly identified pathway of lung cancer dissemination, STAS has been proven to be closely associated with tumor prognosis. To evaluate the prognostic value of ENR-Rad, Kaplan-Meier survival curves were plotted to compare progression-free survival (PFS) between high-risk and low-risk groups predicted by the model.\u003c/p\u003e\n\u003cp\u003eIn the training set, STAS-positive patients exhibited significantly worse PFS than STAS-negative patients (\u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001) (Figure 6A). Similarly, patients predicted by ENR-Rad to be at high risk for STAS had significantly poorer PFS than those predicted to be at low risk (\u003cem\u003ep\u003c/em\u003e = 0.011) (Figure 6B). This finding was further validated in Test B, where STAS-negative patients had a significantly longer PFS (\u003cem\u003ep\u003c/em\u003e = 0.002) (Figure 6C), and those classified as low risk by ENR-Rad also showed a prolonged PFS (\u003cem\u003ep\u003c/em\u003e \u0026lt; 0.001) (Figure 6D). These results reinforce the potential clinical utility of the ENR-Rad model, offering valuable insights for improving diagnostic accuracy and personalizing treatment strategies.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eStudies in recent years have shown that sublobectomy has disease-free survival and overall survival that are not inferior to lobectomy for hilar and mediastinal lymph node-negative early-stage lung cancer with a diameter of less than 2 cm [\u003cspan additionalcitationids=\"CR27\" citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. Currently, there are no clear international guidelines for surgical strategies for adenocarcinoma of the lung (LUAD) nodules\u0026thinsp;\u0026le;\u0026thinsp;3 cm in diameter and positive STAS (airway spread). However, several studies have suggested that the option of lobectomy may provide a better prognosis for patients with STAS-positive early-stage lung adenocarcinoma [\u003cspan additionalcitationids=\"CR30 CR31 CR32\" citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eRegarding postoperative adjuvant therapy, the 2023 NCCN guidelines recommend adjuvant chemotherapy as a standard treatment for early-stage NSCLC patients with high-risk factors, such as highly invasive tumors [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. According to the Guidelines for Clinical Diagnosis and Treatment of Lung Cancer issued by the Chinese Medical Association in 2024 [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], adjuvant chemotherapy is recommended for patients with early-stage non-small cell lung cancer (NSCLC) with high-risk factors, including STAS-positive patients. In certain clinical trials, such as the KEYNOTE-091 trial [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e], immune checkpoint inhibitors (ICIs) have been shown to significantly improve disease-free survival in early-stage (IB) high-risk NSCLC patients. Sum up, STAS-positive patients may derive additional clinical benefits from immunotherapy. However, there are currently no clinical trials that specifically target STAS-positive patients as a separate subgroup, highlighting the need for further investigation.\u003c/p\u003e \u003cp\u003eBased on our previous research [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], we developed a clinical feature-based model (Clinic-LR) using five clinical factors: Tdmax, CTR, burr sign, vacuole sign, and CEA, and compared its performance with a radiomics-based model (ENR-Rad) and a combined model (CM). Our results demonstrated that the radiomics-based model (ENR-Rad) significantly outperformed the clinical feature-based model (Clinic-LR) in the external validation cohort (Test Set B). Furthermore, the combined model (CM) incorporating both radiomic and clinical features achieved the best predictive performance. These findings suggest that radiomic features serve as a powerful complement to traditional clinical factors in improving STAS prediction. The superiority of the CM model further indicates that while radiomic features provide strong predictive capabilities, integrating clinical information can enhance model stability and generalizability.\u003c/p\u003e \u003cp\u003eKaplan-Meier survival analysis revealed that STAS-positive patients had significantly shorter progression-free survival (PFS) than STAS-negative patients, further confirming STAS as a high-risk factor for recurrence. Additionally, when patients were stratified according to STAS status predicted by the radiomics model, those predicted as STAS-positive exhibited worse survival outcomes. This finding suggests that our radiomics model not only serves as a tool for STAS risk assessment but may also function as a surrogate indicator for recurrence risk, thereby assisting in personalized treatment decision-making.\u003c/p\u003e \u003cp\u003eRecent studies have employed SHapley Additive exPlanations (SHAP) for feature selection in predicting spread through air spaces (STAS) in lung adenocarcinoma (LUAD), highlighting the significance of multi-resolution texture information and both tumor and peritumoral characteristics [\u003cspan additionalcitationids=\"CR38\" citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]. In our study, SHAP-based feature selection results showed that wavelet-transformed features played a dominant role, indicating that multi-resolution texture information is crucial for STAS prediction [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e, \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. Among the 12 selected features, tumor and peritumoral features contributed equally, suggesting that both intrinsic tumor characteristics and peritumoral tissue properties are essential for STAS identification. Features from peri3, peri6, and peri12 regions were all highly ranked, underscoring the significance of both proximal and distant peritumoral characteristics in STAS prediction. Several GLSZM and GLDM-based features (e.g., SmallAreaHighGrayLevelEmphasis and LargeDependenceLowGrayLevelEmphasis) were among the highest-ranking SHAP features, indicating that gray-level heterogeneity is a key factor in distinguishing STAS-positive cases. The top-ranked feature, wavelet.LLL_glszm_SmallAreaHighGrayLevelEmphasis, represents small high-intensity clusters within the tumor, suggesting that high-density heterogeneous regions are associated with STAS positivity. The second-ranked feature, peri6_wavelet.LHL_glszm_SmallAreaLowGrayLevelEmphasis, derived from the 6mm peritumoral region, describes small low-intensity clusters, emphasizing the role of peritumoral microstructural alterations in STAS dissemination. The integration of SHAP analysis enhances the model's clinical interpretability and provides novel insights into the biological mechanisms of STAS.\u003c/p\u003e \u003cp\u003eBased on our findings, we propose a treatment strategy for \u0026le;\u0026thinsp;3 cm pulmonary nodules, as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e. Patients identified as high-risk for STAS through the CM model can undergo adjusted surgical planning preoperatively to reduce local recurrence risk. Additionally, patients confirmed to be STAS-positive via postoperative pathology should receive aggressive adjuvant therapy, including chemotherapy, targeted therapy, or immunotherapy, depending on individual clinical characteristics.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eOur study has several limitations. First, this is a retrospective study, and while it includes survival analysis, the extended follow-up period limited our ability to perform prospective validation. Second, although this is a multi-center study, the dataset consists exclusively of Chinese northern populations, and given potential population heterogeneity, future studies should incorporate samples from diverse geographic regions for broader validation. Third, radiomic feature extraction is sensitive to imaging parameters and segmentation variability. Despite standardized preprocessing, subtle differences in imaging protocols or manual segmentation procedures may affect feature reproducibility and, consequently, model performance. Therefore, developing automated segmentation models would be beneficial for future applications.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eIn this study, we developed a novel radiomics-based machine learning model incorporating both tumor and peritumoral features to preoperatively predict STAS status in stage I LUAD. The model outperformed traditional clinical feature-based models, demonstrating the potential application of radiomics in personalized surgical decision-making. Future prospective validation studies and model optimization will be crucial for the clinical translation of this technology.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003e\u003cem\u003eEthics approval and consent to participate\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. This study was approved by the ethical review committee of Tianjin Chest Hospital.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eConsent for publication\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAvailability of data and materials\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSince the research data used in this paper are from clinical patients, they will not be directly disclosed for the sake of protecting patient privacy. The imaging data used and/or analysed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eCompeting interests\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eFunding\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe research was funded by Tianjin Key Medical Discipline (Specialty) Construction Project(TJYXZDXK-018A).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAuthors\u0026apos; contributions\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization, Yuhang Wang and Xufeng Liu; Data curation, Yuhang Wang and Xin Li; Formal analysis, Xiaojiang Zhao; Funding acquisition, Xin Li and Daqiang Sun; Investigation, Zixiao Wang and Xufeng Liu; Methodology, Yuhang Wang and Xiaojiang Zhao; Project administration, Daqiang Sun; Resources, Zixiao Wang, Xufeng Liu and Xin Li; Software, Yuhang Wang and Xiaojiang Zhao; Supervision, Daqiang Sun; Writing \u0026ndash; original draft, Yuhang Wang; Writing \u0026ndash; review \u0026amp; editing, Daqiang Sun.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAcknowledgements\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank all authors for their contributions to this manuscript. We also thank OnekeyAI for providing part of the technical support for this study.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eSiegel RL, Kratzer TB, Giaquinto AN, Sung H, Jemal A. Cancer statistics, 2025. \u003cem\u003eCA Cancer J Clin\u003c/em\u003e. 2025;75(1):10-45.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eChansky K, Detterbeck FC, Nicholson AG, et al. The IASLC Lung Cancer Staging Project: External Validation of the Revision of the TNM Stage Groupings in the Eighth Edition of the TNM Classification of Lung Cancer. J Thorac Oncol. 2017;12(7):1109-1121.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eHung JJ, Jeng WJ, Hsu WH, et al. Prognostic factors of postrecurrence survival in completely resected stage I non-small cell lung cancer with distant metastasis. \u003cem\u003eThorax\u003c/em\u003e. 2010;65(3):241-245.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eTravis WD, Brambilla E, Burke AP, Marx A, Nicholson AG. Introduction to The 2015 World Health Organization Classification of Tumors of the Lung, Pleura, Thymus, and Heart. \u003cem\u003eJ Thorac Oncol\u003c/em\u003e. 2015;10(9):1240-1242.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eYanagawa N, Shiono S, Endo M, Ogata SY. Tumor spread through air spaces is a useful predictor of recurrence and prognosis in stage I lung squamous cell carcinoma, but not in stage II and III. \u003cem\u003eLung Cancer\u003c/em\u003e. 2018;120:14-21.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eDai C, Xie H, Su H, et al. Tumor Spread through Air Spaces Affects the Recurrence and Overall Survival in Patients with Lung Adenocarcinoma \u0026gt;2 to 3 cm. \u003cem\u003eJ Thorac Oncol\u003c/em\u003e. 2017;12(7):1052-1060.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eGillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016 Feb;278(2):563-77.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eYip SS, Aerts HJ. Applications and limitations of radiomics. \u003cem\u003ePhys Med Biol\u003c/em\u003e. 2016;61(13):R150-R166.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eWu L, Lou X, Kong N, Xu M, Gao C. Can quantitative peritumoral CT radiomics features predict the prognosis of patients with non-small cell lung cancer? A systematic review [published online ahead of print, 2022 Oct 29]. Eur Radiol. 2022;10.1007/s00330-022-09174-8.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eMu W, Jiang L, Zhang J, et al. Non-invasive decision support for NSCLC treatment using PET/CT radiomics. \u003cem\u003eNat Commun\u003c/em\u003e. 2020;11(1):5228. Published 2020 Oct 16.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eChen D, She Y, Wang T, et al. Radiomics-based prediction for tumour spread through air spaces in stage I lung adenocarcinoma using machine learning. Eur J Cardiothorac Surg. 2020;58(1):51-58.\u003c/li\u003e\n \u003cli\u003eLiao G, Huang L, Wu S, et al. Preoperative CT-based peritumoral and tumoral radiomic features prediction for tumor spread through air spaces in clinical stage I lung adenocarcinoma. Lung Cancer. 2022;163:87-95.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eDing Y, Chen Y, Wen H, et al. Pretreatment prediction of tumour spread through air spaces in clinical stage I non-small-cell lung cancer. \u003cem\u003eEur J Cardiothorac Surg\u003c/em\u003e. 2022;62(3):ezac248.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eZhuo Y, Feng M, Yang S, et al. Radiomics nomograms of tumors and peritumoral regions for the preoperative prediction of spread through air spaces in lung adenocarcinoma. \u003cem\u003eTransl Oncol\u003c/em\u003e. 2020;13(10):100820.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eQi L, Li X, He L, et al. Comparison of Diagnostic Performance of Spread Through Airspaces of Lung Adenocarcinoma Based on Morphological Analysis and Perinodular and Intranodular Radiomic Features on Chest CT Images. \u003cem\u003eFront Oncol\u003c/em\u003e. 2021;11:654413.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eLiao G, Huang L, Wu S, et al. Preoperative CT-based peritumoral and tumoral radiomic features prediction for tumor spread through air spaces in clinical stage I lung adenocarcinoma. \u003cem\u003eLung Cancer\u003c/em\u003e. 2022;163:87-95.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eTang X, Huang H, Du P, Wang L, Yin H, Xu X. Intratumoral and peritumoral CT-based radiomics strategy reveals distinct subtypes of non-small-cell lung cancer. J Cancer Res Clin Oncol. 2022;148(9):2247-2260.\u003c/li\u003e\n \u003cli\u003eRan J, Cao R, Cai J, Yu T, Zhao D, Wang Z. Development and Validation of a Nomogram for Preoperative Prediction of Lymph Node Metastasis in Lung Adenocarcinoma Based on Radiomics Signature and Deep Learning Signature. Front Oncol. 2021;11:585942.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eGuo QK, Yang HS, Shan SC, et al. A radiomics nomogram prediction for survival of patients with \u0026quot;driver gene-negative\u0026quot; lung adenocarcinomas (LUAD). Radiol Med. 2023;128(6):714-725.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eZhao M, Kluge K, Papp L, et al. Multi-lesion radiomics of PET/CT for non-invasive survival stratification and histologic tumor risk profiling in patients with lung adenocarcinoma. Eur Radiol. 2022;32(10):7056-7067.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eWu L, Lou X, Kong N, Xu M, Gao C. Can quantitative peritumoral CT radiomics features predict the prognosis of patients with non-small cell lung cancer? A systematic review. \u003cem\u003eEur Radiol\u003c/em\u003e. 2023;33(3):2105-2117.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eChen Q, Shao J, Xue T, et al. Intratumoral and peritumoral radiomics nomograms for the preoperative prediction of lymphovascular invasion and overall survival in non-small cell lung cancer [published online ahead of print, 2022 Sep 6]. \u003cem\u003eEur Radiol\u003c/em\u003e. 2022;10.1007/s00330-022-09109-3.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eLiu K, Li K, Wu T, et al. Improving the accuracy of prognosis for clinical stage I solid lung adenocarcinoma by radiomics models covering tumor per se and peritumoral changes on CT. Eur Radiol. 2022;32(2):1065-1077.\u003c/li\u003e\n \u003cli\u003eTunali I, Hall LO, Napel S, et al. Stability and reproducibility of computed tomography radiomic features extracted from peritumoral regions of lung cancer lesions. \u003cem\u003eMed Phys\u003c/em\u003e. 2019;46(11):5075-5085.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eRodr\u0026iacute;guez-P\u0026eacute;rez R, Bajorath J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J Comput Aided Mol Des. 2020;34(10):1013-1026.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eZhang J, Bai W, Guo C, et al. Postoperative Short-term Outcomes Between Sublobar Resection and Lobectomy in Patients with Lung Adenocarcinoma. \u003cem\u003eCancer Manag Res\u003c/em\u003e. 2020;12:9485-9493.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eAltorki NK, Wang X, Wigle D, et al. Perioperative mortality and morbidity after sublobar versus lobar resection for early-stage non-small-cell lung cancer: post-hoc analysis of an international, randomised, phase 3 trial (CALGB/Alliance 140503). \u003cem\u003eLancet Respir Med\u003c/em\u003e. 2018;6(12):915-924.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eAltorki N, Wang X, Kozono D, et al. Lobar or Sublobar Resection for Peripheral Stage IA Non-Small-Cell Lung Cancer. \u003cem\u003eN Engl J Med\u003c/em\u003e. 2023;388(6):489-498.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eEguchi T, Kameda K, Lu S, et al. Lobectomy Is Associated with Better Outcomes than Sublobar Resection in Spread through Air Spaces (STAS)-Positive T1 Lung Adenocarcinoma: A Propensity Score-Matched Analysis. \u003cem\u003eJ Thorac Oncol\u003c/em\u003e. 2019;14(1):87-98.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eToki MI, Harrington K, Syrigos KN. The role of spread through air spaces (STAS) in lung adenocarcinoma prognosis and therapeutic decision making. \u003cem\u003eLung Cancer\u003c/em\u003e. 2020;146:127-133.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eKadota K, Nitadori JI, Sima CS, et al. Tumor Spread through Air Spaces is an Important Pattern of Invasion and Impacts the Frequency and Location of Recurrences after Limited Resection for Small Stage I Lung Adenocarcinomas. \u003cem\u003eJ Thorac Oncol\u003c/em\u003e. 2015;10(5):806-814.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eYang Y, Xie X, Wang Y, et al. A systematic review and meta-analysis of the influence of STAS on the long-term prognosis of stage I lung adenocarcinoma. \u003cem\u003eTransl Cancer Res\u003c/em\u003e. 2021;10(5):2428-2436.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eKagimoto A, Tsutani Y, Okada M. Segmentectomy for Spread Through Air Spaces-positive Lung Adenocarcinoma. \u003cem\u003eAnn Thorac Surg\u003c/em\u003e. 2022;114(5):1989-1990.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eEttinger DS, Wood DE, Aisner DL, et al. NCCN Guidelines\u0026reg; Insights: Non-Small Cell Lung Cancer, Version 2.2023. \u003cem\u003eJ Natl Compr Canc Netw\u003c/em\u003e. 2023;21(4):340-350.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eOncology Society of Chinese Medical Association. \u003cem\u003eZhonghua Yi Xue Za Zhi\u003c/em\u003e. 2024;104(34):3175-3213.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eO\u0026apos;Brien M, Paz-Ares L, Marreaud S, et al. Pembrolizumab versus placebo as adjuvant therapy for completely resected stage IB-IIIA non-small-cell lung cancer (PEARLS/KEYNOTE-091): an interim analysis of a randomised, triple-blind, phase 3 trial. \u003cem\u003eLancet Oncol\u003c/em\u003e. 2022;23(10):1274-1286.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eLiu C, Meng A, Xue XQ, et al. Prediction of early lung adenocarcinoma spread through air spaces by machine learning radiomics: a cross-center cohort study. \u003cem\u003eTransl Lung Cancer Res\u003c/em\u003e. 2024;13(12):3443-3459.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eZhang Z, Zhao Y, Ma YJ, et al. Prediction of STAS in lung adenocarcinoma with nodules\u0026thinsp;\u0026le;\u0026thinsp;2 cm using machine learning: a multicenter retrospective study. \u003cem\u003eBMC Cancer\u003c/em\u003e. 2025;25(1):417. Published 2025 Mar 7.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eYe G, Wu G, Li Y, et al. Advancing presurgical non-invasive spread through air spaces prediction in clinical stage IA lung adenocarcinoma using artificial intelligence and CT signatures. \u003cem\u003eFront Surg\u003c/em\u003e. 2025;11:1511024. Published 2025 Jan 14.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eJiang Z, Yin J, Han P, et al. Wavelet transformation can enhance computed tomography texture features: a multicenter radiomics study for grade assessment of COVID-19 pulmonary lesions. \u003cem\u003eQuant Imaging Med Surg\u003c/em\u003e. 2022;12(10):4758-4770.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eHuang K, Aviyente S. Wavelet feature selection for image classification. \u003cem\u003eIEEE Trans Image Process\u003c/em\u003e. 2008;17(9):1709-1720. \u0026nbsp;\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"cancer-imaging","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"caig","sideBox":"Learn more about [Cancer Imaging](https://cancerimagingjournal.biomedcentral.com/)","snPcode":"40644","submissionUrl":"https://submission.nature.com/new-submission/40644/3","title":"Cancer Imaging","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Lung adenocarcinoma, STAS, Radiomics, Machine learning, SHAP","lastPublishedDoi":"10.21203/rs.3.rs-6345504/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6345504/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground\u003c/strong\u003e: Despite early detection via low-dose computed tomography and complete surgical resection for early-stage lung adenocarcinoma, postoperative recurrence remains high, particularly in patients with tumor spread through air spaces. A reliable preoperative prediction model is urgently needed to adjust the treatment modality.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods\u003c/strong\u003e: In this multicenter retrospective study, 609 patients with pathological stage I lung adenocarcinoma from 3 independent centers were enrolled. Regions of interest for the primary tumor and peritumoral areas (extended by three, six, and twelve voxel units) were manually delineated from preoperative CT imaging. Quantitative imaging features were extracted and filtered by correlation analysis and random forest ranking to yield 40 candidate features. Fifteen machine learning methods were evaluated, and a ten-fold cross-validated elastic net regression model was selected to construct the radiomics-based prediction model. A clinical model based on five key clinical variables and a combined model integrating imaging and clinical features were also developed.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e: The radiomics model achieved accuracies of 0.801, 0.866, and 0.831 in the training set and two external test sets, with AUC of 0.791, 0.829, and 0.807. In one external test set, the clinical model had an AUC of 0.689, significantly lower than the radiomics model (0.807, p \u0026lt; 0.05). The combined model achieved the highest performance, with AUC of 0.834 in the training set and 0.894 in an external test set (p \u0026lt; 0.01 and p \u0026lt; 0.001, respectively). Interpretability analysis revealed that wavelet-transformed features dominated the model, with the highest contribution from a feature reflecting small high-intensity clusters within the tumor and the second highest from a feature representing low-intensity clusters in the six-voxel peritumoral region. Kaplan–Meier analysis demonstrated that patients with either pathologically confirmed or model-predicted spread had significantly shorter progression-free survival (p \u0026lt; 0.001).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion\u003c/strong\u003e: Our novel machine learning model, integrating imaging features from both tumor and peritumoral regions, preoperatively predicts tumor spread through air spaces in stage I lung adenocarcinoma. It outperforms traditional clinical models, highlighting the potential of quantitative imaging analysis in personalizing treatment. Future prospective studies and further optimization are warranted.\u003c/p\u003e","manuscriptTitle":"A Radiomics-Based Machine Learning Model and SHAP for Predicting Spread Through Air Spaces and Its Prognostic Implications in Stage I Lung Adenocarcinoma: A Multicenter Cohort Study","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-06 14:06:11","doi":"10.21203/rs.3.rs-6345504/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-07-12T16:16:40+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-07-04T04:27:21+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"189575419353027886692375701804131099285","date":"2025-07-04T04:16:51+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-04-04T12:20:32+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-04-02T05:59:17+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-04-02T02:25:43+00:00","index":"","fulltext":""},{"type":"submitted","content":"Cancer Imaging","date":"2025-03-31T13:53:26+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"cancer-imaging","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"caig","sideBox":"Learn more about [Cancer Imaging](https://cancerimagingjournal.biomedcentral.com/)","snPcode":"40644","submissionUrl":"https://submission.nature.com/new-submission/40644/3","title":"Cancer Imaging","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"c670d326-c787-4a60-b183-e3f1d32c9bec","owner":[],"postedDate":"May 6th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-10-06T16:06:34+00:00","versionOfRecord":{"articleIdentity":"rs-6345504","link":"https://doi.org/10.1186/s40644-025-00935-4","journal":{"identity":"cancer-imaging","isVorOnly":false,"title":"Cancer Imaging"},"publishedOn":"2025-09-29 15:57:42","publishedOnDateReadable":"September 29th, 2025"},"versionCreatedAt":"2025-05-06 14:06:11","video":"","vorDoi":"10.1186/s40644-025-00935-4","vorDoiUrl":"https://doi.org/10.1186/s40644-025-00935-4","workflowStages":[]},"version":"v1","identity":"rs-6345504","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6345504","identity":"rs-6345504","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0