A machine learning based on CT radiomics signature and change value features for predicting the risk classification of thymoma

preprint OA: closed
Full text JSON View at publisher
Full text 104,705 characters · extracted from preprint-html · click to expand
A machine learning based on CT radiomics signature and change value features for predicting the risk classification of thymoma | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article A machine learning based on CT radiomics signature and change value features for predicting the risk classification of thymoma Liang zhu, Jiaming Li, Yihan Tang, Yaxuan Zhang, Chunyuan Chen, and 5 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3983809/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Objective: The aim of this study is to propose a medical imaging and comprehensive stacking learning based method for predicting high and low risk categories of thymoma. Methods: This retrospective study collected 126 patients with thymoma and 5 patients with thymic carcinoma treated at our institution, including 65 low-risk cases and 66 high-risk cases. Among them 78 cases were the training cohort. The rest formed the validation cohort (53 cases). Radiomicsfeatures and variation features are extracted from collected medical imaging data. Mann-Whitney U-test was used to identify and determine potential differences between categories and features with p<0.05 were retained. Feature selection was first performed using LASSO regression, and then the top ten features with the highest potential for differentiation were selected using the SelectKBest method. By applying stacked ensemble learning, we combine three machine learning algorithms to provide an efficient and reliable solution for risk prediction of thymoma. Results: A total of 54 features were identified as the most discriminative features for low-risk and high-risk thymoma, and were used to develop radiomics features. Our model successfully identified patients with low-risk and high-risk thymoma. For the imaging omics model, the AUC in the training and validation cohorts were 0.999 (95%CI,0.988-1.000) and 0.967(95%CI,0.916-1.000). For the nomogram, the values were 0.999 (95%CI,0.996-1.000) and 0.983 (95%CI,0.990-1.000). Conclusion: This study describes the application of CT based radiomics in thymoma patients and proposes a clinical decision nomogram that can be used to predict the risk of thymoma. This nomogram is advantageous for clinical decision-making concerning thymoma patients. Machine learning CT Thymoma Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Key points CT based radiomics features and their variation values contribute to risk prediction of thymoma Imaging omics models can predict the risk level of thymoma patients The comprehensive nomogram may serve as a tool to identify the risk level of thymoma and provide support for clinical decision-making Introduction Thymoma is a rare neoplasm of thymic epithelial origin[ 1 ]. Its prevalence in Asia is approximately 0.49 per 100,000 person-years[ 2 ]. Nonetheless, it stands as the predominant malignancy of the anterior mediastinum[ 3 , 4 ], representing approximately 47% of all neoplasms in this anatomical region[ 5 ]. Notably, this tumor exhibits a distinctive attribute in its association with paraneoplastic syndromes, specifically myasthenia gravis[ 6 ]. In 2015, the World Health Organization (WHO) introduced a new classification system for thymic epithelial tumors, which includes six categories: type A, AB, B1, B2, B3, and thymic carcinoma[ 7 ]. Based on the variances in the biological behavior of thymoma tumors across different subtypes, the histological classification can be simplified into two categories: a low-risk group comprising type A, AB, and B1, and a high-risk group comprised of type B2 and B3[ 8 ]. Surgery stands as the primary strategy for treating thymoma, with complete resection offering the optimal survival rates[ 9 , 10 ]. The likelihood of complete surgical resection is very high in the low-risk group, typically obviating the need for other adjuvant therapies. Conversely, the high-risk group may necessitate multimodal therapy[ 11 ]. Hence, early and accurate diagnosis and differentiation assume paramount importance. The importance of tissue biopsy in assessing the status of tumors cannot be overstated. However, due to the spatiotemporal heterogeneity of tumor contents, the accuracy and reliability of biopsy remain subject to limitations, as only partial tissue sampling is possible. In addition, deep biopsy is an invasive procedure fraught with the risk of complications, while transpleural biopsy may lead to tumor dissemination. By contrast, CT represents a simple and noninvasive modality that enjoys wide applicability[ 12 ]. Radiomics, which hinges on the extraction of high-dimensional quantitative features from CT images, enables noninvasive quantification of tumor heterogeneity and identification of potential malignant characteristics[ 13 ]. In the domain of thoracic tumors, and particularly thymoma, researchers have been dedicated to devising noninvasive approaches for early detection and risk stratification. For instance, Mayoral et al. substantiated the optimal diagnostic performance achieved by integrating conventional and radiomics features derived from CT into a machine learning algorithm[ 14 ]. While chest computed tomography (CT) remains the most commonly employed imaging modality for preoperative assessment of thymoma[ 12 ], its ability, as well as that of magnetic resonance imaging (MRI), to differentiate between the histological subtypes of thymoma is far from ideal. As of yet, no clearly defined noninvasive preoperative standards exist to assist physicians in devising treatment strategies[ 8 ], underscoring the criticality of leveraging noninvasive approaches to establish the risk stratification of thymoma prior to surgery. As far as we are aware, seldom studies have yet been conducted to extract and analyze the characteristic variances between non-enhanced, arterial, and venous phase CT images. The objective of this study is to propose an imaging-based radiomics and machine learning approach to predict the high-risk and low-risk categories of thymoma.[ 15 ] To achieve this aim, we extracted imaging features and their variability values from non-enhanced, arterial, and venous phase CT images, and input these data into machine learning algorithms to establish robust predictive models. [ 16 ]By combining the imaging features with the pathological molecular indicators, the objective of the study is to furnish clinicians with more refined diagnostic and prognostic insights, thereby enabling them to make personalized treatment decisions with greater precision.[ 17 ] Materials and Methods 2.1 Patient cohort and pathologic evaluation This retrospective study was approved by the ethics review committee and exempted from the need for informed consent. The study design and pipeline are illustrated in Fig. 1 . We have amassed data from a cohort of 126 patients diagnosed with thymoma and 5 patients diagnosed with thymic carcinoma(Fig. 2 ). obtained exclusively from a single medical center's Picture Archiving and Communication System (PACS). The data collection spanned from 2015 to 2023, encompassing 74 male and 57 female patients, with ages ranging from 16 to 80 years. The inclusion criteria are as follows: (1) Archive data of postoperative pathologically diagnosed thymoma from January 2015 to October 2023; (2) Available CT image data in our institution. The following inclusion criteria were applied to these patients: (1) Patients with CT artifacts; (2) Patients who did not receive relevant treatment before preoperative CT scan; (3) Patients with incomplete clinical data; (4) Patients under 16 years old and over 80 years old; (5) Patients with tumor diameter less than 1cm. Expert pathologists meticulously assessed and classified the pathological specimens of each patient, confirming subtypes such as a, ab, b1, b2 and b3 through pathological examination. Additionally, immunohistochemical staining was performed on the pathological samples to quantify the expression levels of vital pathological molecular markers like ki-67, TdT, thereby providing indispensable molecular biology features for subsequent model construction.[ 18 ] [ 19 ] 2.2 CT Imaging Protocol Computed Tomography (CT) scans were performed using a GE MEDICAL SYSTEMS’s Optima CT680 Series scanner at Affiliated Hospital Of Guangdong Medical University. The imaging protocol followed standardized procedures to ensure consistent image acquisition across all patients. For each patient, a series of axial images were acquired with the following settings. (slice thickness: 0.625mm, tube voltage: 120kvp, tube current: 261mA, reconstruction diameter: 380.00). During enhanced scanning, iodohexanol was injected into the median cubital vein at a flow rate of 4 ml/s, with a dose of 0.9-1.0 ml/kg. It was triggered by aortic tracking monitoring. When the CT value reached and exceeded 100 HU, the arterial phase was initiated, and the venous phase was delayed by 15 seconds. 2.3 Image segmentation and feature extraction The expertise of two experienced radiologists was utilized to manually segment the plain, arterial and venous phase CT images of 131 cases. [ 20 ]Thymic tumors and surrounding tissues were outlined on each CT slice using ITK-SNAP software. Features were extracted from the segmented images using the PyRadiomics library according to the following settings. (Partition width: 25;Resampling pixel spacing: [ 3 , 3 , 3 ] (in millimeters); interpolator: nearest neighbor; normalization: enabled) Use the RadiomicsFeatureExtractor class for feature extraction with all features and image types enabled. These features include various aspects such as shape descriptors, texture and grayscale attributes.Shape features include area, perimeter, and volume. [ 21 ]Texture features include grayscale co-occurrence matrix (GLCM) statistics and grayscale size region matrix (GLSZM) attributes. [ 22 ]By integrating ITK-SNAP and PyRadiomics libraries, thymoma feature information can be accurately extracted from CT images. [ 23 ] 2.4 Feature Selection After extracting a comprehensive feature set using the PyRadiomics library, a multi-stage feature selection process was implemented to identify the most informative features. [ 21 ]The first step was feature selection using the Mann-Whitney U rank sum test.[ 24 ] At this stage, features with p-values less than 0.05 were retained. Subsequently, feature selection was performed using LASSO (Least Absolute Shrinkage and Selection Operator) regression to achieve sparsity of the feature coefficients so as to efficiently select features that have a significant effect on the target variable. [ 21 ]Finally, the relationship between the features and the target variable is evaluated using the SelectKBest method to select the subset of features that are most useful to the problem, thus selecting ten features out of the features selected by LASSO. Multi-stage feature selection through SelectKBest effectively narrows the initial feature set to a subset of features that are both important and highly relevant, ensuring an accurate and efficient feature set for subsequent classification and predictive analysis.[ 25 ] 2.5 Model building based on stacked ensemble learning In order to scale up the predictive power of multiple machine learning algorithms, we used a stacked ensemble learning approach to build a robust and accurate model for predicting the high-risk of thymoma. [ 26 ]Three different machine learning algorithms were selected for the base model, including the use of XGBoost, Random Forest, and Multilayer Perceptron to predict, classify, and train the inputs. The results were fed into the second layer, which was then trained on the inputs using support vectors.[ 27 ] XGBoost yields the final model. While constructing the change value features, data from the plain, arterial and venous phases were considered, along with the corresponding change value features between these phases. These change-value features provide an important basis for understanding the characteristic changes of thymoma at different stages. A s for the choice of meta-learner: we chose XGBoost as a meta-learner to summarize the prediction results of the base model.[ 28 ] The basic model mentioned above was trained by considering the data of smooth, arterial and venous phases as well as the change value characteristics. 63The prediction results of the base model were used to train the meta-learner and generate the final aggregated prediction results. [ 29 ] By applying stacked integrated learning, we combine multiple machine learning algorithms to provide an efficient and reliable solution for consistent prediction of thymic cancer. Ultimately, a personalized prediction column chart is constructed by combining age, gender and radiomics model.[ 26 ] Rad score is the prediction probability of the integrated image model, which uses stacking learning algorithm. The first layer is composed of XGBoost, Random forest, and MLP, while the second layer is XGBoost. The output results of the first layer are learned. The third layer of the model uses the results of six image models from the first and second layers to learn and output rad score. The process of establishing the nomogram is shown in Figure S2 . 2.6 Statistical analysis Whitney U-test was performed on the continuous variable of radiomics features in Excel, and a unilateral P-value < 0.05 was considered statistically significant. The chi square test and t-test were conducted in Excel for gender and age, respectively, and bilateral P-values < 0.05 were considered statistically significant. In Python (3.9.12), Lasso and SelectKBest algorithms are used to filter imaging omics features, while Mlp, Random Forest, and Xgboost algorithms are used to develop imaging omics models, and the " https://github.com/Hhy096/nomogram" website is used to construct nomograms. Compare the ROC curve of each model with the Delong test in Python (3.9.12). The Delong test is shown in Table 2. Perform Decision Curve Analysis (DCA) in Python (3.9.12) to evaluate the clinical utility of the model and draw calibration curves to describe the calibration of bar charts in the training and validation queue. Results 3.1 Patient Characteristics This study were totally included 131 patients with thymoma who received treatment at our hospital, of whom 65 were low-risk and 66 were high-risk. Among these patients, 78 cases were assigned to the GDMU dataset as the training queue. The remaining 53 cases formed the validation queue, known as the GDMU dataset. Table 1 presents the baseline characteristics of the thymoma patients at the onset of the study. Clinical and pathological characteristics did not differ significantly between the training and validation groups. 3.2 Feature selection An extensive set of 1698 radiomics features and change value features was extracted from the non-enhanced, arterial and venous phase images. This encompassed a diverse spectrum of quantitative imaging characteristics. To ensure the selection of the most informative features, a multi-step approach was employed. Firstly, the Mann-Whitney U test was applied to identify features with p-values less than 0.05. After determining the potential differences between the categories, features with p < 0.05 were retained. [ 30 ]Subsequently, the Lasso method was utilized to further streamline the feature set based on their coefficients. After Lasso selection, the retained feature set remained substantial. For optimal performance and interpretability, the SelectKBest method was employed in this study, which selected the top ten features with the highest discriminatory potential. The relevant results are shown in Fig. 3 . 3.3 Radiomics model development By exploring the key stages of radiomics model development, the curated features were used to create a robust predictive model for predicting high-risk tumors. Features selected from different imaging stages were used to build a feature-enhanced dataset that encapsulates the essence of radiomics attributes in each patient case. [ 31 ]To build the radiomics model, a series of machine learning algorithms were employed, each customized to exploit the potential of the curated features.[ 32 ] These algorithms include Random Forest, XGBoost, and Multilayer Perceptron (MLP). Stacked integrated learning methods are used to integrate the outputs of individual machine learning models to create powerful metamodels. The performance of the radiomics models is rigorously evaluated using a variety of metrics [ 30 ], including accuracy, positive predictive value, negative predictive value, sensitivity, specificity, and area under the ROC curve. The performances of the models are shown in Fig. 4 and the detailed values are shown in Table 3. 3.4 Ensemble Model Development and Validation The ensemble model is a culmination of the non-enhanced, arterial and venous stage models, taking advantage of the collective strengths of the predictive capabilities of each imaging stage. [ 33 ]XGBoost was chosen as the meta-learner to aggregate the predictions of the base model. The column chart combining age, gender, and radiomics models is shown in Fig. 5 . The performance of the integrated model was thoroughly evaluated on training and independent validation datasets. The results of Decision-curve analysis is shown in the Fig. 6 . [ 34 ]To determine the stability and generalization ability of the model on different datasets, cross-validation and external validation were performed on the GDMUMH dataset to demonstrate the robustness of the model beyond the training cohort.[ 35 ] A feature importance analysis was performed on the ensemble model to reveal the impact of individual radiomics attributes on the ensemble prediction and to facilitate the interpretation of the model. [ 36 ] Discussion In recent years, there have been numerous studies on the predictive risk assessment of thymoma using radiomics. For instance, Liu et al. [ 37 ] employed transfer learning combined with clinical, radiomic, and deep features to establish a support vector machine (SVM) classifier-based model for predicting high and low-risk thymomas, achieving AUCs of 0.99 and 0.95, respectively. Tian et al. [ 38 ]investigated the performance of radiomics-based computed tomography phenomics in predicting pathological staging and survival outcomes of thymic epithelial tumor patients, yielding integrated AUCs of 0.935 and 0.811. Xiao et al. [ 39 ]developed a comprehensive radiomics diagnostic model using multivariate logistic regression analysis, incorporating clinical, conventional MR imaging variables, apparent diffusion coefficient (ADC) values, and radiomic features, demonstrating excellent performance in distinguishing low and high-risk thymomas. Feng et al. [ 40 ]utilized 14 machine learning models combined with different feature selection strategies to establish a three-class model based on radiomic features, predicting simplified risk classification of thymic epithelial tumors (TETs). MM et al. [ 41 ]trained a support vector machine classification model to differentiate between thymomas and thymic carcinomas.[ 42 ] Integration of traditional and radiomic features in the model also achieved the highest diagnostic performance.[ 43 ] However, there has been no study yet utilizing the difference in computed tomography (CT) values between arterial, venous, and plain phases to predict the risk of thymomas.[ 44 ] In this study, we innovatively developed a radiomics model based on the three phases and their inter-phase differences.[ 45 ] Compared to CT plain scans, studies have shown a significant increase in vascularity in highly malignant thymoma lesions (1).[ 46 ] Enhanced CT better visualizes vascularity and enhances the delineation of lesion areas from surrounding structures, all of which can be represented in radiomics. [ 47 ]Therefore, in this study, we utilized differential methods to capture numerical changes in enhanced and non-enhanced imaging features.[ 48 ] Our model effectively demonstrated the heterogeneity of tumor lesion area differences. Additionally, in this study, we utilized stacked ensemble learning for predicting high and low-risk thymomas. Stacked learning integrates predictions from multiple base models, effectively improving the accuracy and robustness of the radiomics model in predicting thymoma risk.[ 49 ] [ 50 ]This approach not only handles complex imaging data but also enhances the model's generalization ability, providing clinicians with a more reliable tool for assessing thymoma risk. [ 51 ]The radiomics model and the clinical-radiomics combined model constructed using the stacking method outperformed other models, with the AUC of the imaging model reaching 0.00 and the clinical model reaching 0.00 (2).[ 52 ] Compared to biopsy, machine learning models have significant value in diagnosing benign and malignant tumors.[ 53 ] The advantages of these methods lie in their non-invasiveness, providing efficient clinical assessment tools for patients' comfort and effectiveness.[ 54 ] Through the analysis of selected features, we observed various types of image features, including texture features(such as arterial_original_gldm_SmallDependenceEmphasis), morphological features (such as Shape-based (3D)), and first-order statistical features (such as First Order Statistics). Features extracted from the arterial phase, such as arterial_original_gldm_SmallDependenceEmphasis and arterial_wavelet-LLH_ gldm_ DependenceEntropy (abbreviated feature names), depict specific morphological and textural characteristics of lesions during arterial perfusion, closely related to vascular perfusion. Features extracted from the venous phase, such as venous_wavelet-LHH_glszm_SizeZoneNonUniformity and venous_wavelet-HHH_gldm_ Dependence Variance, highlight the specificity of lesions during venous perfusion. Non-enhanced phase features (such as plain_wavelet-LHL_gldm_SmallDependence HighGrayLevelEmphasis and delta_plain_venous_exponential_glszm_ Large AreaHighGrayLevelEmphasis) provide baseline image information independent of contrast agents. Differential features, such as delta_plain_arterial_original_glcm_MCC and delta_plain_arterial_original_glrlm_RunEntropy, may contain significant changes occurring between the two phases. These changes may be associated with pathological malignant transformation or other lesion features, providing strong clues for diagnosing benign and malignant thymomas.[ 55 ] These features may focus more on the basic morphological characteristics of lesions, helping to assess the inherent properties of lesions without the influence of contrast agents. [ 56 ]These features reveal significant changes in thymomas under different blood supply states. This multi-level feature extraction helps to comprehensively understand the complex characteristics of thymomas, providing clinicians with more information about lesions, which is expected to influence clinical decisions and the formulation of personalized treatment strategies.[ 57 ] Looking to the future, although our study did not establish models to predict thymoma risk using pathological molecular markers such as ki-67 and tdt, it does not mean the importance of these markers is overlooked. In fact, this provides us with a new perspective: by combining machine learning with medical imaging features, new predictive models can be developed that not only predict thymoma risk but also predict the pathological type of tumors. The advantage of this approach lies in its ability to provide personalized predictions, optimize treatment plans, and improve patient survival rates. Therefore, although our study uses radiomics to predict high and low-risk thymomas, our goal is to develop a universal predictive model applicable to various types of cancers, thus providing more possibilities for future cancer research and treatment. The development of such a model will require further research and understanding of the role of pathological molecular markers in cancer development, and how to apply this knowledge to the establishment of predictive models. This is a challenging but hopeful task because it provides us with a completely new way to understand and combat cancer. This study has some limitations, including single-center nature, limited data, retrospective design, the need for clinical validation, and the lack of integration of genomic data. Single-center studies limit the applicability of results in other regions or medical environments, and retrospective design results in relatively limited available data, which may affect the training and performance evaluation of the model. Another limitation is the lack of integration of genomic information, while combining radiomic data with genomic information can provide a more comprehensive understanding of disease mechanisms and may identify molecular biomarkers associated with prognosis and treatment response. Conclusion Our study found that radiomics can effectively predict the risk level of thymic tumor patients, with clinical radiomics correlation maps demonstrating stronger associations in comparison. This knowledge can aid clinicians in guiding the selection of personalized treatment plans for early-stage thymoma patients. This approach provides robust support for personalized therapy, holding significant implications for future clinical practice. Abbreviations LASSO Least Absolute Shrinkage and Selection Operator AUC Area under the curve DCA Decision curve analysis ROC Receiver operating characteristic PACS Picture Archiving and Communication System ADC apparent diffusion coefficient TETs thymic epithelial tumors SVM support vector machine MLP Multilayer Perceptron Declarations Authors’ contributions : Liang zhu, Jiaming Li, YihanTang, Shuyan He: Conceptualization, Methodology, Software, Writing-Original draft preparation. Yaxuan Zhang, Chunyuan Chen, Siyuan Li, Xuefeng Wang, Ziye Zhuang: Data curation, Writing-Original draft preparation. Shuyan He, Biao Deng: Supervision, Software, Validation. All authors reviewed the manuscript. Availability of data and materials : All data generated or analysed during this study are included in this published article and supplementary material. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare that they have no competing interests Funding None References Wang J, Zhang S (2010) [Advances on diagnosis and treatment of malignant thymic tumors]. Zhongguo Fei Ai Za Zhi 13:985–991. doi:10.3779/j.issn.1009-3419.2010.10.10 Engels EA, Pfeiffer RM (2003) Malignant thymoma in the United States: Demographic patterns in incidence and associations with subsequent malignancies. International Journal of Cancer 105:546–551. doi:10.1002/ijc.11099 Roden AC, Fang W, Shen Y, et al (2020) Distribution of Mediastinal Lesions Across Multi-Institutional, International, Radiology Databases. J Thorac Oncol 15:568–579. doi:10.1016/j.jtho.2019.12.108 Du X, Yu L, Yang L, et al (2021) [Expression and Diagnostic Value of NPTX1 in Thymoma Patients]. Zhongguo Fei Ai Za Zhi 24:1–6. doi:10.3779/j.issn.1009-3419.2021.102.03 Detterbeck FC, Zeeshan A (2013) Thymoma: current diagnosis and treatment. Chin Med J (Engl) 126:2186–2191 Yuan D, Gu Z, Liang G, et al (2018) [Clinical Study on the Prognosis of Patients with Thymoma with Myasthenia Gravis]. Zhongguo Fei Ai Za Zhi 21:1–7. doi:10.3779/j.issn.1009-3419.2018.01.01 Travis WD, Brambilla E, Burke AP, et al (2015) Introduction to The 2015 World Health Organization Classification of Tumors of the Lung, Pleura, Thymus, and Heart. J Thorac Oncol 10:1240–1242. doi:10.1097/JTO.0000000000000663 Multidisciplinary Committee of Oncology, Chinese Physicians Association (2021) [Chinese guideline for clinical diagnosis and treatment of thymic epithelial tumors (2021 Edition)]. Zhonghua Zhong Liu Za Zhi 43:395–404. doi:10.3760/cma.j.cn112152-20210313-00226 Fang W, Chen W, Chen G, Jiang Y (2005) Surgical management of thymic epithelial tumors: a retrospective review of 204 cases. Ann Thorac Surg 80:2002–2007. doi:10.1016/j.athoracsur.2005.05.058 Liu X, Li X, Li J (2020) [Treatment of Recurrent Thymoma]. Zhongguo Fei Ai Za Zhi 23:204–210. doi:10.3779/j.issn.1009-3419.2020.03.11 Fang W, Fu J, Shen Y, et al (2016) [Management of Thymic Tumors - Consensus Based on the Chinese Alliance for Research in Thymomas Multi-institutional Retrospective Studies]. Zhongguo Fei Ai Za Zhi 19:414–417. doi:10.3779/j.issn.1009-3419.2016.07.02 Tomiyama N, Honda O, Tsubamoto M, et al (2009) Anterior mediastinal tumors: diagnostic accuracy of CT and MRI. Eur J Radiol 69:280–288. doi:10.1016/j.ejrad.2007.10.002 Jiao Y, Ren Y, Zheng X (2017) [Quantitative Imaging Assessment of Tumor Response to Chemoradiation in Lung Cancer]. Zhongguo Fei Ai Za Zhi 20:407–414. doi:10.3779/j.issn.1009-3419.2017.06.07 Mayoral M, Pagano AM, Araujo-Filho JAB, et al (2023) Conventional and radiomic features to predict pathology in the preoperative assessment of anterior mediastinal masses. Lung Cancer 178:206–212. doi:10.1016/j.lungcan.2023.02.014 Lu C-F, Hsu F-T, Hsieh KL-C, et al (2018) Machine Learning-Based Radiomics for Molecular Subtyping of Gliomas. Clin Cancer Res 24:4429–4436. doi:10.1158/1078-0432.CCR-17-3445 Hu Y, Xie C, Yang H, et al (2020) Assessment of Intratumoral and Peritumoral Computed Tomography Radiomics for Predicting Pathological Complete Response to Neoadjuvant Chemoradiation in Patients With Esophageal Squamous Cell Carcinoma. JAMA Netw Open 3:e2015927. doi:10.1001/jamanetworkopen.2020.15927 Lu C, Ward PS, Kapoor GS, et al (2012) IDH mutation impairs histone demethylation and results in a block to cell differentiation. Nature 483:474–478. doi:10.1038/nature10860 Egeland NG, Jonsdottir K, Lauridsen KL, et al (2020) Digital Image Analysis of Ki-67 Stained Tissue Microarrays and Recurrence in Tamoxifen-Treated Breast Cancer Patients. Clin Epidemiol 12:771–781. doi:10.2147/CLEP.S248167 Kim D-Y, Park H-S, Choi E-J, et al (2015) Immunophenotypic markers in adult acute lymphoblastic leukemia: the prognostic significance of CD20 and TdT expression. Blood Res 50:227–234. doi:10.5045/br.2015.50.4.227 Yan C, Liu J, Yang X, et al (2022) Automatic vs manual coronary CT angiography reconstruction for whole-heart coverage CT scanner: a comparison study in general patient population. J Xray Sci Technol 30:389–398. doi:10.3233/XST-211048 Aerts HJWL, Velazquez ER, Leijenaar RTH, et al (2014) Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5:4006. doi: 10.1038/ncomms5006 Paris MT, Mourtzakis M (2021) Muscle Composition Analysis of Ultrasound Images: A Narrative Review of Texture Analysis. Ultrasound Med Biol 47:880–895. doi:10.1016/j.ultrasmedbio.2020.12.012 van Griethuysen JJM, Fedorov A, Parmar C, et al (2017) Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 77:e104–e107. doi:10.1158/0008-5472.CAN-17-0339 Huang C-B, Hu J-S, Tan K, et al (2022) Application of machine learning model to predict osteoporosis based on abdominal computed tomography images of the psoas muscle: a retrospective study. BMC Geriatr 22:796. doi:10.1186/s12877-022-03502-9 Brown MP, Grundy WN, Lin D, et al (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A 97:262–267. doi:10.1073/pnas.97.1.262 Naimi AI, Balzer LB (2018) Stacked generalization: an introduction to super learning. Eur J Epidemiol 33:459–464. doi:10.1007/s10654-018-0390-z Sipper M, Moore JH (2021) Conservation machine learning: a case study of random forests. Sci Rep 11:3629. doi: 10.1038/s41598-021-83247-4 Li Y, Li M, Li C, Liu Z (2020) Forest aboveground biomass estimation using Landsat 8 and Sentinel-1A data with machine learning algorithms. Sci Rep 10:9952. doi:10.1038/s41598-020-67024-3 Pham TX, Siarry P, Oulhadj H (2020) Segmentation of MR Brain Images Through Hidden Markov Random Field and Hybrid Metaheuristic Algorithm. IEEE Trans Image Process. doi:10.1109/TIP.2020.2990346 Lambin P, Leijenaar RTH, Deist TM, et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749–762. doi:10.1038/nrclinonc.2017.141 Liu Y, Stojadinovic S, Hrycushko B, et al (2017) A deep convolutional neural network-based automatic delineation strategy for multiple brain metastases stereotactic radiosurgery. PLoS One 12:e0185844. doi:10.1371/journal.pone.0185844 Lambin P, Rios-Velazquez E, Leijenaar R, et al (2012) Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 48:441–446. doi:10.1016/j.ejca.2011.11.036 Fang Z, Ren J, MacLellan C, et al (2022) A Novel Multi-Stage Residual Feature Fusion Network for Detection of COVID-19 in Chest X-Ray Images. IEEE Trans Mol Biol Multiscale Commun 8:17–27. doi:10.1109/TMBMC.2021.3099367 Yan Z, Wang J, Dong Q, et al (2022) XGBoost algorithm and logistic regression to predict the postoperative 5-year outcome in patients with glioma. Ann Transl Med 10:860. doi: 10.21037/atm-22-3384 Ma M, Liu R, Wen C, et al (2022) Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms. Eur Radiol 32:1652–1662. doi:10.1007/s00330-021-08271-4 Gafita A, Calais J, Grogan TR, et al (2021) Nomograms to predict outcomes after 177Lu-PSMA therapy in men with metastatic castration-resistant prostate cancer: an international, multicentre, retrospective study. Lancet Oncol 22:1115–1125. doi:10.1016/S1470-2045(21)00274-6 Liu W, Wang W, Zhang H, et al (2023) Development and Validation of Multi-Omics Thymoma Risk Classification Model Based on Transfer Learning. J Digit Imaging 36:2015–2024. doi:10.1007/s10278-023-00855-4 Tian D, Yan H-J, Shiiya H, et al (2023) Machine learning-based radiomic computed tomography phenotyping of thymic epithelial tumors: Predicting pathological and survival outcomes. J Thorac Cardiovasc Surg 165:502-516.e9. doi:10.1016/j.jtcvs.2022.05.046 Xiao G, Hu Y-C, Ren J-L, et al (2021) MR imaging of thymomas: a combined radiomics nomogram to predict histologic subtypes. Eur Radiol 31:447–457. doi: 10.1007/s00330-020-07074-3 Feng X-L, Wang S-Z, Chen H-H, et al (2022) Optimizing the radiomics-machine-learning model based on non-contrast enhanced CT for the simplified risk categorization of thymic epithelial tumors: A large cohort retrospective study. Lung Cancer 166:150–160. doi:10.1016/j.lungcan.2022.03.007 Mayoral M, Pagano AM, Araujo-Filho JAB, et al (2023) Conventional and radiomic features to predict pathology in the preoperative assessment of anterior mediastinal masses. Lung Cancer 178:206–212. doi:10.1016/j.lungcan.2023.02.014 Su X-Y, Wang W-Y, Li J-N, et al (2015) Immunohistochemical differentiation between type B3 thymomas and thymic squamous cell carcinomas. Int J Clin Exp Pathol 8:5354–5362 Rao A, Pang M, Kim J, et al (2023) Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow. medRxiv 2023.02.21.23285886. doi:10.1101/2023.02.21.23285886 Fukumoto K, Taniguchi T, Ishikawa Y, et al (2012) The utility of [18F]-fluorodeoxyglucose positron emission tomography-computed tomography in thymic epithelial tumours. Eur J Cardiothorac Surg 42:e152-156. doi:10.1093/ejcts/ezs527 Du K-P, Huang W-P, Liu S-Y, et al (2022) Application of computed tomography-based radiomics in differential diagnosis of adenocarcinoma and squamous cell carcinoma at the esophagogastric junction. World J Gastroenterol 28:4363–4375. doi:10.3748/wjg.v28.i31.4363 Hoogenboom M, Eikelenboom DC, van den Bijgaart RJE, et al (2017) Impact of MR-guided boiling histotripsy in distinct murine tumor models. Ultrason Sonochem 38:1–8. doi:10.1016/j.ultsonch.2017.02.035 Esposito D, Olsson DS, Ragnarsson O, et al (2019) Non-functioning pituitary adenomas: indications for pituitary surgery and post-surgical management. Pituitary 22:422–434. doi:10.1007/s11102-019-00960-0 Pinheiro LC, Candore G, Zaccaria C, et al (2018) An algorithm to detect unexpected increases in frequency of reports of adverse events in EudraVigilance. Pharmacoepidemiol Drug Saf 27:38–45. doi:10.1002/pds.4344 Chen K-H, Hu Y-J (2021) Residue-Residue Interaction Prediction via Stacked Meta-Learning. Int J Mol Sci 22:6393. doi:10.3390/ijms22126393 Ponsiglione A, Gambardella M, Stanzione A, et al (2023) Radiomics for the identification of extraprostatic extension with prostate MRI: a systematic review and meta-analysis. Eur Radiol. doi:10.1007/s00330-023-10427-3 Lin Y, Ma J, Wang Q, Sun D-W (2023) Applications of machine learning techniques for enhancing nondestructive food quality and safety detection. Crit Rev Food Sci Nutr 63:1649–1669. doi:10.1080/10408398.2022.2131725 Xing L, Lesperance ML, Zhang X (2020) Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics 36:65–72. doi:10.1093/bioinformatics/btz531 Liu X, Cheng D, Wang W (2015) MRI in differentiation of benign and malignant tongue tumors. Front Biosci (Landmark Ed) 20:614–620. doi:10.2741/4326 Norbash A, Yucel K, Yuh W, et al (2016) Effect of team training on improving MRI study completion rates and no-show rates. J Magn Reson Imaging 44:1040–1047. doi:10.1002/jmri.25219 Yin X, Li Y, Wang H, et al (2022) Small cell lung cancer transformation: From pathogenesis to treatment. Semin Cancer Biol 86:595–606. doi:10.1016/j.semcancer.2022.03.006 Kulikova OI, Stvolinsky SL, Migulin VA, et al (2020) A new derivative of acetylsalicylic acid and carnosine: synthesis, physical and chemical properties, biological activity. Daru 28:119–130. doi:10.1007/s40199-019-00323-x Gallastegui N, Steiner B, Aguero P, et al (2022) The role of point-of-Care Musculoskeletal Ultrasound for Routine Joint evaluation and management in the Hemophilia Clinic - A Real World Experience. BMC Musculoskelet Disord 23:1111. doi:10.1186/s12891-022-06042-w Tables Tables 1 to 3 are available in the Supplementary Files section. Supplementary Figure Supplementary Figure 3 is not available with this version. Figure S3: Calibration curves of the models. (a), (b): Radscore model in the training dataset(a) and test dataset(b). (c), (d): The combined model in the training dataset(c) and test dataset(d). Additional Declarations No competing interests reported. Supplementary Files Table1.xlsx Table1: Baseline characteristics of patients. Table2.xlsx Table2: Delong test with each model Table3.xlsx Table3: The performances of the prediction models. FigureS1.tif Figure S1:Pearson correlation heatmap for all the features and parameters. There was no linear correlation between these parameters and treatment response. FigureS2.tif Figure S2: The process of establishing the models. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3983809","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":277509525,"identity":"76db24e8-75c5-4fa8-a05d-a536687fc2ca","order_by":0,"name":"Liang zhu","email":"","orcid":"","institution":"Affiliated Hospital of Guangdong Medical University","correspondingAuthor":false,"prefix":"","firstName":"Liang","middleName":"","lastName":"zhu","suffix":""},{"id":277509526,"identity":"95cf1805-8d5f-451f-96d3-2b53262ca99f","order_by":1,"name":"Jiaming Li","email":"","orcid":"","institution":"Guangdong Medical Universiy","correspondingAuthor":false,"prefix":"","firstName":"Jiaming","middleName":"","lastName":"Li","suffix":""},{"id":277509527,"identity":"05fe7308-cef7-4c33-8cce-433d7fc4c5a2","order_by":2,"name":"Yihan Tang","email":"","orcid":"","institution":"Guangdong Medical Universiy","correspondingAuthor":false,"prefix":"","firstName":"Yihan","middleName":"","lastName":"Tang","suffix":""},{"id":277509528,"identity":"04ee577a-36ff-4ff6-8d86-060754176a74","order_by":3,"name":"Yaxuan Zhang","email":"","orcid":"","institution":"Guangdong Medical Universiy","correspondingAuthor":false,"prefix":"","firstName":"Yaxuan","middleName":"","lastName":"Zhang","suffix":""},{"id":277509529,"identity":"1d86dd1b-a58c-435e-8ea3-1424e731bd5e","order_by":4,"name":"Chunyuan Chen","email":"","orcid":"","institution":"Affiliated Hospital of Guangdong Medical University","correspondingAuthor":false,"prefix":"","firstName":"Chunyuan","middleName":"","lastName":"Chen","suffix":""},{"id":277509530,"identity":"dbc5357a-e2a5-4880-98af-d329e4bab6e4","order_by":5,"name":"Siyuan Li","email":"","orcid":"","institution":"Sun Yat-sen University","correspondingAuthor":false,"prefix":"","firstName":"Siyuan","middleName":"","lastName":"Li","suffix":""},{"id":277509531,"identity":"a99ce5a0-a247-479c-a605-072035c5eb00","order_by":6,"name":"Xuefeng Wang","email":"","orcid":"","institution":"Affiliated Hospital of Guangdong Medical University","correspondingAuthor":false,"prefix":"","firstName":"Xuefeng","middleName":"","lastName":"Wang","suffix":""},{"id":277509532,"identity":"dbc5ac1e-ac30-4d62-b455-909bb478f4ad","order_by":7,"name":"Ziye Zhuang","email":"","orcid":"","institution":"Guangdong Medical Universiy","correspondingAuthor":false,"prefix":"","firstName":"Ziye","middleName":"","lastName":"Zhuang","suffix":""},{"id":277509533,"identity":"87b6a30e-e041-47f0-8d88-a346cf5bee6d","order_by":8,"name":"Shuyan He","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAzUlEQVRIie3RsQrCMBDG8YRAshzOFSE+gRAICF36LA1ddXdsCeiie8WCzyBCVyMFXXR31EUXB92c1FUnz00w//0H93GE+Hw/GGc028e9B9Bp5XCkJqxVlw2TLOcxjkhY9+vjPtMiB4U8LDBpAzg3o8nwujuTSLZSBNEAYMbFdh4WJNFthyAJBIGZpd2yAcSZEkMqUMosXOeIJLDMsjyONc07HElEZsnFOUmHKx0WCrGlacXpZu4O6MAedudeJD+StwLsa17Jt8Ln8/n+oicoXUPgPSozqwAAAABJRU5ErkJggg==","orcid":"","institution":"Guangzhou Medical University","correspondingAuthor":true,"prefix":"","firstName":"Shuyan","middleName":"","lastName":"He","suffix":""},{"id":277509534,"identity":"26380bd6-3fda-4d2e-9e7c-23cecce6d116","order_by":9,"name":"biao deng","email":"","orcid":"","institution":"Affiliated Hospital of Guangdong Medical University","correspondingAuthor":false,"prefix":"","firstName":"biao","middleName":"","lastName":"deng","suffix":""}],"badges":[],"createdAt":"2024-02-24 03:59:44","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3983809/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3983809/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":52617678,"identity":"c640f282-764b-4b31-9bbd-2a61ce99e875","added_by":"auto","created_at":"2024-03-13 16:21:13","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1957288,"visible":true,"origin":"","legend":"\u003cp\u003eThe study design and pipeline.\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/df76f60b1edd385d6a30df2f.jpg"},{"id":52617676,"identity":"796f3001-8bc3-4727-9fa7-303964af6923","added_by":"auto","created_at":"2024-03-13 16:21:13","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":836161,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart of patient selection. CT, computed tomography.\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/3a5e238318b6a41b7d9a2fbe.jpg"},{"id":52618237,"identity":"f1b1c458-0634-4432-a11c-2f6d43216838","added_by":"auto","created_at":"2024-03-13 16:29:13","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":6238860,"visible":true,"origin":"","legend":"\u003cp\u003eFeature selection in radiomics. (a)-(f): Cross-validation curve of the LASSO regression model. (a): Arterial phase features (b): Venous phase features (c): Unenhanced phase features (d): The change between the arterial phase features and the venous phase features (e): The change between the arterial phase features and the unenhanced phase features (f): The change between the unenhanced phase features and the venous phase features. (g)-(l): Coefficient curves for radiomic features. (g): Arterial phase features (h): Venous phase features (i): Unenhanced phase features (j): The change between the arterial phase features and the venous phase features (k): The change between the arterial phase features and the unenhanced phase features (l): The change between the unenhanced phase features and the venous phase features. (m)-(r): Coefficients in the Lasso model. (m): Arterial phase features (n): Venous phase features (o): Unenhanced phase features (p): The change between the arterial phase features and the venous phase features (q): The change between the arterial phase features and the unenhanced phase features (r): The change between the unenhanced phase features and the venous phase features.\u003c/p\u003e","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/d5ddc37d5a868e7ff833898d.jpg"},{"id":52618234,"identity":"f730aeca-a2d5-431d-837f-9ac65dc495bd","added_by":"auto","created_at":"2024-03-13 16:29:13","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":2254961,"visible":true,"origin":"","legend":"\u003cp\u003eThe performances of the models. (a)-(d):ROC curves of the models. (a), (b): All models of the unenhanced phase, venous phase, arterial phase, the change between the arterial phase and the venous phase, the change between the arterial phase and the unenhanced phase and the change between the unenhanced phase and the venous phase in the training dataset(a) and test dataset(b). (c),(d): Radscore model(c) and the combined model(d) in the training dataset and test dataset. (e),(f): Association between AUC of training and test datasets in the radscore model(e) and the combined model(f). (g),(h): Bar plot of the performances of the eight prediction models in the training set and test set.\u003c/p\u003e","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/cc5ec810409e0ac76dba768c.jpg"},{"id":52617682,"identity":"f7ad71f4-15fd-4f24-9a57-fdd23036f6cc","added_by":"auto","created_at":"2024-03-13 16:21:13","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1372721,"visible":true,"origin":"","legend":"\u003cp\u003eA combined radiomics nomogram for predicting thymomas risk stratification.\u003c/p\u003e","description":"","filename":"Figure5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/c51bcfc48cf1bcac132e3ef4.jpg"},{"id":52617680,"identity":"ff9862bc-7e34-4e32-9ff1-cfca4786668a","added_by":"auto","created_at":"2024-03-13 16:21:13","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1342469,"visible":true,"origin":"","legend":"\u003cp\u003eDecision-curve analysis. The net benefit of each model is plotted on the y-axis, and the x-axis indicates the threshold probability. The black and dashed lines respectively indicate the assumptions that all or no patients have thymoma. (a), (b): Radscore model in the training dataset(a) and test dataset(b). (c), (d): The combined model in the training dataset(c) and test dataset(d). Supplementary.\u003c/p\u003e","description":"","filename":"Figure6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/e67b7b65e906319313e28df0.jpg"},{"id":52640560,"identity":"dc855c67-df4f-4778-94aa-a9e1cf6dc23e","added_by":"auto","created_at":"2024-03-14 01:54:45","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":994111,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/ba879b02-412f-4d16-9745-daf4ba52e0ba.pdf"},{"id":52619448,"identity":"17a57bff-9165-4513-9ce7-e85b4d3333d0","added_by":"auto","created_at":"2024-03-13 16:37:13","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":11020,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable1: \u003c/strong\u003eBaseline characteristics of patients.\u003c/p\u003e","description":"","filename":"Table1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/760ae35e270e617ead328274.xlsx"},{"id":52617674,"identity":"8eca9869-0f72-452a-9c21-24940ed7a861","added_by":"auto","created_at":"2024-03-13 16:21:13","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":10947,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable2: \u003c/strong\u003eDelong test with each model\u003c/p\u003e","description":"","filename":"Table2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/a73f9b42ebf07e356e865a88.xlsx"},{"id":52618236,"identity":"711d1d34-9003-4341-9c50-2ef9c0b1b5e8","added_by":"auto","created_at":"2024-03-13 16:29:13","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":10769,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eTable3: \u003c/strong\u003eThe performances of the prediction models.\u003c/p\u003e","description":"","filename":"Table3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/379b953b7bd765dc757f65d4.xlsx"},{"id":52617684,"identity":"5d972c3b-a4a3-4244-881c-e22ba82c3ca8","added_by":"auto","created_at":"2024-03-13 16:21:14","extension":"tif","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":1860194,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFigure S1:\u003c/strong\u003ePearson correlation heatmap for all the features and parameters. There was no linear correlation between these parameters and treatment response.\u003c/p\u003e","description":"","filename":"FigureS1.tif","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/e92de12e1e54448198349f57.tif"},{"id":52617683,"identity":"a8480805-092d-4448-a82b-a6f1255e0cc7","added_by":"auto","created_at":"2024-03-13 16:21:14","extension":"tif","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":1037988,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFigure S2: \u003c/strong\u003eThe process of establishing the models.\u003c/p\u003e","description":"","filename":"FigureS2.tif","url":"https://assets-eu.researchsquare.com/files/rs-3983809/v1/00ff750fac2a12eda138ebb6.tif"}],"financialInterests":"No competing interests reported.","formattedTitle":"A machine learning based on CT radiomics signature and change value features for predicting the risk classification of thymoma","fulltext":[{"header":"Key points","content":"\u003cp\u003eCT based radiomics features and their variation values contribute to risk prediction of thymoma\u003c/p\u003e\n\u003cp\u003eImaging omics models can predict the risk level of thymoma patients\u003c/p\u003e\n\u003cp\u003eThe comprehensive nomogram may serve as a tool to identify the risk level of thymoma and provide support for clinical decision-making\u003c/p\u003e"},{"header":"Introduction","content":"\u003cp\u003eThymoma is a rare neoplasm of thymic epithelial origin[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Its prevalence in Asia is approximately 0.49 per 100,000 person-years[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Nonetheless, it stands as the predominant malignancy of the anterior mediastinum[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e], representing approximately 47% of all neoplasms in this anatomical region[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Notably, this tumor exhibits a distinctive attribute in its association with paraneoplastic syndromes, specifically myasthenia gravis[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. In 2015, the World Health Organization (WHO) introduced a new classification system for thymic epithelial tumors, which includes six categories: type A, AB, B1, B2, B3, and thymic carcinoma[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Based on the variances in the biological behavior of thymoma tumors across different subtypes, the histological classification can be simplified into two categories: a low-risk group comprising type A, AB, and B1, and a high-risk group comprised of type B2 and B3[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. Surgery stands as the primary strategy for treating thymoma, with complete resection offering the optimal survival rates[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. The likelihood of complete surgical resection is very high in the low-risk group, typically obviating the need for other adjuvant therapies. Conversely, the high-risk group may necessitate multimodal therapy[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. Hence, early and accurate diagnosis and differentiation assume paramount importance.\u003c/p\u003e \u003cp\u003eThe importance of tissue biopsy in assessing the status of tumors cannot be overstated. However, due to the spatiotemporal heterogeneity of tumor contents, the accuracy and reliability of biopsy remain subject to limitations, as only partial tissue sampling is possible. In addition, deep biopsy is an invasive procedure fraught with the risk of complications, while transpleural biopsy may lead to tumor dissemination. By contrast, CT represents a simple and noninvasive modality that enjoys wide applicability[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. Radiomics, which hinges on the extraction of high-dimensional quantitative features from CT images, enables noninvasive quantification of tumor heterogeneity and identification of potential malignant characteristics[\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. In the domain of thoracic tumors, and particularly thymoma, researchers have been dedicated to devising noninvasive approaches for early detection and risk stratification. For instance, Mayoral et al. substantiated the optimal diagnostic performance achieved by integrating conventional and radiomics features derived from CT into a machine learning algorithm[\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eWhile chest computed tomography (CT) remains the most commonly employed imaging modality for preoperative assessment of thymoma[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], its ability, as well as that of magnetic resonance imaging (MRI), to differentiate between the histological subtypes of thymoma is far from ideal. As of yet, no clearly defined noninvasive preoperative standards exist to assist physicians in devising treatment strategies[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e], underscoring the criticality of leveraging noninvasive approaches to establish the risk stratification of thymoma prior to surgery. As far as we are aware, seldom studies have yet been conducted to extract and analyze the characteristic variances between non-enhanced, arterial, and venous phase CT images.\u003c/p\u003e \u003cp\u003eThe objective of this study is to propose an imaging-based radiomics and machine learning approach to predict the high-risk and low-risk categories of thymoma.[\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e] To achieve this aim, we extracted imaging features and their variability values from non-enhanced, arterial, and venous phase CT images, and input these data into machine learning algorithms to establish robust predictive models. [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]By combining the imaging features with the pathological molecular indicators, the objective of the study is to furnish clinicians with more refined diagnostic and prognostic insights, thereby enabling them to make personalized treatment decisions with greater precision.[\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]\u003c/p\u003e"},{"header":"Materials and Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e2.1 Patient cohort and pathologic evaluation\u003c/h2\u003e \u003cp\u003e This retrospective study was approved by the ethics review committee and exempted from the need for informed consent. The study design and pipeline are illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. We have amassed data from a cohort of 126 patients diagnosed with thymoma and 5 patients diagnosed with thymic carcinoma(Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). obtained exclusively from a single medical center's Picture Archiving and Communication System (PACS). The data collection spanned from 2015 to 2023, encompassing 74 male and 57 female patients, with ages ranging from 16 to 80 years. The inclusion criteria are as follows: (1) Archive data of postoperative pathologically diagnosed thymoma from January 2015 to October 2023; (2) Available CT image data in our institution. The following inclusion criteria were applied to these patients: (1) Patients with CT artifacts; (2) Patients who did not receive relevant treatment before preoperative CT scan; (3) Patients with incomplete clinical data; (4) Patients under 16 years old and over 80 years old; (5) Patients with tumor diameter less than 1cm.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eExpert pathologists meticulously assessed and classified the pathological specimens of each patient, confirming subtypes such as a, ab, b1, b2 and b3 through pathological examination. Additionally, immunohistochemical staining was performed on the pathological samples to quantify the expression levels of vital pathological molecular markers like ki-67, TdT, thereby providing indispensable molecular biology features for subsequent model construction.[\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e] [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e2.2 CT Imaging Protocol\u003c/h2\u003e \u003cp\u003eComputed Tomography (CT) scans were performed using a GE MEDICAL SYSTEMS\u0026rsquo;s Optima CT680 Series scanner at Affiliated Hospital Of Guangdong Medical University. The imaging protocol followed standardized procedures to ensure consistent image acquisition across all patients. For each patient, a series of axial images were acquired with the following settings. (slice thickness: 0.625mm, tube voltage: 120kvp, tube current: 261mA, reconstruction diameter: 380.00). During enhanced scanning, iodohexanol was injected into the median cubital vein at a flow rate of 4 ml/s, with a dose of 0.9-1.0 ml/kg. It was triggered by aortic tracking monitoring. When the CT value reached and exceeded 100 HU, the arterial phase was initiated, and the venous phase was delayed by 15 seconds.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e2.3 Image segmentation and feature extraction\u003c/h2\u003e \u003cp\u003eThe expertise of two experienced radiologists was utilized to manually segment the plain, arterial and venous phase CT images of 131 cases. [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]Thymic tumors and surrounding tissues were outlined on each CT slice using ITK-SNAP software. Features were extracted from the segmented images using the PyRadiomics library according to the following settings. (Partition width: 25;Resampling pixel spacing: [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] (in millimeters); interpolator: nearest neighbor; normalization: enabled) Use the RadiomicsFeatureExtractor class for feature extraction with all features and image types enabled. These features include various aspects such as shape descriptors, texture and grayscale attributes.Shape features include area, perimeter, and volume. [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]Texture features include grayscale co-occurrence matrix (GLCM) statistics and grayscale size region matrix (GLSZM) attributes. [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]By integrating ITK-SNAP and PyRadiomics libraries, thymoma feature information can be accurately extracted from CT images. [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e2.4 Feature Selection\u003c/h2\u003e \u003cp\u003eAfter extracting a comprehensive feature set using the PyRadiomics library, a multi-stage feature selection process was implemented to identify the most informative features. [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]The first step was feature selection using the Mann-Whitney U rank sum test.[\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] At this stage, features with p-values less than 0.05 were retained. Subsequently, feature selection was performed using LASSO (Least Absolute Shrinkage and Selection Operator) regression to achieve sparsity of the feature coefficients so as to efficiently select features that have a significant effect on the target variable. [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]Finally, the relationship between the features and the target variable is evaluated using the SelectKBest method to select the subset of features that are most useful to the problem, thus selecting ten features out of the features selected by LASSO. Multi-stage feature selection through SelectKBest effectively narrows the initial feature set to a subset of features that are both important and highly relevant, ensuring an accurate and efficient feature set for subsequent classification and predictive analysis.[\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003e2.5 Model building based on stacked ensemble learning\u003c/h2\u003e \u003cp\u003eIn order to scale up the predictive power of multiple machine learning algorithms, we used a stacked ensemble learning approach to build a robust and accurate model for predicting the high-risk of thymoma. [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]Three different machine learning algorithms were selected for the base model, including the use of XGBoost, Random Forest, and Multilayer Perceptron to predict, classify, and train the inputs. The results were fed into the second layer, which was then trained on the inputs using support vectors.[\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eXGBoost yields the final model. While constructing the change value features, data from the plain, arterial and venous phases were considered, along with the corresponding change value features between these phases. These change-value features provide an important basis for understanding the characteristic changes of thymoma at different stages. A\u003c/p\u003e \u003cp\u003es for the choice of meta-learner: we chose XGBoost as a meta-learner to summarize the prediction results of the base model.[\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eThe basic model mentioned above was trained by considering the data of smooth, arterial and venous phases as well as the change value characteristics. 63The prediction results of the base model were used to train the meta-learner and generate the final aggregated prediction results. [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eBy applying stacked integrated learning, we combine multiple machine learning algorithms to provide an efficient and reliable solution for consistent prediction of thymic cancer. Ultimately, a personalized prediction column chart is constructed by combining age, gender and radiomics model.[\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] Rad score is the prediction probability of the integrated image model, which uses stacking learning algorithm. The first layer is composed of XGBoost, Random forest, and MLP, while the second layer is XGBoost. The output results of the first layer are learned. The third layer of the model uses the results of six image models from the first and second layers to learn and output rad score. The process of establishing the nomogram is shown in Figure \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e2.6 Statistical analysis\u003c/h2\u003e \u003cp\u003eWhitney U-test was performed on the continuous variable of radiomics features in Excel, and a unilateral P-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 was considered statistically significant. The chi square test and t-test were conducted in Excel for gender and age, respectively, and bilateral P-values\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were considered statistically significant. In Python (3.9.12), Lasso and SelectKBest algorithms are used to filter imaging omics features, while Mlp, Random Forest, and Xgboost algorithms are used to develop imaging omics models, and the \"\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/Hhy096/nomogram\"\u003c/span\u003e\u003cspan address=\"https://github.com/Hhy096/nomogram\u0026quot;\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e website is used to construct nomograms. Compare the ROC curve of each model with the Delong test in Python (3.9.12). The Delong test is shown in Table\u0026nbsp;2. Perform Decision Curve Analysis (DCA) in Python (3.9.12) to evaluate the clinical utility of the model and draw calibration curves to describe the calibration of bar charts in the training and validation queue.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\n \u003ch2\u003e3.1 Patient Characteristics\u003c/h2\u003e\n \u003cp\u003eThis study were totally included 131 patients with thymoma who received treatment at our hospital, of whom 65 were low-risk and 66 were high-risk. Among these patients, 78 cases were assigned to the GDMU dataset as the training queue. The remaining 53 cases formed the validation queue, known as the GDMU dataset. Table\u0026nbsp;1 presents the baseline characteristics of the thymoma patients at the onset of the study. Clinical and pathological characteristics did not differ significantly between the training and validation groups.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\n \u003ch2\u003e3.2 Feature selection\u003c/h2\u003e\n \u003cp\u003eAn extensive set of 1698 radiomics features and change value features was extracted from the non-enhanced, arterial and venous phase images. This encompassed a diverse spectrum of quantitative imaging characteristics.\u003c/p\u003e\n \u003cp\u003eTo ensure the selection of the most informative features, a multi-step approach was employed. Firstly, the Mann-Whitney U test was applied to identify features with p-values less than 0.05. After determining the potential differences between the categories, features with p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were retained. [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e]Subsequently, the Lasso method was utilized to further streamline the feature set based on their coefficients. After Lasso selection, the retained feature set remained substantial. For optimal performance and interpretability, the SelectKBest method was employed in this study, which selected the top ten features with the highest discriminatory potential. The relevant results are shown in Fig. \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\n \u003ch2\u003e3.3 Radiomics model development\u003c/h2\u003e\n \u003cp\u003eBy exploring the key stages of radiomics model development, the curated features were used to create a robust predictive model for predicting high-risk tumors.\u003c/p\u003e\n \u003cp\u003eFeatures selected from different imaging stages were used to build a feature-enhanced dataset that encapsulates the essence of radiomics attributes in each patient case. [\u003cspan class=\"CitationRef\"\u003e31\u003c/span\u003e]To build the radiomics model, a series of machine learning algorithms were employed, each customized to exploit the potential of the curated features.[\u003cspan class=\"CitationRef\"\u003e32\u003c/span\u003e] These algorithms include Random Forest, XGBoost, and Multilayer Perceptron (MLP). Stacked integrated learning methods are used to integrate the outputs of individual machine learning models to create powerful metamodels. The performance of the radiomics models is rigorously evaluated using a variety of metrics [\u003cspan class=\"CitationRef\"\u003e30\u003c/span\u003e], including accuracy, positive predictive value, negative predictive value, sensitivity, specificity, and area under the ROC curve. The performances of the models are shown in Fig. \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e and the detailed values are shown in Table 3.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\n \u003ch2\u003e3.4 Ensemble Model Development and Validation\u003c/h2\u003e\n \u003cp\u003eThe ensemble model is a culmination of the non-enhanced, arterial and venous stage models, taking advantage of the collective strengths of the predictive capabilities of each imaging stage. [\u003cspan class=\"CitationRef\"\u003e33\u003c/span\u003e]XGBoost was chosen as the meta-learner to aggregate the predictions of the base model. The column chart combining age, gender, and radiomics models is shown in Fig. \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e. The performance of the integrated model was thoroughly evaluated on training and independent validation datasets. The results of Decision-curve analysis is shown in the Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e. [\u003cspan class=\"CitationRef\"\u003e34\u003c/span\u003e]To determine the stability and generalization ability of the model on different datasets, cross-validation and external validation were performed on the GDMUMH dataset to demonstrate the robustness of the model beyond the training cohort.[\u003cspan class=\"CitationRef\"\u003e35\u003c/span\u003e] A feature importance analysis was performed on the ensemble model to reveal the impact of individual radiomics attributes on the ensemble prediction and to facilitate the interpretation of the model. [\u003cspan class=\"CitationRef\"\u003e36\u003c/span\u003e]\u003c/p\u003e\n\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn recent years, there have been numerous studies on the predictive risk assessment of thymoma using radiomics. For instance, Liu et al. [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] employed transfer learning combined with clinical, radiomic, and deep features to establish a support vector machine (SVM) classifier-based model for predicting high and low-risk thymomas, achieving AUCs of 0.99 and 0.95, respectively. Tian et al. [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]investigated the performance of radiomics-based computed tomography phenomics in predicting pathological staging and survival outcomes of thymic epithelial tumor patients, yielding integrated AUCs of 0.935 and 0.811. Xiao et al. [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e]developed a comprehensive radiomics diagnostic model using multivariate logistic regression analysis, incorporating clinical, conventional MR imaging variables, apparent diffusion coefficient (ADC) values, and radiomic features, demonstrating excellent performance in distinguishing low and high-risk thymomas. Feng et al. [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]utilized 14 machine learning models combined with different feature selection strategies to establish a three-class model based on radiomic features, predicting simplified risk classification of thymic epithelial tumors (TETs). MM et al. [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]trained a support vector machine classification model to differentiate between thymomas and thymic carcinomas.[\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e] Integration of traditional and radiomic features in the model also achieved the highest diagnostic performance.[\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eHowever, there has been no study yet utilizing the difference in computed tomography (CT) values between arterial, venous, and plain phases to predict the risk of thymomas.[\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eIn this study, we innovatively developed a radiomics model based on the three phases and their inter-phase differences.[\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e] Compared to CT plain scans, studies have shown a significant increase in vascularity in highly malignant thymoma lesions (1).[\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e] Enhanced CT better visualizes vascularity and enhances the delineation of lesion areas from surrounding structures, all of which can be represented in radiomics. [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]Therefore, in this study, we utilized differential methods to capture numerical changes in enhanced and non-enhanced imaging features.[\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e] Our model effectively demonstrated the heterogeneity of tumor lesion area differences.\u003c/p\u003e \u003cp\u003eAdditionally, in this study, we utilized stacked ensemble learning for predicting high and low-risk thymomas. Stacked learning integrates predictions from multiple base models, effectively improving the accuracy and robustness of the radiomics model in predicting thymoma risk.[\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e] [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]This approach not only handles complex imaging data but also enhances the model's generalization ability, providing clinicians with a more reliable tool for assessing thymoma risk. [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e]The radiomics model and the clinical-radiomics combined model constructed using the stacking method outperformed other models, with the AUC of the imaging model reaching 0.00 and the clinical model reaching 0.00 (2).[\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e] Compared to biopsy, machine learning models have significant value in diagnosing benign and malignant tumors.[\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e] The advantages of these methods lie in their non-invasiveness, providing efficient clinical assessment tools for patients' comfort and effectiveness.[\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eThrough the analysis of selected features, we observed various types of image features, including texture features(such as arterial_original_gldm_SmallDependenceEmphasis), morphological features (such as Shape-based (3D)), and first-order statistical features (such as First Order Statistics). Features extracted from the arterial phase, such as arterial_original_gldm_SmallDependenceEmphasis and arterial_wavelet-LLH_ gldm_ DependenceEntropy (abbreviated feature names), depict specific morphological and textural characteristics of lesions during arterial perfusion, closely related to vascular perfusion. Features extracted from the venous phase, such as venous_wavelet-LHH_glszm_SizeZoneNonUniformity and venous_wavelet-HHH_gldm_ Dependence Variance, highlight the specificity of lesions during venous perfusion. Non-enhanced phase features (such as plain_wavelet-LHL_gldm_SmallDependence HighGrayLevelEmphasis and delta_plain_venous_exponential_glszm_ Large AreaHighGrayLevelEmphasis) provide baseline image information independent of contrast agents. Differential features, such as delta_plain_arterial_original_glcm_MCC and delta_plain_arterial_original_glrlm_RunEntropy, may contain significant changes occurring between the two phases. These changes may be associated with pathological malignant transformation or other lesion features, providing strong clues for diagnosing benign and malignant thymomas.[\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e] These features may focus more on the basic morphological characteristics of lesions, helping to assess the inherent properties of lesions without the influence of contrast agents. [\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e]These features reveal significant changes in thymomas under different blood supply states. This multi-level feature extraction helps to comprehensively understand the complex characteristics of thymomas, providing clinicians with more information about lesions, which is expected to influence clinical decisions and the formulation of personalized treatment strategies.[\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e]\u003c/p\u003e \u003cp\u003eLooking to the future, although our study did not establish models to predict thymoma risk using pathological molecular markers such as ki-67 and tdt, it does not mean the importance of these markers is overlooked. In fact, this provides us with a new perspective: by combining machine learning with medical imaging features, new predictive models can be developed that not only predict thymoma risk but also predict the pathological type of tumors. The advantage of this approach lies in its ability to provide personalized predictions, optimize treatment plans, and improve patient survival rates. Therefore, although our study uses radiomics to predict high and low-risk thymomas, our goal is to develop a universal predictive model applicable to various types of cancers, thus providing more possibilities for future cancer research and treatment. The development of such a model will require further research and understanding of the role of pathological molecular markers in cancer development, and how to apply this knowledge to the establishment of predictive models. This is a challenging but hopeful task because it provides us with a completely new way to understand and combat cancer.\u003c/p\u003e \u003cp\u003eThis study has some limitations, including single-center nature, limited data, retrospective design, the need for clinical validation, and the lack of integration of genomic data. Single-center studies limit the applicability of results in other regions or medical environments, and retrospective design results in relatively limited available data, which may affect the training and performance evaluation of the model. Another limitation is the lack of integration of genomic information, while combining radiomic data with genomic information can provide a more comprehensive understanding of disease mechanisms and may identify molecular biomarkers associated with prognosis and treatment response.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eOur study found that radiomics can effectively predict the risk level of thymic tumor patients, with clinical radiomics correlation maps demonstrating stronger associations in comparison. This knowledge can aid clinicians in guiding the selection of personalized treatment plans for early-stage thymoma patients. This approach provides robust support for personalized therapy, holding significant implications for future clinical practice.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eLASSO \u0026nbsp; Least Absolute Shrinkage and Selection Operator\u003c/p\u003e\n\u003cp\u003eAUC \u0026nbsp; \u0026nbsp;Area under the curve\u003c/p\u003e\n\u003cp\u003eDCA \u0026nbsp; \u0026nbsp;Decision curve analysis\u003c/p\u003e\n\u003cp\u003eROC \u0026nbsp; \u0026nbsp;Receiver operating characteristic\u003c/p\u003e\n\u003cp\u003ePACS \u0026nbsp; Picture Archiving and Communication System\u003c/p\u003e\n\u003cp\u003eADC \u0026nbsp; apparent diffusion coefficient\u003c/p\u003e\n\u003cp\u003eTETs \u0026nbsp; thymic epithelial tumors\u003c/p\u003e\n\u003cp\u003eSVM \u0026nbsp; support vector machine\u003c/p\u003e\n\u003cp\u003eMLP \u0026nbsp; Multilayer Perceptron\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthors\u0026rsquo; contributions\u003c/strong\u003e\u003cstrong\u003e:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eLiang zhu, Jiaming Li, YihanTang, Shuyan He: Conceptualization, Methodology, Software, Writing-Original draft preparation. Yaxuan Zhang, Chunyuan Chen, Siyuan Li,\u0026nbsp;Xuefeng Wang, Ziye Zhuang: Data curation, Writing-Original draft preparation. Shuyan He, Biao Deng: Supervision, Software, Validation. All authors reviewed the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003cstrong\u003e:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;All data generated or analysed during this study are included in this published article and supplementary material.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNone\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eWang J, Zhang S (2010) [Advances on diagnosis and treatment of malignant thymic tumors]. Zhongguo Fei Ai Za Zhi 13:985\u0026ndash;991. doi:10.3779/j.issn.1009-3419.2010.10.10\u003c/li\u003e\n\u003cli\u003eEngels EA, Pfeiffer RM (2003) Malignant thymoma in the United States: Demographic patterns in incidence and associations with subsequent malignancies. International Journal of Cancer 105:546\u0026ndash;551. doi:10.1002/ijc.11099\u003c/li\u003e\n\u003cli\u003eRoden AC, Fang W, Shen Y, et al (2020) Distribution of Mediastinal Lesions Across Multi-Institutional, International, Radiology Databases. J Thorac Oncol 15:568\u0026ndash;579. doi:10.1016/j.jtho.2019.12.108\u003c/li\u003e\n\u003cli\u003eDu X, Yu L, Yang L, et al (2021) [Expression and Diagnostic Value of NPTX1 in Thymoma Patients]. Zhongguo Fei Ai Za Zhi 24:1\u0026ndash;6. doi:10.3779/j.issn.1009-3419.2021.102.03\u003c/li\u003e\n\u003cli\u003eDetterbeck FC, Zeeshan A (2013) Thymoma: current diagnosis and treatment. Chin Med J (Engl) 126:2186\u0026ndash;2191\u003c/li\u003e\n\u003cli\u003eYuan D, Gu Z, Liang G, et al (2018) [Clinical Study on the Prognosis of Patients with Thymoma with Myasthenia Gravis]. Zhongguo Fei Ai Za Zhi 21:1\u0026ndash;7. doi:10.3779/j.issn.1009-3419.2018.01.01\u003c/li\u003e\n\u003cli\u003eTravis WD, Brambilla E, Burke AP, et al (2015) Introduction to The 2015 World Health Organization Classification of Tumors of the Lung, Pleura, Thymus, and Heart. J Thorac Oncol 10:1240\u0026ndash;1242. doi:10.1097/JTO.0000000000000663\u003c/li\u003e\n\u003cli\u003eMultidisciplinary Committee of Oncology, Chinese Physicians Association (2021) [Chinese guideline for clinical diagnosis and treatment of thymic epithelial tumors (2021 Edition)]. Zhonghua Zhong Liu Za Zhi 43:395\u0026ndash;404. doi:10.3760/cma.j.cn112152-20210313-00226\u003c/li\u003e\n\u003cli\u003eFang W, Chen W, Chen G, Jiang Y (2005) Surgical management of thymic epithelial tumors: a retrospective review of 204 cases. Ann Thorac Surg 80:2002\u0026ndash;2007. doi:10.1016/j.athoracsur.2005.05.058\u003c/li\u003e\n\u003cli\u003eLiu X, Li X, Li J (2020) [Treatment of Recurrent Thymoma]. Zhongguo Fei Ai Za Zhi 23:204\u0026ndash;210. doi:10.3779/j.issn.1009-3419.2020.03.11\u003c/li\u003e\n\u003cli\u003eFang W, Fu J, Shen Y, et al (2016) [Management of Thymic Tumors - Consensus Based on the Chinese Alliance for Research in Thymomas Multi-institutional Retrospective Studies]. Zhongguo Fei Ai Za Zhi 19:414\u0026ndash;417. doi:10.3779/j.issn.1009-3419.2016.07.02\u003c/li\u003e\n\u003cli\u003eTomiyama N, Honda O, Tsubamoto M, et al (2009) Anterior mediastinal tumors: diagnostic accuracy of CT and MRI. Eur J Radiol 69:280\u0026ndash;288. doi:10.1016/j.ejrad.2007.10.002\u003c/li\u003e\n\u003cli\u003eJiao Y, Ren Y, Zheng X (2017) [Quantitative Imaging Assessment of Tumor Response to Chemoradiation \u2029in Lung Cancer]. Zhongguo Fei Ai Za Zhi 20:407\u0026ndash;414. doi:10.3779/j.issn.1009-3419.2017.06.07\u003c/li\u003e\n\u003cli\u003eMayoral M, Pagano AM, Araujo-Filho JAB, et al (2023) Conventional and radiomic features to predict pathology in the preoperative assessment of anterior mediastinal masses. Lung Cancer 178:206\u0026ndash;212. doi:10.1016/j.lungcan.2023.02.014\u003c/li\u003e\n\u003cli\u003eLu C-F, Hsu F-T, Hsieh KL-C, et al (2018) Machine Learning-Based Radiomics for Molecular Subtyping of Gliomas. Clin Cancer Res 24:4429\u0026ndash;4436. doi:10.1158/1078-0432.CCR-17-3445\u003c/li\u003e\n\u003cli\u003eHu Y, Xie C, Yang H, et al (2020) Assessment of Intratumoral and Peritumoral Computed Tomography Radiomics for Predicting Pathological Complete Response to Neoadjuvant Chemoradiation in Patients With Esophageal Squamous Cell Carcinoma. JAMA Netw Open 3:e2015927. doi:10.1001/jamanetworkopen.2020.15927\u003c/li\u003e\n\u003cli\u003eLu C, Ward PS, Kapoor GS, et al (2012) IDH mutation impairs histone demethylation and results in a block to cell differentiation. Nature 483:474\u0026ndash;478. doi:10.1038/nature10860\u003c/li\u003e\n\u003cli\u003eEgeland NG, Jonsdottir K, Lauridsen KL, et al (2020) Digital Image Analysis of Ki-67 Stained Tissue Microarrays and Recurrence in Tamoxifen-Treated Breast Cancer Patients. Clin Epidemiol 12:771\u0026ndash;781. doi:10.2147/CLEP.S248167\u003c/li\u003e\n\u003cli\u003eKim D-Y, Park H-S, Choi E-J, et al (2015) Immunophenotypic markers in adult acute lymphoblastic leukemia: the prognostic significance of CD20 and TdT expression. Blood Res 50:227\u0026ndash;234. doi:10.5045/br.2015.50.4.227\u003c/li\u003e\n\u003cli\u003eYan C, Liu J, Yang X, et al (2022) Automatic vs manual coronary CT angiography reconstruction for whole-heart coverage CT scanner: a comparison study in general patient population. J Xray Sci Technol 30:389\u0026ndash;398. doi:10.3233/XST-211048\u003c/li\u003e\n\u003cli\u003eAerts HJWL, Velazquez ER, Leijenaar RTH, et al (2014) Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5:4006. doi: 10.1038/ncomms5006\u003c/li\u003e\n\u003cli\u003eParis MT, Mourtzakis M (2021) Muscle Composition Analysis of Ultrasound Images: A Narrative Review of Texture Analysis. Ultrasound Med Biol 47:880\u0026ndash;895. doi:10.1016/j.ultrasmedbio.2020.12.012\u003c/li\u003e\n\u003cli\u003evan Griethuysen JJM, Fedorov A, Parmar C, et al (2017) Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 77:e104\u0026ndash;e107. doi:10.1158/0008-5472.CAN-17-0339\u003c/li\u003e\n\u003cli\u003eHuang C-B, Hu J-S, Tan K, et al (2022) Application of machine learning model to predict osteoporosis based on abdominal computed tomography images of the psoas muscle: a retrospective study. BMC Geriatr 22:796. doi:10.1186/s12877-022-03502-9\u003c/li\u003e\n\u003cli\u003eBrown MP, Grundy WN, Lin D, et al (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A 97:262\u0026ndash;267. doi:10.1073/pnas.97.1.262\u003c/li\u003e\n\u003cli\u003eNaimi AI, Balzer LB (2018) Stacked generalization: an introduction to super learning. Eur J Epidemiol 33:459\u0026ndash;464. doi:10.1007/s10654-018-0390-z\u003c/li\u003e\n\u003cli\u003eSipper M, Moore JH (2021) Conservation machine learning: a case study of random forests. Sci Rep 11:3629. doi: 10.1038/s41598-021-83247-4\u003c/li\u003e\n\u003cli\u003eLi Y, Li M, Li C, Liu Z (2020) Forest aboveground biomass estimation using Landsat 8 and Sentinel-1A data with machine learning algorithms. Sci Rep 10:9952. doi:10.1038/s41598-020-67024-3\u003c/li\u003e\n\u003cli\u003ePham TX, Siarry P, Oulhadj H (2020) Segmentation of MR Brain Images Through Hidden Markov Random Field and Hybrid Metaheuristic Algorithm. IEEE Trans Image Process. doi:10.1109/TIP.2020.2990346\u003c/li\u003e\n\u003cli\u003eLambin P, Leijenaar RTH, Deist TM, et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749\u0026ndash;762. doi:10.1038/nrclinonc.2017.141\u003c/li\u003e\n\u003cli\u003eLiu Y, Stojadinovic S, Hrycushko B, et al (2017) A deep convolutional neural network-based automatic delineation strategy for multiple brain metastases stereotactic radiosurgery. PLoS One 12:e0185844. doi:10.1371/journal.pone.0185844\u003c/li\u003e\n\u003cli\u003eLambin P, Rios-Velazquez E, Leijenaar R, et al (2012) Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 48:441\u0026ndash;446. doi:10.1016/j.ejca.2011.11.036\u003c/li\u003e\n\u003cli\u003eFang Z, Ren J, MacLellan C, et al (2022) A Novel Multi-Stage Residual Feature Fusion Network for Detection of COVID-19 in Chest X-Ray Images. IEEE Trans Mol Biol Multiscale Commun 8:17\u0026ndash;27. doi:10.1109/TMBMC.2021.3099367\u003c/li\u003e\n\u003cli\u003eYan Z, Wang J, Dong Q, et al (2022) XGBoost algorithm and logistic regression to predict the postoperative 5-year outcome in patients with glioma. Ann Transl Med 10:860. doi: 10.21037/atm-22-3384\u003c/li\u003e\n\u003cli\u003eMa M, Liu R, Wen C, et al (2022) Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms. Eur Radiol 32:1652\u0026ndash;1662. doi:10.1007/s00330-021-08271-4\u003c/li\u003e\n\u003cli\u003eGafita A, Calais J, Grogan TR, et al (2021) Nomograms to predict outcomes after 177Lu-PSMA therapy in men with metastatic castration-resistant prostate cancer: an international, multicentre, retrospective study. Lancet Oncol 22:1115\u0026ndash;1125. doi:10.1016/S1470-2045(21)00274-6\u003c/li\u003e\n\u003cli\u003eLiu W, Wang W, Zhang H, et al (2023) Development and Validation of Multi-Omics Thymoma Risk Classification Model Based on Transfer Learning. J Digit Imaging 36:2015\u0026ndash;2024. doi:10.1007/s10278-023-00855-4\u003c/li\u003e\n\u003cli\u003eTian D, Yan H-J, Shiiya H, et al (2023) Machine learning-based radiomic computed tomography phenotyping of thymic epithelial tumors: Predicting pathological and survival outcomes. J Thorac Cardiovasc Surg 165:502-516.e9. doi:10.1016/j.jtcvs.2022.05.046\u003c/li\u003e\n\u003cli\u003eXiao G, Hu Y-C, Ren J-L, et al (2021) MR imaging of thymomas: a combined radiomics nomogram to predict histologic subtypes. Eur Radiol 31:447\u0026ndash;457. doi: 10.1007/s00330-020-07074-3\u003c/li\u003e\n\u003cli\u003eFeng X-L, Wang S-Z, Chen H-H, et al (2022) Optimizing the radiomics-machine-learning model based on non-contrast enhanced CT for the simplified risk categorization of thymic epithelial tumors: A large cohort retrospective study. Lung Cancer 166:150\u0026ndash;160. doi:10.1016/j.lungcan.2022.03.007\u003c/li\u003e\n\u003cli\u003eMayoral M, Pagano AM, Araujo-Filho JAB, et al (2023) Conventional and radiomic features to predict pathology in the preoperative assessment of anterior mediastinal masses. Lung Cancer 178:206\u0026ndash;212. doi:10.1016/j.lungcan.2023.02.014\u003c/li\u003e\n\u003cli\u003eSu X-Y, Wang W-Y, Li J-N, et al (2015) Immunohistochemical differentiation between type B3 thymomas and thymic squamous cell carcinomas. Int J Clin Exp Pathol 8:5354\u0026ndash;5362\u003c/li\u003e\n\u003cli\u003eRao A, Pang M, Kim J, et al (2023) Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow. medRxiv 2023.02.21.23285886. doi:10.1101/2023.02.21.23285886\u003c/li\u003e\n\u003cli\u003eFukumoto K, Taniguchi T, Ishikawa Y, et al (2012) The utility of [18F]-fluorodeoxyglucose positron emission tomography-computed tomography in thymic epithelial tumours. Eur J Cardiothorac Surg 42:e152-156. doi:10.1093/ejcts/ezs527\u003c/li\u003e\n\u003cli\u003eDu K-P, Huang W-P, Liu S-Y, et al (2022) Application of computed tomography-based radiomics in differential diagnosis of adenocarcinoma and squamous cell carcinoma at the esophagogastric junction. World J Gastroenterol 28:4363\u0026ndash;4375. doi:10.3748/wjg.v28.i31.4363\u003c/li\u003e\n\u003cli\u003eHoogenboom M, Eikelenboom DC, van den Bijgaart RJE, et al (2017) Impact of MR-guided boiling histotripsy in distinct murine tumor models. Ultrason Sonochem 38:1\u0026ndash;8. doi:10.1016/j.ultsonch.2017.02.035\u003c/li\u003e\n\u003cli\u003eEsposito D, Olsson DS, Ragnarsson O, et al (2019) Non-functioning pituitary adenomas: indications for pituitary surgery and post-surgical management. Pituitary 22:422\u0026ndash;434. doi:10.1007/s11102-019-00960-0\u003c/li\u003e\n\u003cli\u003ePinheiro LC, Candore G, Zaccaria C, et al (2018) An algorithm to detect unexpected increases in frequency of reports of adverse events in EudraVigilance. Pharmacoepidemiol Drug Saf 27:38\u0026ndash;45. doi:10.1002/pds.4344\u003c/li\u003e\n\u003cli\u003eChen K-H, Hu Y-J (2021) Residue-Residue Interaction Prediction via Stacked Meta-Learning. Int J Mol Sci 22:6393. doi:10.3390/ijms22126393\u003c/li\u003e\n\u003cli\u003ePonsiglione A, Gambardella M, Stanzione A, et al (2023) Radiomics for the identification of extraprostatic extension with prostate MRI: a systematic review and meta-analysis. Eur Radiol. doi:10.1007/s00330-023-10427-3\u003c/li\u003e\n\u003cli\u003eLin Y, Ma J, Wang Q, Sun D-W (2023) Applications of machine learning techniques for enhancing nondestructive food quality and safety detection. Crit Rev Food Sci Nutr 63:1649\u0026ndash;1669. doi:10.1080/10408398.2022.2131725\u003c/li\u003e\n\u003cli\u003eXing L, Lesperance ML, Zhang X (2020) Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics 36:65\u0026ndash;72. doi:10.1093/bioinformatics/btz531\u003c/li\u003e\n\u003cli\u003eLiu X, Cheng D, Wang W (2015) MRI in differentiation of benign and malignant tongue tumors. Front Biosci (Landmark Ed) 20:614\u0026ndash;620. doi:10.2741/4326\u003c/li\u003e\n\u003cli\u003eNorbash A, Yucel K, Yuh W, et al (2016) Effect of team training on improving MRI study completion rates and no-show rates. J Magn Reson Imaging 44:1040\u0026ndash;1047. doi:10.1002/jmri.25219\u003c/li\u003e\n\u003cli\u003eYin X, Li Y, Wang H, et al (2022) Small cell lung cancer transformation: From pathogenesis to treatment. Semin Cancer Biol 86:595\u0026ndash;606. doi:10.1016/j.semcancer.2022.03.006\u003c/li\u003e\n\u003cli\u003eKulikova OI, Stvolinsky SL, Migulin VA, et al (2020) A new derivative of acetylsalicylic acid and carnosine: synthesis, physical and chemical properties, biological activity. Daru 28:119\u0026ndash;130. doi:10.1007/s40199-019-00323-x\u003c/li\u003e\n\u003cli\u003eGallastegui N, Steiner B, Aguero P, et al (2022) The role of point-of-Care Musculoskeletal Ultrasound for Routine Joint evaluation and management in the Hemophilia Clinic - A Real World Experience. BMC Musculoskelet Disord 23:1111. doi:10.1186/s12891-022-06042-w\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003eTables 1 to 3 are available in the Supplementary Files section.\u003c/p\u003e"},{"header":"Supplementary Figure","content":"\u003cp\u003eSupplementary Figure 3 is not available with this version.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFigure S3:\u0026nbsp;\u003c/strong\u003eCalibration curves of the models. (a), (b): Radscore model in the training dataset(a) and test dataset(b). (c), (d): The combined model in the training dataset(c) and test dataset(d).\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Machine learning, CT, Thymoma","lastPublishedDoi":"10.21203/rs.3.rs-3983809/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3983809/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eObjective:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe aim of this study is to propose a medical imaging and comprehensive stacking learning based method for predicting high and low risk categories of thymoma.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis retrospective study collected 126 patients with thymoma and 5 patients with thymic carcinoma treated at our institution, including 65 low-risk cases and 66 high-risk cases. Among them 78 cases were the training cohort. The rest formed the validation cohort (53 cases). Radiomicsfeatures and variation features are extracted from collected medical imaging data. Mann-Whitney U-test was used to identify and determine potential differences between categories and features with p\u0026lt;0.05 were retained. Feature selection was first performed using LASSO regression, and then the top ten features with the highest potential for differentiation were selected using the SelectKBest method. By applying stacked ensemble learning, we combine three machine learning algorithms to provide an efficient and reliable solution for risk prediction of thymoma.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA total of 54 features were identified as the most discriminative features for low-risk and high-risk thymoma, and were used to develop radiomics features. Our model successfully identified patients with low-risk and high-risk thymoma. For the imaging omics model, the AUC in the training and validation cohorts were 0.999 (95%CI,0.988-1.000) and 0.967(95%CI,0.916-1.000). For the nomogram, the values were 0.999 (95%CI,0.996-1.000) and 0.983 (95%CI,0.990-1.000).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study describes the application of CT based radiomics in thymoma patients and proposes a clinical decision nomogram that can be used to predict the risk of thymoma. This nomogram is advantageous for clinical decision-making concerning thymoma patients.\u003c/p\u003e","manuscriptTitle":"A machine learning based on CT radiomics signature and change value features for predicting the risk classification of thymoma","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-03-13 16:21:08","doi":"10.21203/rs.3.rs-3983809/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ea6c5724-2a52-42fa-9fd5-79b886497591","owner":[],"postedDate":"March 13th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-03-14T01:46:34+00:00","versionOfRecord":[],"versionCreatedAt":"2024-03-13 16:21:08","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3983809","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3983809","identity":"rs-3983809","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-19T01:45:01.086888+00:00