Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings

doi:10.21203/rs.3.rs-4644889/v1

Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings

2024 · doi:10.21203/rs.3.rs-4644889/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 131,179 characters · extracted from preprint-html · click to expand

Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings Felipe Cicci Farinha Restini, Tarraf Torfeh, Souha Aouadi, Rabih Hammoud, and 15 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4644889/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 14 Nov, 2024 Read the published version in Scientific Reports → Version 1 posted 10 You are reading this latest preprint version Abstract Background Glioblastoma is an aggressive brain cancer with a poor prognosis. MGMT (O6-methylguanine-DNA methyltransferase) gene methylation status is crucial for treatment stratification, yet economic constraints often limit access. This study aims to develop an artificial intelligence (AI) framework for predicting MGMT methylation status. Methods Machine learning (ML) and deep learning (DL) techniques were applied to diagnostic MR images from the NIH and a private institution. The images were segmented according to ESTRO-ACROP 2016 guidelines for radiotherapy treatment volumes and combined, with clinical evaluations from neuroradiology experts. Radiomic features (quantitative) and clinical impressions (qualitative) were extracted for ML models. Feature selection methods were used to identify relevant phenotypes for training and validation with ML classifiers. Results We evaluated 100 patients from the NIH and 46 patients from a local institution. A total of 343 features were extracted. Eight feature selection methods produced seven independent predictive frameworks. The top-performing ML models included Recursive Feature Elimination (RFE) combined with Linear Discriminant Analysis (LDA) (accuracy of 0.75). DL performance achieved an accuracy of 0.74 using convolutional networks. Conclusion This study demonstrates that integrating clinical and radiotherapy-derived AI-driven phenotypes can accurately predict MGMT methylation. The framework also addresses constraints that limit molecular diagnosis access. Biological sciences/Cancer Biological sciences/Computational biology and bioinformatics Health sciences/Health care Health sciences/Medical research Health sciences/Oncology Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Glioblastoma is the most common malignant primary brain cancer in adulthood 1 . Despite treatment advances, prognosis remains poor with a median overall survival (OS) varying between 9 to 15 months 2,3 . Patients with methylation of the MGMT (O6-methylguanine-DNA methyltransferase) have been shown to have improved survival rates due to a better response to alkylating agents such as temozolomide 3,4 . Age is another important prognostic factor, and median survival in patients over 65 years can be as low as 4 to 5 months 5 . Recent studies have investigated tailoring strategies for the elderly, such as chemotherapy alone or hypofractionation 6 . However, evaluating MGMT methylation status is a key factor for guiding treatment allocation 6–8 . In low and middle-income regions, the stratification of glioblastoma treatment through MGMT methylation identification is frequently thwarted by economic limitations that permeate the entire spectrum of cancer care. A stark illustration of this challenge is the accessibility of radiotherapy (RT) in Brazil, where 15.9% of patients initiate RT within the recommended 30 days post-diagnosis 9 . The integration of MGMT methylation status into clinical decision-making holds the potential for optimized treatment planning and judicious allocation of medical resources. Artificial Intelligence (AI) applications are being progressively employed to anticipate MGMT methylation status 10 , symbolizing a pivotal advancement in bridging the divide in healthcare accessibility. The latest compilations of data unveil a notable enhancement in the prognostic accuracy (ACC) of these AI algorithms, with performance metrics exhibiting an expansion in the area under the receiver operating characteristic curve (AUC) from 0.67 to an impressive 0.87. Such advances herald a promising horizon for the precision-oriented management of glioblastoma in resource-constrained settings 11,12 Nonetheless, most AI studies are based on costly complex imaging acquisition protocols to predict diagnostic tests such as MGMT. Also, they’re using mostly non-standardized volume of interest definitions 13,14 , which may hinder practical application and increase variability in results. Hence, the study objective was to create an AI model that utilizes readily available diagnostic images, such as T1 and Flair MRIs to predict MGMT test results. The imaging data was extracted from predefined regions of interest following standardized tumor definition protocols, and combined with clinical and neuroradiological assessments. Methods Data Collection This study utilized two data sources. The training cohort's data was retrieved from The Cancer Imaging Archive's UPENN-GBM public database 15 . The validation cohort comprised GBM patients from a single Brazilian institution with inclusion criteria of cranial MRI and MGMT data. The diagnostic T1 weighted with gadolinium (T1GD) and FLAIR MR images were evaluated by radiation oncologists (tumor delineation) and neuroradiologists (standardized radiological interpretation). The patient's clinical features acquired from medical records were age, sex, MGMT status (defined as TARGET, for supervised learning purposes), and IDH status. This study was conducted per the principles of the Declaration of Helsinki and was submitted and approved by the Brazilian National Health Council through the Brazil platform with the identifier 63591922.6.0000.5461. As this study involved a retrospective analysis of hospital database records, a waiver for the application of Informed Consent was requested and approved by the local bioethics committee. RadF extraction For radiomic feature extraction (RadF), Volumes of Interest (VOI) were created by contouring following a modified version of ESTRO-ACROP 2016 guideline for target delineation in radiation therapy treatment 16 . The contour was made with Eclipse Treatment Planning System (Varian - Siemens Healthineers, Palo Alto, USA) and a 256x256 matrix size was used. We defined three distinct VOI using two MR sequences. On the T1GD sequence, we initially outlined the gross tumor volume (GTV), identified by the contrast-enhancing lesion, referred to as T1VOL. Subsequently, we constructed a secondary VOI, extending symmetrically 2 cm from the boundaries of the T1VOL. This expansion was adjusted to respect anatomical barriers and to encompass all pathological changes evident in the FLAIR sequence, thus defining the T1_FLAIR VOI. The third VOI, termed FLAIR, was crafted to mirror the dimensions and configuration of the T1_FLAIR, yet delineated exclusively on the FLAIR sequence. The DICOM images, and VOI's as RT STRUCTURESET were exported to 3D Slicer for RadF extraction 17 . We used the Pyradiomic 18 extension to extract 150 radiomic phenotypes from each VOI. The RadF included shape (26 features), first-order statistics (19 features), and textural features (75 features). Shape features are related to the geometric properties of the tumor. The first-order features describe the distribution of the tumor intensity. Texture features represent the heterogeneity of the tumor and were extracted from the gray level co-occurrence (GLCM) 19 , gray level run length matrix (GLRLM) 20 , gray level size zone matrix (GLSZM) 21 , neighboring gray-tone difference matrix 19 , and gray level dependence matrix (GLDM) 22 matrices. The detailed calculation of these RadF is available on pyradiomics website 18 ( https://pyradiomics.readthedocs.io/en/latest/features.html ). A bin width = 25 was used for gray-level discretization before texture calculation. Preprocessing filters including Laplacian-of-Gaussian and wavelet were also applied. Categorical Neuroradiology Evaluation Four expert neuroradiologists evaluated the same images following a guideline for categorical impressions of the images (Table S1 in Supplementary Appendix). Data Frame Structuring The clinical data, RadF, and categorical assessments were unified into a spreadsheet, forming a comprehensive data frame. This data frame underwent preprocessing using Python 3.6 on the Google Colab platform. Continuous variables were normalized using the MinMax scaling technique 23 , while categorical variables were transformed via OneHotEncoding 24 , both from Scikit Learn Library. Subsequently, this data frame was divided into two separate entities: Public and Private data frames corresponding to training and validation databases respectively. To address the issue of data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) 25 was employed to upsample the Public data frame. Features Selection The structured data frame underwent various filtering techniques to identify the most pertinent features for predicting MGMT methylation. In univariate filters, the ANOVA F-test was applied to continuous variables and the Chi-Squared test to categorical variables, with the p-value as a varying hyperparameter set at 0.1, 0.05, and 0.01 14 . Also in the univariate filters, Bonferroni corrections for multiple comparisons were applied 26 . For wrapper methods, Recursive Feature Elimination (RFE) 27 was utilized. In embedded methods, we employed the Least Absolute Shrinkage and Selection Operator (LASSO) with an L1 penalty, alongside a LightBoost regressor 28,29 . Additionally, for dimensionality reduction, Principal Component Analysis (PCA) was used to generate principal components, encompassing 95% of the variance 30 . These processes resulted in new data frames, each corresponding to the different supervised machine-learning experiments to be conducted. Machine Learning Experiments Machine learning (ML) and Deep Learning (DL) techniques were compared. For ML, each experiment was associated with a distinct filtering method, as depicted in Fig. 1 . All experiments were executed within the PyCaret 31 environment, enabling a concurrent assessment of various classifiers. A total number of eighteen binary classifiers were tested; linear and quadratic discriminant analysis, trees-based classifiers (extra trees, decision trees, random forests), ridge classifier, MLP classifier, boosting algorithm (gradient boosting, ada boost, extreme gradient boosting, light gradient boosting machine, cat boost), linear and radial support vector machine, logistic regression, k-nearest neighbors classifier, Naive-Bayes, and gaussian process classifier. A standard setup was maintained across all experiments in terms of pre-processing parameters. The Public dataset was utilized for training and testing the models, while the Private dataset was reserved as a validation set, and consequently not involved in the modeling process (unseen data). For the model construction, the data was divided into a 70:30 ratio, resulting in x_train, y_train, x_test, and y_test subsets. Here, 'x' represents the input variables used by the algorithm for predicting 'y', the TARGET variable. The efficacy of the models was assessed using a 10-fold Stratified K-Fold cross-validation approach. To address multicollinearity the threshold was set for 80%, this was made by Minimum absolute Pearson correlation, thus, if any column was correlated with each other equal to or higher than this threshold, it would be removed. The top five algorithms, based on their initial performance comparison, underwent hyperparameter tuning using the Optuna Library 32 where applicable. These models were refined using ensemble bagging and either blending or stacking techniques and evaluated with 10-fold cross-validation. The predictions of the base models are provided as features for the meta-model. The selection of the best-performing model was based on ACC and AUC metrics, and then applied to the unseen data to assess its robustness and effectiveness in handling new datasets. Implementation of Deep Learning Workflow Data Frame Structuring For Deep Learning (DL) methodologies, the training dataset includes 100 MR studies of T1 and T2 Flair for GBM patients from the public dataset UPENN-GBM. Only slices containing structure sets were used to train the models. To increase the size of the training set and make the model more robust to variations in the data, data augmentation was applied to the original training images by performing random rotations of ± 7 degrees and translations of ± 2mm. The model's output is a probability that ranges from 0 to 1, for unmethylated and methylated, respectively. After the training, the models are tested using a different set of 46 images of GBM patients issued from private data. DL Experiments For DL experiments, two models have been used to predict the output. The first model, as demonstrated in Fig. 2 , consists of three convolutional layers with adjustable filter sizes, three max-pooling layers for downsampling, a flattening layer, and two dense layers with adjustable units. In the second model, we have added two additional convolutional layers and an additional pooling layer after each convolutional layer. This increases the depth of the model, allowing it to capture more complex features and patterns in the images. The loss function adopted for training is "binary Cross-entropy " and the optimizer used is ADAM (Adaptive Moment Estimation). For all the models, a thorough hyperparameter tuning process was conducted using the Keras Tuner library. The best hyperparameters including, the number of filters, the number of units in the densely connected layer, the learning rate, the batch size, and the number of training epochs were determined through a random search with 10 trials. Statistical Analyses For the primary endpoint assessment and validation of model metrics, each model's performance was evaluated based on its ACC and the quality of its Receiver Operating Characteristic (ROC) curve. Exploratory analysis was made, for continuous variables T-test and Mann-Whitney U test were performed when applicable. For categorical variables, the Fisher Exact Test and Chi-squared were performed. Statistical analyses were performed by using SPSS version 20, and Python 3.6, a two-sided p-value of 0.05 or less was considered statistically significant. Results Patient Characteristics From July 16, 2022, to December 15, 2022, 146 patients were selected for further analysis. One hundred from the UPENN-GBM database (training set) and 46 from the private institution (validation set). The median age was 63.7 for the training set and 57.5 years for the validation set. Table 1 presents baseline characteristics. Table 1 Clinical and radiological characteristics of patients, tumor characteristics, and methylation status from both sets. Feature Description Training Validation Tumor Location Type 1 (in contact with subventricular zone (SVZ) and cortex) 65(65) 21(45.7) Type 2 (contact with SVZ but NOT cortex) 4( 4 ) 6( 13 ) Type 3 (Contact ONLY with cortex) 31( 31 ) 17( 37 ) NA NA 2(4.3) Laterality Right 45(45) 16 (34.8) Left 48(48) 24(52.2) NA 7( 7 ) 6( 13 ) Well Defined Bordes Yes 18( 18 ) 19(41.3) No 82(82) 27(58.7) NA NA NA Multifocality Yes 28( 28 ) 9(19.6) No 72(72) 35(76.1) NA NA 2(4.3) Midline Crossing Yes 32( 32 ) 13(28.3) No 68(68) 31(67.4) NA NA 2(4.3) Greatest Dimension of Contrast Enhancing Lesion < 1.5cm 5( 5 ) 3(6.5) ≥ 1.5cm 95(95) 40(87) NA NA 3(6.5) IDH 1/2 Status Wildtype 100(100) 42(91.3) Mutated 0(0) 4(8.7) Gender Female 41(41) 29(63) Male 59(59) 17( 37 ) MGMT Status Unmethylated 61(61) 30(65.2) Methylated 39( 39 ) 16(34.8) Feature Extraction Using Pyradiomics, 450 radiomic features (RadF) were extracted, and narrowed down to 330 after eliminating location-specific and redundant features. These were merged with 13 categorical and clinical characteristics. To balance the dataset, SMOTE generated 22 synthetic cases, equalizing the proportions of MGMT-methylated patients. The illustration of the 8 filtering methods can be accessed in Figure S1 in the Supplement Appendix, it yields diverse results: PCA identified 24 principal components, ANOVA F test and Chi-square tests at varied p-values isolated 1, 8, 15, and 44 features, while LASSO Regression and LightBoost pinpointed 21 and 74 features, respectively. RFE selected 17 features. Algorithms Across eight experiments, 105 algorithms were developed, and the best from each was selected, yielding seven top performers detailed in Table S3 of the Supplementary Appendix. Access to the datasets and codes were also provided. Best Performance in Training and Validation set The best performance on the Public dataset, with an AUC of 0.78 and ACC of 0.75, was achieved using RFE with a stacked estimator, specifically Logistic Regression (see Fig. 3 ). When applied to the validation set, the AUC was 0.62, with an ACC of 0.71. Conversely, the best performance on the unseen dataset, with an AUC of 0.77 and ACC of 0.76, was obtained using a Ridge Classifier post-LASSO feature selection. This outperformed its results on the Public dataset, achieving an AUC of 0.70 and ACC of 0.65. Secondary Analyses We investigated correlations between clinical factors like tumor specifics and patient age with MGMT methylation, detailed in Table 2 . Neuro-radiology expert opinions were consistent across categorical analyses, with no significant variance (Fisher's exact p > .05). Table 2 Significance of clinical variables with its odds ratio and confidence interval. Variable p-value Odds Ratio CI 95% SVZ + Cortical 1.00 1.01 0.54–1.87 SVZ 0.04 0.12 0.01–0.97 Cortical 0.67 1.22 0.64–2.32 Intermediate Necrosis 0.26 1.64 0.78–3.45 Severe Necrosis 0.16 0.59 0.31–1.15 Right Hemisphere 0.09 1.80 0.97–3.35 Left Hemisphere < 0.001 0.27 0.14–0.52 Well defined borders < 0.001 0.19 0.08–0.47 Unifocal 0.03 0.44 0.22–0.87 Midline crossing 0.47 1.35 0.70–2.60 Constrast Enchancement Nodular 0.51 0.57 0.14–2.38 Patchy 0.43 1.30 0.70–2.45 Ring 0.16 0.63 0.34–1.15 Hemorraghea 0.09 1.76 0.94–3.29 Cystic 1.00 0.78 0.13–4.81 Size greater than 1.5cm 0.77 0.84 0.26–2.70 SVZ: subventricular zone. Regarding continuous variables, 42 variables showed significant association with MGMT status. The interrelationships among these variables were thoroughly assessed and are illustrated in the subsequent heatmap. The Spearman-test showed out of approximately 1,500 interactions, only 25 were significant (p-value < 0.05) with correlations from 0.96 to 1.0. Four of these were negative correlations (-0.93 to -0.94), indicating mostly weak to moderate phenotype relationships. These findings can be visualized in Fig. 4 . To access the description of each variable, refer to Table S2 in the Supplement Appendix. DL models After training the models on the public dataset, the best hyperparameters obtained for the first custom-developed CNN model were: 48 filters, 112 filters, and 256 filters for the first, second, and third convolution layers respectively. 80 units for the densely connected layer, 0.0001 for the learning rate, 16 for the batch size, and 10 training epochs. The best ACC achieved was 69%, and the precision was 70%. Figure 5 illustrates the ACC and loss graphs created during the training and testing phases. For the second custom-developed CNN model, the best hyperparameters were: 16, 112, 224, 192, and 768 filters for the first, second, third, fourth, and fifth convolution layers respectively. 112 units for the densely connected layer, 0.0001 for the learning rate, 32 for the batch size, and 20 training epochs. The best ACC achieved was 74% and a precision equal to 75%. Figure 6 illustrates the ACC and loss graphs created during the training and testing phases for the second DL model. When evaluated on unseen data, the first model achieved an ACC of 62% and a precision of 54%, while the second model achieved an ACC of 70% and a precision of 67%. Discussion The present study provides a framework for predicting MGMT methylation by the application of AI techniques on diagnostics MRIs together with clinical information. The final best algorithm reached 0.75 ACC, 0.78 AUC with 0.83 sensibility (Fig. 3 ). The results showed notable variations in performance across different experiments, with the primary distinction arising from the feature selection methods employed. This underscores how crucial pre-processing steps are in influencing the overall predictive performance. The ESTRO-ACROP guideline was adapted to create the VOI from which the RadF would be extracted 16 representing an effort to allow reproducibility for this framework (Fig. 1 ). This is in contrast with previous research that was mostly based on experts' qualitative perspectives, i.e. defining disease areas 10,33 . Moreover, this study intentionally utilized only two MRI sequences that are generally already performed for diagnosis to align with constraints in low-resource settings. While many high-performance studies typically employ a variety of imaging types, such an approach may not be feasible in environments with limited resources. The neuroradiologist interpretation, together with other clinical features, was intentionally combined with AI-derived, i.e. RadF, once previous studies demonstrated an increased predictive performance of AI algorithms 11,34 . Additionally, we observed the presence of these features in all final best algorithms, therefore, we hypothesized that the clinical perspective is relevant for algorithm engineering. The implementation of multiple filters in the pre-processing phase is another strong point of this research. Typically these models utilize only a small subset of these features after applying filtering techniques 13,14,33,35 . Regrettably, most studies used only one or a few methods for feature filtering, potentially overlooking the value of other filter methods 14 . This led to the question regarding the best number of features to be used (Figure S1 , supplement appendix ). Mann-Whitney U test found 42 RadFs significantly linked to MGMT methylation. However, experiments with this number of features led to poor performance. Common statistical tests like ANOVA and Chi-Squared couldn't manage type I errors well, resulting in many false positives and poor performance. Using the Bonferroni correction, only one variable was selected, highlighting its conservative nature and the high risk of type II errors, excluding significant variables, a finding already present in previous genome-wide analyses. 36 (Figure S2, supplement appendix ). In other words, these results indicate that common statistical methods do not provide the optimal balance between the odds of type I and II error, leading to inaccurate outcomes. Additionally, the majority of the correlation between features was determined as weak to moderate by Spearman's test. This adds a layer of complexity, especially once high correlations ultimately lead to multicollinearity in AI algorithm development 37 . This condition can make it difficult to predict the impact of these variables and potentially cause overfitting 38 , reducing the algorithm's ability to generalize to new data. The observed performance decline of algorithms on unseen data, although expected, it could also be related to notable differences in patient characteristics between the two cohorts (100 public database patients vs. 46 from private data). These populations may differ significantly, impacting algorithm effectiveness. Enhancing the sample size could balance these disparities, ensuring a more representative comparison and potentially stabilizing algorithm performance. Another point to explore is the difference in DL and ML performances. There was a marginal difference in ACC performance, 0.74 vs 0.75 for DL and ML, respectively. Recently, DL has become an important approach in ML techniques, and due to its ability to better deal with massive amounts of data, it has been reaching outstanding performances while dealing with complex tasks 39 . In the present study, a CNN strategy was used for predicting MGMT status, this approach is known to deal with large amounts of images properly and is thus useful for this specific task 40 . It is very likely that when increasing the number of patients analyzed the predictive power of DL would overcome those from ML, however, our results cannot prove this theory. Despite existing challenges, the algorithm showed commendable performance, utilizing minimal resources like the number of MRI sequences used and clinical data efficiently. Enhancing this framework while maintaining simplicity could be crucial for enabling reproducibility in low- to middle-income countries, suggesting that strategic improvements could broaden its applicability without sacrificing accessibility. Conclusion This research highlights the potential of an affordable AI tool to enhance clinical decision-making in oncology, where molecular characteristics of tumors are crucial. It shows AI's ability to improve clinical outcomes through precision medicine, even in resource-constrained settings, advancing equitable healthcare in oncology. Declarations Disclaimers: The content of this article represents the views of the involved authors and not the official positions of their respective institutions. Source of Support: This study did not receive any grants or financial support. Data Availability Statement: This project will share most of the data generated to formulate the presented results. Additionally, the code used for algorithm development will be available online for public access. Author Names, Degrees, and Contributions Felipe Cicci Farinha Restini: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. Tarraf Torfeh: Resources, Software, Supervision, Validation, Visualization, Writing – original draft. Souha Aouadi: Resources, Software, Supervision, Validation, Visualization, Writing – original draft. Rabih Hammoud: Supervision. Noora Al-Hammadi: Supervision. Maria Thereza Mansur Starling: Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing. Cecília Felix Penido Mendes Souza: Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing. Anselmo Mancini: Software. Leticia Hernandes Brito: Methodology. Fernanda Hayashida Yoshimoto: Methodology. Nildevande Firmino Lima-Júnior: Methodology. Marcelo Moro Queiroz: Methodology. Ula Lindoso Passos: Investigation, Methodology, Project administration, Resources. Camila Trolez Amancio: Investigation, Methodology, Project administration, Resources. Jorge Tomio Takahashi: Investigation, Methodology, Project administration, Resources. Daniel De Souza Delgado: Investigation, Methodology, Project administration, Resources. Samir Abdallah Hanna: Conceptualization, Investigation, Project administration, Resources, Supervision, Visualization, Writing – original draft. Gustavo Nader Marta: Conceptualization, Investigation, Project administration, Resources, Supervision, Visualization, Writing – original draft. Wellington Furtado Pimenta Neves-Junior: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. References Ostrom QT, Price M, Neff C, et al. CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2016–2020. Neuro Oncol 2023;25(Supplement_4):iv1–99. Brown NF, Ottaviani D, Tazare J, et al. Survival Outcomes and Prognostic Factors in Glioblastoma. Cancers [Internet] 2022 [cited 2024 Jan 20];14(13):3161. Available from: https://www.mdpi.com/2072–6694/14/13/3161 Stupp R, Mason WP, van den Bent MJ, et al. Radiotherapy plus Concomitant and Adjuvant Temozolomide for Glioblastoma. New England Journal of Medicine [Internet] 2005 [cited 2023 Apr 26];352(10):987–96. Available from: https://doi.org/10.1056/NEJMoa043330 Hegi ME, Diserens A-C, Gorlia T, et al. MGMT Gene Silencing and Benefit from Temozolomide in Glioblastoma. New England Journal of Medicine [Internet] 2005 [cited 2023 May 21];352(10):997–1003. Available from: https://doi.org/10.1056/NEJMoa043331 Marta GN, Moraes FY, Feher O, et al. Social determinants of health and survival on Brazilian patients with glioblastoma: a retrospective analysis of a large populational database. The Lancet Regional Health – Americas [Internet] 2021 [cited 2024 Jan 20];4. Available from: https://www.thelancet.com/journals/lanam/article/PIIS2667–193X (21)00062–4/fulltext Perry JR, Laperriere N, O’Callaghan CJ, et al. Short-Course Radiation plus Temozolomide in Elderly Patients with Glioblastoma. N Engl J Med 2017;376(11):1027–37. Wick W, Platten M, Meisner C, et al. Temozolomide chemotherapy alone versus radiotherapy alone for malignant astrocytoma in the elderly: the NOA–08 randomised, phase 3 trial. Lancet Oncol 2012;13(7):707–15. Rivera AL, Pelloski CE, Gilbert MR, et al. MGMT promoter methylation is predictive of response to radiotherapy and prognostic in the absence of adjuvant alkylating chemotherapy for glioblastoma. Neuro Oncol [Internet] 2010 [cited 2024 Jan 20];12(2):116–21. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2940581/ RT2030 - Home [Internet]. Sociedade Brasileira de Radioterapia. [cited 2023 May 21];Available from: https://sbradioterapia.com.br/rt2030/ Chen S, Xu Y, Ye M, et al. Predicting MGMT Promoter Methylation in Diffuse Gliomas Using Deep Learning with Radiomics. J Clin Med [Internet] 2022 [cited 2023 May 22];11(12):3445. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9224690/ He J, Ren J, Niu G, et al. Multiparametric MR radiomics in brain glioma: models comparation to predict biomarker status. BMC Medical Imaging [Internet] 2022 [cited 2023 Dec 18];22(1):137. Available from: https://doi.org/10.1186/s12880-022-00865–8 Sasaki T, Kinoshita M, Fujita K, et al. Radiomics and MGMT promoter methylation for prognostication of newly diagnosed glioblastoma. Sci Rep [Internet] 2019 [cited 2023 Dec 18];9(1):14435. Available from: https://www.nature.com/articles/s41598-019-50849-y Gómez OV, Herraiz JL, Udías JM, et al. Analysis of Cross-Combinations of Feature Selection and Machine-Learning Classification Methods Based on [18F]F-FDG PET/CT Radiomic Features for Metabolic Response Prediction of Metastatic Breast Cancer Lesions. Cancers [Internet] 2022 [cited 2023 Dec 18];14(12):2922. Available from: https://www.mdpi.com /2072–6694/14/12/2922 Pudjihartono N, Fadason T, Kempa-Liehr AW, O’Sullivan JM. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Front Bioinform [Internet] 2022 [cited 2023 Dec 18];2:927312. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580915/ The University of Pennsylvania glioblastoma (UPenn-GBM) cohort: advanced MRI, clinical, genomics, & radiomics | Scientific Data [Internet]. [cited 2024 Jan 20];Available from: https://www.nature.com/articles/s41597-022-01560–7 Niyazi M, Brada M, Chalmers AJ, et al. ESTRO-ACROP guideline “target delineation of glioblastomas.” Radiother Oncol 2016;118(1):35–42. Fedorov A, Beichel R, Kalpathy-Cramer J, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn Reson Imaging 2012;30(9):1323–41. van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 2017;77(21):e104–7. Haralick RM, Shanmugam K, Dinstein I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics [Internet] 1973 [cited 2024 Mar 2];SMC–3(6):610–21. Available from: https://ieeexplore.ieee.org/document/4309314 Galloway MM. Texture analysis using gray level run lengths. Computer Graphics and Image Processing [Internet] 1975 [cited 2024 Mar 2];4(2):172–9. Available from: https://www.sciencedirect.com/science/article/pii/S0146664X75800086 Texture indexes and gray level size zone matrix. Application to cell nuclei classification – ScienceOpen [Internet]. [cited 2024 Mar 2];Available from: https://www.scienceopen.com/document?vid=2c91747d-b5c 9–4a39-8751-9e17e9776f22 Sun C, Wee WG. Neighboring gray level dependence matrix for texture classification. Computer Graphics and Image Processing [Internet] 1982 [cited 2024 Mar 2];20(3):297. Available from: https://www.sciencedirect.com/science/article/pii/0146664X82900934 Deepa B, Ramesh K. Epileptic seizure detection using deep learning through min max scaler normalization. ijhs [Internet] 2022 [cited 2024 Mar 2];10981–96. Available from: https://sciencescholar.us/journal/index.php/ijhs/article/view/7801 Applied Sciences | Free Full-Text | Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions [Internet]. [cited 2024 Mar 2];Available from: https://www.mdpi.com/2076–3417/11/3/1291 Raghuwanshi BS, Shukla S. SMOTE based class-specific extreme learning machine for imbalanced learning. Knowledge-Based Systems [Internet] 2020 [cited 2024 Mar 2];187:104814. Available from: https://www.sciencedirect.com/science/article/pii/S0950705119302898 Genetic Epidemiology | Human Genetics Journal | Wiley Online Library [Internet]. [cited 2024 Mar 2];Available from: https://onlinelibrary.wiley.com/doi/ 10.1002/gepi.20297 Chen X, Jeong JC. Enhanced recursive feature elimination [Internet]. In: Sixth International Conference on Machine Learning and Applications (ICMLA 2007). 2007 [cited 2024 Apr 5]. p. 429–35.Available from: https://ieeexplore.ieee.org/document/4457268 Feature Extraction: Foundations and Applications | SpringerLink [Internet]. [cited 2024 Mar 2];Available from: https://link.springer.com/book/10.1007/978-3-540– 35488–8 A systematic comparison of statistical methods to detect interactions in exposome-health associations | Environmental Health | Full Text [Internet]. [cited 2024 Mar 2];Available from: https://ehjournal.biomedcentral.com/articles/ 10.1186/s12940-017-0277–6 Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems [Internet] 1987 [cited 2024 Mar 2];2(1):37–52. Available from: https://www.sciencedirect.com/science/article/pii/0169743987800849 PyCaret — pycaret 3.0.4 documentation [Internet]. [cited 2024 Jan 21];Available from: https://pycaret.readthedocs.io/en/latest/ Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: A Next-generation Hyperparameter Optimization Framework [Internet]. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA: Association for Computing Machinery; 2019 [cited 2024 Jan 21]. p. 2623–31.Available from: https://doi.org/10.1145/3292500.3330701 Tibshirani R. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) [Internet] 1996 [cited 2023 Dec 18];58(1):267–88. Available from: https://rss.onlinelibrary.wiley.com/doi/ 10.1111/j.2517–6161.1996.tb02080.x Ren J, Li Y, Yang J-J, et al. MRI-based radiomics analysis improves preoperative diagnostic performance for the depth of stromal invasion in patients with early stage cervical cancer. Insights into Imaging [Internet] 2022 [cited 2024 Mar 17];13(1):17. Available from: https://doi.org/10.1186/s13244-022-01156–0 Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. Benchmarking relief-based feature selection methods for bioinformatics data mining. Journal of Biomedical Informatics [Internet] 2018 [cited 2023 Dec 18];85:168–88. Available from: https://linkinghub.elsevier.com/retrieve/pii/S1532046418301412 Panagiotou OA, Ioannidis JPA, Genome-Wide Significance Project. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int J Epidemiol 2012;41(1):273–86. Sundus KI, Hammo BH, Al-Zoubi MB, Al-Omari A. Solving the multicollinearity problem to improve the stability of machine learning algorithms applied to a fully annotated breast cancer dataset. Informatics in Medicine Unlocked [Internet] 2022 [cited 2024 Mar 2];33:101088. Available from: https://www.sciencedirect.com/science/article/pii/S2352914822002246 Cook JA, Ranstam J. Overfitting. British Journal of Surgery [Internet] 2016 [cited 2024 Mar 2];103(13):1814. Available from: https://doi.org/10.1002/bjs.10244 Alzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data [Internet] 2021 [cited 2024 Apr 5];8(1):53. Available from: https://doi.org/10.1186/s40537-021-00444–8 Manakitsa N, Maraslidis GS, Moysis L, Fragulis GF. A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision. Technologies [Internet] 2024 [cited 2024 Apr 5];12(2):15. Available from: https://www.mdpi.com/2227–7080/12/2/15 Additional Declarations No competing interests reported. Supplementary Files Supplementanonymized2.docx Cite Share Download PDF Status: Published Journal Publication published 14 Nov, 2024 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 05 Aug, 2024 Reviews received at journal 31 Jul, 2024 Reviewers agreed at journal 28 Jul, 2024 Reviews received at journal 08 Jul, 2024 Reviewers agreed at journal 05 Jul, 2024 Reviewers invited by journal 04 Jul, 2024 Editor assigned by journal 03 Jul, 2024 Editor invited by journal 03 Jul, 2024 Submission checks completed at journal 02 Jul, 2024 First submitted to journal 26 Jun, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4644889","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":324472212,"identity":"fe9d3075-7071-4401-b31b-f49185f17d88","order_by":0,"name":"Felipe Cicci Farinha Restini","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+UlEQVRIie3PMWsCMRTA8TwCdWm59YmFfgVv0q0fpIvuZr/BHg8KyeIHUCj6Fdrl6KgEziXgB+gSlzq7XYdCX6QUl/NuLDR/SEjC+w0RIhb7q40EJqIDJETGNympCQAx6ZIMxAUCLQhvTMJR/z7UNzDm4PdvQ0yk1P64fHxIDJMqK2rJrXMpjR1i9wlMuii2am6BYObeawniBGisMe9b0L2bolTERIK+QO4O+0DwPpCv51KtGgmK9ET6PNYDmqqXRnI9SeeBoOW/zMq1emWyufiXztYfP3WOiTEfvprmarmzG19l9eS8K172dFq3mv8hedvhWCwW+0d9AxVkWSa2aHboAAAAAElFTkSuQmCC","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":true,"prefix":"","firstName":"Felipe","middleName":"Cicci Farinha","lastName":"Restini","suffix":""},{"id":324472214,"identity":"0b9db60d-cb24-44e6-b6f2-f048be427f5b","order_by":1,"name":"Tarraf Torfeh","email":"","orcid":"","institution":"National Center for Cancer Care and Research","correspondingAuthor":false,"prefix":"","firstName":"Tarraf","middleName":"","lastName":"Torfeh","suffix":""},{"id":324472217,"identity":"5bdd3715-80ae-4e75-a408-175a6d356197","order_by":2,"name":"Souha Aouadi","email":"","orcid":"","institution":"National Center for Cancer Care and Research","correspondingAuthor":false,"prefix":"","firstName":"Souha","middleName":"","lastName":"Aouadi","suffix":""},{"id":324472218,"identity":"ce979604-af72-41ec-a2ed-24eee2734d05","order_by":3,"name":"Rabih Hammoud","email":"","orcid":"","institution":"National Center for Cancer Care and Research","correspondingAuthor":false,"prefix":"","firstName":"Rabih","middleName":"","lastName":"Hammoud","suffix":""},{"id":324472219,"identity":"549da922-caf9-47cd-87e8-08acf4e4f1e5","order_by":4,"name":"Noora Al-Hammadi","email":"","orcid":"","institution":"National Center for Cancer Care and Research","correspondingAuthor":false,"prefix":"","firstName":"Noora","middleName":"","lastName":"Al-Hammadi","suffix":""},{"id":324472223,"identity":"63895a6f-05da-45f7-9544-29588ba7a231","order_by":5,"name":"Maria Thereza Mansur Starling","email":"","orcid":"","institution":"London Health Sciences Centre","correspondingAuthor":false,"prefix":"","firstName":"Maria","middleName":"Thereza Mansur","lastName":"Starling","suffix":""},{"id":324472224,"identity":"ef712aa5-a42b-4746-b595-efbd9d9a9fe2","order_by":6,"name":"Cecília Felix Penido Mendes Souza","email":"","orcid":"","institution":"Johns Hopkins Bloomberg School of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Cecília","middleName":"Felix Penido Mendes","lastName":"Souza","suffix":""},{"id":324472225,"identity":"8e521014-5d8d-49a7-b8d1-f5db428bd5c4","order_by":7,"name":"Anselmo Mancini","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Anselmo","middleName":"","lastName":"Mancini","suffix":""},{"id":324472226,"identity":"3322eeae-6f99-4723-8935-28185618cf64","order_by":8,"name":"Leticia Hernandes Brito","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Leticia","middleName":"Hernandes","lastName":"Brito","suffix":""},{"id":324472227,"identity":"42aae23f-c848-4842-ad76-547ddb572600","order_by":9,"name":"Fernanda Hayashida Yoshimoto","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Fernanda","middleName":"Hayashida","lastName":"Yoshimoto","suffix":""},{"id":324472228,"identity":"d43aaa2e-8119-426a-a279-7a123be6c6bd","order_by":10,"name":"Nildevande Firmino Lima-Júnior","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Nildevande","middleName":"Firmino","lastName":"Lima-Júnior","suffix":""},{"id":324472229,"identity":"a9dc56a9-361a-44cb-a4b9-d202ee9f9236","order_by":11,"name":"Marcelo Moro Queiroz","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Marcelo","middleName":"Moro","lastName":"Queiroz","suffix":""},{"id":324472230,"identity":"a3fcc374-9639-4dfa-9fba-75b068c3a0e0","order_by":12,"name":"Ula Lindoso Passos","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Ula","middleName":"Lindoso","lastName":"Passos","suffix":""},{"id":324472231,"identity":"a0986777-b0a3-4ef9-b0e6-e507d1d0cb2f","order_by":13,"name":"Camila Trolez Amancio","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Camila","middleName":"Trolez","lastName":"Amancio","suffix":""},{"id":324472232,"identity":"06d60fa6-47cd-4526-8c25-781153f2643f","order_by":14,"name":"Jorge Tomio Takahashi","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Jorge","middleName":"Tomio","lastName":"Takahashi","suffix":""},{"id":324472235,"identity":"7b04ea87-54b6-4d30-b500-cf1b48674345","order_by":15,"name":"Daniel De Souza Delgado","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Daniel","middleName":"De Souza","lastName":"Delgado","suffix":""},{"id":324472239,"identity":"1b8feae0-68f2-4127-a670-48a624b62dde","order_by":16,"name":"Samir Abdallah Hanna","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Samir","middleName":"Abdallah","lastName":"Hanna","suffix":""},{"id":324472240,"identity":"16fb6c7c-faa8-4f31-a67c-a8e392ed0f83","order_by":17,"name":"Gustavo Nader Marta","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Gustavo","middleName":"Nader","lastName":"Marta","suffix":""},{"id":324472241,"identity":"52412764-e105-432f-a7ec-59bfcbf673e3","order_by":18,"name":"Wellington Furtado Pimenta Neves-Junior","email":"","orcid":"","institution":"Hospital Sírio-Libanês","correspondingAuthor":false,"prefix":"","firstName":"Wellington","middleName":"Furtado Pimenta","lastName":"Neves-Junior","suffix":""}],"badges":[],"createdAt":"2024-06-26 21:22:13","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4644889/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4644889/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-024-78189-6","type":"published","date":"2024-11-14T15:58:16+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":61185003,"identity":"4e0bb03b-c555-4dde-9391-ccd4f733da7a","added_by":"auto","created_at":"2024-07-26 17:20:11","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":178360,"visible":true,"origin":"","legend":"\u003cp\u003eWorkflow summary demonstrating the VOI's delineation by experts from the radiation oncology field using T1GD and Flair weighted sequences, the neuro-radiology assessment was carried together in this phase. Subsequently, these images together with the created VOI's underwent RadF extraction. The data was then generated. Different ML experiments were carried out and compared with DL experiments.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/690770b2c8dad4fc5b82619d.png"},{"id":61182216,"identity":"95dce216-01e3-4466-b596-1b8285ec9acd","added_by":"auto","created_at":"2024-07-26 16:56:11","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":88104,"visible":true,"origin":"","legend":"\u003cp\u003eDeep Learning experiment design using convolution matrix to reach the clinical outcome.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/def0e441a56c758843fe791f.png"},{"id":61182223,"identity":"d7ef26c4-381c-4e1b-8656-01a1d2e99267","added_by":"auto","created_at":"2024-07-26 16:56:11","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":76541,"visible":true,"origin":"","legend":"\u003cp\u003eROC curve and performance summary of the best classifier.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/a8e5c0097d5dda07b4eee7f6.png"},{"id":61182218,"identity":"f79f9e82-ce5c-486f-acde-4f79ee3bcc73","added_by":"auto","created_at":"2024-07-26 16:56:11","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":92430,"visible":true,"origin":"","legend":"\u003cp\u003eHeatmap with the correlation between variables in the third quartile of strength regarding both positive and negative correlations.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eTo access the description of each variable, refer to Table S2 in the Supplement Appendix.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/5fe042523c34f3fd0fc63786.png"},{"id":61182221,"identity":"0ea04100-1dcc-45df-9038-6de61583cd03","added_by":"auto","created_at":"2024-07-26 16:56:11","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":78309,"visible":true,"origin":"","legend":"\u003cp\u003eAccuracy and loss graphs for the first CNN model.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/a65524323719c896ef8facc2.png"},{"id":61183436,"identity":"f038eba8-9b2c-4985-9d1a-37b94be1286b","added_by":"auto","created_at":"2024-07-26 17:04:11","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":91687,"visible":true,"origin":"","legend":"\u003cp\u003eAccuracy and loss graphs for the second CNN model.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/ef2264c234190b55b97a419b.png"},{"id":69285207,"identity":"9e1aff67-ad82-4d7c-81a9-ab0ba5792399","added_by":"auto","created_at":"2024-11-18 19:24:37","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1350290,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/421c6435-fe3a-495e-81a7-edb4269c50ca.pdf"},{"id":61182222,"identity":"7e524314-af12-4096-ab7e-b55a4628016a","added_by":"auto","created_at":"2024-07-26 16:56:11","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":173749,"visible":true,"origin":"","legend":"","description":"","filename":"Supplementanonymized2.docx","url":"https://assets-eu.researchsquare.com/files/rs-4644889/v1/394e0ab3aa949a735770c94c.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings","fulltext":[{"header":"Introduction","content":"\u003cp\u003eGlioblastoma is the most common malignant primary brain cancer in adulthood\u003csup\u003e1\u003c/sup\u003e. Despite treatment advances, prognosis remains poor with a median overall survival (OS) varying between 9 to 15 months\u003csup\u003e2,3\u003c/sup\u003e. Patients with methylation of the MGMT (O6-methylguanine-DNA methyltransferase) have been shown to have improved survival rates due to a better response to alkylating agents such as temozolomide\u003csup\u003e3,4\u003c/sup\u003e. Age is another important prognostic factor, and median survival in patients over 65 years can be as low as 4 to 5 months\u003csup\u003e5\u003c/sup\u003e. Recent studies have investigated tailoring strategies for the elderly, such as chemotherapy alone or hypofractionation\u003csup\u003e6\u003c/sup\u003e. However, evaluating MGMT methylation status is a key factor for guiding treatment allocation\u003csup\u003e6\u0026ndash;8\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eIn low and middle-income regions, the stratification of glioblastoma treatment through MGMT methylation identification is frequently thwarted by economic limitations that permeate the entire spectrum of cancer care. A stark illustration of this challenge is the accessibility of radiotherapy (RT) in Brazil, where 15.9% of patients initiate RT within the recommended 30 days post-diagnosis\u003csup\u003e9\u003c/sup\u003e. The integration of MGMT methylation status into clinical decision-making holds the potential for optimized treatment planning and judicious allocation of medical resources.\u003c/p\u003e \u003cp\u003eArtificial Intelligence (AI) applications are being progressively employed to anticipate MGMT methylation status\u003csup\u003e10\u003c/sup\u003e, symbolizing a pivotal advancement in bridging the divide in healthcare accessibility. The latest compilations of data unveil a notable enhancement in the prognostic accuracy (ACC) of these AI algorithms, with performance metrics exhibiting an expansion in the area under the receiver operating characteristic curve (AUC) from 0.67 to an impressive 0.87. Such advances herald a promising horizon for the precision-oriented management of glioblastoma in resource-constrained settings\u003csup\u003e11,12\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eNonetheless, most AI studies are based on costly complex imaging acquisition protocols to predict diagnostic tests such as MGMT. Also, they\u0026rsquo;re using mostly non-standardized volume of interest definitions\u003csup\u003e13,14\u003c/sup\u003e, which may hinder practical application and increase variability in results.\u003c/p\u003e \u003cp\u003eHence, the study objective was to create an AI model that utilizes readily available diagnostic images, such as T1 and Flair MRIs to predict MGMT test results. The imaging data was extracted from predefined regions of interest following standardized tumor definition protocols, and combined with clinical and neuroradiological assessments.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eData Collection\u003c/h2\u003e \u003cp\u003eThis study utilized two data sources. The training cohort's data was retrieved from The Cancer Imaging Archive's UPENN-GBM public database\u003csup\u003e15\u003c/sup\u003e. The validation cohort comprised GBM patients from a single Brazilian institution with inclusion criteria of cranial MRI and MGMT data.\u003c/p\u003e \u003cp\u003eThe diagnostic T1 weighted with gadolinium (T1GD) and FLAIR MR images were evaluated by radiation oncologists (tumor delineation) and neuroradiologists (standardized radiological interpretation). The patient's clinical features acquired from medical records were age, sex, MGMT status (defined as TARGET, for supervised learning purposes), and IDH status.\u003c/p\u003e \u003cp\u003e This study was conducted per the principles of the Declaration of Helsinki and was submitted and approved by the Brazilian National Health Council through the Brazil platform with the identifier 63591922.6.0000.5461. As this study involved a retrospective analysis of hospital database records, a waiver for the application of Informed Consent was requested and approved by the local bioethics committee.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eRadF extraction\u003c/h2\u003e \u003cp\u003eFor radiomic feature extraction (RadF), Volumes of Interest (VOI) were created by contouring following a modified version of ESTRO-ACROP 2016 guideline for target delineation in radiation therapy treatment\u003csup\u003e16\u003c/sup\u003e. The contour was made with Eclipse Treatment Planning System (Varian - Siemens Healthineers, Palo Alto, USA) and a 256x256 matrix size was used.\u003c/p\u003e \u003cp\u003eWe defined three distinct VOI using two MR sequences. On the T1GD sequence, we initially outlined the gross tumor volume (GTV), identified by the contrast-enhancing lesion, referred to as T1VOL. Subsequently, we constructed a secondary VOI, extending symmetrically 2 cm from the boundaries of the T1VOL. This expansion was adjusted to respect anatomical barriers and to encompass all pathological changes evident in the FLAIR sequence, thus defining the T1_FLAIR VOI. The third VOI, termed FLAIR, was crafted to mirror the dimensions and configuration of the T1_FLAIR, yet delineated exclusively on the FLAIR sequence.\u003c/p\u003e \u003cp\u003eThe DICOM images, and VOI's as RT STRUCTURESET were exported to 3D Slicer for RadF extraction\u003csup\u003e17\u003c/sup\u003e. We used the Pyradiomic\u003csup\u003e18\u003c/sup\u003e extension to extract 150 radiomic phenotypes from each VOI. The RadF included shape (26 features), first-order statistics (19 features), and textural features (75 features). Shape features are related to the geometric properties of the tumor. The first-order features describe the distribution of the tumor intensity. Texture features represent the heterogeneity of the tumor and were extracted from the gray level co-occurrence (GLCM)\u003csup\u003e19\u003c/sup\u003e, gray level run length matrix (GLRLM)\u003csup\u003e20\u003c/sup\u003e, gray level size zone matrix (GLSZM)\u003csup\u003e21\u003c/sup\u003e, neighboring gray-tone difference matrix\u003csup\u003e19\u003c/sup\u003e, and gray level dependence matrix (GLDM)\u003csup\u003e22\u003c/sup\u003e matrices. The detailed calculation of these RadF is available on pyradiomics website\u003csup\u003e18\u003c/sup\u003e (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pyradiomics.readthedocs.io/en/latest/features.html\u003c/span\u003e\u003cspan address=\"https://pyradiomics.readthedocs.io/en/latest/features.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). A bin width\u0026thinsp;=\u0026thinsp;25 was used for gray-level discretization before texture calculation. Preprocessing filters including Laplacian-of-Gaussian and wavelet were also applied.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eCategorical Neuroradiology Evaluation\u003c/h2\u003e \u003cp\u003eFour expert neuroradiologists evaluated the same images following a guideline for categorical impressions of the images (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e in Supplementary Appendix).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eData Frame Structuring\u003c/h2\u003e \u003cp\u003eThe clinical data, RadF, and categorical assessments were unified into a spreadsheet, forming a comprehensive data frame. This data frame underwent preprocessing using Python 3.6 on the Google Colab platform. Continuous variables were normalized using the MinMax scaling technique\u003csup\u003e23\u003c/sup\u003e, while categorical variables were transformed via OneHotEncoding\u003csup\u003e24\u003c/sup\u003e, both from Scikit Learn Library. Subsequently, this data frame was divided into two separate entities: Public and Private data frames corresponding to training and validation databases respectively.\u003c/p\u003e \u003cp\u003eTo address the issue of data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE)\u003csup\u003e25\u003c/sup\u003e was employed to upsample the Public data frame.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eFeatures Selection\u003c/h2\u003e \u003cp\u003eThe structured data frame underwent various filtering techniques to identify the most pertinent features for predicting MGMT methylation. In univariate filters, the ANOVA F-test was applied to continuous variables and the Chi-Squared test to categorical variables, with the p-value as a varying hyperparameter set at 0.1, 0.05, and 0.01\u003csup\u003e14\u003c/sup\u003e. Also in the univariate filters, Bonferroni corrections for multiple comparisons were applied\u003csup\u003e26\u003c/sup\u003e. For wrapper methods, Recursive Feature Elimination (RFE)\u003csup\u003e27\u003c/sup\u003e was utilized. In embedded methods, we employed the Least Absolute Shrinkage and Selection Operator (LASSO) with an L1 penalty, alongside a LightBoost regressor\u003csup\u003e28,29\u003c/sup\u003e. Additionally, for dimensionality reduction, Principal Component Analysis (PCA) was used to generate principal components, encompassing 95% of the variance\u003csup\u003e30\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThese processes resulted in new data frames, each corresponding to the different supervised machine-learning experiments to be conducted.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eMachine Learning Experiments\u003c/h2\u003e \u003cp\u003eMachine learning (ML) and Deep Learning (DL) techniques were compared. For ML, each experiment was associated with a distinct filtering method, as depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. All experiments were executed within the PyCaret\u003csup\u003e31\u003c/sup\u003e environment, enabling a concurrent assessment of various classifiers. A total number of eighteen binary classifiers were tested; linear and quadratic discriminant analysis, trees-based classifiers (extra trees, decision trees, random forests), ridge classifier, MLP classifier, boosting algorithm (gradient boosting, ada boost, extreme gradient boosting, light gradient boosting machine, cat boost), linear and radial support vector machine, logistic regression, k-nearest neighbors classifier, Naive-Bayes, and gaussian process classifier. A standard setup was maintained across all experiments in terms of pre-processing parameters. The Public dataset was utilized for training and testing the models, while the Private dataset was reserved as a validation set, and consequently not involved in the modeling process (unseen data).\u003c/p\u003e \u003cp\u003eFor the model construction, the data was divided into a 70:30 ratio, resulting in x_train, y_train, x_test, and y_test subsets. Here, 'x' represents the input variables used by the algorithm for predicting 'y', the TARGET variable. The efficacy of the models was assessed using a 10-fold Stratified K-Fold cross-validation approach. To address multicollinearity the threshold was set for 80%, this was made by Minimum absolute Pearson correlation, thus, if any column was correlated with each other equal to or higher than this threshold, it would be removed.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe top five algorithms, based on their initial performance comparison, underwent hyperparameter tuning using the Optuna Library\u003csup\u003e32\u003c/sup\u003e where applicable.\u003c/p\u003e \u003cp\u003eThese models were refined using ensemble bagging and either blending or stacking techniques and evaluated with 10-fold cross-validation. The predictions of the base models are provided as features for the meta-model. The selection of the best-performing model was based on ACC and AUC metrics, and then applied to the unseen data to assess its robustness and effectiveness in handling new datasets.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eImplementation of Deep Learning Workflow\u003c/h2\u003e \u003cdiv id=\"Sec10\" class=\"Section3\"\u003e \u003ch2\u003eData Frame Structuring\u003c/h2\u003e \u003cp\u003eFor Deep Learning (DL) methodologies, the training dataset includes 100 MR studies of T1 and T2 Flair for GBM patients from the public dataset UPENN-GBM. Only slices containing structure sets were used to train the models. To increase the size of the training set and make the model more robust to variations in the data, data augmentation was applied to the original training images by performing random rotations of \u0026plusmn;\u0026thinsp;7 degrees and translations of \u0026plusmn;\u0026thinsp;2mm. The model's output is a probability that ranges from 0 to 1, for unmethylated and methylated, respectively. After the training, the models are tested using a different set of 46 images of GBM patients issued from private data.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eDL Experiments\u003c/h2\u003e \u003cp\u003eFor DL experiments, two models have been used to predict the output. The first model, as demonstrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, consists of three convolutional layers with adjustable filter sizes, three max-pooling layers for downsampling, a flattening layer, and two dense layers with adjustable units.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn the second model, we have added two additional convolutional layers and an additional pooling layer after each convolutional layer. This increases the depth of the model, allowing it to capture more complex features and patterns in the images. The loss function adopted for training is \"binary Cross-entropy \" and the optimizer used is ADAM (Adaptive Moment Estimation).\u003c/p\u003e \u003cp\u003eFor all the models, a thorough hyperparameter tuning process was conducted using the Keras Tuner library. The best hyperparameters including, the number of filters, the number of units in the densely connected layer, the learning rate, the batch size, and the number of training epochs were determined through a random search with 10 trials.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eStatistical Analyses\u003c/h2\u003e \u003cp\u003eFor the primary endpoint assessment and validation of model metrics, each model's performance was evaluated based on its ACC and the quality of its Receiver Operating Characteristic (ROC) curve. Exploratory analysis was made, for continuous variables T-test and Mann-Whitney U test were performed when applicable. For categorical variables, the Fisher Exact Test and Chi-squared were performed. Statistical analyses were performed by using SPSS version 20, and Python 3.6, a two-sided p-value of 0.05 or less was considered statistically significant.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003ePatient Characteristics\u003c/h2\u003e \u003cp\u003eFrom July 16, 2022, to December 15, 2022, 146 patients were selected for further analysis. One hundred from the UPENN-GBM database (training set) and 46 from the private institution (validation set). The median age was 63.7 for the training set and 57.5 years for the validation set. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e presents baseline characteristics.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eClinical and radiological characteristics of patients, tumor characteristics, and methylation status from both sets.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eFeature\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDescription\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTraining\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eValidation\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eTumor Location\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eType 1 (in contact with subventricular zone (SVZ) and cortex)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e65(65)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e21(45.7)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eType 2 (contact with SVZ but NOT cortex)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e4(\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e6(\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eType 3 (Contact ONLY with cortex)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e31(\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e17(\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2(4.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLaterality\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRight\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e45(45)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e16 (34.8)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLeft\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e48(48)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e24(52.2)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e7(\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e6(\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eWell Defined Bordes\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e18(\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e19(41.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e82(82)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e27(58.7)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMultifocality\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e28(\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e9(19.6)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e72(72)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e35(76.1)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2(4.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMidline Crossing\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eYes\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e32(\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e13(28.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNo\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68(68)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e31(67.4)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2(4.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003e\u003cb\u003eGreatest Dimension of Contrast Enhancing Lesion\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;1.5cm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5(\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e3(6.5)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u0026ge;\u0026thinsp;1.5cm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e95(95)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e40(87)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNA\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e3(6.5)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eIDH 1/2 Status\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eWildtype\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e100(100)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e42(91.3)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMutated\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0(0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e4(8.7)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eGender\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFemale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e41(41)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e29(63)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMale\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e59(59)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e17(\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMGMT Status\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eUnmethylated\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e61(61)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e30(65.2)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMethylated\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e39(\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e16(34.8)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eFeature Extraction\u003c/h2\u003e \u003cp\u003eUsing Pyradiomics, 450 radiomic features (RadF) were extracted, and narrowed down to 330 after eliminating location-specific and redundant features. These were merged with 13 categorical and clinical characteristics. To balance the dataset, SMOTE generated 22 synthetic cases, equalizing the proportions of MGMT-methylated patients. The illustration of the 8 filtering methods can be accessed in Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e in the Supplement Appendix, it yields diverse results: PCA identified 24 principal components, ANOVA F test and Chi-square tests at varied p-values isolated 1, 8, 15, and 44 features, while LASSO Regression and LightBoost pinpointed 21 and 74 features, respectively. RFE selected 17 features.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eAlgorithms\u003c/h2\u003e \u003cp\u003eAcross eight experiments, 105 algorithms were developed, and the best from each was selected, yielding seven top performers detailed in Table S3 of the Supplementary Appendix. Access to the datasets and codes were also provided.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eBest Performance in Training and Validation set\u003c/h2\u003e \u003cp\u003eThe best performance on the Public dataset, with an AUC of 0.78 and ACC of 0.75, was achieved using RFE with a stacked estimator, specifically Logistic Regression (see Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). When applied to the validation set, the AUC was 0.62, with an ACC of 0.71.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eConversely, the best performance on the unseen dataset, with an AUC of 0.77 and ACC of 0.76, was obtained using a Ridge Classifier post-LASSO feature selection. This outperformed its results on the Public dataset, achieving an AUC of 0.70 and ACC of 0.65.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003eSecondary Analyses\u003c/h2\u003e \u003cp\u003eWe investigated correlations between clinical factors like tumor specifics and patient age with MGMT methylation, detailed in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Neuro-radiology expert opinions were consistent across categorical analyses, with no significant variance (Fisher's exact p\u0026thinsp;\u0026gt;\u0026thinsp;.05).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSignificance of clinical variables with its odds ratio and confidence interval.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eVariable\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ep-value\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eOdds Ratio\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCI 95%\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSVZ\u0026thinsp;+\u0026thinsp;Cortical\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.54\u0026ndash;1.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSVZ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.04\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.12\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.01\u0026ndash;0.97\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCortical\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.67\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.22\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.64\u0026ndash;2.32\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIntermediate Necrosis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.26\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.64\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.78\u0026ndash;3.45\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSevere Necrosis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.59\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.31\u0026ndash;1.15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRight Hemisphere\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.80\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.97\u0026ndash;3.35\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLeft Hemisphere\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.27\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.14\u0026ndash;0.52\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eWell defined borders\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e\u0026lt;\u0026thinsp;0.001\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.19\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.08\u0026ndash;0.47\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnifocal\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.03\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.44\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.22\u0026ndash;0.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMidline crossing\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.47\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.35\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.70\u0026ndash;2.60\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eConstrast Enchancement\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNodular\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.51\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.57\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.14\u0026ndash;2.38\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePatchy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.43\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.70\u0026ndash;2.45\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRing\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.63\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.34\u0026ndash;1.15\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHemorraghea\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.76\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.94\u0026ndash;3.29\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCystic\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1.00\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.78\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.13\u0026ndash;4.81\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSize greater than 1.5cm\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e0.77\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.84\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.26\u0026ndash;2.70\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eSVZ: subventricular zone.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003cp\u003eRegarding continuous variables, 42 variables showed significant association with MGMT status. The interrelationships among these variables were thoroughly assessed and are illustrated in the subsequent heatmap. The Spearman-test showed out of approximately 1,500 interactions, only 25 were significant (p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05) with correlations from 0.96 to 1.0. Four of these were negative correlations (-0.93 to -0.94), indicating mostly weak to moderate phenotype relationships. These findings can be visualized in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cem\u003eTo access the description of each variable, refer to Table S2 in the Supplement Appendix.\u003c/em\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003eDL models\u003c/h2\u003e \u003cp\u003eAfter training the models on the public dataset, the best hyperparameters obtained for the first custom-developed CNN model were: 48 filters, 112 filters, and 256 filters for the first, second, and third convolution layers respectively. 80 units for the densely connected layer, 0.0001 for the learning rate, 16 for the batch size, and 10 training epochs. The best ACC achieved was 69%, and the precision was 70%. Figure\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e illustrates the ACC and loss graphs created during the training and testing phases.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor the second custom-developed CNN model, the best hyperparameters were: 16, 112, 224, 192, and 768 filters for the first, second, third, fourth, and fifth convolution layers respectively. 112 units for the densely connected layer, 0.0001 for the learning rate, 32 for the batch size, and 20 training epochs. The best ACC achieved was 74% and a precision equal to 75%. Figure\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e illustrates the ACC and loss graphs created during the training and testing phases for the second DL model.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eWhen evaluated on unseen data, the first model achieved an ACC of 62% and a precision of 54%, while the second model achieved an ACC of 70% and a precision of 67%.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe present study provides a framework for predicting MGMT methylation by the application of AI techniques on diagnostics MRIs together with clinical information. The final best algorithm reached 0.75 ACC, 0.78 AUC with 0.83 sensibility (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). The results showed notable variations in performance across different experiments, with the primary distinction arising from the feature selection methods employed. This underscores how crucial pre-processing steps are in influencing the overall predictive performance.\u003c/p\u003e \u003cp\u003eThe ESTRO-ACROP guideline was adapted to create the VOI from which the RadF would be extracted\u003csup\u003e16\u003c/sup\u003e representing an effort to allow reproducibility for this framework (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). This is in contrast with previous research that was mostly based on experts' qualitative perspectives, i.e. defining disease areas\u003csup\u003e10,33\u003c/sup\u003e. Moreover, this study intentionally utilized only two MRI sequences that are generally already performed for diagnosis to align with constraints in low-resource settings. While many high-performance studies typically employ a variety of imaging types, such an approach may not be feasible in environments with limited resources.\u003c/p\u003e \u003cp\u003eThe neuroradiologist interpretation, together with other clinical features, was intentionally combined with AI-derived, i.e. RadF, once previous studies demonstrated an increased predictive performance of AI algorithms\u003csup\u003e11,34\u003c/sup\u003e. Additionally, we observed the presence of these features in all final best algorithms, therefore, we hypothesized that the clinical perspective is relevant for algorithm engineering.\u003c/p\u003e \u003cp\u003eThe implementation of multiple filters in the pre-processing phase is another strong point of this research. Typically these models utilize only a small subset of these features after applying filtering techniques\u003csup\u003e13,14,33,35\u003c/sup\u003e. Regrettably, most studies used only one or a few methods for feature filtering, potentially overlooking the value of other filter methods\u003csup\u003e14\u003c/sup\u003e. This led to the question regarding the best number of features to be used (Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e, \u003cem\u003esupplement appendix\u003c/em\u003e).\u003c/p\u003e \u003cp\u003eMann-Whitney U test found 42 RadFs significantly linked to MGMT methylation. However, experiments with this number of features led to poor performance. Common statistical tests like ANOVA and Chi-Squared couldn't manage type I errors well, resulting in many false positives and poor performance. Using the Bonferroni correction, only one variable was selected, highlighting its conservative nature and the high risk of type II errors, excluding significant variables, a finding already present in previous genome-wide analyses.\u003csup\u003e36\u003c/sup\u003e (Figure S2, \u003cem\u003esupplement appendix\u003c/em\u003e). In other words, these results indicate that common statistical methods do not provide the optimal balance between the odds of type I and II error, leading to inaccurate outcomes.\u003c/p\u003e \u003cp\u003eAdditionally, the majority of the correlation between features was determined as weak to moderate by Spearman's test. This adds a layer of complexity, especially once high correlations ultimately lead to multicollinearity in AI algorithm development\u003csup\u003e37\u003c/sup\u003e. This condition can make it difficult to predict the impact of these variables and potentially cause overfitting\u003csup\u003e38\u003c/sup\u003e, reducing the algorithm's ability to generalize to new data.\u003c/p\u003e \u003cp\u003eThe observed performance decline of algorithms on unseen data, although expected, it could also be related to notable differences in patient characteristics between the two cohorts (100 public database patients vs. 46 from private data). These populations may differ significantly, impacting algorithm effectiveness. Enhancing the sample size could balance these disparities, ensuring a more representative comparison and potentially stabilizing algorithm performance.\u003c/p\u003e \u003cp\u003eAnother point to explore is the difference in DL and ML performances. There was a marginal difference in ACC performance, 0.74 vs 0.75 for DL and ML, respectively. Recently, DL has become an important approach in ML techniques, and due to its ability to better deal with massive amounts of data, it has been reaching outstanding performances while dealing with complex tasks\u003csup\u003e39\u003c/sup\u003e. In the present study, a CNN strategy was used for predicting MGMT status, this approach is known to deal with large amounts of images properly and is thus useful for this specific task\u003csup\u003e40\u003c/sup\u003e. It is very likely that when increasing the number of patients analyzed the predictive power of DL would overcome those from ML, however, our results cannot prove this theory.\u003c/p\u003e \u003cp\u003eDespite existing challenges, the algorithm showed commendable performance, utilizing minimal resources like the number of MRI sequences used and clinical data efficiently. Enhancing this framework while maintaining simplicity could be crucial for enabling reproducibility in low- to middle-income countries, suggesting that strategic improvements could broaden its applicability without sacrificing accessibility.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis research highlights the potential of an affordable AI tool to enhance clinical decision-making in oncology, where molecular characteristics of tumors are crucial. It shows AI's ability to improve clinical outcomes through precision medicine, even in resource-constrained settings, advancing equitable healthcare in oncology.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eDisclaimers:\u0026nbsp;\u003c/strong\u003eThe content of this article represents the views of the involved authors and not the official positions of their respective institutions.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eSource of Support:\u003c/strong\u003e This study did not receive any grants or financial support.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability Statement:\u0026nbsp;\u003c/strong\u003eThis project will share most of the data generated to formulate the presented results. Additionally, the code used for algorithm development will be available online for public access.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Names, Degrees, and Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFelipe Cicci Farinha Restini: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing \u0026ndash; original draft, Writing \u0026ndash; review \u0026amp; editing.\u003c/p\u003e\n\u003cp\u003eTarraf Torfeh: Resources, Software, Supervision, Validation, Visualization, Writing \u0026ndash; original draft.\u003c/p\u003e\n\u003cp\u003eSouha Aouadi: Resources, Software, Supervision, Validation, Visualization, Writing \u0026ndash; original draft.\u003c/p\u003e\n\u003cp\u003eRabih Hammoud: Supervision.\u003c/p\u003e\n\u003cp\u003eNoora Al-Hammadi: Supervision.\u003c/p\u003e\n\u003cp\u003eMaria Thereza Mansur Starling: Investigation, Methodology, Project administration, Writing \u0026ndash; original draft, Writing \u0026ndash; review \u0026amp; editing.\u003c/p\u003e\n\u003cp\u003eCec\u0026iacute;lia Felix Penido Mendes Souza: Investigation, Methodology, Project administration, Writing \u0026ndash; original draft, Writing \u0026ndash; review \u0026amp; editing.\u003c/p\u003e\n\u003cp\u003eAnselmo Mancini: Software.\u003c/p\u003e\n\u003cp\u003eLeticia Hernandes Brito: Methodology.\u003c/p\u003e\n\u003cp\u003eFernanda Hayashida Yoshimoto: Methodology.\u003c/p\u003e\n\u003cp\u003eNildevande Firmino Lima-J\u0026uacute;nior: Methodology.\u003c/p\u003e\n\u003cp\u003eMarcelo Moro Queiroz: Methodology.\u003c/p\u003e\n\u003cp\u003eUla Lindoso Passos: Investigation, Methodology, Project administration, Resources.\u003c/p\u003e\n\u003cp\u003eCamila Trolez Amancio: Investigation, Methodology, Project administration, Resources.\u003c/p\u003e\n\u003cp\u003eJorge Tomio Takahashi: Investigation, Methodology, Project administration, Resources.\u003c/p\u003e\n\u003cp\u003eDaniel De Souza Delgado: Investigation, Methodology, Project administration, Resources.\u003c/p\u003e\n\u003cp\u003eSamir Abdallah Hanna: Conceptualization, Investigation, Project administration, Resources, Supervision, Visualization, Writing \u0026ndash; original draft.\u003c/p\u003e\n\u003cp\u003eGustavo Nader Marta: Conceptualization, Investigation, Project administration, Resources, Supervision, Visualization, Writing \u0026ndash; original draft.\u003c/p\u003e\n\u003cp\u003eWellington Furtado Pimenta Neves-Junior: \u0026nbsp; Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing \u0026ndash; original draft, Writing \u0026ndash; review \u0026amp; editing.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eOstrom QT, Price M, Neff C, et al. CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2016\u0026ndash;2020. Neuro Oncol 2023;25(Supplement_4):iv1\u0026ndash;99.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrown NF, Ottaviani D, Tazare J, et al. Survival Outcomes and Prognostic Factors in Glioblastoma. Cancers [Internet] 2022 [cited 2024 Jan 20];14(13):3161. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.mdpi.com/2072\u0026ndash;6694/14/13/3161\u003c/span\u003e\u003cspan address=\"https://www.mdpi.com/2072\u0026ndash;6694/14/13/3161\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStupp R, Mason WP, van den Bent MJ, et al. Radiotherapy plus Concomitant and Adjuvant Temozolomide for Glioblastoma. New England Journal of Medicine [Internet] 2005 [cited 2023 Apr 26];352(10):987\u0026ndash;96. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1056/NEJMoa043330\u003c/span\u003e\u003cspan address=\"10.1056/NEJMoa043330\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHegi ME, Diserens A-C, Gorlia T, et al. MGMT Gene Silencing and Benefit from Temozolomide in Glioblastoma. New England Journal of Medicine [Internet] 2005 [cited 2023 May 21];352(10):997\u0026ndash;1003. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1056/NEJMoa043331\u003c/span\u003e\u003cspan address=\"10.1056/NEJMoa043331\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarta GN, Moraes FY, Feher O, et al. Social determinants of health and survival on Brazilian patients with glioblastoma: a retrospective analysis of a large populational database. The Lancet Regional Health \u0026ndash; Americas [Internet] 2021 [cited 2024 Jan 20];4. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.thelancet.com/journals/lanam/article/PIIS2667\u0026ndash;193X\u003c/span\u003e\u003cspan address=\"https://www.thelancet.com/journals/lanam/article/PIIS2667\u0026ndash;193X\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e(21)00062\u0026ndash;4/fulltext\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePerry JR, Laperriere N, O\u0026rsquo;Callaghan CJ, et al. Short-Course Radiation plus Temozolomide in Elderly Patients with Glioblastoma. N Engl J Med 2017;376(11):1027\u0026ndash;37.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWick W, Platten M, Meisner C, et al. Temozolomide chemotherapy alone versus radiotherapy alone for malignant astrocytoma in the elderly: the NOA\u0026ndash;08 randomised, phase 3 trial. Lancet Oncol 2012;13(7):707\u0026ndash;15.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRivera AL, Pelloski CE, Gilbert MR, et al. MGMT promoter methylation is predictive of response to radiotherapy and prognostic in the absence of adjuvant alkylating chemotherapy for glioblastoma. Neuro Oncol [Internet] 2010 [cited 2024 Jan 20];12(2):116\u0026ndash;21. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2940581/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2940581/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRT2030 - Home [Internet]. Sociedade Brasileira de Radioterapia. [cited 2023 May 21];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://sbradioterapia.com.br/rt2030/\u003c/span\u003e\u003cspan address=\"https://sbradioterapia.com.br/rt2030/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen S, Xu Y, Ye M, et al. Predicting MGMT Promoter Methylation in Diffuse Gliomas Using Deep Learning with Radiomics. J Clin Med [Internet] 2022 [cited 2023 May 22];11(12):3445. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9224690/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9224690/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHe J, Ren J, Niu G, et al. Multiparametric MR radiomics in brain glioma: models comparation to predict biomarker status. BMC Medical Imaging [Internet] 2022 [cited 2023 Dec 18];22(1):137. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12880-022-00865\u0026ndash;8\u003c/span\u003e\u003cspan address=\"10.1186/s12880-022-00865\u0026ndash;8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSasaki T, Kinoshita M, Fujita K, et al. Radiomics and MGMT promoter methylation for prognostication of newly diagnosed glioblastoma. Sci Rep [Internet] 2019 [cited 2023 Dec 18];9(1):14435. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.nature.com/articles/s41598-019-50849-y\u003c/span\u003e\u003cspan address=\"https://www.nature.com/articles/s41598-019-50849-y\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eG\u0026oacute;mez OV, Herraiz JL, Ud\u0026iacute;as JM, et al. Analysis of Cross-Combinations of Feature Selection and Machine-Learning Classification Methods Based on [18F]F-FDG PET/CT Radiomic Features for Metabolic Response Prediction of Metastatic Breast Cancer Lesions. Cancers [Internet] 2022 [cited 2023 Dec 18];14(12):2922. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.mdpi.com\u003c/span\u003e\u003cspan address=\"https://www.mdpi.com\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e/2072\u0026ndash;6694/14/12/2922\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePudjihartono N, Fadason T, Kempa-Liehr AW, O\u0026rsquo;Sullivan JM. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Front Bioinform [Internet] 2022 [cited 2023 Dec 18];2:927312. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580915/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580915/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThe University of Pennsylvania glioblastoma (UPenn-GBM) cohort: advanced MRI, clinical, genomics, \u0026amp; radiomics | Scientific Data [Internet]. [cited 2024 Jan 20];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.nature.com/articles/s41597-022-01560\u0026ndash;7\u003c/span\u003e\u003cspan address=\"https://www.nature.com/articles/s41597-022-01560\u0026ndash;7\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNiyazi M, Brada M, Chalmers AJ, et al. ESTRO-ACROP guideline \u0026ldquo;target delineation of glioblastomas.\u0026rdquo; Radiother Oncol 2016;118(1):35\u0026ndash;42.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFedorov A, Beichel R, Kalpathy-Cramer J, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn Reson Imaging 2012;30(9):1323\u0026ndash;41.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 2017;77(21):e104\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaralick RM, Shanmugam K, Dinstein I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics [Internet] 1973 [cited 2024 Mar 2];SMC\u0026ndash;3(6):610\u0026ndash;21. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ieeexplore.ieee.org/document/4309314\u003c/span\u003e\u003cspan address=\"https://ieeexplore.ieee.org/document/4309314\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGalloway MM. Texture analysis using gray level run lengths. Computer Graphics and Image Processing [Internet] 1975 [cited 2024 Mar 2];4(2):172\u0026ndash;9. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.sciencedirect.com/science/article/pii/S0146664X75800086\u003c/span\u003e\u003cspan address=\"https://www.sciencedirect.com/science/article/pii/S0146664X75800086\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTexture indexes and gray level size zone matrix. Application to cell nuclei classification \u0026ndash; ScienceOpen [Internet]. [cited 2024 Mar 2];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.scienceopen.com/document?vid=2c91747d-b5c\u003c/span\u003e\u003cspan address=\"https://www.scienceopen.com/document?vid=2c91747d-b5c\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e9\u0026ndash;4a39-8751-9e17e9776f22\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun C, Wee WG. Neighboring gray level dependence matrix for texture classification. Computer Graphics and Image Processing [Internet] 1982 [cited 2024 Mar 2];20(3):297. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.sciencedirect.com/science/article/pii/0146664X82900934\u003c/span\u003e\u003cspan address=\"https://www.sciencedirect.com/science/article/pii/0146664X82900934\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeepa B, Ramesh K. Epileptic seizure detection using deep learning through min max scaler normalization. ijhs [Internet] 2022 [cited 2024 Mar 2];10981\u0026ndash;96. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://sciencescholar.us/journal/index.php/ijhs/article/view/7801\u003c/span\u003e\u003cspan address=\"https://sciencescholar.us/journal/index.php/ijhs/article/view/7801\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eApplied Sciences | Free Full-Text | Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions [Internet]. [cited 2024 Mar 2];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.mdpi.com/2076\u0026ndash;3417/11/3/1291\u003c/span\u003e\u003cspan address=\"https://www.mdpi.com/2076\u0026ndash;3417/11/3/1291\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRaghuwanshi BS, Shukla S. SMOTE based class-specific extreme learning machine for imbalanced learning. Knowledge-Based Systems [Internet] 2020 [cited 2024 Mar 2];187:104814. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.sciencedirect.com/science/article/pii/S0950705119302898\u003c/span\u003e\u003cspan address=\"https://www.sciencedirect.com/science/article/pii/S0950705119302898\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGenetic Epidemiology | Human Genetics Journal | Wiley Online Library [Internet]. [cited 2024 Mar 2];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://onlinelibrary.wiley.com/doi/\u003c/span\u003e\u003cspan address=\"https://onlinelibrary.wiley.com/doi/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/gepi.20297\u003c/span\u003e\u003cspan address=\"10.1002/gepi.20297\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen X, Jeong JC. Enhanced recursive feature elimination [Internet]. In: Sixth International Conference on Machine Learning and Applications (ICMLA 2007). 2007 [cited 2024 Apr 5]. p. 429\u0026ndash;35.Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ieeexplore.ieee.org/document/4457268\u003c/span\u003e\u003cspan address=\"https://ieeexplore.ieee.org/document/4457268\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFeature Extraction: Foundations and Applications | SpringerLink [Internet]. [cited 2024 Mar 2];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://link.springer.com/book/10.1007/978-3-540\u0026ndash;\u003c/span\u003e\u003cspan address=\"https://link.springer.com/book/10.1007/978-3-540\u0026ndash;\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e35488\u0026ndash;8\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eA systematic comparison of statistical methods to detect interactions in exposome-health associations | Environmental Health | Full Text [Internet]. [cited 2024 Mar 2];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ehjournal.biomedcentral.com/articles/\u003c/span\u003e\u003cspan address=\"https://ehjournal.biomedcentral.com/articles/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12940-017-0277\u0026ndash;6\u003c/span\u003e\u003cspan address=\"10.1186/s12940-017-0277\u0026ndash;6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWold S, Esbensen K, Geladi P. Principal component analysis. Chemometrics and Intelligent Laboratory Systems [Internet] 1987 [cited 2024 Mar 2];2(1):37\u0026ndash;52. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.sciencedirect.com/science/article/pii/0169743987800849\u003c/span\u003e\u003cspan address=\"https://www.sciencedirect.com/science/article/pii/0169743987800849\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePyCaret \u0026mdash; pycaret 3.0.4 documentation [Internet]. [cited 2024 Jan 21];Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://pycaret.readthedocs.io/en/latest/\u003c/span\u003e\u003cspan address=\"https://pycaret.readthedocs.io/en/latest/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAkiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna: A Next-generation Hyperparameter Optimization Framework [Internet]. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \u0026amp; Data Mining. New York, NY, USA: Association for Computing Machinery; 2019 [cited 2024 Jan 21]. p. 2623\u0026ndash;31.Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1145/3292500.3330701\u003c/span\u003e\u003cspan address=\"10.1145/3292500.3330701\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTibshirani R. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) [Internet] 1996 [cited 2023 Dec 18];58(1):267\u0026ndash;88. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://rss.onlinelibrary.wiley.com/doi/\u003c/span\u003e\u003cspan address=\"https://rss.onlinelibrary.wiley.com/doi/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1111/j.2517\u0026ndash;6161.1996.tb02080.x\u003c/span\u003e\u003cspan address=\"10.1111/j.2517\u0026ndash;6161.1996.tb02080.x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRen J, Li Y, Yang J-J, et al. MRI-based radiomics analysis improves preoperative diagnostic performance for the depth of stromal invasion in patients with early stage cervical cancer. Insights into Imaging [Internet] 2022 [cited 2024 Mar 17];13(1):17. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13244-022-01156\u0026ndash;0\u003c/span\u003e\u003cspan address=\"10.1186/s13244-022-01156\u0026ndash;0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUrbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. Benchmarking relief-based feature selection methods for bioinformatics data mining. Journal of Biomedical Informatics [Internet] 2018 [cited 2023 Dec 18];85:168\u0026ndash;88. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://linkinghub.elsevier.com/retrieve/pii/S1532046418301412\u003c/span\u003e\u003cspan address=\"https://linkinghub.elsevier.com/retrieve/pii/S1532046418301412\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePanagiotou OA, Ioannidis JPA, Genome-Wide Significance Project. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int J Epidemiol 2012;41(1):273\u0026ndash;86.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSundus KI, Hammo BH, Al-Zoubi MB, Al-Omari A. Solving the multicollinearity problem to improve the stability of machine learning algorithms applied to a fully annotated breast cancer dataset. Informatics in Medicine Unlocked [Internet] 2022 [cited 2024 Mar 2];33:101088. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.sciencedirect.com/science/article/pii/S2352914822002246\u003c/span\u003e\u003cspan address=\"https://www.sciencedirect.com/science/article/pii/S2352914822002246\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCook JA, Ranstam J. Overfitting. British Journal of Surgery [Internet] 2016 [cited 2024 Mar 2];103(13):1814. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/bjs.10244\u003c/span\u003e\u003cspan address=\"10.1002/bjs.10244\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data [Internet] 2021 [cited 2024 Apr 5];8(1):53. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s40537-021-00444\u0026ndash;8\u003c/span\u003e\u003cspan address=\"10.1186/s40537-021-00444\u0026ndash;8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eManakitsa N, Maraslidis GS, Moysis L, Fragulis GF. A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision. Technologies [Internet] 2024 [cited 2024 Apr 5];12(2):15. Available from: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.mdpi.com/2227\u0026ndash;7080/12/2/15\u003c/span\u003e\u003cspan address=\"https://www.mdpi.com/2227\u0026ndash;7080/12/2/15\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-4644889/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4644889/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eGlioblastoma is an aggressive brain cancer with a poor prognosis. MGMT (O6-methylguanine-DNA methyltransferase) gene methylation status is crucial for treatment stratification, yet economic constraints often limit access. This study aims to develop an artificial intelligence (AI) framework for predicting MGMT methylation status.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eMachine learning (ML) and deep learning (DL) techniques were applied to diagnostic MR images from the NIH and a private institution. The images were segmented according to ESTRO-ACROP 2016 guidelines for radiotherapy treatment volumes and combined, with clinical evaluations from neuroradiology experts. Radiomic features (quantitative) and clinical impressions (qualitative) were extracted for ML models. Feature selection methods were used to identify relevant phenotypes for training and validation with ML classifiers.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003e We evaluated 100 patients from the NIH and 46 patients from a local institution. A total of 343 features were extracted. Eight feature selection methods produced seven independent predictive frameworks. The top-performing ML models included Recursive Feature Elimination (RFE) combined with Linear Discriminant Analysis (LDA) (accuracy of 0.75). DL performance achieved an accuracy of 0.74 using convolutional networks.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThis study demonstrates that integrating clinical and radiotherapy-derived AI-driven phenotypes can accurately predict MGMT methylation. The framework also addresses constraints that limit molecular diagnosis access.\u003c/p\u003e","manuscriptTitle":"Predicting MGMT Methylation in Glioblastoma for Informed Clinical Decisions: An AI-Driven Approach in Resource-Limited Settings","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-07-26 16:56:06","doi":"10.21203/rs.3.rs-4644889/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2024-08-05T04:16:17+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-07-31T16:29:06+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"171112145402709754658904140363010109249","date":"2024-07-28T19:20:03+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-07-09T03:43:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"63691083796252861804414749896199963601","date":"2024-07-05T10:39:47+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-07-04T04:05:38+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-07-04T00:05:39+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-07-03T23:26:34+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-07-03T03:34:23+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2024-06-26T21:21:01+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"2656cd5c-92bc-4987-9adc-912b3e9c6aae","owner":[],"postedDate":"July 26th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":34319064,"name":"Biological sciences/Cancer"},{"id":34319065,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":34319066,"name":"Health sciences/Health care"},{"id":34319067,"name":"Health sciences/Medical research"},{"id":34319068,"name":"Health sciences/Oncology"}],"tags":[],"updatedAt":"2024-11-18T19:18:20+00:00","versionOfRecord":{"articleIdentity":"rs-4644889","link":"https://doi.org/10.1038/s41598-024-78189-6","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2024-11-14 15:58:16","publishedOnDateReadable":"November 14th, 2024"},"versionCreatedAt":"2024-07-26 16:56:06","video":"","vorDoi":"10.1038/s41598-024-78189-6","vorDoiUrl":"https://doi.org/10.1038/s41598-024-78189-6","workflowStages":[]},"version":"v1","identity":"rs-4644889","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4644889","identity":"rs-4644889","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0