A Radio-Genomics Biomarker for Precision Epidermal Growth Factor Receptor Mutation Targeting Therapy in Non-Small Cell Lung Cancer

preprint OA: gold CC-BY-4.0
📄 Open PDF Full text JSON View at publisher
Full text 158,866 characters · extracted from preprint-html · click to expand
A Radio-Genomics Biomarker for Precision Epidermal Growth Factor Receptor Mutation Targeting Therapy in Non-Small Cell Lung Cancer | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A Radio-Genomics Biomarker for Precision Epidermal Growth Factor Receptor Mutation Targeting Therapy in Non-Small Cell Lung Cancer Mitchell Chen, Susan J Copley, Kristofer Linton-Reid, Patrizia Viola, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8158721/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 06 Mar, 2026 Read the published version in Scientific Reports → Version 1 posted 6 You are reading this latest preprint version Abstract Newer-generation tyrosine kinase inhibitors (TKIs) have shown increasing efficacy in cancers driven by specific mutations, with epidermal growth factor receptor (EGFR) alterations remaining the most common actionable targets in non-small cell lung cancer (NSCLC). Treatment decisions are currently guided by tissue sampling and genetic testing, which are limited by procedural risks, patient tolerance, tumour heterogeneity and mutation evolution. Because co-mutations involving EGFR and other targetable genes can diminish treatment response, identifying exclusive EGFR mutation, defined by the absence of other actionable alterations, represents a clinically favourable scenario for first-line EGFR-TKI therapy. We developed a CT-based radiomics signature, EGFR-RPV, to predict exclusive EGFR mutational status using NSCLC patients (n = 304) from a multi-centre cohort with paired imaging and genomics data, and validated performance in an independent testing set (n = 51), alongside transcriptomics enrichment analysis. EGFR-RPV predicted exclusive EGFR mutation with accuracies of 0.77 (95% CI 0.66–0.88) and 0.71 (95% CI 0.54–0.89) in internal and external testing, respectively, and stratified patient prognosis (hazard ratio 2.15, 95% CI 1.50–3.08). FAM190A and CBMO1 were enriched in exclusive EGFR-positive cases, consistent with their roles in cell division regulation and vitamin A biosynthesis, respectively. EGFR-RPV thus offers a non-invasive approach to identify exclusive EGFR mutations, with a potential role in guiding first-line EGFR-TKI use. Health sciences/Biomarkers Biological sciences/Cancer Biological sciences/Computational biology and bioinformatics Biological sciences/Genetics Health sciences/Oncology Non-small cell lung cancer imaging biomarker radiogenomics EGFR mutation tyrosine kinase inhibitor Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Lung cancer is the leading cause of cancer-related deaths worldwide, with non-small cell lung cancer (NSCLC) accounting for 80–85% of its cases.[ 1 ] Over 70% of NSCLC cases are diagnosed at advanced stages, carrying poor prognoses.[ 2 ] Tyrosine kinase inhibitors (TKI) have excellent response profile in cancers exhibiting certain oncogenic driver mutations, with those relating to epidermal growth factor receptor (EGFR) being the most common targets in NSCLC.[ 3 ] Newer generation EGFR-TKIs, such as osimertinib, are more effective against advanced and metastatic NSCLC, with prolonged patient survival, improved quality of life and fewer adverse events compared to chemotherapy,[ 4 ] and are now the first line therapy for cancers harbouring suitable mutations.[ 5 ] In precision EGFR-TKI clinical pathways, it is crucial to ensure timely commencement of therapies to maximise patient benefit, avoid unnecessary treatment-related adverse events,[ 6 ] and select the best first-line treatment. For example, in the case of immune checkpoint inhibitor (ICI), increased pneumotoxicity has been observed in patients with EGFR mutation.[ 7 ] Treatment decision is currently guided by tissue sampling followed by next generation sequencing (NGS),[ 8 ] which is burdened by procedural invasiveness, patient acceptance, tumour heterogeneity and quality of sampled tissue.[ 9 ] Up to 30% of patients do not have suitable biopsy sample available.[ 10 ] More recently, liquid biopsy with plasma-derived cell free DNA (cfDNA) analysed by digital droplet polymerase chain reaction (ddPCR)⁠ has been proposed as an alternative test,[ 11 ] but is limited by confounding effect from non-tumour cfDNA from normal tissue necrosis, lysis of leukocytes after blood collection or clonal haematopoiesis, leading to its suboptimal specificity; and the lack of mutation localisation in multi-focal disease.[ 12 ] An overall genotyping error rate of up to 11.1% has been reported for cfDNA.[ 10 ] These limitations highlight the need for strategies that can complement existing molecular testing to improve the delivery of precision EGFR-TKI therapy in NSCLC. Imaging-based biomarkers hold notable promise in this regard, given their non-invasiveness, broad availability, and low cost. Radiomic features are quantitative metrics derived from imaging data and can non-invasively capture important disease information⁠. [ 13 – 16 ] Prior research have demonstrated the utility of radiomics for predicting EGFR mutation, though they have been limited to single mutation prediction. [ 17 – 20 ] In clinical practice, multiple actionable mutations are routinely tested in non-squamous NSCLC. [ 21 ] Co-occurring mutations can be found in up to 12.9% of the EGFR mutation positive patients, [ 22 ] including up to 3.9% with concomitant anaplastic lymphoma kinase (ALK) [ 23 ] and 1.1% with Kirsten RAt Sarcoma viral oncogene homologue (KRAS) mutations. [ 24 ] Co-mutation status is associated with an increased treatment resistance to EGFR-TKI and worse patient survival. [ 22 , 24 – 26 ] Although radiomics studies have previously investigated the prediction of targetable mutations, including EGFR and KRAS,, [ 27 – 29 ] the identification of exclusive EGFR mutation, characterised by the absence of other key mutations (ALK and KRAS), remains unaddressed in the literature, despite its relevance as a clinical scenario favouring response to EGFR-TKIs. In this paper, we introduce a CT-based radiomic biomarker, EGFR-RPV, for the prediction of this important mutation profile. Results Patient characteristics We included NSCLC patients presenting to our multi-centre institution between February 2012 and July 2019 (n = 304, age (mean ± standard deviation): 67.6 ± 10.7, male: female [M:F] = 174:130) as a discovery cohort, and a dataset [ 30 ] from the Cancer Imaging Archive (TCIA) (n = 51, Age: 69.6 ± 8.1, M:F = 35:16) for external testing. The discovery cohort was split 2:1 into training and internal validation sets, balanced for patient's age, sex, and tumour histology. We define exclusive EGFR positivity as where it is positive in the absence of ALK and KRAS mutation. Radiomics predictive vector The biomarker development pipeline is presented in Fig. 1 A. EGFR-RPV was developed using multi-regional segmentation, comprehensive radiomic feature extraction in a multi-regional approach (regions of interests (ROI): lesion, perilesional annulus and lung parenchyma), followed by a benchmarking of various dimensionality reduction and regression methods for the best performing statistical learning pipeline. EGFR-RPV is a 10-feature composite radiomics vector (Fig. 1 B) developed using Spearman and Least Absolute Shrinkage and Selection Operator (LASSO) methods. It predicts exclusive EGFR positivity to an accuracy of 0.77, 95% CI: 0.66–0.88 and 0.71, 95% CI: 0.54–0.89 in the internal and external testing sets, respectively (Figs. 2 A and 2 B). The component features belong to first order and higher classes (texture and fractals), extracted from all three ROIs, with most number (n = 5) from the peri-lesional area; consistent with the hypothesised distribution of oncogenic cells harbouring driver mutations. [ 31 ] EGFR-RPV also delivers effective patient prognostic stratification into high and low risk groups (Cox hazard ratio (HR) 2.15, 95% CI 1.50–3.08, log-ranked p < 0.001). (Fig. 2 C) Logistic regression analysis and clinico-radiomics integration We have performed univariable and multivariable logistic regression analyses to identify clinical features with statistically significant association with exclusive EGFR mutation status (Figs. 3 A & 3 B). The statistical significance of patient sex and EGFR-RPV was established (p < 0.05) in both univariable and multivariable analyses. We developed a nomogram using these features with EGFR-RPV to aid in clinical decision making. (Fig. 3 C). Genomics analysis Given the wide use of genomics in cancer biomarker research, including their demonstrated utility in various precision oncology scenarios [ 32 , 33 ] and relevance to radiomics advancements for NSCLC [ 15 , 34 , 35 ], we have further performed RNA transcriptomics analysis to advance an understanding of the radio-genomics landscape of NSCLC in the context of exclusive EGFR positivity. In this study arm, we analysed the bulk RNA transcriptomics readouts from the NSCLC Radiogenomics dataset, an independent cohort of 51 patients (Age: 69.6 ± 8.1, M:F = 35:16). [ 30 ] In the latent space formed by the two top ranked principal components which explain most data variance (Figs. 4 A& 4 B), we found exclusive EGFR positivity is not distinctly predicted by unsupervised clustering of the samples (Fig. 4 C) nor by their hierarchical clustering by Euclidean distance on a heatmap (Fig. 4 D). We discovered FAM190A and BCMO1 genes to be most differentially expressed in cases with exclusive EGFR positivity (Fig. 5 A), which are expressed in separate clusters on Uniform Manifold Approximation and Projection (UMAP) plot (Fig. 5 B). Discussion Tissue sampling followed by mutational profiling enables targeted treatment for NSCLC harbouring EGFR mutations, regardless of disease stage. Nevertheless, this approach is burdened by the invasiveness of tissue sampling procedure, tumour heterogeneity and quality of sampled tissue.[ 9 ] The more recently introduced liquid biopsy with plasma cfDNA is limited by false positives arising from non-tumour cfDNA, and a lack of mutation localisation in multi-focal disease.[ 12 ] To tackle these challenges, we developed a novel, non-invasive, imaging biomarker for guiding clinical decisions using routinely acquired imaging data. It demonstrates good performance for predicting exclusive EGFR mutation and achieves effective patient prognostic stratification. EGFR mutations are more prevalent in non-smoking, female and East Asian patients. [ 36 ] The commonest actionable types of EGFR mutations include del19 (exon 19) and L858R (exon 21), which collectively constitute up to 90% of all EGFR mutations in NSCLC. [ 37 ] Previously, target mutations such as EGFR, ALK and KRAS were considered mutually exclusive, but their co-mutational status is becoming increasingly recognised for their association with increased treatment resistance to EGFR-TKI and worse patient survival. [ 22 , 24 – 26 ] This supports a comprehensive analysis of key actionable mutations when determining patient suitability for this treatment. During the study period (2012–2018), KRAS mutation was not considered targetable, as no approved therapies were available for clinical use at that time. The therapeutic landscape changed significantly in 2021, when the first KRAS inhibitor demonstrated clinical efficacy and gained regulatory approval, thereby establishing KRAS as an actionable target. [ 38 ] This distinction is crucial for interpreting our results: while KRAS was not clinically targetable during the study period, its presence is now directly targetable and has also been associated with reduced benefit from EGFR-TKI. [ 39 ] In current clinical practice, the knowledge of the absence of KRAS mutation in EGFR-mutant disease helps to improve treatment strategy by permitting more confident use of EGFR-targeting therapies. In precision treatment for NSCLC, the importance of predicting for exclusive EGFR positivity cannot be overstated. First, mutation exclusivity ensures the tumour is primarily dependent on the EGFR-driven pathway, thus maximising the likelihood of a strong therapeutic response to EGFR-TKI. By identifying patients with driver mutations other than EGFR, such as ALK and KRAS, we could avoid the use of ineffective or potentially toxic therapies in those cases. In tissue-scarce scenarios, a high predicted probability of exclusive EGFR positivity can justify focused assays while avoiding delays from broad yet low-yield testing; and it can avoid futile therapy in cases where other non-EGFR actionable mutations are present and, in the contemporary setting, redirect such candidates toward appropriate targeted strategies in subsequent lines. Finally, with their more favourable survival profile, tumours harbouring exclusive EGFR mutations would benefit from more accurate disease prognostication when they are readily identified at the time of diagnosis. In our study, all patients underwent mutational testing on tissue biopsy specimens, and as such, EGFR-RPV was developed in a cohort where tissue acquisition was feasible. The aim of this study was not to replace tissue- or plasma-based molecular testing, but to evaluate the potential of a rapid, low-cost, non-invasive imaging signature to complement the current diagnostic workflow. When biopsy is performed as standard, EGFR-RPV could provide early molecular insights prior to the availability of biopsy-derived results, and offer a surrogate for longitudinal monitoring without repeated invasive procedures. EGFR-RPV can also provide complementary information in clinical scenarios where standard assays yield inconclusive results, such as when there is inadequate tissue DNA quality due to poor cellularity, necrosis, or degraded FFPE material; [ 40 ] very low variant allele frequency (VAF) in tissue below validated diagnostic thresholds; [ 41 ] or in the case of liquid biopsy, low tumour fraction in circulating cfDNA, particularly in oligometastatic disease or protected compartments (e.g. brain) which can yield false-negative results. [ 42 ] Additionally, EGFR-RPV can help to adjudicate discordant tissue versus plasma results, acting as a tie-breaker. [ 43 ] In patients with negative or unavailable molecular testing but clinical features strongly suggestive of EGFR-mutant disease, such as those who are female, non-smoker and have adenocarcinoma histology, [ 36 ] EGFR-RPV could provide clinical-decision guidance while confirmatory assays are pending. In the above contexts, the utility of EGFR-RPV lies not only in its non-invasiveness, but also in its speed, affordability, and potential to bridge gaps when conventional genomic testing is inconclusive. EGFR-RPV is the first imaging biomarker to address the question of predicting for exclusive EGFR positivity. Several radiomic biomarkers have previously been presented in literature for EGFR mutation. We searched on PubMed using keywords “CT”, “radiomics”, “EGFR” and “Lung cancer”, for original related radiomics studies published since 2012. [ 13 ] We reviewed the bibliometric search results and listed the most relevant ones in Table 2 ; most were limited to small scale studies, lacked in external validation, and/or suffered from methodological shortfalls such as not meeting the recommendations stipulated by the International Biomarker Standardisation Initiative. [ 44 ] None dealt with the question of exclusive EGFR positivity that our study addresses. Previously, the radiomic features found to be predictive of EGFR mutation were from multiple feature classes. For example, texture features such as gray level size zone ( GLSZM ) and wavelet transformed features have been consistently included in various published radiomic biomarkers for actionable EGFR mutations [ 45 – 48 ]. In our study, we found most constituent features to derive from the peri-lesional ROI; which would be consistent with the hypothesised distribution of oncogenic cells harbouring driver mutations, [ 31 ] and supported radiologically on contrast enhanced CT and ¹⁸F-fluorodeoxyglucose-Positron Emission Tomography (FDG-PET) imaging. Evaluating the specific enriched radiomic features for their biophysical significance, EGFR-RPV includes wavelet transformed first order, texture, FD and wavelet-LoG transformed texture features from the tumour; wavelet transformed first order, texture, FD and wavelet-LoG transformed texture features from the perilesional annulus; and wavelet transformed texture feature from the lung parenchyma, with the most positive weight attached to wavelet transformed first order statistic ( FOS Imedian LLH ) from the perilesional annulus. A raised median intensity of the perilesional area can be associated with tissue invasion into the surrounding lung, as commonly observed radiologically in adenocarcinoma, the most common NSCLC histological subtype to harbour targetable EGFR mutations. [ 49 ] Interestingly, the highest absolute weight is a negative one attached to the only feature from the lung parenchyma ( glcm Correl LLL ), which suggests a negative predictive value the lung parenchyma feature has on the presence of exclusive EGFR mutation. Glcm quantifies gray-level zones or the number of connected voxels that share the same gray-level intensity within the image and has been associated with the presence of pulmonary emphysema [ 50 ]. This finding would be consistent with the understanding that EGFR mutations are commonly found in adenocarcinomas and non-smokers. [ 49 ] With tobacco smoking commonly associated with squamous cell carcinoma of the lung, [ 51 ] emphysema-associated radiomic feature could imply squamous cell characteristics not otherwise detected histologically in the sampled tissue, particularly in poorly differentiated or mixed NSCLC histology cases diagnosed on limited or necrotic tissue specimen [ 52 ]. This latter hypothesis, if proven, would support the use of radiomics to screen for squamous cell cancer features, which could in turn influence the patient’s initial clinical pathway. The molecular analysis of paired bulk RNA data provides a complementary genomic perspective that supports the overarching aim of this study through a non-imaging approach, which reflects the central role of genomics in cancer biomarker research, where transcriptomic profiling has demonstrated substantial utility across diverse precision oncology applications [ 32 , 33 ] and has contributed to key advances in radiomics for NSCLC [ 34 , 35 ]. Incorporating transcriptomic analysis establishes a biologically grounded reference against which our imaging-based innovation can be contextualised. Notably, unlike EGFR-RPV, a genomics-based approach does not readily enable direct prediction of exclusive EGFR mutation. However, the transcriptomic findings offer mechanistic insight into the tumour biology underpinning this molecular subtype. The observed upregulation of FAM190A and BCMO1 in exclusive EGFR-positive cases aligns with their respective roles in regulating aberrant cell division [ 53 ] and catalysing the conversion of β-carotene to vitamin A, [ 54 ] the latter being particularly relevant given epidemiological evidence linking β-carotene and vitamin A supplementation to elevated NSCLC risk. [ 55 , 56 ] This genomic analysis therefore enriches the biological interpretability of EGFR-RPV and strengthens the framework supporting its clinical relevance. The limitations of this study include its retrospective nature and relatively limited size of the external testing set. While EGFR-RPV showed promising accuracy in predicting exclusive EGFR mutation status across internal and external validation datasets, further refinements are needed to strengthen its clinical applicability. There are some significant differences between the disease characteristics of the discovery and testing cohorts. For example, in the testing cohort, most patients (82.4%) had early-stage, non-metastatic NSCLC, whereas 41.6% of the discovery cohort presented with stage 3 or 4 disease. Additional statistically significant differences in histological subtype, PD-L1 expression and EGFR exclusive positivity are present between the two cohorts. The observed differences in AUCs, ROC curves, and Youden Index values between the development and testing cohorts likely reflect these dataset differences, which can influence model prediction and the optimal threshold for classification. Despite these differences, however, we note EGFR-RPV maintained good predictive performance across the cohorts, demonstrating good robustness to variations in these characteristics. Nevertheless, further external validation in diverse populations is necessary to confirm the generalisability of EGFR-RPV. Radiomic features can be affected by the type of scanner, scanning protocol and reconstruction setting used. [ 57 ] We have ascertained feature reproducibility by including only reproducible features validated in a test-retest experiment, and features meeting an ICC threshold. Furthermore, we have used resampling, standardisation, and feature harmonisation techniques, and validated EGFR-RPV in an external testing dataset acquired in a different country (USA) from training (UK), with different scanners and scanning protocols, as well as statistically significant differences in certain cohort characteristics (Table 1 ). Although the multi-institutional nature of our dataset reduces single-centre bias, residual heterogeneity in CT protocols and hardware could still affect feature stability and model generalisability, which remains an important consideration requiring continued methodological rigour in future works. Table 1 Characteristics of patients included in the study and p-values showing statistical differences between the discovery and testing cohorts. Note: p-values were calculated using Wilcoxon rank-sum test for continuous variables, and chi-square test for categorical variables. Percentage figures are given in brackets, unless otherwise specified. ECOG, Eastern Cooperative Oncology Group; disease stage based on International Association for the Study of Lung Cancer (IASLC) 7th edition; TPS: tumour proportion score; Exclusive EGFR mutation is defined as where it is positive in the absence of ALK and KRAS positivity. *denotes statistically significant difference Discovery Cohort (n = 304) No. (%) Testing Cohort (n = 51) No. (%) p-value Training vs Testing Age (years) Median Standard Deviation Range 67.6 10.7 32–92 69.6 8.1 50–85 0.2085 Sex Female Male 130 (42.8) 174 (57.2) 16 (31.3) 35 (68.7) 0.1268 ECOG Performance Status 0 1 2 3 4 Unknown 145 (48.4) 90 (30.0) 43 (14.3) 19 (6.3) 2 (0.7) 1 (0.3) T Stage 1 2 3 4 Unknown 91 (30.5) 81 (27.2) 47 (15.8) 77 (25.8) 2 (0.7) 21 (41.2) 21 (41.2) 5 (9.8) 4 (7.8) 0 < 0.0* N Stage 0 1 2 3 Unknown 113 (37.8) 44 (14.7) 78 (26.1) 63 (21.1) 1 (0.3) 41 (80.4) 5 (9.8) 5 (9.8) 0 0 < 0.05* Metastases 0 1 Unknown 154 (51.3) 145 (48.3) 1 (0.3) 48 (94.1) 3 (5.9) 0 < 0.05* Histological type Adenocarcinoma Other 286 (94.1) 18 (5.9) 45 (88.2) 5 (11.8) < 0.05* PD-L1 expression TPS < 1% TPS ≥ 1% 252 (82.9) 52 (17.1) 28 (54.9) 23 (45.1) < 0.05* EGFR mutation Negative Positive Exclusive Positivity 262 (86.2) 42 (13.8) 34 (11.2) 38 (74.5) 13 (25.5) 12 (23.5) 0.142 < 0.05* Over the past decade, osimertinib has been established as the standard first-line treatment for patients with advanced or metastatic NSCLC harbouring actionable EGFR mutations. However, the field is now shifting: recent landmark trials, such as FLAURA-2[ 58 ] and MARIPOSA,[ 59 ] have moved beyond the development of newer EGFR inhibitors, instead focusing on combination strategies to further enhance patient outcomes. Findings from these trials would support an extension of this work to scenarios such as osimertinib plus chemotherapy or amivantamab (bispecific antibody targeting EGFR and MET) +/- lazertinib (EGFR-TKI). MET co-alterations have emerged as a clinical challenge in EGFR-mutant NSCLC, with MET amplification recognised as a mechanism of resistance to EGFR-TKI monotherapy.[ 60 ] In our study, given its retrospective nature, an evaluation of EGFR-RPV for MET mutation status prediction was not feasible. Although the Ion Torrent Hotspot Panel used in the discovery cohort provided limited coverage of MET hotspot mutations, it did not reliably capture the most relevant alterations, namely MET amplification and exon 14 skipping, and such data was also not provided in the testing cohort. Consequently, MET co-alterations could not be reliably evaluated in this study, prompting future works to assess EGFR-RPV in relation to its prediction of MET–EGFR co-mutational status, to ensure alignment with contemporary molecular oncology practice. Other future works include testing the biomarker prospectively and evaluating its utility in clinical practice. Additional pertinent outcome measure could include response to treatment. Given the expanding use of EGFR-TKI, we could consider developing a related biomarker for early-stage NSCLC treated with resection followed by adjuvant TKI. In patients with concomitant PD-L1 positivity and EGFR mutation, EGFR-TKI can be combined with ICI for improved systemic therapy; but this is associated with increased incidence of immune-related adverse events; warranting further investigation. [ 61 ] Methods Data collection This retrospective study was approved by the institutional review board (IRB) and Health Research Authority UK (HRA: 18HH4616), conducted in accordance with the Declaration of Helsinki, and adhered to the STROBE and REMARK guidelines. Informed consent was obtained from all study participants at the time of the image data acquisition, as in routine clinical practice. The requirement for separate consent for this study was waived by IRB and HRA, due to the study’s retrospective and observational nature and use of de-identified patient data. The discovery cohort consisted of 304 patients with NSCLC (age (mean ± standard deviation): 67.6 ± 10.7, male: female [M:F] = 174:130) who underwent CT scans and tissue sampling followed by genetic testing at our multi-centre institution between February 2012 and July 2018. An independent cohort of 51 patients (Age: 69.6 ± 8.1, M:F = 35:16) from TCIA was used for external validation.[ 30 ] Patients in this cohort have had mutational tests for actionable mutations, and additionally bulk RNA sequencing of their tumour tissue specimen using a HiSeq 2500 (Illumina, San Diego, USA) system. Clinical data including patient demographics and tumor histology were collected from the electronic patient record. Actionable EGFR mutation was the primary study endpoint. Patient overall survival was documented up to 3 years post-diagnosis.[ 62 ] It is defined as the time from the baseline diagnostic CT to 3-year follow-up or patient death of any cause, whichever occurred earlier. We excluded cases with tumor histology other than NSCLC, missing clinical or molecular data, or with non-contrast or thick axial slice (> 3mm) scans. Patient flow diagrams and characteristics are presented in Fig. 6 A, and Table 2 , respectively. Table 2 Key radiomics literature on EGFR mutation prediction. Performance metric given in AUROC, unless stated otherwise. Study Training Validation Performance* Feature Type Features Limitation Le et al. [ 47 ] TCIA (n = 143) Internal (n = 18) 0.778 Hand-crafted features Wavelet, first order energy No external Moreno et al. [ 48 ] TCIA (n = 83) Internal (80:20) 0.857 Hand-crafted and deep learning features Texture No external Wu et al. [ 45 ] Local (n = 67) Cross-validation only 0.882 Hand-crafted features Shape/surface volume ratio, texture, wavelet features No external Zhang et al.[ 46 ] Local (n = 297) Independent dataset (n = 127) 0.753 Hand-crafted features first-order, texture, wavelet features Only portal venous phase scans routinely performed Genetic testing In the discovery cohort, DNA was extracted from formalin fixed paraffin embedded (FFPE) tissue using the Qiagen QI Symphony DSP DNA Minikit (Qiagen N.V., Hilden, Germany). [ 63 ] The mutational screening was performed by next-generation sequencing using the Ion Torrent Hotspot Panel (Ion Torrent Systems, now part of Thermo Fisher Scientific, South San Francisco, CA, USA). The assay comprised 207 amplicons in 50 oncogenes frequently mutated in solid tumours, including EGFR, KRAS, NRAS, BRAF, and PIK3CA. Reference Sequences NM_005228.3, NM_004985.3, NM_002524.4, NM_004333.4, and NM_006218.2 were used to screen the EGFR, KRAS, NRAS, BRAF and PIK3CA genes, respectively. Fluorescence in situ hybridisation (FISH) was used in parallel for ALK rearrangements (EML4-ALK translocation). In the testing cohort, EGFR, KRAS, and ALK mutation status were available. Single-nucleotide variant detection was performed using the SNaPshot assay (Applied Biosystems, now Thermo Fisher Scientific, Waltham, MA, USA). [ 30 ] EGFR mutations were assessed in exons 18–21, KRAS mutations in exon 2. ALK rearrangements were evaluated using FISH for detection of EML4–ALK translocations. We defined exclusive EGFR positivity as EGFR positivity in the absence of ALK and KRAS mutation; actionable EGFR, ALK, and KRAS mutations, at the time of writing, which were covered in both cohorts, were considered. Image acquisition All patients in the discovery cohort had contrast-enhanced chest CTs demonstrating a primary NSCLC at the time of diagnosis. The three centers (A, B, C) at our institution used different scanners (Site A: Siemens Definition AS+; Site B: Philips Brilliance and Philips Ingenuity; Site C: Siemens Definition AS+) and institutional scanning protocols with a peak kilovoltage (kVp) ranging from 100-140kVp (120kVp), tube current 120–650 mA (mean 200mA) and slice thickness 0.625–3 mm (median: 1.5 mm), and contrast given in the portal venous phase. Scans were acquired with subjects in supine position with arms at sides, from the apex of the lung to the adrenal gland within a single breath-hold. Patients in the testing cohort received their CT scans at Stanford USA, [ 30 ] performed using various scanners with 80–140 kVp (mean 120 kVp), 124–699 mA (mean 220 mA) and slice thickness of 0.625–3 mm (median: 1.5 mm). Contrast enhancement phase, subject positioning, breath holding and scan coverage were similar to those in the discovery cohort. Multi-label segmentation Two chest radiologists, blinded to clinical and histological data, with 13 and 9 years of experience, doubly reviewed all scans using both mediastinal (width, 350 HU; level 40 HU) and lung (width, 1500 HU; level, -600 HU) window settings, and performed semi-automated segmentation of the tumour, peri-lesional annulus of 5mm thickness, and a spherical parenchymal patch of 2cm diameter in the same or an ipsilateral pulmonary lobe, where there is no appreciable aerated lung remaining. This multi-region approach (Fig. 6 B) is currently the mainstay in lung cancer radiomics workflow. [ 64 ] All delineations were made using 3DSlicer (Slicer Community, Boston, USA). [ 65 ] Image processing and radiomic features extraction After segmentation, the imaging data were resampled to a uniform voxel size of 1 × 1 × 2 mm and analysed for a total of 1,998 radiomic features from each scan (666 features per ROI), using an in-house software (TexLab 2.0), in Matlab 2020b (MathWorks Inc., Natick, MA, USA). [ 15 , 66 ] The computed features included ones pertaining to tumour image intensity, shape and texture from the original, wavelet and Laplacian of Gaussian (LoG) transformed images, which are compliant with Image Biomarker Standardisation Initiative (IBSI). [ 44 , 67 ] We have additionally extracted an additional texture descriptor not yet covered by ISBI, fractal dimension (FD), to capture complex spatial patterns not well characterised by traditional metrics. [ 15 , 68 ] This was captured using a box-counting algorithm, which involved overlaying grids of varying box sizes over the ROIs and computing the number of boxes required \(\:N\left(\epsilon\:\right)\) to cover the object as a function of box size \(\:\epsilon\:\) . The fractal dimension was then estimated as the negative slope of the linear regression line fitted to the log-log plot of \(\:\text{l}\text{o}\text{g}\left(N\left(\epsilon\:\right)\right)\) versus \(\:\text{l}\text{o}\text{g}(1/\epsilon\:)\) . The computed radiomic features were standardised to a mean of zero and standard deviation (SD) of one. To further countering batch effects resulting from inter-scanner and inter-site variabilities, features were harmonised using ComBat, [ 69 ] in keeping with IBSI recommendation. [ 44 , 67 ] Inter-observer radiomic feature reproducibility was assessed by calculating the intraclass correlation coefficient (ICC), on the basis of a two-way random model. There were 1,452 features found to have an ICC greater than or equal to 0.8, thus deemed reproducible and included in subsequent dimensionality reduction and regression steps. Test-retest repeatability was assessed using a cutoff ICC of greater or equal to 0.9, based on the publicly available RIDER dataset (n = 29), where repeated CT scans were taken 15 minutes apart for every participant.[ 70 ] Model development and validation Common dimensionality reduction and regression methods were benchmarked to select the best combination for achieving optimal performance, as assessed by predictive performance in the internal validation cohort (Fig. 6 C). In this work, the combination of Spearman and Least Absolute Shrinkage and Selection Operator (LASSO) yielded best performance and was adopted to develop the predictive biomarker, EGFR-RPV. This study has a radiomics quality score of 23/36.[ 71 ] We have performed univariable and multivariable logistic regression analyses to identify clinical features with statistically significant association with exclusive EGFR mutation status and developed a nomogram based on EGFR-RPV and clinical features deemed statistically significant in these analyses, using R package rms. Statistical analysis All statistical analyses and machine learning were performed using R version 4.3.0 (R Project for Statistical Computing, http://www.r-project.org/ ). The statistical tests were two-sided, with a p-value threshold of significance at 5% adopted throughout. Differences between cohorts were tested using the analysis of variance test for continuous variables and the chi-square test for categorical variables. Kaplan-Meier plots were used to evaluate the utility of the model for patient prognostication, and log-rank test used to assess the survival curve differences. Receiver operating characteristics (ROC) analysis was used to assess the predictive performance of EGFR-RPV. Univariable and multivariable regression models were used to investigate the effects of EGFR-RPV and various clinical features on exclusive EGFR mutation. Genomics analysis Bulk RNA reads were aligned to the human genome (hg19) using the alignment algorithm STAR version 2.3 with 91 bases of splice junction overhangs. Next, the readouts were normalised on an individual gene basis. We computed counts per million (CPM) using R package edgeR and retained readouts with a threshold of CPM of 0.5 then log2 transformed and performed hierarchical clustering with heatmap based on Euclidean distance. We found the feature latent space using principal component analysis (PCA), followed by Uniform Manifold Approximation and Projection (UMAP) and differential expression analysis to identify the most differentially expressed genes based on exclusive EGFR mutation status. Declarations Data availability Our institutional study data (clinical, genetics and imaging) are retrospective in nature and protected through institutional compliance; and can be shared as per specific institutional review board (IRB) requirements. Upon reasonable request, a data sharing agreement can be initiated between the interested parties and the clinical institution following institution-specific guidelines. The gene expression dataset analysed during the study are from the NSCLC Radiogenomics public domain dataset, available in the Gene Expression Omnibus (GEO) repository, accessed via accession number GSE103584. Its paired imaging and clinical data are deposited in The Cancer Imaging Archive (TCIA), which can be accessed through the NSCLC Radiogenomics collection at: https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/ . Code availability All R scripts and guidance for use can be found on GitHub: https://github.com/scat2801/egfr-rpv/ . Conflict of interest The authors declare no competing interests in the publication of this paper. MC sits on the Royal College of Radiologists AI steering committee; SJC and MC sit on the North West London Imaging Network AI Diagnostic Fund panel. Funding MC is funded by Medical Research Council (MRC) Clinician Scientist Fellowship WSCC_PB1626 and a North West London Pathology E&R grant; EOA received funding from the Imperial College London Biomedical Research Centre (ICL-BRC), Experimental Cancer Medicines Centre (ECMC) and National Cancer Imaging Translational Accelerator consortium (NCITA). EOA also acknowledges the MRC. Author Contribution M.C: Conceptualisation, data curation, formal analysis, funding acquisition, project administration, resources and software, manuscript – original draft; S.J.C: Data curation, funding acquisition, resources, supervision; K.L: Software; P.V: Investigation, resources; Y.H: Data curation; A.C: Resources; H.L: Investigation; A.M: Data curation; M.B: Data curation; D.J.P: Resources; D.P: Resources; A.G.R: Resources; E.O.A: Conceptualisation, funding acquisition, funding acquisition, resources, supervision; All authors contributed to Manuscript – review & editing. References Cancer Research, U. K. Types of lung cancer. (2019). https://www.cancerresearchuk.org/about-cancer/lung-cancer/stages-types-grades/types Polanco, D. et al. Prognostic value of symptoms at lung cancer diagnosis: a three-year observational study. J. Thorac. Dis. 13 , 1485 (2021). Rothschild, S. I. Targeted Therapies in Non-Small Cell Lung Cancer—Beyond EGFR and ALK. Cancers (Basel) . 7 , 930–949 (2015). Simeone, J. C., Nordstrom, B. L., Patel, K. & Klein, A. B. Treatment patterns and overall survival in metastatic non-small-cell lung cancer in a real-world, US setting. Future Oncol. 15 , 3491–3502 (2019). Araki, T., Kanda, S., Horinouchi, H. & Ohe, Y. Current treatment strategies for EGFR-mutated non-small cell lung cancer: from first line to beyond osimertinib resistance. Jpn J. Clin. Oncol. 53 , 547 (2023). Hirsh, V. Managing treatment-related adverse events associated with egfr tyrosine kinase inhibitors in advanced non-small-cell lung cancer. Curr. Oncol. 18 , 126 (2011). Gavralidis, A. & Gainor, J. F. Immunotherapy in EGFR-Mutant and ALK-Positive Lung Cancer: Implications for Oncogene-Driven Lung Cancer. Cancer J. 26 , 517–524 (2020). Lee, C. et al. Next-generation sequencing reveals novel resistance mechanisms and molecular heterogeneity in EGFR-mutant non-small cell lung cancer with acquired resistance to EGFR-TKIs. Lung Cancer . 113 , 106–114 (2017). Czyzewski, A. ‘Virtual biopsy’ uses AI to help doctors assess lung cancer. Imperial News (2024). https://www.imperial.ac.uk/news/251593/virtual-biopsy-uses-ai-help-doctors/ Fairley, J. A. et al. Results of a worldwide external quality assessment of cfDNA testing in lung Cancer. BMC Cancer . 22 , 1–12 (2022). Helman, E. et al. Cell-Free DNA Next-Generation Sequencing Prediction of Response and Resistance to Third-Generation EGFR Inhibitor. Clin. Lung Cancer . 19 , 518–530e7 (2018). Cucchiara, F. et al. Integrating Liquid Biopsy and Radiomics to Monitor Clonal Heterogeneity of EGFR-Positive Non-Small Cell Lung Cancer. Front Oncol 10 , (2020). Lambin, P. et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur. J. Cancer . 48 , 441–446 (2012). Chen, M., Copley, S. J., Viola, P., Lu, H. & Aboagye, E. O. Radiomics and artificial intelligence for precision medicine in lung cancer treatment. Semin Cancer Biol. 93 , 97–113 (2023). Chen, M. et al. A Novel Radiogenomics Biomarker for Predicting Treatment Response and Pneumotoxicity From Programmed Cell Death Protein or Ligand-1 Inhibition Immunotherapy in NSCLC. J. Thorac. Oncol. 18 , 718–730 (2023). Chen, M., Linton-Reid, K., Aboagye, E. O. & Copley, S. J. Translating Radiomics into Clinical Practice: A Step-by-Step Guide to Study Design and Evaluation. Clin. Radiol. 107053 10.1016/J.CRAD.2025.107053 (2025). Liu, Y. et al. Radiomic Features Are Associated With EGFR Mutation Status in Lung Adenocarcinomas. Clin. Lung Cancer . 17 , 441–448e6 (2016). Wu, S., Shen, G., Mao, J. & Gao, B. CT Radiomics in Predicting EGFR Mutation in Non-small Cell Lung Cancer: A Single Institutional Study. Front. Oncol. 10 , 542957 (2020). Rossi, G. et al. Radiomic Detection of EGFR Mutations in NSCLC. Cancer Res. 81 , 724–731 (2021). Cheng, B. et al. Predicting EGFR mutation status in lung adenocarcinoma presenting as ground-glass opacity: utilizing radiomics model in clinical translation. Eur. Radiol. 32 , 5869–5879 (2022). Le, X., Elamin, Y. Y. & Zhang, J. New Actions on Actionable Mutations in Lung Cancers. Cancers (Basel) . 15 , 2917 (2023). Barnet, M. B. et al. EGFR–Co-Mutated Advanced NSCLC and Response to EGFR Tyrosine Kinase Inhibitors. J. Thorac. Oncol. 12 , 585–590 (2017). Yang, J. J. et al. Lung cancers with concomitant egfr mutations and ALK rearrangements: Diverse responses to EGFR-TKI and crizotinib in relation to diverse receptors phosphorylation. Clin. Cancer Res. 20 , 1383–1392 (2014). Zhang, Y. et al. The co-mutation of EGFR and tumor-related genes leads to a worse prognosis and a higher level of tumor mutational burden in Chinese non-small cell lung cancer patients. J. Thorac. Dis. 14 , 185–193 (2022). Peng, P. et al. Co-mutations of epidermal growth factor receptor and BRAF in Chinese non-small cell lung cancer patients. Ann. Transl Med. 9 , 1321–1321 (2021). Stockhammer, P. et al. Co-Occurring Alterations in Multiple Tumor Suppressor Genes Are Associated With Worse Outcomes in Patients With EGFR-Mutant Lung Cancer. J. Thorac. Oncol. 19 , 240–251 (2024). Chen, M. et al. Concurrent Driver Gene Mutations as Negative Predictive Factors in Epidermal Growth Factor Receptor-Positive Non-Small Cell Lung Cancer. EBioMedicine 42, 304–310 (2019). Imyanitov, E. N., Iyevleva, A. G. & Levchenko, E. N. Molecular testing and targeted therapy for non-small cell lung cancer: Current status and perspectives. Crit. Rev. Oncol. Hematol. 157 , 103194 (2021). Shiri, I. et al. Next-Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Algorithms. Mol. Imaging Biol. 22 , 1132–1148 (2020). Bakr, S. et al. A radiogenomic dataset of non-small cell lung cancer. Sci Data 5 , (2018). Jiménez-Sánchez, J. et al. Evolutionary dynamics at the tumor edge reveal metabolic imaging biomarkers. Proc. Natl. Acad. Sci. U S A . 118 , e2018110118 (2021). Hayasaka, K. et al. Clinical, Genomic, and Transcriptomic Featurses of Lung Adenocarcinoma With Uncommon EGFR Mutation. Clin. Lung Cancer . 25 , e43–e51 (2024). Izumi, M. et al. Integrative single-cell RNA-seq and spatial transcriptomics analyses reveal diverse apoptosis-related gene expression profiles in EGFR-mutated lung cancer. Cell Death & Disease 2024 15:8 15, 1–11 (2024). Vanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer . 3 , 1151–1164 (2022). Chen, W., Qiao, X., Yin, S., Zhang, X. & Xu, X. Integrating Radiomics with Genomics for Non-Small Cell Lung Cancer Survival Analysis. J Oncol (2022). (2022). Bieńkowski, M., Dziadziuszko, R. & Jassem, J. Complex EGFR mutations in non-small cell lung cancer: a distinct entity? J. Thorac. Dis. 14 , 2738–2741 (2022). Boch, C. et al. The frequency of EGFR and KRAS mutations in non-small cell lung cancer (NSCLC): routine screening data for central Europe from a cohort study. BMJ Open. 3 , e002560 (2013). Skoulidis, F. et al. Sotorasib for Lung Cancers with KRAS p.G12C Mutation. N. Engl. J. Med. 384 , 2371–2381 (2021). Rachiglio, A. M. et al. The presence of concomitant mutations affects the activity of egfr tyrosine kinase inhibitors in egfr-mutant non-small cell lung cancer (Nsclc) patients. Cancers (Basel) 11 , (2019). Zhang, C. et al. The Performance of an Extended Next Generation Sequencing Panel Using Endobronchial Ultrasound-Guided Fine Needle Aspiration Samples in Non-Squamous Non-Small Cell Lung Cancer: A Pragmatic Study. Clin. Lung Cancer . 24 , e105 (2022). Shin, H. T. et al. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. Nat. Commun. 8 , 1–10 (2017). Rolfo, C. D. et al. Measurement of ctDNA Tumor Fraction Identifies Informative Negative Liquid Biopsy Results and Informs Value of Tissue Confirmation. Clin. Cancer Res. 30 , 2452–2460 (2024). Bote-de Cabo, H. et al. Clinical Utility of Combined Tissue and Plasma Next-Generation Sequencing in Patients With Advanced, Treatment-Naïve NSCLC. JTO Clin. Res. Rep. 6 , 100778 (2025). Zwanenburg, A. et al. The image biomarker standardization initiative: Standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295 , 328–338 (2020). Wu, S., Shen, G., Mao, J. & Gao, B. CT Radiomics in Predicting EGFR Mutation in Non-small Cell Lung Cancer: A Single Institutional Study. Front. Oncol. 10 , 542957 (2020). Zhang, G. et al. Using Multi-phase CT Radiomics Features to Predict EGFR Mutation Status in Lung Adenocarcinoma Patients. Acad. Radiol. 31 , 2591–2600 (2024). Le, N. Q. K. et al. Machine learning-based radiomics signatures for egfr and kras mutations prediction in non-small-cell lung cancer. Int J. Mol. Sci 22 , (2021). Moreno, S. et al. A Radiogenomics Ensemble to Predict EGFR and KRAS Mutations in NSCLC. Tomography 7 , 154–168 (2021). Li, H. et al. Frequency of well-identified oncogenic driver mutations in lung adenocarcinoma of smokers varies with histological subtypes and graduated smoking dose. Lung Cancer . 79 , 8–13 (2013). Yang, J. et al. Novel Subtypes of Pulmonary Emphysema Based on Spatially-Informed Lung Texture Learning: The Multi-Ethnic Study of Atherosclerosis (MESA) COPD Study. IEEE Trans. Med. Imaging . 40 , 3652 (2021). Stellman, S. D., Muscat, J. E., Hoffmann, D. H. & Wynder, E. L. Impact of filter cigarette smoking on lung cancer histology. Prev. Med. (Baltim) . 26 , 451–456 (1997). Inamura, K. Update on Immunohistochemistry for the Diagnosis of Lung Cancer. Cancers 2018 . 10, Page 72 (10), 72 (2018). Patel, K. et al. FAM190A deficiency creates a cell division defect. Am. J. Pathol. 183 , 296–303 (2013). Wang, Y. et al. Regulatory mechanisms of Beta-carotene and BCMO1 in adipose tissues: A gene enrichment-based bioinformatics analysis. Hum Exp. Toxicol 41 , (2022). Omenn, G. S. et al. Effects of a Combination of Beta Carotene and Vitamin A on Lung Cancer and Cardiovascular Disease. N. Engl. J. Med. 334 , 1150–1155 (1996). Kordiak, J., Bielec, F., Jabłoński, S. & Pastuszak-Lewandoska, D. Role of Beta-Carotene in Lung Cancer Primary Chemoprevention: A Systematic Review with Meta-Analysis and Meta-Regression. Nutrients 14, (2022). Orlhac, F., Frouin, F., Nioche, C., Ayache, N. & Buvat, I. Validation of A Method to Compensate Multicenter Effects Affecting CT Radiomics. (2019). https://doi.org/10.1148/radiol.2019182023 291, 53–59 Planchard, D. et al. Osimertinib with or without Chemotherapy in EGFR -Mutated Advanced NSCLC. N. Engl. J. Med. 389 , 1935–1948 (2023). Cho, B. C. et al. Amivantamab plus Lazertinib in Previously Untreated EGFR -Mutated Advanced NSCLC. N. Engl. J. Med. 391 , 1486–1498 (2024). Feldt, S. L. & Bestvina, C. M. The Role of MET in Resistance to EGFR Inhibition in NSCLC: A Review of Mechanisms and Treatment Implications. Cancers (Basel) . 15 , 2998 (2023). Chan, D. W. K., Choi, H. C. W. & Lee, V. H. F. Treatment-Related Adverse Events of Combination EGFR Tyrosine Kinase Inhibitor and Immune Checkpoint Inhibitor in EGFR-Mutant Advanced Non-Small Cell Lung Cancer: A Systematic Review and Meta-Analysis. Cancers (Basel) . 14 , 2157 (2022). Li, Q. et al. CT imaging features associated with recurrence in non-small cell lung cancer patients after stereotactic body radiotherapy. Radiat. Oncol. 12 , 1–10 (2017). Spinelli, M., Parcq, P., Du, Gupta, N., Khorashad, J. & Viola, P. Coexistence of two missense mutations in the KRAS gene in adenocarcinoma of the lung: a possible indicator of poor prognosis. Pathologica 114 , 221 (2022). Khorrami, M. et al. Combination of peri- and intratumoral radiomic features on baseline CT scans predicts response to chemotherapy in lung adenocarcinoma. Radiol Artif. Intell 1 , (2019). Fedorov, A. et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn. Reson. Imaging . 30 , 1323–1341 (2012). Lu, H. et al. A mathematical-descriptor of tumor-mesoscopic-structure from computed-tomography images annotates prognostic- and molecular-phenotypes of epithelial ovarian cancer. Nat. Commun. 10 , 1–11 (2019). Whybra, P. et al. The Image Biomarker Standardization Initiative: Standardized Convolutional Filters for Reproducible Radiomics and Enhanced Clinical Insights. Radiology 310, (2024). Thawani, R. et al. Radiomics and radiogenomics in lung cancer: A review for the clinician. Lung Cancer . 115 , 34–41 (2018). Mahon, R. N., Ghita, M., Hugo, G. D. & Weiss, E. ComBat harmonization for radiomic features in independent phantom and lung cancer patient computed tomography datasets. Phys. Med. Biol. 65 , 015010 (2020). Armato, S. G. et al. The Reference Image Database to Evaluate Response to Therapy in Lung Cancer (RIDER) Project: A Resource for the Development of Change Analysis Software. Clin. Pharmacol. Ther. 84 , 448 (2008). Lambin, P. et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 14 , 749–762 (2017). Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 06 Mar, 2026 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 12 Feb, 2026 Reviews received at journal 11 Feb, 2026 Reviewers agreed at journal 02 Feb, 2026 Reviewers invited by journal 08 Dec, 2025 Submission checks completed at journal 08 Dec, 2025 First submitted to journal 08 Dec, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8158721","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":557047111,"identity":"7ff4f8fe-e994-45a3-842d-5e244d5ee476","order_by":0,"name":"Mitchell Chen","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABFklEQVRIiWNgGAWjYDACZjCykGFg4AHyKiTAggcYCmCS2LQwgyQkeCBazsC0GODRwoCshbENJopHi247/8HPBRVALexnDz4unGcRbc7e+/DADwMGef4GHmMDLFrMDjMzS884A9TCk5dsPHObRO7OnuMGB3sMGAxnHOAxTsCuhY2Ztw2oRYLHTJoXqGXDjTSGAzwGDIwbGHiMD+DU8g+sxfw37xyglvvPGA7+MWCwx6+lAWILiAG0hY3hMNCWRJAWHA4zluY5JsHDxpMDZgD9ksZwWMZAInnGYbZirN4/f/DhZ54aGzl+9jOGQEZd7nb2Y8wf31TY2Pa3N2+WwBbKMMAGY0ANlsAZkZgAm1tGwSgYBaNgZAMAyh5QJ58KLtMAAAAASUVORK5CYII=","orcid":"","institution":"Imperial College London","correspondingAuthor":true,"prefix":"","firstName":"Mitchell","middleName":"","lastName":"Chen","suffix":""},{"id":557047113,"identity":"bb2bc660-a438-47cc-9a00-5f881a52546f","order_by":1,"name":"Susan J Copley","email":"","orcid":"","institution":"Imperial College Healthcare NHS Trust, Imperial College Healthcare NHS Trust, Hammersmith Hospital","correspondingAuthor":false,"prefix":"","firstName":"Susan","middleName":"J","lastName":"Copley","suffix":""},{"id":557047115,"identity":"8f157a1c-2ce2-461c-8a5e-e1450510c833","order_by":2,"name":"Kristofer Linton-Reid","email":"","orcid":"","institution":"Imperial College London","correspondingAuthor":false,"prefix":"","firstName":"Kristofer","middleName":"","lastName":"Linton-Reid","suffix":""},{"id":557047116,"identity":"ec2f508e-0f47-4573-ae83-3796a661dbc5","order_by":3,"name":"Patrizia Viola","email":"","orcid":"","institution":"North West London Pathology","correspondingAuthor":false,"prefix":"","firstName":"Patrizia","middleName":"","lastName":"Viola","suffix":""},{"id":557047117,"identity":"33a42e69-1825-4b52-84df-4fec45612476","order_by":4,"name":"Yidong Han","email":"","orcid":"","institution":"Imperial College Healthcare NHS Trust","correspondingAuthor":false,"prefix":"","firstName":"Yidong","middleName":"","lastName":"Han","suffix":""},{"id":557047118,"identity":"20169c69-14dd-48c9-96f6-abee6bff541d","order_by":5,"name":"Alessio Cortellini","email":"","orcid":"","institution":"Fondazione Policlinico Universitario Campus Bio-Medico","correspondingAuthor":false,"prefix":"","firstName":"Alessio","middleName":"","lastName":"Cortellini","suffix":""},{"id":557047119,"identity":"c1d1a72c-7067-4f16-87ef-b64256584a27","order_by":6,"name":"Haonan Lu","email":"","orcid":"","institution":"University of Hong Kong","correspondingAuthor":false,"prefix":"","firstName":"Haonan","middleName":"","lastName":"Lu","suffix":""},{"id":557047120,"identity":"4a735f3e-dfdf-492b-a065-b601058025c9","order_by":7,"name":"Aleksander Mani","email":"","orcid":"","institution":"Imperial College Healthcare NHS Trust","correspondingAuthor":false,"prefix":"","firstName":"Aleksander","middleName":"","lastName":"Mani","suffix":""},{"id":557047121,"identity":"8af9f007-7e3c-4514-a5c4-922a0a3d4455","order_by":8,"name":"Marize Bahket","email":"","orcid":"","institution":"Imperial College Healthcare NHS Trust","correspondingAuthor":false,"prefix":"","firstName":"Marize","middleName":"","lastName":"Bahket","suffix":""},{"id":557047122,"identity":"8fa9f7ba-3c7c-477f-ab02-9996befe238d","order_by":9,"name":"David J Pinato","email":"","orcid":"","institution":"Imperial College London","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"J","lastName":"Pinato","suffix":""},{"id":557047123,"identity":"a179e5fd-ecd3-4b21-9a1c-df6ab8a1821a","order_by":10,"name":"Danielle Power","email":"","orcid":"","institution":"Imperial College Healthcare NHS Trust","correspondingAuthor":false,"prefix":"","firstName":"Danielle","middleName":"","lastName":"Power","suffix":""},{"id":557047124,"identity":"0c241433-65ae-49f4-9304-99e3ba02d1a8","order_by":11,"name":"Andrea G Rockall","email":"","orcid":"","institution":"Imperial College London","correspondingAuthor":false,"prefix":"","firstName":"Andrea","middleName":"G","lastName":"Rockall","suffix":""},{"id":557047128,"identity":"c8b37285-844b-4746-8fed-8310fa9682bf","order_by":12,"name":"Eric O Aboagye","email":"","orcid":"","institution":"Imperial College London","correspondingAuthor":false,"prefix":"","firstName":"Eric","middleName":"O","lastName":"Aboagye","suffix":""}],"badges":[],"createdAt":"2025-11-19 21:38:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8158721/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8158721/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-026-42948-4","type":"published","date":"2026-03-06T16:00:10+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":98423980,"identity":"d7136711-327d-4886-a8ed-39ad43e7d212","added_by":"auto","created_at":"2025-12-17 16:32:49","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":3095878,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscriptrevisedclean.docx","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/847b7f91443f4cb6fe5e9e11.docx"},{"id":97965647,"identity":"f32cf4c1-0d17-480e-a6d7-6ae7ef4747c9","added_by":"auto","created_at":"2025-12-11 09:50:00","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":14196,"visible":true,"origin":"","legend":"","description":"","filename":"4465b5a45fba4a5ea265c3715a06d92c.json","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/8434eed082f267759264013e.json"},{"id":98423646,"identity":"00878808-5c06-4f51-b4c0-dc27a49b6ebe","added_by":"auto","created_at":"2025-12-17 16:32:28","extension":"xml","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":159443,"visible":true,"origin":"","legend":"","description":"","filename":"4465b5a45fba4a5ea265c3715a06d92c1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/f1961d17ad725681b5e10499.xml"},{"id":98422556,"identity":"16524954-c792-4a99-b8f3-bd2ff4b15e34","added_by":"auto","created_at":"2025-12-17 16:31:12","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":176562,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/8067b6ac53d54a0c122d3a87.png"},{"id":98423108,"identity":"1e0c1fe4-1bc8-47c1-b31d-fa1b2463d7e4","added_by":"auto","created_at":"2025-12-17 16:31:51","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":94363,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/353be586e2e5d9a0e48db182.png"},{"id":98422894,"identity":"c2934e88-f298-4a58-bb0f-41b6471ffa64","added_by":"auto","created_at":"2025-12-17 16:31:37","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":65988,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/09ed282db74e0d152790f45e.png"},{"id":98422921,"identity":"a799a619-777f-44d6-8000-09917557bf29","added_by":"auto","created_at":"2025-12-17 16:31:39","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":125604,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/146e4bcdbffdd8c757117fae.png"},{"id":98423636,"identity":"0e028250-49b2-4ae3-b9e6-eea5085f7afe","added_by":"auto","created_at":"2025-12-17 16:32:28","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":54834,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/a76129464d05f52a753bd29e.png"},{"id":97965661,"identity":"87301e45-a993-4969-9844-2aacbf156314","added_by":"auto","created_at":"2025-12-11 09:50:01","extension":"png","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":148306,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/c045407e26122d969929e94b.png"},{"id":98423272,"identity":"3a3fe100-0c71-4c8d-b0c6-9be1734ab84b","added_by":"auto","created_at":"2025-12-17 16:32:02","extension":"xml","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":156383,"visible":true,"origin":"","legend":"","description":"","filename":"4465b5a45fba4a5ea265c3715a06d92c1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/4154b99e99a0e5b6c7a20f25.xml"},{"id":97965663,"identity":"9fc2dff5-afbb-41b4-9a47-7389051ef63b","added_by":"auto","created_at":"2025-12-11 09:50:01","extension":"html","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":175383,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/91eaf357e44c22996e7468cb.html"},{"id":98423318,"identity":"43c8887c-cf95-40e8-88dd-d077ea5c4519","added_by":"auto","created_at":"2025-12-17 16:32:06","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":748503,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eDevelopment and composition of EGFR-RPV. A. \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eEGFR-RPV development and testing pipeline. \u003c/em\u003e\u003cem\u003e\u003cstrong\u003eB. \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eEnriched radiomic features in EGFR-RPV; which were drawn from all three regions of interests (ROIs), with most number of features deriving from the annulus followed by lesion, from all classes including first-order, texture and fractal dimension (FD). Some features are related to wavelet or Laplacian-of-Gaussian (LoG) transformed images.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/a90d375d939272f3a467c652.jpeg"},{"id":97965649,"identity":"077439aa-a853-4bd1-ba4c-ec4d6c51c335","added_by":"auto","created_at":"2025-12-11 09:50:00","extension":"jpeg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":528622,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003ePerformance of EGFR-RPV for predicting exclusive EGFR mutation\u003c/strong\u003e\u003c/em\u003e\u003cem\u003e in A. internal validation set. B. external testing set, and C. patient stratification into high and low risk groups in the discovery cohort; three patients with missing survival information were excluded.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage2.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/ebb3ba6db2e30ba742e03f99.jpeg"},{"id":98423959,"identity":"e4e27930-a117-488b-bc97-d399b7056ff9","added_by":"auto","created_at":"2025-12-17 16:32:48","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":371397,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eClinico-radiomics statistical analysis and integration as a clinical-decision nomogram. \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eA. Univariable\u003c/em\u003e\u003cem\u003e\u003cstrong\u003e \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eand B. multivariable logistic regression of clinical features for their clinical predictive values for exclusive EGFR mutation status, showing the statistical significance of patient gender and EGFR-RPV. C. nomogram based on clinco-radiomic features for predicting exclusive EGFR mutation status. The red and blue lines show the calculation of predicted probability of the mutational status in two patients with otherwise identical key clinical characteristics (yellow line), showcasing two use cases.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/90ab93c3899bc43c8df5f400.jpeg"},{"id":98423241,"identity":"6544de0f-84d4-46f2-a3f4-b40f222ebafa","added_by":"auto","created_at":"2025-12-17 16:31:59","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":657166,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eTranscriptomics correlates of exclusive EGFR mutation status\u003c/strong\u003e\u003c/em\u003e\u003cstrong\u003e. \u003c/strong\u003e\u003cem\u003eA. Principal component analysis (PCA) of bulk RNA sequencing data showing the first two components accounting for most of the observed variance. B. Top constituent genes of these principal components. C. Unsupervised clustering based on these components do not satisfactorily stratify the patients based on their exclusive EGFR mutation status. D. Hierarchical clustering by Euclidean distance presented as a heatmap again failed to stratify for exclusive EGFR mutation. WT: wild type.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/6a40cf16c18b68fbe91985e2.jpeg"},{"id":97965657,"identity":"7c518cc0-9ffb-4652-9142-4bbd9681c149","added_by":"auto","created_at":"2025-12-11 09:50:01","extension":"jpeg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":305979,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eDifferential expression gene analysis for exclusive EGFR mutation status.\u003c/strong\u003e\u003c/em\u003e\u003cem\u003e A. Differentially expressed genes (DEG) by exclusive EGFR mutation status. B. Distribution of the top two DEG genes (FAM190A and BCMO1) on Uniform Manifold Approximation and Projection (UMAP). WT: wild type.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage5.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/203ead36aff1349f26c85d13.jpeg"},{"id":97965651,"identity":"3836b7a7-df93-486b-8b44-9889795b2cfe","added_by":"auto","created_at":"2025-12-11 09:50:00","extension":"jpeg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":788077,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eStudy methodology. \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eA. Flow diagrams of the discovery and external testing cohort. \u003c/em\u003e\u003cem\u003e\u003cstrong\u003eB.\u003c/strong\u003e\u003c/em\u003e \u003cem\u003eMulti-label segmentation for radiomics feature extraction. \u003c/em\u003e\u003cem\u003e\u003cstrong\u003eC. \u003c/strong\u003e\u003c/em\u003e\u003cem\u003eOptimal dimensionality reduction and regression method selection, as assessed by their area under the receiver operating curve. In our case, the combination of LASSO and Spearman yielded the best performance and was adopted for use. ENet: elastic net; GLM: generalised linear model; LASSO: least absolute shrinkage and selection operation; KNN: k-nearest neighbour, RF: random forest; SVM: support vector machine; PLS: partial least squares, XGB: extreme gradient boosting; NNet: neural net; RFE: recursive feature elimination.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"floatimage6.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/fce008da10ee05cf8ed6b471.jpeg"},{"id":104251685,"identity":"40657410-a9cf-4ae3-aaee-ab0ff0e6a3c9","added_by":"auto","created_at":"2026-03-09 16:14:54","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":4630992,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8158721/v1/447ad8e2-2831-4901-8eae-2b61c05f1311.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"A Radio-Genomics Biomarker for Precision Epidermal Growth Factor Receptor Mutation Targeting Therapy in Non-Small Cell Lung Cancer","fulltext":[{"header":"Introduction","content":"\u003cp\u003eLung cancer is the leading cause of cancer-related deaths worldwide, with non-small cell lung cancer (NSCLC) accounting for 80\u0026ndash;85% of its cases.[\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e] Over 70% of NSCLC cases are diagnosed at advanced stages, carrying poor prognoses.[\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e] Tyrosine kinase inhibitors (TKI) have excellent response profile in cancers exhibiting certain oncogenic driver mutations, with those relating to epidermal growth factor receptor (EGFR) being the most common targets in NSCLC.[\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] Newer generation EGFR-TKIs, such as osimertinib, are more effective against advanced and metastatic NSCLC, with prolonged patient survival, improved quality of life and fewer adverse events compared to chemotherapy,[\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e] and are now the first line therapy for cancers harbouring suitable mutations.[\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eIn precision EGFR-TKI clinical pathways, it is crucial to ensure timely commencement of therapies to maximise patient benefit, avoid unnecessary treatment-related adverse events,[\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e] and select the best first-line treatment. For example, in the case of immune checkpoint inhibitor (ICI), increased pneumotoxicity has been observed in patients with EGFR mutation.[\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e] Treatment decision is currently guided by tissue sampling followed by next generation sequencing (NGS),[\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e] which is burdened by procedural invasiveness, patient acceptance, tumour heterogeneity and quality of sampled tissue.[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] Up to 30% of patients do not have suitable biopsy sample available.[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e] More recently, liquid biopsy with plasma-derived cell free DNA (cfDNA) analysed by digital droplet polymerase chain reaction (ddPCR)⁠ has been proposed as an alternative test,[\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e] but is limited by confounding effect from non-tumour cfDNA from normal tissue necrosis, lysis of leukocytes after blood collection or clonal haematopoiesis, leading to its suboptimal specificity; and the lack of mutation localisation in multi-focal disease.[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] An overall genotyping error rate of up to 11.1% has been reported for cfDNA.[\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e] These limitations highlight the need for strategies that can complement existing molecular testing to improve the delivery of precision EGFR-TKI therapy in NSCLC. Imaging-based biomarkers hold notable promise in this regard, given their non-invasiveness, broad availability, and low cost.\u003c/p\u003e\u003cp\u003eRadiomic features are quantitative metrics derived from imaging data and can non-invasively capture important disease information⁠. [\u003cspan additionalcitationids=\"CR14 CR15\" citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e] Prior research have demonstrated the utility of radiomics for predicting EGFR mutation, though they have been limited to single mutation prediction. [\u003cspan additionalcitationids=\"CR18 CR19\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e] In clinical practice, multiple actionable mutations are routinely tested in non-squamous NSCLC. [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e] Co-occurring mutations can be found in up to 12.9% of the EGFR mutation positive patients, [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e] including up to 3.9% with concomitant anaplastic lymphoma kinase (ALK) [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e] and 1.1% with Kirsten RAt Sarcoma viral oncogene homologue (KRAS) mutations. [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e] Co-mutation status is associated with an increased treatment resistance to EGFR-TKI and worse patient survival. [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan additionalcitationids=\"CR25\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] Although radiomics studies have previously investigated the prediction of targetable mutations, including EGFR and KRAS,, [\u003cspan additionalcitationids=\"CR28\" citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] the identification of \u003cem\u003eexclusive\u003c/em\u003e EGFR mutation, characterised by the absence of other key mutations (ALK and KRAS), remains unaddressed in the literature, despite its relevance as a clinical scenario favouring response to EGFR-TKIs. In this paper, we introduce a CT-based radiomic biomarker, EGFR-RPV, for the prediction of this important mutation profile.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003ePatient characteristics\u003c/h2\u003e\u003cp\u003eWe included NSCLC patients presenting to our multi-centre institution between February 2012 and July 2019 (n\u0026thinsp;=\u0026thinsp;304, age (mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation): 67.6\u0026thinsp;\u0026plusmn;\u0026thinsp;10.7, male: female [M:F]\u0026thinsp;=\u0026thinsp;174:130) as a discovery cohort, and a dataset [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] from the Cancer Imaging Archive (TCIA) (n\u0026thinsp;=\u0026thinsp;51, Age: 69.6\u0026thinsp;\u0026plusmn;\u0026thinsp;8.1, M:F\u0026thinsp;=\u0026thinsp;35:16) for external testing. The discovery cohort was split 2:1 into training and internal validation sets, balanced for patient's age, sex, and tumour histology. We define exclusive EGFR positivity as where it is positive in the absence of ALK and KRAS mutation.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eRadiomics predictive vector\u003c/h3\u003e\n\u003cp\u003eThe biomarker development pipeline is presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA. EGFR-RPV was developed using multi-regional segmentation, comprehensive radiomic feature extraction in a multi-regional approach (regions of interests (ROI): lesion, perilesional annulus and lung parenchyma), followed by a benchmarking of various dimensionality reduction and regression methods for the best performing statistical learning pipeline.\u003c/p\u003e\u003cp\u003eEGFR-RPV is a 10-feature composite radiomics vector (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB) developed using Spearman and Least Absolute Shrinkage and Selection Operator (LASSO) methods. It predicts exclusive EGFR positivity to an accuracy of 0.77, 95% CI: 0.66\u0026ndash;0.88 and 0.71, 95% CI: 0.54\u0026ndash;0.89 in the internal and external testing sets, respectively (Figs.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA and \u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). The component features belong to first order and higher classes (texture and fractals), extracted from all three ROIs, with most number (n\u0026thinsp;=\u0026thinsp;5) from the peri-lesional area; consistent with the hypothesised distribution of oncogenic cells harbouring driver mutations. [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e] EGFR-RPV also delivers effective patient prognostic stratification into high and low risk groups (Cox hazard ratio (HR) 2.15, 95% CI 1.50\u0026ndash;3.08, log-ranked p\u0026thinsp;\u0026lt;\u0026thinsp;0.001). (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC)\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eLogistic regression analysis and clinico-radiomics integration\u003c/h3\u003e\n\u003cp\u003eWe have performed univariable and multivariable logistic regression analyses to identify clinical features with statistically significant association with exclusive EGFR mutation status (Figs.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA \u0026amp; \u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). The statistical significance of patient sex and EGFR-RPV was established (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) in both univariable and multivariable analyses. We developed a nomogram using these features with EGFR-RPV to aid in clinical decision making. (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\n\u003ch3\u003eGenomics analysis\u003c/h3\u003e\n\u003cp\u003eGiven the wide use of genomics in cancer biomarker research, including their demonstrated utility in various precision oncology scenarios [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] and relevance to radiomics advancements for NSCLC [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e], we have further performed RNA transcriptomics analysis to advance an understanding of the radio-genomics landscape of NSCLC in the context of exclusive EGFR positivity.\u003c/p\u003e\u003cp\u003eIn this study arm, we analysed the bulk RNA transcriptomics readouts from the NSCLC Radiogenomics dataset, an independent cohort of 51 patients (Age: 69.6\u0026thinsp;\u0026plusmn;\u0026thinsp;8.1, M:F\u0026thinsp;=\u0026thinsp;35:16). [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] In the latent space formed by the two top ranked principal components which explain most data variance (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA\u0026amp; \u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB), we found exclusive EGFR positivity is not distinctly predicted by unsupervised clustering of the samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eC) nor by their hierarchical clustering by Euclidean distance on a heatmap (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). We discovered \u003cem\u003eFAM190A\u003c/em\u003e and \u003cem\u003eBCMO1\u003c/em\u003e genes to be most differentially expressed in cases with exclusive EGFR positivity (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eA), which are expressed in separate clusters on Uniform Manifold Approximation and Projection (UMAP) plot (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eB).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eTissue sampling followed by mutational profiling enables targeted treatment for NSCLC harbouring EGFR mutations, regardless of disease stage. Nevertheless, this approach is burdened by the invasiveness of tissue sampling procedure, tumour heterogeneity and quality of sampled tissue.[\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e] The more recently introduced liquid biopsy with plasma cfDNA is limited by false positives arising from non-tumour cfDNA, and a lack of mutation localisation in multi-focal disease.[\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] To tackle these challenges, we developed a novel, non-invasive, imaging biomarker for guiding clinical decisions using routinely acquired imaging data. It demonstrates good performance for predicting exclusive EGFR mutation and achieves effective patient prognostic stratification.\u003c/p\u003e\u003cp\u003eEGFR mutations are more prevalent in non-smoking, female and East Asian patients. [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e] The commonest actionable types of EGFR mutations include del19 (exon 19) and L858R (exon 21), which collectively constitute up to 90% of all EGFR mutations in NSCLC. [\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e] Previously, target mutations such as EGFR, ALK and KRAS were considered mutually exclusive, but their co-mutational status is becoming increasingly recognised for their association with increased treatment resistance to EGFR-TKI and worse patient survival. [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan additionalcitationids=\"CR25\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e–\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e] This supports a comprehensive analysis of key actionable mutations when determining patient suitability for this treatment.\u003c/p\u003e\u003cp\u003eDuring the study period (2012–2018), KRAS mutation was not considered targetable, as no approved therapies were available for clinical use at that time. The therapeutic landscape changed significantly in 2021, when the first KRAS inhibitor demonstrated clinical efficacy and gained regulatory approval, thereby establishing KRAS as an actionable target. [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e] This distinction is crucial for interpreting our results: while KRAS was not clinically targetable during the study period, its presence is now directly targetable and has also been associated with reduced benefit from EGFR-TKI. [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] In current clinical practice, the knowledge of the absence of KRAS mutation in EGFR-mutant disease helps to improve treatment strategy by permitting more confident use of EGFR-targeting therapies.\u003c/p\u003e\u003cp\u003eIn precision treatment for NSCLC, the importance of predicting for exclusive EGFR positivity cannot be overstated. First, mutation exclusivity ensures the tumour is primarily dependent on the EGFR-driven pathway, thus maximising the likelihood of a strong therapeutic response to EGFR-TKI. By identifying patients with driver mutations other than EGFR, such as ALK and KRAS, we could avoid the use of ineffective or potentially toxic therapies in those cases. In tissue-scarce scenarios, a high predicted probability of exclusive EGFR positivity can justify focused assays while avoiding delays from broad yet low-yield testing; and it can avoid futile therapy in cases where other non-EGFR actionable mutations are present and, in the contemporary setting, redirect such candidates toward appropriate targeted strategies in subsequent lines. Finally, with their more favourable survival profile, tumours harbouring exclusive EGFR mutations would benefit from more accurate disease prognostication when they are readily identified at the time of diagnosis.\u003c/p\u003e\u003cp\u003eIn our study, all patients underwent mutational testing on tissue biopsy specimens, and as such, EGFR-RPV was developed in a cohort where tissue acquisition was feasible. The aim of this study was not to replace tissue- or plasma-based molecular testing, but to evaluate the potential of a rapid, low-cost, non-invasive imaging signature to complement the current diagnostic workflow. When biopsy is performed as standard, EGFR-RPV could provide early molecular insights prior to the availability of biopsy-derived results, and offer a surrogate for longitudinal monitoring without repeated invasive procedures. EGFR-RPV can also provide complementary information in clinical scenarios where standard assays yield inconclusive results, such as when there is inadequate tissue DNA quality due to poor cellularity, necrosis, or degraded FFPE material; [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e] very low variant allele frequency (VAF) in tissue below validated diagnostic thresholds; [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e] or in the case of liquid biopsy, low tumour fraction in circulating cfDNA, particularly in oligometastatic disease or protected compartments (e.g. brain) which can yield false-negative results. [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e] Additionally, EGFR-RPV can help to adjudicate discordant tissue versus plasma results, acting as a tie-breaker. [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e] In patients with negative or unavailable molecular testing but clinical features strongly suggestive of EGFR-mutant disease, such as those who are female, non-smoker and have adenocarcinoma histology, [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e] EGFR-RPV could provide clinical-decision guidance while confirmatory assays are pending. In the above contexts, the utility of EGFR-RPV lies not only in its non-invasiveness, but also in its speed, affordability, and potential to bridge gaps when conventional genomic testing is inconclusive.\u003c/p\u003e\u003cp\u003eEGFR-RPV is the first imaging biomarker to address the question of predicting for exclusive EGFR positivity.\u003c/p\u003e\u003cp\u003eSeveral radiomic biomarkers have previously been presented in literature for EGFR mutation. We searched on PubMed using keywords “CT”, “radiomics”, “EGFR” and “Lung cancer”, for original related radiomics studies published since 2012. [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] We reviewed the bibliometric search results and listed the most relevant ones in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e; most were limited to small scale studies, lacked in external validation, and/or suffered from methodological shortfalls such as not meeting the recommendations stipulated by the International Biomarker Standardisation Initiative. [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e] None dealt with the question of exclusive EGFR positivity that our study addresses.\u003c/p\u003e\u003cp\u003ePreviously, the radiomic features found to be predictive of EGFR mutation were from multiple feature classes. For example, texture features such as gray level size zone (\u003cem\u003eGLSZM\u003c/em\u003e) and wavelet transformed features have been consistently included in various published radiomic biomarkers for actionable EGFR mutations [\u003cspan additionalcitationids=\"CR46 CR47\" citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e–\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. In our study, we found most constituent features to derive from the peri-lesional ROI; which would be consistent with the hypothesised distribution of oncogenic cells harbouring driver mutations, [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e] and supported radiologically on contrast enhanced CT and ¹⁸F-fluorodeoxyglucose-Positron Emission Tomography (FDG-PET) imaging.\u003c/p\u003e\u003cp\u003eEvaluating the specific enriched radiomic features for their biophysical significance, EGFR-RPV includes wavelet transformed first order, texture, FD and wavelet-LoG transformed texture features from the tumour; wavelet transformed first order, texture, FD and wavelet-LoG transformed texture features from the perilesional annulus; and wavelet transformed texture feature from the lung parenchyma, with the most positive weight attached to wavelet transformed first order statistic (\u003cem\u003eFOS Imedian LLH\u003c/em\u003e) from the perilesional annulus. A raised median intensity of the perilesional area can be associated with tissue invasion into the surrounding lung, as commonly observed radiologically in adenocarcinoma, the most common NSCLC histological subtype to harbour targetable EGFR mutations. [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eInterestingly, the highest absolute weight is a negative one attached to the only feature from the lung parenchyma (\u003cem\u003eglcm Correl LLL\u003c/em\u003e), which suggests a negative predictive value the lung parenchyma feature has on the presence of exclusive EGFR mutation. \u003cem\u003eGlcm\u003c/em\u003e quantifies gray-level zones or the number of connected voxels that share the same gray-level intensity within the image and has been associated with the presence of pulmonary emphysema [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. This finding would be consistent with the understanding that EGFR mutations are commonly found in adenocarcinomas and non-smokers. [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e] With tobacco smoking commonly associated with squamous cell carcinoma of the lung, [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e] emphysema-associated radiomic feature could imply squamous cell characteristics not otherwise detected histologically in the sampled tissue, particularly in poorly differentiated or mixed NSCLC histology cases diagnosed on limited or necrotic tissue specimen [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e]. This latter hypothesis, if proven, would support the use of radiomics to screen for squamous cell cancer features, which could in turn influence the patient’s initial clinical pathway.\u003c/p\u003e\u003cp\u003eThe molecular analysis of paired bulk RNA data provides a complementary genomic perspective that supports the overarching aim of this study through a non-imaging approach, which reflects the central role of genomics in cancer biomarker research, where transcriptomic profiling has demonstrated substantial utility across diverse precision oncology applications [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e] and has contributed to key advances in radiomics for NSCLC [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. Incorporating transcriptomic analysis establishes a biologically grounded reference against which our imaging-based innovation can be contextualised. Notably, unlike EGFR-RPV, a genomics-based approach does not readily enable direct prediction of exclusive EGFR mutation. However, the transcriptomic findings offer mechanistic insight into the tumour biology underpinning this molecular subtype. The observed upregulation of \u003cem\u003eFAM190A\u003c/em\u003e and \u003cem\u003eBCMO1\u003c/em\u003e in exclusive EGFR-positive cases aligns with their respective roles in regulating aberrant cell division [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e] and catalysing the conversion of β-carotene to vitamin A, [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e] the latter being particularly relevant given epidemiological evidence linking β-carotene and vitamin A supplementation to elevated NSCLC risk. [\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e, \u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e] This genomic analysis therefore enriches the biological interpretability of EGFR-RPV and strengthens the framework supporting its clinical relevance.\u003c/p\u003e\u003cp\u003eThe limitations of this study include its retrospective nature and relatively limited size of the external testing set. While EGFR-RPV showed promising accuracy in predicting exclusive EGFR mutation status across internal and external validation datasets, further refinements are needed to strengthen its clinical applicability. There are some significant differences between the disease characteristics of the discovery and testing cohorts. For example, in the testing cohort, most patients (82.4%) had early-stage, non-metastatic NSCLC, whereas 41.6% of the discovery cohort presented with stage 3 or 4 disease. Additional statistically significant differences in histological subtype, PD-L1 expression and EGFR exclusive positivity are present between the two cohorts. The observed differences in AUCs, ROC curves, and Youden Index values between the development and testing cohorts likely reflect these dataset differences, which can influence model prediction and the optimal threshold for classification. Despite these differences, however, we note EGFR-RPV maintained good predictive performance across the cohorts, demonstrating good robustness to variations in these characteristics. Nevertheless, further external validation in diverse populations is necessary to confirm the generalisability of EGFR-RPV.\u003c/p\u003e\u003cp\u003eRadiomic features can be affected by the type of scanner, scanning protocol and reconstruction setting used. [\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e] We have ascertained feature reproducibility by including only reproducible features validated in a test-retest experiment, and features meeting an ICC threshold. Furthermore, we have used resampling, standardisation, and feature harmonisation techniques, and validated EGFR-RPV in an external testing dataset acquired in a different country (USA) from training (UK), with different scanners and scanning protocols, as well as statistically significant differences in certain cohort characteristics (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Although the multi-institutional nature of our dataset reduces single-centre bias, residual heterogeneity in CT protocols and hardware could still affect feature stability and model generalisability, which remains an important consideration requiring continued methodological rigour in future works.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003e\u003cem\u003eCharacteristics of patients included in the study and p-values showing statistical differences between the discovery and testing cohorts. Note: p-values were calculated using Wilcoxon rank-sum test for continuous variables, and chi-square test for categorical variables. Percentage figures are given in brackets, unless otherwise specified. ECOG, Eastern Cooperative Oncology Group; disease stage based on International Association for the Study of Lung Cancer (IASLC) 7th edition; TPS: tumour proportion score; Exclusive EGFR mutation is defined as where it is positive in the absence of ALK and KRAS positivity. *denotes statistically significant difference\u003c/em\u003e\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eDiscovery Cohort\u003c/p\u003e\u003cp\u003e(n = 304) No. (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eTesting Cohort\u003c/p\u003e\u003cp\u003e(n = 51) No. (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003ep-value Training vs Testing\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge (years)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMedian\u003c/p\u003e\u003cp\u003eStandard Deviation\u003c/p\u003e\u003cp\u003eRange\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e67.6\u003c/p\u003e\u003cp\u003e10.7\u003c/p\u003e\u003cp\u003e32–92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e69.6\u003c/p\u003e\u003cp\u003e8.1\u003c/p\u003e\u003cp\u003e50–85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.2085\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eSex\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFemale\u003c/p\u003e\u003cp\u003eMale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e130 (42.8)\u003c/p\u003e\u003cp\u003e174 (57.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e16 (31.3)\u003c/p\u003e\u003cp\u003e35 (68.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.1268\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eECOG Performance Status\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e0\u003c/p\u003e\u003cp\u003e1\u003c/p\u003e\u003cp\u003e2\u003c/p\u003e\u003cp\u003e3\u003c/p\u003e\u003cp\u003e4\u003c/p\u003e\u003cp\u003eUnknown\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e145 (48.4)\u003c/p\u003e\u003cp\u003e90 (30.0)\u003c/p\u003e\u003cp\u003e43 (14.3)\u003c/p\u003e\u003cp\u003e19 (6.3)\u003c/p\u003e\u003cp\u003e2 (0.7)\u003c/p\u003e\u003cp\u003e1 (0.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eT Stage\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e1\u003c/p\u003e\u003cp\u003e2\u003c/p\u003e\u003cp\u003e3\u003c/p\u003e\u003cp\u003e4\u003c/p\u003e\u003cp\u003eUnknown\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e91 (30.5)\u003c/p\u003e\u003cp\u003e81 (27.2)\u003c/p\u003e\u003cp\u003e47 (15.8)\u003c/p\u003e\u003cp\u003e77 (25.8)\u003c/p\u003e\u003cp\u003e2 (0.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e21 (41.2)\u003c/p\u003e\u003cp\u003e21 (41.2)\u003c/p\u003e\u003cp\u003e5 (9.8)\u003c/p\u003e\u003cp\u003e4 (7.8)\u003c/p\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.0*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eN Stage\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e0\u003c/p\u003e\u003cp\u003e1\u003c/p\u003e\u003cp\u003e2\u003c/p\u003e\u003cp\u003e3\u003c/p\u003e\u003cp\u003eUnknown\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e113 (37.8)\u003c/p\u003e\u003cp\u003e44 (14.7)\u003c/p\u003e\u003cp\u003e78 (26.1)\u003c/p\u003e\u003cp\u003e63 (21.1)\u003c/p\u003e\u003cp\u003e1 (0.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e41 (80.4)\u003c/p\u003e\u003cp\u003e5 (9.8)\u003c/p\u003e\u003cp\u003e5 (9.8)\u003c/p\u003e\u003cp\u003e0\u003c/p\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.05*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eMetastases\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e0\u003c/p\u003e\u003cp\u003e1\u003c/p\u003e\u003cp\u003eUnknown\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e154 (51.3)\u003c/p\u003e\u003cp\u003e145 (48.3)\u003c/p\u003e\u003cp\u003e1 (0.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e48 (94.1)\u003c/p\u003e\u003cp\u003e3 (5.9)\u003c/p\u003e\u003cp\u003e0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.05*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eHistological type\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAdenocarcinoma\u003c/p\u003e\u003cp\u003eOther\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e286 (94.1)\u003c/p\u003e\u003cp\u003e18 (5.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e45 (88.2)\u003c/p\u003e\u003cp\u003e5 (11.8)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.05*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003ePD-L1 expression\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eTPS \u0026lt; 1%\u003c/p\u003e\u003cp\u003eTPS ≥ 1%\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e252 (82.9)\u003c/p\u003e\u003cp\u003e52 (17.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e28 (54.9)\u003c/p\u003e\u003cp\u003e23 (45.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e\u0026lt; 0.05*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eEGFR\u003c/b\u003e \u003cb\u003emutation\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003cp\u003ePositive\u003c/p\u003e\u003cp\u003e\u003cb\u003eExclusive Positivity\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e262 (86.2)\u003c/p\u003e\u003cp\u003e42 (13.8)\u003c/p\u003e\u003cp\u003e34 (11.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e38 (74.5)\u003c/p\u003e\u003cp\u003e13 (25.5)\u003c/p\u003e\u003cp\u003e12 (23.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.142\u003c/p\u003e\u003cp\u003e\u0026lt; 0.05*\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eOver the past decade, osimertinib has been established as the standard first-line treatment for patients with advanced or metastatic NSCLC harbouring actionable EGFR mutations. However, the field is now shifting: recent landmark trials, such as FLAURA-2[\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e] and MARIPOSA,[\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e] have moved beyond the development of newer EGFR inhibitors, instead focusing on combination strategies to further enhance patient outcomes. Findings from these trials would support an extension of this work to scenarios such as osimertinib plus chemotherapy or amivantamab (bispecific antibody targeting EGFR and MET) +/- lazertinib (EGFR-TKI).\u003c/p\u003e\u003cp\u003eMET co-alterations have emerged as a clinical challenge in EGFR-mutant NSCLC, with MET amplification recognised as a mechanism of resistance to EGFR-TKI monotherapy.[\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e] In our study, given its retrospective nature, an evaluation of EGFR-RPV for MET mutation status prediction was not feasible. Although the Ion Torrent Hotspot Panel used in the discovery cohort provided limited coverage of MET hotspot mutations, it did not reliably capture the most relevant alterations, namely MET amplification and exon 14 skipping, and such data was also not provided in the testing cohort. Consequently, MET co-alterations could not be reliably evaluated in this study, prompting future works to assess EGFR-RPV in relation to its prediction of MET–EGFR co-mutational status, to ensure alignment with contemporary molecular oncology practice.\u003c/p\u003e\u003cp\u003eOther future works include testing the biomarker prospectively and evaluating its utility in clinical practice. Additional pertinent outcome measure could include response to treatment. Given the expanding use of EGFR-TKI, we could consider developing a related biomarker for early-stage NSCLC treated with resection followed by adjuvant TKI. In patients with concomitant PD-L1 positivity and EGFR mutation, EGFR-TKI can be combined with ICI for improved systemic therapy; but this is associated with increased incidence of immune-related adverse events; warranting further investigation. [\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e]\u003c/p\u003e"},{"header":"Methods","content":"\u003ch2\u003eData collection\u003c/h2\u003e\u003cp\u003e This retrospective study was approved by the institutional review board (IRB) and Health Research Authority UK (HRA: 18HH4616), conducted in accordance with the Declaration of Helsinki, and adhered to the STROBE and REMARK guidelines. Informed consent was obtained from all study participants at the time of the image data acquisition, as in routine clinical practice. The requirement for separate consent for this study was waived by IRB and HRA, due to the study’s retrospective and observational nature and use of de-identified patient data.\u003c/p\u003e\u003cp\u003eThe discovery cohort consisted of 304 patients with NSCLC (age (mean ± standard deviation): 67.6 ± 10.7, male: female [M:F] = 174:130) who underwent CT scans and tissue sampling followed by genetic testing at our multi-centre institution between February 2012 and July 2018. An independent cohort of 51 patients (Age: 69.6 ± 8.1, M:F = 35:16) from TCIA was used for external validation.[\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] Patients in this cohort have had mutational tests for actionable mutations, and additionally bulk RNA sequencing of their tumour tissue specimen using a HiSeq 2500 (Illumina, San Diego, USA) system.\u003c/p\u003e\u003cp\u003eClinical data including patient demographics and tumor histology were collected from the electronic patient record. Actionable EGFR mutation was the primary study endpoint. Patient overall survival was documented up to 3 years post-diagnosis.[\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e] It is defined as the time from the baseline diagnostic CT to 3-year follow-up or patient death of any cause, whichever occurred earlier. We excluded cases with tumor histology other than NSCLC, missing clinical or molecular data, or with non-contrast or thick axial slice (\u0026gt; 3mm) scans. Patient flow diagrams and characteristics are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA, and Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, respectively.\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eKey radiomics literature on EGFR mutation prediction. Performance metric given in AUROC, unless stated otherwise.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"7\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eStudy\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTraining\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eValidation\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003ePerformance*\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eFeature Type\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eFeatures\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003eLimitation\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eLe et al. [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTCIA (n = 143)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eInternal (n = 18)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.778\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eHand-crafted features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eWavelet, first order energy\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eNo external\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMoreno et al. [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTCIA (n = 83)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eInternal (80:20)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.857\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eHand-crafted and deep learning features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eTexture\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eNo external\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eWu et al. [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLocal (n = 67)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCross-validation only\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.882\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eHand-crafted features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003eShape/surface volume ratio, texture, wavelet features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eNo external\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eZhang et al.[\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eLocal (n = 297)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eIndependent dataset (n = 127)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.753\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eHand-crafted features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003efirst-order, texture, wavelet features\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003eOnly portal venous phase scans routinely performed\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003ch3\u003eGenetic testing\u003c/h3\u003e\u003cp\u003eIn the discovery cohort, DNA was extracted from formalin fixed paraffin embedded (FFPE) tissue using the Qiagen QI Symphony DSP DNA Minikit (Qiagen N.V., Hilden, Germany). [\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e] The mutational screening was performed by next-generation sequencing using the Ion Torrent Hotspot Panel (Ion Torrent Systems, now part of Thermo Fisher Scientific, South San Francisco, CA, USA). The assay comprised 207 amplicons in 50 oncogenes frequently mutated in solid tumours, including EGFR, KRAS, NRAS, BRAF, and PIK3CA. Reference Sequences NM_005228.3, NM_004985.3, NM_002524.4, NM_004333.4, and NM_006218.2 were used to screen the EGFR, KRAS, NRAS, BRAF and PIK3CA genes, respectively. Fluorescence in situ hybridisation (FISH) was used in parallel for ALK rearrangements (EML4-ALK translocation).\u003c/p\u003e\u003cp\u003eIn the testing cohort, EGFR, KRAS, and ALK mutation status were available. Single-nucleotide variant detection was performed using the SNaPshot assay (Applied Biosystems, now Thermo Fisher Scientific, Waltham, MA, USA). [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] EGFR mutations were assessed in exons 18–21, KRAS mutations in exon 2. ALK rearrangements were evaluated using FISH for detection of EML4–ALK translocations.\u003c/p\u003e\u003cp\u003eWe defined exclusive EGFR positivity as EGFR positivity in the absence of ALK and KRAS mutation; actionable EGFR, ALK, and KRAS mutations, at the time of writing, which were covered in both cohorts, were considered.\u003c/p\u003e\u003ch2\u003eImage acquisition\u003c/h2\u003e\u003cp\u003eAll patients in the discovery cohort had contrast-enhanced chest CTs demonstrating a primary NSCLC at the time of diagnosis. The three centers (A, B, C) at our institution used different scanners (Site A: Siemens Definition AS+; Site B: Philips Brilliance and Philips Ingenuity; Site C: Siemens Definition AS+) and institutional scanning protocols with a peak kilovoltage (kVp) ranging from 100-140kVp (120kVp), tube current 120–650 mA (mean 200mA) and slice thickness 0.625–3 mm (median: 1.5 mm), and contrast given in the portal venous phase. Scans were acquired with subjects in supine position with arms at sides, from the apex of the lung to the adrenal gland within a single breath-hold.\u003c/p\u003e\u003cp\u003ePatients in the testing cohort received their CT scans at Stanford USA, [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e] performed using various scanners with 80–140 kVp (mean 120 kVp), 124–699 mA (mean 220 mA) and slice thickness of 0.625–3 mm (median: 1.5 mm). Contrast enhancement phase, subject positioning, breath holding and scan coverage were similar to those in the discovery cohort.\u003c/p\u003e\u003ch2\u003eMulti-label segmentation\u003c/h2\u003e\u003cp\u003eTwo chest radiologists, blinded to clinical and histological data, with 13 and 9 years of experience, doubly reviewed all scans using both mediastinal (width, 350 HU; level 40 HU) and lung (width, 1500 HU; level, -600 HU) window settings, and performed semi-automated segmentation of the tumour, peri-lesional annulus of 5mm thickness, and a spherical parenchymal patch of 2cm diameter in the same or an ipsilateral pulmonary lobe, where there is no appreciable aerated lung remaining. This multi-region approach (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB) is currently the mainstay in lung cancer radiomics workflow. [\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e] All delineations were made using 3DSlicer (Slicer Community, Boston, USA). [\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e]\u003c/p\u003e\u003ch2\u003eImage processing and radiomic features extraction\u003c/h2\u003e\u003cp\u003eAfter segmentation, the imaging data were resampled to a uniform voxel size of 1 × 1 × 2 mm and analysed for a total of 1,998 radiomic features from each scan (666 features per ROI), using an in-house software (TexLab 2.0), in Matlab 2020b (MathWorks Inc., Natick, MA, USA). [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eThe computed features included ones pertaining to tumour image intensity, shape and texture from the original, wavelet and Laplacian of Gaussian (LoG) transformed images, which are compliant with Image Biomarker Standardisation Initiative (IBSI). [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e] We have additionally extracted an additional texture descriptor not yet covered by ISBI, fractal dimension (FD), to capture complex spatial patterns not well characterised by traditional metrics. [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e] This was captured using a box-counting algorithm, which involved overlaying grids of varying box sizes over the ROIs and computing the number of boxes required \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:N\\left(\\epsilon\\:\\right)\\)\u003c/span\u003e\u003c/span\u003e to cover the object as a function of box size \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\epsilon\\:\\)\u003c/span\u003e\u003c/span\u003e. The fractal dimension was then estimated as the negative slope of the linear regression line fitted to the log-log plot of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{l}\\text{o}\\text{g}\\left(N\\left(\\epsilon\\:\\right)\\right)\\)\u003c/span\u003e\u003c/span\u003e versus \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\(\\:\\text{l}\\text{o}\\text{g}(1/\\epsilon\\:)\\)\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e\u003cp\u003eThe computed radiomic features were standardised to a mean of zero and standard deviation (SD) of one. To further countering batch effects resulting from inter-scanner and inter-site variabilities, features were harmonised using ComBat, [\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e] in keeping with IBSI recommendation. [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eInter-observer radiomic feature reproducibility was assessed by calculating the intraclass correlation coefficient (ICC), on the basis of a two-way random model. There were 1,452 features found to have an ICC greater than or equal to 0.8, thus deemed reproducible and included in subsequent dimensionality reduction and regression steps. Test-retest repeatability was assessed using a cutoff ICC of greater or equal to 0.9, based on the publicly available RIDER dataset (n = 29), where repeated CT scans were taken 15 minutes apart for every participant.[\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e]\u003c/p\u003e\u003ch2\u003eModel development and validation\u003c/h2\u003e\u003cp\u003eCommon dimensionality reduction and regression methods were benchmarked to select the best combination for achieving optimal performance, as assessed by predictive performance in the internal validation cohort (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC). In this work, the combination of Spearman and Least Absolute Shrinkage and Selection Operator (LASSO) yielded best performance and was adopted to develop the predictive biomarker, EGFR-RPV. This study has a radiomics quality score of 23/36.[\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eWe have performed univariable and multivariable logistic regression analyses to identify clinical features with statistically significant association with exclusive EGFR mutation status and developed a nomogram based on EGFR-RPV and clinical features deemed statistically significant in these analyses, using R package \u003cem\u003erms.\u003c/em\u003e\u003c/p\u003e\u003ch2\u003eStatistical analysis\u003c/h2\u003e\u003cp\u003eAll statistical analyses and machine learning were performed using R version 4.3.0 (R Project for Statistical\u003c/p\u003e\u003cp\u003eComputing, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttp://www.r-project.org/\u003c/span\u003e\u003cspan address=\"http://www.r-project.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The statistical tests were two-sided, with a p-value threshold of significance at 5% adopted throughout. Differences between cohorts were tested using the analysis of variance\u003c/p\u003e\u003cp\u003etest for continuous variables and the chi-square test for categorical variables. Kaplan-Meier plots were used to evaluate the utility of the model for patient prognostication, and log-rank test used to assess the survival curve differences. Receiver operating characteristics (ROC) analysis was used to assess the predictive performance of EGFR-RPV. Univariable and multivariable regression models were used to investigate the effects of EGFR-RPV and various clinical features on exclusive EGFR mutation.\u003c/p\u003e\u003ch2\u003eGenomics analysis\u003c/h2\u003e\u003cp\u003eBulk RNA reads were aligned to the human genome (hg19) using the alignment algorithm STAR version 2.3 with 91 bases of splice junction overhangs. Next, the readouts were normalised on an individual gene basis. We computed counts per million (CPM) using R package \u003cem\u003eedgeR\u003c/em\u003e and retained readouts with a threshold of CPM of 0.5 then \u003cem\u003elog2\u003c/em\u003e transformed and performed hierarchical clustering with heatmap based on Euclidean distance. We found the feature latent space using principal component analysis (PCA), followed by Uniform Manifold Approximation and Projection (UMAP) and differential expression analysis to identify the most differentially expressed genes based on exclusive EGFR mutation status.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\u003ch2\u003eData availability\u003c/h2\u003e\u003cp\u003eOur institutional study data (clinical, genetics and imaging) are retrospective in nature and protected through institutional compliance; and can be shared as per specific institutional review board (IRB) requirements. Upon reasonable request, a data sharing agreement can be initiated between the interested parties and the clinical institution following institution-specific guidelines.\u003c/p\u003e\u003cp\u003eThe gene expression dataset analysed during the study are from the NSCLC Radiogenomics public domain dataset, available in the Gene Expression Omnibus (GEO) repository, accessed via accession number GSE103584. Its paired imaging and clinical data are deposited in The Cancer Imaging Archive (TCIA), which can be accessed through the NSCLC Radiogenomics collection at: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/\u003c/span\u003e\u003cspan address=\"https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\u003ch2\u003eCode availability\u003c/h2\u003e\u003cp\u003eAll R scripts and guidance for use can be found on GitHub: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/scat2801/egfr-rpv/\u003c/span\u003e\u003cspan address=\"https://github.com/scat2801/egfr-rpv/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/p\u003e\u003c/div\u003e\u003ch2\u003eConflict of interest\u003c/h2\u003e\u003cp\u003eThe authors declare no competing interests in the publication of this paper. MC sits on the Royal College of Radiologists AI steering committee; SJC and MC sit on the North West London Imaging Network AI Diagnostic Fund panel.\u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e\u003cp\u003eMC is funded by Medical Research Council (MRC) Clinician Scientist Fellowship WSCC_PB1626 and a North West London Pathology E\u0026amp;R grant; EOA received funding from the Imperial College London Biomedical Research Centre (ICL-BRC), Experimental Cancer Medicines Centre (ECMC) and National Cancer Imaging Translational Accelerator consortium (NCITA). EOA also acknowledges the MRC.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eM.C: Conceptualisation, data curation, formal analysis, funding acquisition, project administration, resources and software, manuscript \u0026ndash; original draft; S.J.C: Data curation, funding acquisition, resources, supervision; K.L: Software; P.V: Investigation, resources; Y.H: Data curation; A.C: Resources; H.L: Investigation; A.M: Data curation; M.B: Data curation; D.J.P: Resources; D.P: Resources; A.G.R: Resources; E.O.A: Conceptualisation, funding acquisition, funding acquisition, resources, supervision; All authors contributed to Manuscript \u0026ndash; review \u0026amp; editing.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eCancer Research, U. K. Types of lung cancer. (2019). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.cancerresearchuk.org/about-cancer/lung-cancer/stages-types-grades/types\u003c/span\u003e\u003cspan address=\"https://www.cancerresearchuk.org/about-cancer/lung-cancer/stages-types-grades/types\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePolanco, D. et al. Prognostic value of symptoms at lung cancer diagnosis: a three-year observational study. \u003cem\u003eJ. Thorac. Dis.\u003c/em\u003e \u003cb\u003e13\u003c/b\u003e, 1485 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRothschild, S. I. Targeted Therapies in Non-Small Cell Lung Cancer\u0026mdash;Beyond EGFR and ALK. \u003cem\u003eCancers (Basel)\u003c/em\u003e. \u003cb\u003e7\u003c/b\u003e, 930\u0026ndash;949 (2015).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSimeone, J. C., Nordstrom, B. L., Patel, K. \u0026amp; Klein, A. B. Treatment patterns and overall survival in metastatic non-small-cell lung cancer in a real-world, US setting. \u003cem\u003eFuture Oncol.\u003c/em\u003e \u003cb\u003e15\u003c/b\u003e, 3491\u0026ndash;3502 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAraki, T., Kanda, S., Horinouchi, H. \u0026amp; Ohe, Y. Current treatment strategies for EGFR-mutated non-small cell lung cancer: from first line to beyond osimertinib resistance. \u003cem\u003eJpn J. Clin. Oncol.\u003c/em\u003e \u003cb\u003e53\u003c/b\u003e, 547 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHirsh, V. Managing treatment-related adverse events associated with egfr tyrosine kinase inhibitors in advanced non-small-cell lung cancer. \u003cem\u003eCurr. Oncol.\u003c/em\u003e \u003cb\u003e18\u003c/b\u003e, 126 (2011).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGavralidis, A. \u0026amp; Gainor, J. F. Immunotherapy in EGFR-Mutant and ALK-Positive Lung Cancer: Implications for Oncogene-Driven Lung Cancer. \u003cem\u003eCancer J.\u003c/em\u003e \u003cb\u003e26\u003c/b\u003e, 517\u0026ndash;524 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLee, C. et al. Next-generation sequencing reveals novel resistance mechanisms and molecular heterogeneity in EGFR-mutant non-small cell lung cancer with acquired resistance to EGFR-TKIs. \u003cem\u003eLung Cancer\u003c/em\u003e. \u003cb\u003e113\u003c/b\u003e, 106\u0026ndash;114 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCzyzewski, A. \u0026lsquo;Virtual biopsy\u0026rsquo; uses AI to help doctors assess lung cancer. \u003cem\u003eImperial News\u003c/em\u003e (2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.imperial.ac.uk/news/251593/virtual-biopsy-uses-ai-help-doctors/\u003c/span\u003e\u003cspan address=\"https://www.imperial.ac.uk/news/251593/virtual-biopsy-uses-ai-help-doctors/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFairley, J. A. et al. Results of a worldwide external quality assessment of cfDNA testing in lung Cancer. \u003cem\u003eBMC Cancer\u003c/em\u003e. \u003cb\u003e22\u003c/b\u003e, 1\u0026ndash;12 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHelman, E. et al. Cell-Free DNA Next-Generation Sequencing Prediction of Response and Resistance to Third-Generation EGFR Inhibitor. \u003cem\u003eClin. Lung Cancer\u003c/em\u003e. \u003cb\u003e19\u003c/b\u003e, 518\u0026ndash;530e7 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCucchiara, F. et al. Integrating Liquid Biopsy and Radiomics to Monitor Clonal Heterogeneity of EGFR-Positive Non-Small Cell Lung Cancer. \u003cem\u003eFront Oncol\u003c/em\u003e \u003cb\u003e10\u003c/b\u003e, (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLambin, P. et al. Radiomics: extracting more information from medical images using advanced feature analysis. \u003cem\u003eEur. J. Cancer\u003c/em\u003e. \u003cb\u003e48\u003c/b\u003e, 441\u0026ndash;446 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, M., Copley, S. J., Viola, P., Lu, H. \u0026amp; Aboagye, E. O. Radiomics and artificial intelligence for precision medicine in lung cancer treatment. \u003cem\u003eSemin Cancer Biol.\u003c/em\u003e \u003cb\u003e93\u003c/b\u003e, 97\u0026ndash;113 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, M. et al. A Novel Radiogenomics Biomarker for Predicting Treatment Response and Pneumotoxicity From Programmed Cell Death Protein or Ligand-1 Inhibition Immunotherapy in NSCLC. \u003cem\u003eJ. Thorac. Oncol.\u003c/em\u003e \u003cb\u003e18\u003c/b\u003e, 718\u0026ndash;730 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, M., Linton-Reid, K., Aboagye, E. O. \u0026amp; Copley, S. J. Translating Radiomics into Clinical Practice: A Step-by-Step Guide to Study Design and Evaluation. \u003cem\u003eClin. Radiol.\u003c/em\u003e \u003cb\u003e107053\u003c/b\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/J.CRAD.2025.107053\u003c/span\u003e\u003cspan address=\"10.1016/J.CRAD.2025.107053\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e (2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu, Y. et al. Radiomic Features Are Associated With EGFR Mutation Status in Lung Adenocarcinomas. \u003cem\u003eClin. Lung Cancer\u003c/em\u003e. \u003cb\u003e17\u003c/b\u003e, 441\u0026ndash;448e6 (2016).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWu, S., Shen, G., Mao, J. \u0026amp; Gao, B. CT Radiomics in Predicting EGFR Mutation in Non-small Cell Lung Cancer: A Single Institutional Study. \u003cem\u003eFront. Oncol.\u003c/em\u003e \u003cb\u003e10\u003c/b\u003e, 542957 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRossi, G. et al. Radiomic Detection of EGFR Mutations in NSCLC. \u003cem\u003eCancer Res.\u003c/em\u003e \u003cb\u003e81\u003c/b\u003e, 724\u0026ndash;731 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCheng, B. et al. Predicting EGFR mutation status in lung adenocarcinoma presenting as ground-glass opacity: utilizing radiomics model in clinical translation. \u003cem\u003eEur. Radiol.\u003c/em\u003e \u003cb\u003e32\u003c/b\u003e, 5869\u0026ndash;5879 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLe, X., Elamin, Y. Y. \u0026amp; Zhang, J. New Actions on Actionable Mutations in Lung Cancers. \u003cem\u003eCancers (Basel)\u003c/em\u003e. \u003cb\u003e15\u003c/b\u003e, 2917 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBarnet, M. B. et al. EGFR\u0026ndash;Co-Mutated Advanced NSCLC and Response to EGFR Tyrosine Kinase Inhibitors. \u003cem\u003eJ. Thorac. Oncol.\u003c/em\u003e \u003cb\u003e12\u003c/b\u003e, 585\u0026ndash;590 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYang, J. J. et al. Lung cancers with concomitant egfr mutations and ALK rearrangements: Diverse responses to EGFR-TKI and crizotinib in relation to diverse receptors phosphorylation. \u003cem\u003eClin. Cancer Res.\u003c/em\u003e \u003cb\u003e20\u003c/b\u003e, 1383\u0026ndash;1392 (2014).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, Y. et al. The co-mutation of EGFR and tumor-related genes leads to a worse prognosis and a higher level of tumor mutational burden in Chinese non-small cell lung cancer patients. \u003cem\u003eJ. Thorac. Dis.\u003c/em\u003e \u003cb\u003e14\u003c/b\u003e, 185\u0026ndash;193 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePeng, P. et al. Co-mutations of epidermal growth factor receptor and BRAF in Chinese non-small cell lung cancer patients. \u003cem\u003eAnn. Transl Med.\u003c/em\u003e \u003cb\u003e9\u003c/b\u003e, 1321\u0026ndash;1321 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eStockhammer, P. et al. Co-Occurring Alterations in Multiple Tumor Suppressor Genes Are Associated With Worse Outcomes in Patients With EGFR-Mutant Lung Cancer. \u003cem\u003eJ. Thorac. Oncol.\u003c/em\u003e \u003cb\u003e19\u003c/b\u003e, 240\u0026ndash;251 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, M. et al. Concurrent Driver Gene Mutations as Negative Predictive Factors in Epidermal Growth Factor Receptor-Positive Non-Small Cell Lung Cancer. \u003cem\u003eEBioMedicine\u003c/em\u003e 42, 304\u0026ndash;310 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eImyanitov, E. N., Iyevleva, A. G. \u0026amp; Levchenko, E. N. Molecular testing and targeted therapy for non-small cell lung cancer: Current status and perspectives. \u003cem\u003eCrit. Rev. Oncol. Hematol.\u003c/em\u003e \u003cb\u003e157\u003c/b\u003e, 103194 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShiri, I. et al. Next-Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Algorithms. \u003cem\u003eMol. Imaging Biol.\u003c/em\u003e \u003cb\u003e22\u003c/b\u003e, 1132\u0026ndash;1148 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBakr, S. et al. A radiogenomic dataset of non-small cell lung cancer. \u003cem\u003eSci Data\u003c/em\u003e \u003cb\u003e5\u003c/b\u003e, (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJim\u0026eacute;nez-S\u0026aacute;nchez, J. et al. Evolutionary dynamics at the tumor edge reveal metabolic imaging biomarkers. \u003cem\u003eProc. Natl. Acad. Sci. U S A\u003c/em\u003e. \u003cb\u003e118\u003c/b\u003e, e2018110118 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHayasaka, K. et al. Clinical, Genomic, and Transcriptomic Featurses of Lung Adenocarcinoma With Uncommon EGFR Mutation. \u003cem\u003eClin. Lung Cancer\u003c/em\u003e. \u003cb\u003e25\u003c/b\u003e, e43\u0026ndash;e51 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIzumi, M. et al. Integrative single-cell RNA-seq and spatial transcriptomics analyses reveal diverse apoptosis-related gene expression profiles in EGFR-mutated lung cancer. \u003cem\u003eCell Death \u0026amp; Disease 2024 15:8\u003c/em\u003e 15, 1\u0026ndash;11 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. \u003cem\u003eNat. Cancer\u003c/em\u003e. \u003cb\u003e3\u003c/b\u003e, 1151\u0026ndash;1164 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChen, W., Qiao, X., Yin, S., Zhang, X. \u0026amp; Xu, X. Integrating Radiomics with Genomics for Non-Small Cell Lung Cancer Survival Analysis. \u003cem\u003eJ Oncol\u003c/em\u003e (2022). (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBieńkowski, M., Dziadziuszko, R. \u0026amp; Jassem, J. Complex EGFR mutations in non-small cell lung cancer: a distinct entity? \u003cem\u003eJ. Thorac. Dis.\u003c/em\u003e \u003cb\u003e14\u003c/b\u003e, 2738\u0026ndash;2741 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBoch, C. et al. The frequency of EGFR and KRAS mutations in non-small cell lung cancer (NSCLC): routine screening data for central Europe from a cohort study. \u003cem\u003eBMJ Open.\u003c/em\u003e \u003cb\u003e3\u003c/b\u003e, e002560 (2013).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSkoulidis, F. et al. Sotorasib for Lung Cancers with KRAS p.G12C Mutation. \u003cem\u003eN. Engl. J. Med.\u003c/em\u003e \u003cb\u003e384\u003c/b\u003e, 2371\u0026ndash;2381 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRachiglio, A. M. et al. The presence of concomitant mutations affects the activity of egfr tyrosine kinase inhibitors in egfr-mutant non-small cell lung cancer (Nsclc) patients. \u003cem\u003eCancers (Basel)\u003c/em\u003e \u003cb\u003e11\u003c/b\u003e, (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, C. et al. The Performance of an Extended Next Generation Sequencing Panel Using Endobronchial Ultrasound-Guided Fine Needle Aspiration Samples in Non-Squamous Non-Small Cell Lung Cancer: A Pragmatic Study. \u003cem\u003eClin. Lung Cancer\u003c/em\u003e. \u003cb\u003e24\u003c/b\u003e, e105 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShin, H. T. et al. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cb\u003e8\u003c/b\u003e, 1\u0026ndash;10 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRolfo, C. D. et al. Measurement of ctDNA Tumor Fraction Identifies Informative Negative Liquid Biopsy Results and Informs Value of Tissue Confirmation. \u003cem\u003eClin. Cancer Res.\u003c/em\u003e \u003cb\u003e30\u003c/b\u003e, 2452\u0026ndash;2460 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBote-de Cabo, H. et al. Clinical Utility of Combined Tissue and Plasma Next-Generation Sequencing in Patients With Advanced, Treatment-Na\u0026iuml;ve NSCLC. \u003cem\u003eJTO Clin. Res. Rep.\u003c/em\u003e \u003cb\u003e6\u003c/b\u003e, 100778 (2025).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZwanenburg, A. et al. The image biomarker standardization initiative: Standardized quantitative radiomics for high-throughput image-based phenotyping. \u003cem\u003eRadiology\u003c/em\u003e \u003cb\u003e295\u003c/b\u003e, 328\u0026ndash;338 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWu, S., Shen, G., Mao, J. \u0026amp; Gao, B. CT Radiomics in Predicting EGFR Mutation in Non-small Cell Lung Cancer: A Single Institutional Study. \u003cem\u003eFront. Oncol.\u003c/em\u003e \u003cb\u003e10\u003c/b\u003e, 542957 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZhang, G. et al. Using Multi-phase CT Radiomics Features to Predict EGFR Mutation Status in Lung Adenocarcinoma Patients. \u003cem\u003eAcad. Radiol.\u003c/em\u003e \u003cb\u003e31\u003c/b\u003e, 2591\u0026ndash;2600 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLe, N. Q. K. et al. Machine learning-based radiomics signatures for egfr and kras mutations prediction in non-small-cell lung cancer. \u003cem\u003eInt J. Mol. Sci\u003c/em\u003e \u003cb\u003e22\u003c/b\u003e, (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMoreno, S. et al. A Radiogenomics Ensemble to Predict EGFR and KRAS Mutations in NSCLC. \u003cem\u003eTomography\u003c/em\u003e \u003cb\u003e7\u003c/b\u003e, 154\u0026ndash;168 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi, H. et al. Frequency of well-identified oncogenic driver mutations in lung adenocarcinoma of smokers varies with histological subtypes and graduated smoking dose. \u003cem\u003eLung Cancer\u003c/em\u003e. \u003cb\u003e79\u003c/b\u003e, 8\u0026ndash;13 (2013).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYang, J. et al. Novel Subtypes of Pulmonary Emphysema Based on Spatially-Informed Lung Texture Learning: The Multi-Ethnic Study of Atherosclerosis (MESA) COPD Study. \u003cem\u003eIEEE Trans. Med. Imaging\u003c/em\u003e. \u003cb\u003e40\u003c/b\u003e, 3652 (2021).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eStellman, S. D., Muscat, J. E., Hoffmann, D. H. \u0026amp; Wynder, E. L. Impact of filter cigarette smoking on lung cancer histology. \u003cem\u003ePrev. Med. (Baltim)\u003c/em\u003e. \u003cb\u003e26\u003c/b\u003e, 451\u0026ndash;456 (1997).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eInamura, K. Update on Immunohistochemistry for the Diagnosis of Lung Cancer. \u003cem\u003eCancers 2018\u003c/em\u003e. \u003cb\u003e10, Page 72\u003c/b\u003e (10), 72 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatel, K. et al. FAM190A deficiency creates a cell division defect. \u003cem\u003eAm. J. Pathol.\u003c/em\u003e \u003cb\u003e183\u003c/b\u003e, 296\u0026ndash;303 (2013).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWang, Y. et al. Regulatory mechanisms of Beta-carotene and BCMO1 in adipose tissues: A gene enrichment-based bioinformatics analysis. \u003cem\u003eHum Exp. Toxicol\u003c/em\u003e \u003cb\u003e41\u003c/b\u003e, (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOmenn, G. S. et al. Effects of a Combination of Beta Carotene and Vitamin A on Lung Cancer and Cardiovascular Disease. \u003cem\u003eN. Engl. J. Med.\u003c/em\u003e \u003cb\u003e334\u003c/b\u003e, 1150\u0026ndash;1155 (1996).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKordiak, J., Bielec, F., Jabłoński, S. \u0026amp; Pastuszak-Lewandoska, D. Role of Beta-Carotene in Lung Cancer Primary Chemoprevention: A Systematic Review with Meta-Analysis and Meta-Regression. \u003cem\u003eNutrients\u003c/em\u003e 14, (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOrlhac, F., Frouin, F., Nioche, C., Ayache, N. \u0026amp; Buvat, I. Validation of A Method to Compensate Multicenter Effects Affecting CT Radiomics. (2019). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1148/radiol.2019182023\u003c/span\u003e\u003cspan address=\"https://doi.org/10.1148/radiol.2019182023\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e 291, 53\u0026ndash;59\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePlanchard, D. et al. Osimertinib with or without Chemotherapy in EGFR -Mutated Advanced NSCLC. \u003cem\u003eN. Engl. J. Med.\u003c/em\u003e \u003cb\u003e389\u003c/b\u003e, 1935\u0026ndash;1948 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCho, B. C. et al. Amivantamab plus Lazertinib in Previously Untreated EGFR -Mutated Advanced NSCLC. \u003cem\u003eN. Engl. J. Med.\u003c/em\u003e \u003cb\u003e391\u003c/b\u003e, 1486\u0026ndash;1498 (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeldt, S. L. \u0026amp; Bestvina, C. M. The Role of MET in Resistance to EGFR Inhibition in NSCLC: A Review of Mechanisms and Treatment Implications. \u003cem\u003eCancers (Basel)\u003c/em\u003e. \u003cb\u003e15\u003c/b\u003e, 2998 (2023).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChan, D. W. K., Choi, H. C. W. \u0026amp; Lee, V. H. F. Treatment-Related Adverse Events of Combination EGFR Tyrosine Kinase Inhibitor and Immune Checkpoint Inhibitor in EGFR-Mutant Advanced Non-Small Cell Lung Cancer: A Systematic Review and Meta-Analysis. \u003cem\u003eCancers (Basel)\u003c/em\u003e. \u003cb\u003e14\u003c/b\u003e, 2157 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi, Q. et al. CT imaging features associated with recurrence in non-small cell lung cancer patients after stereotactic body radiotherapy. \u003cem\u003eRadiat. Oncol.\u003c/em\u003e \u003cb\u003e12\u003c/b\u003e, 1\u0026ndash;10 (2017).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSpinelli, M., Parcq, P., Du, Gupta, N., Khorashad, J. \u0026amp; Viola, P. Coexistence of two missense mutations in the KRAS gene in adenocarcinoma of the lung: a possible indicator of poor prognosis. \u003cem\u003ePathologica\u003c/em\u003e \u003cb\u003e114\u003c/b\u003e, 221 (2022).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKhorrami, M. et al. Combination of peri- and intratumoral radiomic features on baseline CT scans predicts response to chemotherapy in lung adenocarcinoma. \u003cem\u003eRadiol Artif. Intell\u003c/em\u003e \u003cb\u003e1\u003c/b\u003e, (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFedorov, A. et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. \u003cem\u003eMagn. Reson. Imaging\u003c/em\u003e. \u003cb\u003e30\u003c/b\u003e, 1323\u0026ndash;1341 (2012).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLu, H. et al. A mathematical-descriptor of tumor-mesoscopic-structure from computed-tomography images annotates prognostic- and molecular-phenotypes of epithelial ovarian cancer. \u003cem\u003eNat. Commun.\u003c/em\u003e \u003cb\u003e10\u003c/b\u003e, 1\u0026ndash;11 (2019).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWhybra, P. et al. The Image Biomarker Standardization Initiative: Standardized Convolutional Filters for Reproducible Radiomics and Enhanced Clinical Insights. \u003cem\u003eRadiology\u003c/em\u003e 310, (2024).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eThawani, R. et al. Radiomics and radiogenomics in lung cancer: A review for the clinician. \u003cem\u003eLung Cancer\u003c/em\u003e. \u003cb\u003e115\u003c/b\u003e, 34\u0026ndash;41 (2018).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMahon, R. N., Ghita, M., Hugo, G. D. \u0026amp; Weiss, E. ComBat harmonization for radiomic features in independent phantom and lung cancer patient computed tomography datasets. \u003cem\u003ePhys. Med. Biol.\u003c/em\u003e \u003cb\u003e65\u003c/b\u003e, 015010 (2020).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eArmato, S. G. et al. The Reference Image Database to Evaluate Response to Therapy in Lung Cancer (RIDER) Project: A Resource for the Development of Change Analysis Software. \u003cem\u003eClin. Pharmacol. Ther.\u003c/em\u003e \u003cb\u003e84\u003c/b\u003e, 448 (2008).\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLambin, P. et al. Radiomics: the bridge between medical imaging and personalized medicine. \u003cem\u003eNat. Rev. Clin. Oncol.\u003c/em\u003e \u003cb\u003e14\u003c/b\u003e, 749\u0026ndash;762 (2017).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Non-small cell lung cancer, imaging biomarker, radiogenomics, EGFR mutation, tyrosine kinase inhibitor","lastPublishedDoi":"10.21203/rs.3.rs-8158721/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8158721/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eNewer-generation tyrosine kinase inhibitors (TKIs) have shown increasing efficacy in cancers driven by specific mutations, with epidermal growth factor receptor (EGFR) alterations remaining the most common actionable targets in non-small cell lung cancer (NSCLC). Treatment decisions are currently guided by tissue sampling and genetic testing, which are limited by procedural risks, patient tolerance, tumour heterogeneity and mutation evolution. Because co-mutations involving EGFR and other targetable genes can diminish treatment response, identifying \u003cem\u003eexclusive\u003c/em\u003e EGFR mutation, defined by the absence of other actionable alterations, represents a clinically favourable scenario for first-line EGFR-TKI therapy. We developed a CT-based radiomics signature, EGFR-RPV, to predict exclusive EGFR mutational status using NSCLC patients (n\u0026thinsp;=\u0026thinsp;304) from a multi-centre cohort with paired imaging and genomics data, and validated performance in an independent testing set (n\u0026thinsp;=\u0026thinsp;51), alongside transcriptomics enrichment analysis. EGFR-RPV predicted exclusive EGFR mutation with accuracies of 0.77 (95% CI 0.66\u0026ndash;0.88) and 0.71 (95% CI 0.54\u0026ndash;0.89) in internal and external testing, respectively, and stratified patient prognosis (hazard ratio 2.15, 95% CI 1.50\u0026ndash;3.08). FAM190A and CBMO1 were enriched in exclusive EGFR-positive cases, consistent with their roles in cell division regulation and vitamin A biosynthesis, respectively. EGFR-RPV thus offers a non-invasive approach to identify exclusive EGFR mutations, with a potential role in guiding first-line EGFR-TKI use.\u003c/p\u003e","manuscriptTitle":"A Radio-Genomics Biomarker for Precision Epidermal Growth Factor Receptor Mutation Targeting Therapy in Non-Small Cell Lung Cancer","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-12-11 09:49:56","doi":"10.21203/rs.3.rs-8158721/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-02-12T11:43:16+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-02-12T04:04:05+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"66163350051342966058401286589161378915","date":"2026-02-02T09:02:00+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-12-08T13:42:36+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-12-08T12:46:17+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-12-08T12:32:21+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"5316e957-167b-49fb-819d-d01f007f623e","owner":[],"postedDate":"December 11th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":59285191,"name":"Health sciences/Biomarkers"},{"id":59285192,"name":"Biological sciences/Cancer"},{"id":59285193,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":59285194,"name":"Biological sciences/Genetics"},{"id":59285195,"name":"Health sciences/Oncology"}],"tags":[],"updatedAt":"2026-03-09T16:10:35+00:00","versionOfRecord":{"articleIdentity":"rs-8158721","link":"https://doi.org/10.1038/s41598-026-42948-4","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2026-03-06 16:00:10","publishedOnDateReadable":"March 6th, 2026"},"versionCreatedAt":"2025-12-11 09:49:56","video":"","vorDoi":"10.1038/s41598-026-42948-4","vorDoiUrl":"https://doi.org/10.1038/s41598-026-42948-4","workflowStages":[]},"version":"v1","identity":"rs-8158721","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8158721","identity":"rs-8158721","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-4.0