Ferroptosis-related gene signature predicts clinical metastasis of clear cell renal cell carcinoma based on single-cell RNA-seq

preprint OA: closed
Full text JSON View at publisher
Full text 113,867 characters · extracted from preprint-html · click to expand
Ferroptosis-related gene signature predicts clinical metastasis of clear cell renal cell carcinoma based on single-cell RNA-seq | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Ferroptosis-related gene signature predicts clinical metastasis of clear cell renal cell carcinoma based on single-cell RNA-seq Wenxing Yue, Meiyuan Huang, Qian Che, Manling Tang, Xiyun Quan, and 2 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7083430/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract metastasis of clear cell renal cell carcinoma (ccRCC) negatively affects patient survival. Meanwhile, ferroptosis genes have a certain relationship with cancer metastasis. Here, we conducted a screening of metastasis genes using single-cell sequencing of clear cell renal cell carcinoma GSE73121, and intersected it with the ferroptosis gene database to obtain 13 metastasis-related ferroptosis genes. Next, we performed gene set enrichment analysis (WGCNA) on the patient data of ccRCC in TCGA, resulting in 9 metastasis-related ferroptosis genes. Furthermore, we conducted univariate logistic analysis, lasso analysis, and multivariate logistic analysis on these 9 genes, ultimately identifying 3 key metastasis-related ferroptosis genes (MAFGs). A risk score (RS) for predicting metastasis was constructed based on three MAFGs (DPP4, SLC1A5, and AIFM2). The results showed good outcomes in ROC, calibration curves, and goodness-of-fit tests. Additionally, the risk score (RS) was well validated in clear cell renal cell carcinoma data from GEO22541 and ICGC. It also demonstrated good predictive effects on various survival times of patients. By combining clinical information of the patients, we constructed a nomogram score that includes RS. The nomogram score better predicts patient prognosis (AUC of 0.858). A higher MAFGs nomogram score is associated with fatty acid metabolism, MTOR signaling pathway, and P53 signaling pathway in ccRCC. In summary, we constructed a robust MAFGs using various sequencing data and validated the model in multiple patient cohort databases, which is of significant value for prognostic stratification and screening treatment of metastatic ccRCC. ccRCC single-cell RNA-seq metastasis ferroptosis LASSO cox regression Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. Introduction Renal cell carcinoma (RCC) accounts for 2%-3% of all human malignancies, and its incidence has been increasing globally in recent years 1 , 2 . The most common histopathological type is clear cell renal cell carcinoma (ccRCC), which constitutes 80%-90% of kidney cancer 1 , 3 . Renal cancer is insensitive to radiotherapy and chemotherapy, lacking effective systemic treatment methods. Therefore, localized renal cancer has a good prognosis after surgical treatment, with a 5-year survival rate exceeding 70%. Nearly 20% of ccRCC cases progress to advanced stages or metastasize at diagnosis, with a 5-year overall survival (OS) rate of less than 25% 4 . Currently, the diagnosis of metastatic renal cancer still relies on imaging examination, lacking precise molecular biological markers that can react early to renal cancer metastasis. Therefore, it is possible that some patients, although diagnosed with localized renal cancer, have already developed micro-metastatic lesions that imaging cannot detect. These patients may be more prone to tumor metastasis recurrence after the primary lesion is surgically removed. Previous researchers have attempted to study several key biomarkers related to metastasis from a large number of transcriptomic profiles. Screening and identifying valuable metastasis-related genes can expand our comprehensive understanding of the genomic changes between primary and metastatic ccRCC. Furthermore, these perilous biomarkers offer additional avenues for refining strategies and effectively forecasting progression events. Consequently, there exists an imperative to investigate the molecular mechanisms behind ccRCC metastasis and to identify new therapeutic targets aimed at enhancing patient prognosis. Ferroptosis is a regulatory form of cell death, characterized by the accumulation of iron-dependent lipid peroxides to lethal levels. It represents a novel mode of cell death, distinct from apoptosis, necrosis, and autophagy. The interplay between ferroptosis and lipid metabolism plays a crucial role in tumor development, invasion, metastasis, drug resistance, and tumor immunity 5 . Currently, inducing ferroptosis holds significant potential for cancer therapy as a novel therapeutic strategy to trigger cancer cell death, particularly in tumors that have developed resistance to conventional treatments, with notable efficacy 6 , 7 . Ferroptosis-related genes are linked to tumor metastasis and prognosis. Nevertheless, the relationship between ferroptosis genes and the metastasis of clear cell renal carcinoma, along with its prognostic implications, remains to be fully clarified. Single-cell sequencing separates tissues into single cells for transcriptomic sequencing, revealing the gene expression status of individual cells in tissues or bodily fluids at the single-cell level. The emergence of single-cell sequencing technology (SCS) allows for precise cell-level studies of the genomics, transcriptomics, and epigenomics of tumor cells 8 . This assists in the precision and individualized treatment of tumors. Therefore, we studied important marker genes in primary and metastatic ccRCC cell subpopulations through single-cell expression profiling. In the current study, we pinpointed genes with differential expression in primary and metastatic ccRCC by leveraging the single-cell sequencing dataset designated as GSE73121. Additionally, we compiled transcriptomic and clinical data from roughly 700 patient tissue samples sourced from multiple databases, such as GSE22541, The Cancer Genome Atlas (TCGA), and the International Cancer Genome Consortium (ICGC). Our investigation entailed a comprehensive multi-omics analysis to identify ferroptosis-related genes associated with metastasis (MAFGs), with the goal of verifying the predictive accuracy of these biomarkers for ccRCC metastasis and tumor recurrence. This validation is anticipated to refine the selection of precise treatment regimens and enhance the overall prognosis for patients. 2. Results 2.1. Screening transfer-related genes from single-cell sequencing data We acquired 118 single-cell sequencing datasets from GSE73121. Upon conducting quality control based on the number of genes and sequencing counts for each cell (Fig. 1A), we retained 116 single-cell sequencing datasets for further analysis. All cells within the GSE73121 dataset were classified as epithelial cells. Furthermore, we employed tSNE and UMAP techniques to precisely segregate the cellular populations into two distinct subgroups (Fig. 1B, C).These two subgroups correspond to primary and metastatic cells (Supplementary Table 1). Accordingly, we performed differential analysis using the DESeq2 package in R, identifying a total of 2383 transfer-related differential genes, with log fold change | (FC) | >0.5 and adjPval < 0.05 (Fig. 1D, Supplementary Table 2). Furthermore, we identified the intersection between the screened transfer-related differential genes and the ferroptosis-related gene set, yielding 13 MAFGs (Fig. 1E). [insert Fig. 1.] Figure 1. Preliminary screening of transfer-related ferroptosis genes in single-cell sequencing. (A) Quality control of single-cell sequencing data GSE73121, removing substandard cells. (B-C) Dimensionality reduction analysis of the selected single cells using tSNE and UMAP methods, successfully clustering the cells into two clusters. (D) Screening of differential genes between the primary and metastatic clusters (|log fold change (FC)| >0.5, FDR < 0.05). (E) Intersection of ferroptosis genes with the previously screened transfer-related differential genes. 2.2. WGCNA screening of transfer-related differentially expressed ferroptosis genes Using the gene weighted co-expression network analysis (WGCNA) package in R, we constructed a network based on KIRC expression data from TCGA and screened gene modules in conjunction with patient clinical survival information (Fig. 2A). Furthermore, we screened for transfer-related gene modules, as shown in Fig. 2B, where the pink module is most closely related to transfer. By integrating gene significance (GS value) and module membership (MM value), 9 out of the 13 MAFGs met the criteria for further analysis (Supplementary Table 3). [insert Fig. 2.] Figure 2. WGCNA method screening and validation of transfer-related ferroptosis genes. (A) Heatmap showing the relationship between gene modules constructed by the WGCNA method and traits. (B) Correlation analysis between transfer and various gene modules. 2.3. Screening of metastasis-associated ferroptosis genes using LASSO method and logistic regression First, we collected the expression profiles of 9 MAFGs and integrated them with clinical information from the TCGA-Kidney Renal Clear Cell Carcinoma (KIRC) dataset into a table (Supplementary Table 4). The patient data from Supplementary Table 3 was randomly divided at a ratio of 2:1 into a training cohort and a testing cohort, with corresponding statistical information provided in Supplementary Table 5 (patients without metastasis information were not included in the statistical analysis). We first performed univariate logistic regression on the 9 genes using the R software in the training cohort to obtain 7 genes (Fig. 3A). Next, we conducted the least absolute shrinkage and selection operator (LASSO) method and multivariate logistic regression analysis on the 7 genes (Fig. 3B-D). By combining the two methods, we ultimately obtained 3 MAFGs (DPP4, SLC1A5, AIFM2). Then, based on multivariate logistic regression, we established a MAFG risk score (MAFG-RS) = Exp[DPP4]*(-0.248) + Exp[SLC1A5]*0.661 + Exp[AIFM2]*0.737 9 . The area under the receiver operating characteristic (ROC) curve (AUC) was 0.727 and 0.738 for predicting metastasis events in the training and testing cohorts, respectively (Fig. 3E). Additionally, calibration curves plotted using the calibrate function in R also showed good predictive performance for actual situations (Fig. 3F-G), with mean absolute errors of 0.023 and 0.03, respectively. [insert Fig. 3.] Figure 3. Screening and validation of MAFG risk score constructed in TCGA. (A) Univariate logistic regression screening of MAFGs. (B) Further analysis of MAFGs using the LASSO method. (D) Multivariate logistic regression screening to obtain the final 3 MAFGs. (E) ROC curve analysis of MAFG risk score predicting metastasis in TCGA training and testing cohorts. (F) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA training cohort. (G) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA testing cohort. Furthermore, in the independent GSE22541 and ICGC cohorts, the AUCs of the receiver operating characteristic (ROC) curves for predicting metastasis events using the MAFG risk score were 0.729 and 0.666, respectively (Fig. 4A-B). Similarly, the calibration curves in Fig. 4C-D also demonstrated good predictive performance for patient metastasis, with mean absolute errors of 0.028 and 0.022, respectively. The Hosmer and Lemeshow test results showed that the MAFG risk score (MAFG-RS) had a good fit in both the GSE22541 (X-squared = 8.1585, df = 8, p-value = 0.4181) and ICGC (X-squared = 4.8809, df = 8, p-value = 0.7702) cohorts. Additionally, the MAFG risk score could effectively distinguish patients with different survival prognoses in the TCGA, GSE22541, and ICGC cohorts (Fig. 4E-I). As shown, a high MAFG-RS was associated with more death or recurrence/progression cases. [insert Fig. 4.] Figure 4. Further validation of the MAFG risk score's predictive ability for metastasis and survival prognosis. (A) ROC curve analysis of MAFG risk score predicting metastasis events in the GSE22541 cohort. (B) ROC curve analysis of MAFG risk score predicting metastasis events in the ICGC cohort. (C) Calibration curve analysis of MAFG risk score predicting metastasis in the GSE22541 cohort. (D) Calibration curve analysis of MAFG risk score predicting metastasis in the ICGC cohort. (E-G) Comparison of overall survival time, disease-specific survival time, and progression-free survival time between high and low MAFG risk score groups in the TCGA-KIRC cohort. (H) Kaplan-Meier survival curve of disease-free survival time for patients with high and low MAFG risk scores in the GSE22541 cohort. (I) Kaplan-Meier survival curve of overall survival time for patients with high and low MAFG risk scores in the the ICGC cohort. 2.4. Constructing a prediction model for MAFG nomogram Furthermore, we combined the MAFG-RS with other clinical variables to construct a comprehensive model for monitoring ccRCC progression. Since most patients lacked AJCC-N stage factor information, we excluded it from the candidate factors for the model. We first performed univariate logistic regression and multivariate logistic regression screening on candidate factors (age, gender, tumor grade, pathological stage, AJCC-T stage, and MAFG-RS) in the TCGA-KIRC training cohort, ultimately including AJCC-T stage and MAFG-RS in the prediction model (Supplementary Table 6, Fig. 5A). Using the rms package and Regplot package in R, we developed a generalized linear model (GLM) regression that includes AJCC-T stage and MAFG-RS to create the MAFG nomogram (Figu. 5B). We performed calibration curves in the training cohort based on the MAFG nomogram score (R language risk Regression package) and calculated the AUC of the ROC curve and the probability calibration Brier score. As shown in Fig. 5C, the predicted values and observed values were closely aligned, with AUC and Brier scores of 85.8% and 10.9%, respectively. Similarly, we performed calibration curves in the TCGA-KIRC testing cohort and calculated AUC and Brier scores of 85.7% and 9.1%, respectively (Fig. 5D). In summary, the constructed metastasis-related MAFG nomogram prediction model can effectively predict patients' metastasis status. [insert Fig. 5.] Figure 5. Constructing the MAFG nomogram scoring model and validating its effectiveness. (A) MAFG risk score (MAFG-RS) and T stage results obtained through multivariate logistic regression screening. (B) Plotting the MAFG nomogram based on MAFG risk score (MAFG-RS) and T stage to predict metastasis and calculating MAFG nomogram scores. (C) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA training set and calculating AUC and probability calibration Brier scores. (D) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA testing set and calculating AUC and probability calibration Brier scores. Additionally, in an independent ICGC cohort, the AUC of the ROC curve for MAFG nomogram score predicting metastatic events was 0.949 (Fig. 6A). The calibration curve in Fig. 6B also showed that MAFG nomogram score can effectively predict metastatic events, with a Brier score of 5.2% for predicting metastasis, and the p-value of the Unreliability test (U test) was 0.954. Similarly, the results of the Hosmer and Lemeshow Test indicated that the MAFG nomogram score can effectively predict the occurrence of metastatic events in the ICGC cohort (X-squared = 0.056828, df = 8, p-value = 1). Furthermore, in both TCGA and ICGC cohorts, patients with high MAFG nomogram scores tended to have higher risks of death or recurrence/progression (Fig. 6C-F). By comparing K-M curves, we can easily conclude that the MAFG nomogram score can better differentiate between patients with different prognoses. [insert Fig. 6.] Figure 6. Further examining the predictive ability of the MAFG nomogram score for metastasis and survival prognosis. (A) ROC curve for predicting metastatic events using MAFG nomogram score in the ICGC cohort. (B) Performing calibration curve detection of MAFG nomogram score predicting metastasis performance in the ICGC cohort. (C-E) We compared the differences in overall survival, disease-free survival, and progression-free survival between patients with high and low MAFG nomogram scores in the TCGA-KIRC cohort. (F) Drawing KM survival curves of patients with high and low MAFG nomogram scores in the ICGC cohort. 2.5. Gene set enrichment analysis Using the MAFG nomogram score as a reference phenotype, we selected the transcriptomic data of patients from the total TCGA-KIRS cohort for gene set enrichment analysis. We observed that the adipocyte cytokine signaling pathway, glycosylphosphatidylinositol (GPI) anchor biosynthesis, phosphatidylinositol metabolism, fatty acid metabolism, renal cell carcinoma, and mTOR signaling pathway were upregulated in the high MAFG nomogram score group. However, α-linolenic acid metabolism, cytoplasmic DNA sensing pathway, and P53 signaling pathway were downregulated in the low MAFG nomogram score group (Fig. 7). Thus, changes in the adipocyte cytokine signaling pathway, mTOR signaling pathway, and P53 signaling pathway may be involved in the occurrence and development of iron death-related renal cancer. [insert Fig. 7.] Figure 7. GSEA results showing significant differences in biological processes between high and low MAFG nomogram score levels. 3. Discussion The malignant progression and high tumor recurrence rates make renal cell carcinoma the deadliest type of kidney cancer in the urinary system 10 . Previous studies mainly focused on screening biomarkers that are differentially expressed between tumor and non-tumor tissues 11 , 12 . However, when processing large transcriptomic analyses of cell populations, important genes may be overlooked 13 , 14 . Clinically, there may be some patients who, although diagnosed with localized renal cancer, actually have micro-metastases that are undetectable by imaging. This subset of patients may be more prone to tumor metastasis and recurrence after primary lesion resection. Therefore, studying the mechanisms that regulate RCC progression is an important basis for improving RCC treatment. Identifying more valuable metastasis-related genes can enhance our understanding of the genomic changes between primary and metastatic ccRCC. These key biomarkers can provide references for better treatment strategies or predicting cancer progression. Additionally, elucidating the potential mechanisms related to the metastasis and recurrence of renal cell carcinoma is relatively more meaningful. Recent studies have shown that ferroptosis is closely related to the occurrence and development of various tumors 5 . In our study, we analyzed 116 high-quality single-cell original scRNA data to illustrate the differential genomic features between primary and metastatic ccRCC, and then cross-referenced them with a ferroptosis gene database to obtain 13 MAFGs. We selected renal clear cell carcinoma tissue sequencing data from TCGA for WGCNA, univariate logistic regression, LASSO, and multivariate logistic regression analysis, ultimately identifying three MAFGs: DPP4,SLC1A5, and AIFM2. We used internal and independent external cohorts to validate our robust MAFG signature, thus constructing an integrated MAFG nomogram model, which combines AJCC-T stage and MAFG-RS to efficiently predict cancer-specific tumor progression. Multi-omics analysis indicated that high MAFG nomogram risk scores are associated with high TMB, which has been proven to be a risk factor for prognosis. These findings suggest that the scRNA-seq method combined with validation in cohort populations has been proven to be a powerful and sensitive strategy for obtaining important gene signatures with potential clinical value in ccRCC. Nowadays, many studies have involved differences in gene expression profiles and identified molecular biomarkers related to RCCs, suggesting that predicting the prognosis of patients by using molecular biomarkers has great prospects 15 . The single-cell RNA sequencing (scRNA-seq) analysis of cells was executed under stringent quality control conditions. Cells with a high proportion of mitochondrial DNA sequencing (> 5%) were excluded, as they constitute a confounding variable in the statistical outcomes. Subsequently, the t-distributed stochastic neighbor embedding (tSNE) and uniform manifold approximation and projection (UMAP) algorithmic analyses were carried out for nonlinear dimensionality reduction. This process successfully categorized renal clear cell carcinoma (ccRCC) cells into primary and metastatic subtypes based on the actual cell characteristics. On the basis of these findings, marker genes were screened between the two cell clusters to facilitate the subsequent analysis of MAFGs. It has been reported that some of the final three MAFGs play significant roles in the advancement of malignant tumors. For instance, research has demonstrated that DPP4 facilitates angiogenesis and inflammatory modulation in renal clear cell carcinoma, which indirectly substantiates its role in promoting cancer metastasis 16 . A study by Kawakami I et al. showed that the downregulation of SLC1A5 significantly inhibits tumor growth, invasion, and migration 17 . Furthermore, a high level of SLC1A5 expression is correlated with a poor prognosis 18 . AIFM2 enhances mitochondrial biosynthesis to promote hepatocellular carcinoma metastasis by activating the sirt1/PGC-1α signaling pathway 19 . Previous investigations have underlined the crucial roles of these genes in cancer metabolic regulation 20 – 22 . We observed differences in these four genes between in situ and metastatic cell populations, which are associated with a high probability of tumor progression, providing another new direction for our subsequent research. We used the GSE22541 cohort and ICGC cohort as external datasets to further test our MAFG signal and discovered the clinical value of MAFGs in predicting OS or DFS. Subsequent multivariable Cox regression analysis was performed to meticulously screen for clinically relevant and significant information pertaining to the metastasis of renal cell carcinoma (RCC). The MAFG nomogram developed therefrom exhibits enhanced predictive capabilities regarding the likelihood of patients experiencing metastasis. Nevertheless, the potential predictive value of MAFGs in the context of drug treatment remains ambiguous, representing an intriguing and highly valuable area for future research exploration. To further corroborate the efficacy of MAFGs, we carried out functional enrichment analyses within several prevalent biological pathways, such as the MTOR signaling pathway and the P53 signaling pathway, which are recognized as crucial signaling crosstalk pathways in the context of ccRCC. This approach aimed to elucidate the underlying mechanisms and potential associations of MAFGs within these pathways, thereby providing a more comprehensive understanding of their role in RCC biology and potentially uncovering novel therapeutic targets or biomarkers 23 – 25 . Additionally, pathways such as adipokine signaling pathway, glycosylphosphatidylinositol (GPI) anchor biosynthesis, phosphoinositide metabolism, and fatty acid metabolism were also enriched 26 , suggesting that the metabolic aspects of ccRCC might be a promising research direction. Remarkably, one of the prominent strengths of our research lies in the integration of scRNA-seq and validation within the cohort. We undertook further analyses of both internal and external datasets to convincingly demonstrate the robustness of the MAFG signal that we had identified 27 , 28 . ScRNA-seq offers distinct advantages in detecting potential hub markers that might be concealed within bulk sequencing data. Additionally, we incorporated multi-omics and large sample analyses to comprehensively characterize the MAFGs implicated in ccRCC metastasis. However, several limitations still exist and warrant further improvement. Firstly, the majority of the cells or tumor tissues were sourced from American or European populations, and it remains uncertain whether the identified MAFGs are applicable to Asian populations. Hence, it is essential to validate our findings in cohorts from local hospitals. Secondly, although this signal or nomogram has been rigorously validated in a large sample of the ccRCC population, further clinical experiments are indispensable to more deeply elucidate the specific mechanisms through which MAFGs drive tumor occurrence and development. In conclusion, this study represents the first attempt to screen marker genes based on scRNA-seq, TCGA data, and a ferroptosis gene library, and these findings have been validated in a substantial number of ccRCC samples. We not only delineated the genomic characteristics and heterogeneity between in situ RCC and metastatic RCC but also identified three MAFGs (DPP4, SLC1A5, AIFM2), thereby providing reliable signals for prognosis prediction and novel avenues for subsequent research. A deeper molecular understanding of these phenomena holds the potential to lead to the development of innovative anticancer therapies. 4. Materials and methods 4.1. Collection of cell samples and ccRCC cohort We retrieved the raw data of 118 single-cell transcriptome sequencing from the GSE73121 dataset within the Gene Expression Omnibus (GEO) database ( https://www.ncbi.nlm.nih.gov/geo/ ). Subsequent to the removal of low-quality cells through a stringent filtering process, we ultimately acquired 116 single-cell sequencing data specific to ccRCC. The transcriptome data were then integrated into a matrix, and the DESeq2 package was employed for normalization and the screening of differentially expressed genes. In addition, we obtained 48 tissue sequencing data of ccRCC accompanied by clinical information from the GSE22541 dataset in the GEO database. Moreover, we downloaded the expression profiles of 607 Kidney Renal Clear Cell Carcinoma (KIRC) samples from the TCGA database ( https://portal.gdc.cancer.gov/ ), among which 498 samples had metastasis information. We also downloaded the expression profiles of 91 ccRCC patients from the International Cancer Genome Consortium (ICGC) database ( https://icgc.org/ ). 4.2. Processing of single-cell RNA-seq data We used GRCh38 as the reference genome and extracted transcriptome sequencing data of 118 ccRCC single cells. We utilized the Seurat package to generate objects and filter out poor-quality cells 29 . We then performed standard data preprocessing, excluding genes detected in fewer than 3 cells and ignoring cells with fewer than 200 detected genes, while restricting the proportion of mitochondrial genes to less than 10%. Ultimately, we obtained 116 ccRCC single-cell transcriptome sequencing data. Furthermore, we identified genes with significant differences between different cell subpopulations and used these to distinguish different cells. Additionally, we performed dimensionality reduction analysis on the cells using t-distributed stochastic neighbor embedding (tSNE) and uniform manifold approximation and projection (UMAP) clustering methods 30 . tSNE and UMAP algorithms have the advantage of visualizing high-dimensional data, and the latter can retain the characteristics of the original data to the greatest extent while significantly reducing the feature dimension 31 , 32 . 4.3. Utilization of the weighted gene co-expression network analysis (WGCNA) method for further screening of MAFGs The Weighted Gene Co-expression Network Analysis (WGCNA) method has the advantage of avoiding the removal of genes that exhibit little variation in expression yet are closely associated with certain traits 33 . Initially, we clustered the samples based on the similarity of gene expression among the 498 samples with metastasis information from TCGA-KIRC and excluded the outlier samples. For the remaining samples, we constructed a scale-free network and calculated an appropriate soft-threshold, which serves as the classification criterion for network construction. After selecting the soft-threshold, we adopted a one-step approach to construct the scale-free topology matrix. By cutting the gene dendrogram with a minimum module size of 10 and a merge cut height of 0.25, we obtained the module clustering tree. Subsequently, we linked the patients' clinical information with the module clustering tree of gene expression. Based on this correlation, we were able to identify which genes within specific modules were related to the patients' specific survival information. Moreover, we could calculate the gene significance (GS) value of specific genes associated with specific traits. 4.4. Identification of MAFGs in the ccRCC patient cohort Considering that the differentially expressed genes related to metastasis have been detected through single-cell RNA sequencing (scRNA-seq) and that ferroptosis genes are closely associated with tumor metastasis, we obtained MAFGs by taking the intersection of these two gene sets. Subsequently, we further identified MAFGs in multiple ccRCC patient cohorts. Firstly, we randomly divided the TCGA-KIRC cohort with clinical information into training and testing groups at a 2:1 ratio. Then, we conducted univariate logistic regression, LASSO regression (the least absolute shrinkage and selection operator method), and multivariate logistic regression analyses on the TCGA training dataset using the glmnet package, aiming to identify the key ferroptosis genes related to metastasis (A LASSO regression model using the glmnet package was employed to identify the prognostic hub genes from the marker genes identified through scRNA-seq). Moreover, the MAFG risk score was calculated as follows: MAFG risk score = Σ(βi × Expi), where βi represents the coefficient derived from multivariate regression and indicates the weight of each included gene 9 . In the training dataset, we assessed the value of the MAFG risk score in predicting ccRCC metastasis by using ROC curves and calibration curves, and we evaluated differences in various survival outcomes through Kaplan-Meier analysis. Furthermore, we validated the predictive value of the MAFG risk score in external datasets (GSE22541 and the International Cancer Genome Consortium (ICGC) datasets). 4.5. Construction of an accurate predictive model for monitoring ccRCC metastasis We integrated the MAFG risk score with other clinical characteristics within the entire TCGA-KIRC cohort. Univariate and multivariate logistic regression methods were employed to evaluate the significant clinical variables. After excluding the missing and nonsensical variables, we established a comprehensive MAFG nomogram model using the Generalized Linear Model (GLM). The practical predictive significance of the MAFG nomogram score was appraised through the ROC plots and calibration curves in both the TCGA-KIRC and ICGC cohorts. In addition, Kaplan-Meier analysis was utilized to assess the survival differences between the high and low MAFG nomogram score groups. 4.6. Functional pathway analysis between two groups based on MAFG nomogram score The TCGA-KIRC cohort was divided into two groups with high and low MAFG nomogram score levels. GSEA was further carried out using the nomogram score as a phenotype in the R software. The enriched signaling pathways with a False Discovery Rate (FDR) < 0.05 were regarded as statistically significant. 4.7. Statistical analysis LASSO regression and logistic regression analyses were carried out by means of the glmnet package. Kaplan-Meier curves were constructed with the utilization of the survival package, and calibration curves were built using the riskRegression package. The GLM was established via the rms package. For continuous variables, comparison was performed using Student's t-test, whereas for categorical variables, the chi-square (χ2) test was employed. The Wilcoxon rank-sum test was utilized to compare ranked data between two groups, and the Kruskal-Wallis test was applied for comparisons among three or more groups. All statistical analyses were executed in R software (version 4.1.2), with the exception of Kaplan-Meier curves, which were plotted using GraphPad Prism 8 software. A P-value less than 0.05 was regarded as statistically significant. Declarations Author Contribution Wenxing Yue: Data curation, Formal analysis, Methodology, Writing-original draft. Meiyuan Huang, Qian Che, Xiyun Quan, Manling Tang & Taoli Wang: Formal analysis; Methodology. Siwei Zhang: Conceptualization; Funding acquisition, Project administration, Supervision, Writing-review & editing. Acknowledgement TCGA and GEO belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest.We acknowledge TCGA and GEO database for providing their platforms and contributors for uploading their meaningful datasets. Funding This study was supported by The Scientific Research Plan Project of Zhuzhou Central Hospital(202243). Disclosure TCGA, ICGC and GEO belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest. Data availability statement: TCGA, ICGC and GEO belong to public databases. The data in our article availability. Ethical approval: The patients in our article involved in TCGA and GEO have obtained ethical approval. Consent to Publish: applicable. Consent to Participate: applicable. Acknowledgments We acknowledge TCGA and GEO database for providing their platforms and contributors for uploading their meaningful datasets. References Barata, P. C. & Rini, B. I. Treatment of renal cell carcinoma: Current status and future directions. CA Cancer J Clin 67 , 507-524 (2017). Cooley, L. S. et al. Experimental and computational modeling for signature and biomarker discovery of renal cell carcinoma progression. Mol Cancer 20 , 136 (2021). Carril-Ajuria, L., Santos, M., Roldán-Romero, J. M., Rodriguez-Antona, C. & de Velasco, G. Prognostic and Predictive Value of PBRM1 in Clear Cell Renal Cell Carcinoma. Cancers (Basel) 12 (2019). Xue, D. et al. Circ-AKT3 inhibits clear cell renal cell carcinoma metastasis via altering miR-296-3p/E-cadherin signals. Mol Cancer 18 , 151 (2019). Stockwell, B. R. & Jiang, X. The Chemistry and Biology of Ferroptosis. Cell Chem Biol 27 , 365-375 (2020). Friedmann Angeli, J. P., Krysko, D. V. & Conrad, M. Ferroptosis at the crossroads of cancer-acquired drug resistance and immune evasion. Nat Rev Cancer 19 , 405-414 (2019). Lei, G., Zhuang, L. & Gan, B. Targeting ferroptosis as a vulnerability in cancer. Nat Rev Cancer 22 , 381-396 (2022). Zhang, J., Song, C., Tian, Y. & Yang, X. Single-Cell RNA Sequencing in Lung Cancer: Revealing Phenotype Shaping of Stromal Cells in the Microenvironment. Front Immunol 12 , 802080 (2021). Zhang, H. et al. N6-Methyladenosine-Related lncRNAs as potential biomarkers for predicting prognoses and immune responses in patients with cervical cancer. BMC Genom Data 23 , 8 (2022). Yao, X. et al. VHL Deficiency Drives Enhancer Activation of Oncogenes in Clear Cell Renal Cell Carcinoma. Cancer Discov 7 , 1284-1305 (2017). Turajlic, S. et al. Deterministic Evolutionary Trajectories Influence Primary Tumor Growth: TRACERx Renal. Cell 173 , 595-610.e511 (2018). Chen, W. et al. Targeting renal cell carcinoma with a HIF-2 antagonist. Nature 539 , 112-117 (2016). Harlander, S. et al. Combined mutation in Vhl, Trp53 and Rb1 causes clear cell renal cell carcinoma in mice. Nat Med 23 , 869-877 (2017). Smith, C. C. et al. Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. J Clin Invest 128 , 4804-4820 (2018). Jonasch, E., Walker, C. L. & Rathmell, W. K. Clear cell renal cell carcinoma ontogeny and mechanisms of lethality. Nat Rev Nephrol 17 , 245-261 (2021). Qiu, L. et al. Pro-Angiogenic and Pro-Inflammatory Regulation by lncRNA MCM3AP-AS1-Mediated Upregulation of DPP4 in Clear Cell Renal Cell Carcinoma. Front Oncol 10 , 705 (2020). Kawakami, I. et al. Targeting of the glutamine transporter SLC1A5 induces cellular senescence in clear cell renal cell carcinoma. Biochem Biophys Res Commun 611 , 99-106 (2022). Liu, Y. et al. High expression of Solute Carrier Family 1, member 5 (SLC1A5) is associated with poor prognosis in clear-cell renal cell carcinoma. Sci Rep 5 , 16954 (2015). Guo, S. et al. AIFM2 promotes hepatocellular carcinoma metastasis by enhancing mitochondrial biogenesis through activation of SIRT1/PGC-1α signaling. Oncogenesis 12 , 46 (2023). Kraja, A. T. et al. Associations of Mitochondrial and Nuclear Mitochondrial Variants and Genes with Seven Metabolic Traits. Am J Hum Genet 104 , 112-138 (2019). Triska, P. et al. Landscape of Germline and Somatic Mitochondrial DNA Mutations in Pediatric Malignancies. Cancer Res 79 , 1318-1330 (2019). Li, S. et al. Mitochondrial Dysfunctions Contribute to Hypertrophic Cardiomyopathy in Patient iPSC-Derived Cardiomyocytes with MT-RNR2 Mutation. Stem Cell Reports 10 , 808-821 (2018). LaGory, E. L. et al. Suppression of PGC-1α Is Critical for Reprogramming Oxidative Metabolism in Renal Cell Carcinoma. Cell Rep 12 , 116-127 (2015). Kim, H. et al. Unsaturated Fatty Acids Stimulate Tumor Growth through Stabilization of β-Catenin. Cell Rep 13 , 495-503 (2015). Tang, X. et al. Cystine Deprivation Triggers Programmed Necrosis in VHL-Deficient Renal Cell Carcinomas. Cancer Res 76 , 1892-1903 (2016). Tan, S. K., Hougen, H. Y., Merchan, J. R., Gonzalgo, M. L. & Welford, S. M. Fatty acid metabolism reprogramming in ccRCC: mechanisms and potential targets. Nat Rev Urol 20 , 48-60 (2023). Yuan, L. et al. Co-expression network analysis identified six hub genes in association with progression and prognosis in human clear cell renal cell carcinoma (ccRCC). Genom Data 14 , 132-140 (2017). Zeng, J. H. et al. Prognosis of clear cell renal cell carcinoma (ccRCC) based on a six-lncRNA-based risk score: an investigation based on RNA-sequencing data. J Transl Med 17 , 281 (2019). Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36 , 411-420 (2018). Pont, F., Tosolini, M. & Fournié, J. J. Single-Cell Signature Explorer for comprehensive visualization of single cell signatures across scRNA-seq datasets. Nucleic Acids Res 47 , e133 (2019). Kobak, D. & Berens, P. The art of using t-SNE for single-cell transcriptomics. Nat Commun 10 , 5416 (2019). Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol (2018). Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9 , 559 (2008). Additional Declarations No competing interests reported. Supplementary Files SupplementaryTable1.csv SupplementaryTable6.csv SupplementaryTable3.csv SupplementaryTable5.xls SupplementaryTable4.csv SupplementaryTable2.csv Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7083430","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":534819509,"identity":"068a3ada-f4ec-4258-b3ae-fbd9549794dd","order_by":0,"name":"Wenxing Yue","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Wenxing","middleName":"","lastName":"Yue","suffix":""},{"id":534819510,"identity":"65b81fb8-f5e7-4231-87f6-26c0c0258319","order_by":1,"name":"Meiyuan Huang","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Meiyuan","middleName":"","lastName":"Huang","suffix":""},{"id":534819511,"identity":"7885f3f7-2118-456d-8fcf-d9af66e1e1da","order_by":2,"name":"Qian Che","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Qian","middleName":"","lastName":"Che","suffix":""},{"id":534819512,"identity":"a5942188-b554-4310-862f-1d28fc2de365","order_by":3,"name":"Manling Tang","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Manling","middleName":"","lastName":"Tang","suffix":""},{"id":534819513,"identity":"8c866ffd-1720-482d-8c67-2458d1cfcf10","order_by":4,"name":"Xiyun Quan","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Xiyun","middleName":"","lastName":"Quan","suffix":""},{"id":534819514,"identity":"d689406b-b4d9-4e99-80f4-9b1b31119a5c","order_by":5,"name":"Taoli Wang","email":"","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":false,"prefix":"","firstName":"Taoli","middleName":"","lastName":"Wang","suffix":""},{"id":534819515,"identity":"00d9b26d-49f9-49d8-a26c-212490dd76c5","order_by":6,"name":"Siwei Zhang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA30lEQVRIie3PsWoCQRCA4TkGdpvFa1d8iYEDTeAwrzLHwVYWlikCORAuTR7gHsNHOB1idWBrYXE2qQ/SWEgSU0fOtbPYD6abn2EAguAe4XlaSDFGrOuO0qlfwuD08K3MVtXc5X6XGCSmpknEdOuouLY90Xho+RlHsGOSlGoELR/LvuRxoRLiRiVRxSwz2g/AOLfrS0iMsllpcrRcn5NPBGvGHsm3fS1tVsgDSVT4JQWhMQICfsnfLxtGq0tYvZPL1dVftnJou5cffJL4qzue0mmsZdOb/KduWw+CIAgu+QVi7Er9YqVDUQAAAABJRU5ErkJggg==","orcid":"","institution":"Zhuzhou Central Hospital","correspondingAuthor":true,"prefix":"","firstName":"Siwei","middleName":"","lastName":"Zhang","suffix":""}],"badges":[],"createdAt":"2025-07-09 11:38:20","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7083430/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7083430/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":94681443,"identity":"6c813bf6-5e33-47fd-8ba8-54f29273fc83","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":124737,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscript.docx","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/03dfde518ec436b00585b9bc.docx"},{"id":94728336,"identity":"4361245c-f119-4824-8bea-9eec80b0d651","added_by":"auto","created_at":"2025-10-30 07:03:35","extension":"json","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":8639,"visible":true,"origin":"","legend":"","description":"","filename":"3c525f7da93f4c98b4fd3591367e9be6.json","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/25b1e0d3fa454875e07723d9.json"},{"id":94728447,"identity":"a85bbadb-8539-4de8-9f48-d25f53992fc0","added_by":"auto","created_at":"2025-10-30 07:03:49","extension":"csv","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2078,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable1.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/6dda68c9829180c0e827cfcb.csv"},{"id":94728515,"identity":"0ed1512a-f0d2-4358-8756-8b3cc7ed6e1c","added_by":"auto","created_at":"2025-10-30 07:03:57","extension":"csv","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":74066,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable2.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/230a375d7b3d7f4d8064c156.csv"},{"id":94728135,"identity":"f11299d5-114e-456f-be47-53fa17eaf96f","added_by":"auto","created_at":"2025-10-30 07:03:08","extension":"csv","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":728,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable3.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/3c5ab2022428aa4016769af9.csv"},{"id":94728397,"identity":"ca40f7a0-e256-4246-8c3d-4fb4f96639a1","added_by":"auto","created_at":"2025-10-30 07:03:44","extension":"csv","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":80456,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable4.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/38f2a883a8f3a4c2a23c8de3.csv"},{"id":94681456,"identity":"e4d9bed9-91f4-447c-9662-a21d271f3f63","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"xls","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":22016,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable5.xls","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/d9e992fa3b24390a4c76d0ab.xls"},{"id":94681459,"identity":"fd62d6d0-a3ea-4d3b-9411-0e4ee0d7eb75","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1493,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable6.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/aaa5d62d40644848d5d7f384.csv"},{"id":94681464,"identity":"ff309daf-cb1a-4978-b7e6-2b773c8fd777","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"xml","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":83201,"visible":true,"origin":"","legend":"","description":"","filename":"3c525f7da93f4c98b4fd3591367e9be61enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/2a90639258d0c72e2097d853.xml"},{"id":94681462,"identity":"d9d1ec6c-ddfa-4fdb-8cac-a6039fb7a2b5","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"pdf","order_by":16,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":574447,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/dd1a650773aaac16895b8d00.pdf"},{"id":94728127,"identity":"061070b1-0f81-4855-aeb0-44facf05c14b","added_by":"auto","created_at":"2025-10-30 07:03:07","extension":"pdf","order_by":17,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1722021,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/e12b1491cc2617ecc796f51d.pdf"},{"id":94681465,"identity":"9519900d-f219-4ef5-b74a-4b80d27e8173","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"pdf","order_by":18,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":695561,"visible":true,"origin":"","legend":"","description":"","filename":"Figure3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/7250dfe1eb7f61c364d45cd2.pdf"},{"id":94728749,"identity":"1f21ed66-3f65-4b91-8337-7b6f7bcaff8e","added_by":"auto","created_at":"2025-10-30 07:04:14","extension":"pdf","order_by":19,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":627719,"visible":true,"origin":"","legend":"","description":"","filename":"Figure4.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/dab8f034daf94e65ce4f0262.pdf"},{"id":94728777,"identity":"1024be9b-4cdf-4465-9204-da5bdef8113b","added_by":"auto","created_at":"2025-10-30 07:04:16","extension":"pdf","order_by":20,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":401316,"visible":true,"origin":"","legend":"","description":"","filename":"Figure5.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/c6ee00a20af491a7cdb14072.pdf"},{"id":94681471,"identity":"26fbaf8f-ce92-44fd-9b27-4381b3427ece","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"pdf","order_by":21,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":420366,"visible":true,"origin":"","legend":"","description":"","filename":"Figure6.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/34e8b7f63b61d66533516474.pdf"},{"id":94681469,"identity":"f0fa92c8-987b-4f06-a6bd-9706f568e139","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"pdf","order_by":22,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":969091,"visible":true,"origin":"","legend":"","description":"","filename":"Figure7.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/4ec89d8f6bcf2381350eb4b8.pdf"},{"id":94728792,"identity":"eefe5320-5c40-4e80-80c6-7cb8abecd69b","added_by":"auto","created_at":"2025-10-30 07:04:17","extension":"xml","order_by":23,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":81415,"visible":true,"origin":"","legend":"","description":"","filename":"3c525f7da93f4c98b4fd3591367e9be61structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/1b1baaa3b1d16c8dd931348f.xml"},{"id":94681468,"identity":"9ea4527b-157e-4d52-a98e-94677b354160","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"html","order_by":24,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":91230,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/cc753d0098548c778f7e51f4.html"},{"id":94728424,"identity":"35f98257-5b1f-4c2c-8f86-41ff06569ffa","added_by":"auto","created_at":"2025-10-30 07:03:47","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":182793,"visible":true,"origin":"","legend":"\u003cp\u003ePreliminary screening of transfer-related ferroptosis genes in single-cell sequencing. (A) Quality control of single-cell sequencing data GSE73121, removing substandard cells. (B-C) Dimensionality reduction analysis of the selected single cells using tSNE and UMAP methods, successfully clustering the cells into two clusters. (D) Screening of differential genes between the primary and metastatic clusters (|log fold change (FC)| \u0026gt; 0.5, FDR \u0026lt; 0.05). (E) Intersection of ferroptosis genes with the previously screened transfer-related differential genes.\u003c/p\u003e","description":"","filename":"Binder11.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/7c419aafcd5c1e8366a9ec78.png"},{"id":94728333,"identity":"981eef51-5037-4bd7-a264-3d2cf47b45c6","added_by":"auto","created_at":"2025-10-30 07:03:34","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":604372,"visible":true,"origin":"","legend":"\u003cp\u003eWGCNA method screening and validation of transfer-related ferroptosis genes. (A) Heatmap showing the relationship between gene modules constructed by the WGCNA method and traits. (B) Correlation analysis between transfer and various gene modules.\u003c/p\u003e","description":"","filename":"Binder12.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/a9bd12fa4228979ff1fc72de.png"},{"id":94681446,"identity":"168d1f88-b5b7-4696-8c91-1a63fe367aa6","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":203678,"visible":true,"origin":"","legend":"\u003cp\u003eScreening and validation of MAFG risk score constructed in TCGA. (A) Univariate logistic regression screening of MAFGs. (B) Further analysis of MAFGs using the LASSO method. (D) Multivariate logistic regression screening to obtain the final 3 MAFGs. (E) ROC curve analysis of MAFG risk score predicting metastasis in TCGA training and testing cohorts. (F) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA training cohort. (G) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA testing cohort.\u003c/p\u003e","description":"","filename":"Binder13.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/b96fa9906ab362024fa162c5.png"},{"id":94728747,"identity":"a88b7e79-05e0-43ff-bc4c-63cd9c1799e3","added_by":"auto","created_at":"2025-10-30 07:04:14","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":185145,"visible":true,"origin":"","legend":"\u003cp\u003eFurther validation of the MAFG risk score's predictive ability for metastasis and survival prognosis. (A) ROC curve analysis of MAFG risk score predicting metastasis events in the GSE22541 cohort. (B) ROC curve analysis of MAFG risk score predicting metastasis events in the ICGC cohort. (C) Calibration curve analysis of MAFG risk score predicting metastasis in the GSE22541 cohort. (D) Calibration curve analysis of MAFG risk score predicting metastasis in the ICGC cohort. (E-G) Comparison of overall survival time, disease-specific survival time, and progression-free survival time between high and low MAFG risk score groups in the TCGA-KIRC cohort. (H) Kaplan-Meier survival curve of disease-free survival time for patients with high and low MAFG risk scores in the GSE22541 cohort. (I) Kaplan-Meier survival curve of overall survival time for patients with high and low MAFG risk scores in the the ICGC cohort.\u003c/p\u003e","description":"","filename":"Binder14.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/d26aef5d7a38c6b4c71d0c49.png"},{"id":94728278,"identity":"120efd27-facb-4c8f-802f-58edbd588bfa","added_by":"auto","created_at":"2025-10-30 07:03:27","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":116939,"visible":true,"origin":"","legend":"\u003cp\u003eConstructing the MAFG nomogram scoring model and validating its effectiveness. (A) MAFG risk score (MAFG-RS) and T stage results obtained through multivariate logistic regression screening. (B) Plotting the MAFG nomogram based on MAFG risk score (MAFG-RS) and T stage to predict metastasis and calculating MAFG nomogram scores. (C) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA training set and calculating AUC and probability calibration Brier scores. (D) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA testing set and calculating AUC and probability calibration Brier scores.\u003c/p\u003e","description":"","filename":"Binder15.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/27e9836d17ec2e6fa2c99e49.png"},{"id":94681454,"identity":"4b31256a-71a0-42f1-a344-5b26c5f6d733","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":131066,"visible":true,"origin":"","legend":"\u003cp\u003eFurther examining the predictive ability of the MAFG nomogram score for metastasis and survival prognosis. (A) ROC curve for predicting metastatic events using MAFG nomogram score in the ICGC cohort. (B) Performing calibration curve detection of MAFG nomogram score predicting metastasis performance in the ICGC cohort. (C-E) We compared the differences in overall survival, disease-free survival, and progression-free survival between patients with high and low MAFG nomogram scores in the TCGA-KIRC cohort. (F) Drawing KM survival curves of patients with high and low MAFG nomogram scores in the ICGC cohort.\u003c/p\u003e","description":"","filename":"Binder16.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/a494e2426129195fe70bcbe9.png"},{"id":94681453,"identity":"fa448ace-2cb9-419a-9a0a-8355bbf513ef","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":263753,"visible":true,"origin":"","legend":"\u003cp\u003eGSEA results showing significant differences in biological processes between high and low MAFG nomogram score levels.\u003c/p\u003e","description":"","filename":"Binder17.png","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/5890701cc7c971569e92728e.png"},{"id":99797099,"identity":"b59a74af-7d5a-4bec-9b66-9dd56d6e105f","added_by":"auto","created_at":"2026-01-08 13:44:34","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2637348,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/f9ce4b02-d08a-48f4-95d1-9d2fc1eea9c4.pdf"},{"id":94681440,"identity":"2aa1b766-c652-45b8-a5d2-781740abd4e5","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":2078,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable1.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/96e3f882e7c9193d239d0a2b.csv"},{"id":94681444,"identity":"f1aac110-518a-4e23-ad7f-da02e54fe25f","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1493,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable6.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/da1dca43865c24ed8af8d05a.csv"},{"id":94681441,"identity":"5992081e-a94a-4765-a125-980c99af40bd","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":728,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable3.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/40f880f02f29fb18fb890d24.csv"},{"id":94728148,"identity":"96472cb3-357e-41b9-b516-09f94af35aeb","added_by":"auto","created_at":"2025-10-30 07:03:11","extension":"xls","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":22016,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable5.xls","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/417855e6f281aa19520e63df.xls"},{"id":94681449,"identity":"0d85d879-3e78-4cee-b94e-73a1c06e931a","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":80456,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable4.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/f31bf66af2045981d90ddcb8.csv"},{"id":94681457,"identity":"f7ead52b-82ff-496c-a244-2277b82d13d3","added_by":"auto","created_at":"2025-10-29 14:51:55","extension":"csv","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":74066,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable2.csv","url":"https://assets-eu.researchsquare.com/files/rs-7083430/v1/e779fd9533c8769b588673dc.csv"}],"financialInterests":"No competing interests reported.","formattedTitle":"Ferroptosis-related gene signature predicts clinical metastasis of clear cell renal cell carcinoma based on single-cell RNA-seq","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eRenal cell carcinoma (RCC) accounts for 2%-3% of all human malignancies, and its incidence has been increasing globally in recent years\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. The most common histopathological type is clear cell renal cell carcinoma (ccRCC), which constitutes 80%-90% of kidney cancer\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. Renal cancer is insensitive to radiotherapy and chemotherapy, lacking effective systemic treatment methods. Therefore, localized renal cancer has a good prognosis after surgical treatment, with a 5-year survival rate exceeding 70%. Nearly 20% of ccRCC cases progress to advanced stages or metastasize at diagnosis, with a 5-year overall survival (OS) rate of less than 25%\u003csup\u003e4\u003c/sup\u003e. Currently, the diagnosis of metastatic renal cancer still relies on imaging examination, lacking precise molecular biological markers that can react early to renal cancer metastasis. Therefore, it is possible that some patients, although diagnosed with localized renal cancer, have already developed micro-metastatic lesions that imaging cannot detect. These patients may be more prone to tumor metastasis recurrence after the primary lesion is surgically removed. Previous researchers have attempted to study several key biomarkers related to metastasis from a large number of transcriptomic profiles. Screening and identifying valuable metastasis-related genes can expand our comprehensive understanding of the genomic changes between primary and metastatic ccRCC. Furthermore, these perilous biomarkers offer additional avenues for refining strategies and effectively forecasting progression events. Consequently, there exists an imperative to investigate the molecular mechanisms behind ccRCC metastasis and to identify new therapeutic targets aimed at enhancing patient prognosis.\u003c/p\u003e\u003cp\u003eFerroptosis is a regulatory form of cell death, characterized by the accumulation of iron-dependent lipid peroxides to lethal levels. It represents a novel mode of cell death, distinct from apoptosis, necrosis, and autophagy. The interplay between ferroptosis and lipid metabolism plays a crucial role in tumor development, invasion, metastasis, drug resistance, and tumor immunity\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. Currently, inducing ferroptosis holds significant potential for cancer therapy as a novel therapeutic strategy to trigger cancer cell death, particularly in tumors that have developed resistance to conventional treatments, with notable efficacy\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. Ferroptosis-related genes are linked to tumor metastasis and prognosis. Nevertheless, the relationship between ferroptosis genes and the metastasis of clear cell renal carcinoma, along with its prognostic implications, remains to be fully clarified.\u003c/p\u003e\u003cp\u003eSingle-cell sequencing separates tissues into single cells for transcriptomic sequencing, revealing the gene expression status of individual cells in tissues or bodily fluids at the single-cell level. The emergence of single-cell sequencing technology (SCS) allows for precise cell-level studies of the genomics, transcriptomics, and epigenomics of tumor cells\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. This assists in the precision and individualized treatment of tumors. Therefore, we studied important marker genes in primary and metastatic ccRCC cell subpopulations through single-cell expression profiling.\u003c/p\u003e\u003cp\u003eIn the current study, we pinpointed genes with differential expression in primary and metastatic ccRCC by leveraging the single-cell sequencing dataset designated as GSE73121. Additionally, we compiled transcriptomic and clinical data from roughly 700 patient tissue samples sourced from multiple databases, such as GSE22541, The Cancer Genome Atlas (TCGA), and the International Cancer Genome Consortium (ICGC). Our investigation entailed a comprehensive multi-omics analysis to identify ferroptosis-related genes associated with metastasis (MAFGs), with the goal of verifying the predictive accuracy of these biomarkers for ccRCC metastasis and tumor recurrence. This validation is anticipated to refine the selection of precise treatment regimens and enhance the overall prognosis for patients.\u003c/p\u003e"},{"header":"2. Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1. Screening transfer-related genes from single-cell sequencing data\u003c/h2\u003e\u003cp\u003eWe acquired 118 single-cell sequencing datasets from GSE73121. Upon conducting quality control based on the number of genes and sequencing counts for each cell (Fig.\u0026nbsp;1A), we retained 116 single-cell sequencing datasets for further analysis. All cells within the GSE73121 dataset were classified as epithelial cells. Furthermore, we employed tSNE and UMAP techniques to precisely segregate the cellular populations into two distinct subgroups (Fig.\u0026nbsp;1B, C).These two subgroups correspond to primary and metastatic cells (Supplementary Table\u0026nbsp;1). Accordingly, we performed differential analysis using the DESeq2 package in R, identifying a total of 2383 transfer-related differential genes, with log fold change | (FC) | \u0026gt;0.5 and adjPval\u0026thinsp;\u0026lt;\u0026thinsp;0.05 (Fig.\u0026nbsp;1D, Supplementary Table\u0026nbsp;2). Furthermore, we identified the intersection between the screened transfer-related differential genes and the ferroptosis-related gene set, yielding 13 MAFGs (Fig.\u0026nbsp;1E).\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;1.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;1. Preliminary screening of transfer-related ferroptosis genes in single-cell sequencing. (A) Quality control of single-cell sequencing data GSE73121, removing substandard cells. (B-C) Dimensionality reduction analysis of the selected single cells using tSNE and UMAP methods, successfully clustering the cells into two clusters. (D) Screening of differential genes between the primary and metastatic clusters (|log fold change (FC)| \u0026gt;0.5, FDR\u0026thinsp;\u0026lt;\u0026thinsp;0.05). (E) Intersection of ferroptosis genes with the previously screened transfer-related differential genes.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2. WGCNA screening of transfer-related differentially expressed ferroptosis genes\u003c/h2\u003e\u003cp\u003eUsing the gene weighted co-expression network analysis (WGCNA) package in R, we constructed a network based on KIRC expression data from TCGA and screened gene modules in conjunction with patient clinical survival information (Fig.\u0026nbsp;2A). Furthermore, we screened for transfer-related gene modules, as shown in Fig.\u0026nbsp;2B, where the pink module is most closely related to transfer. By integrating gene significance (GS value) and module membership (MM value), 9 out of the 13 MAFGs met the criteria for further analysis (Supplementary Table\u0026nbsp;3).\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;2.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;2. WGCNA method screening and validation of transfer-related ferroptosis genes. (A) Heatmap showing the relationship between gene modules constructed by the WGCNA method and traits. (B) Correlation analysis between transfer and various gene modules.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3. Screening of metastasis-associated ferroptosis genes using LASSO method and logistic regression\u003c/h2\u003e\u003cp\u003eFirst, we collected the expression profiles of 9 MAFGs and integrated them with clinical information from the TCGA-Kidney Renal Clear Cell Carcinoma (KIRC) dataset into a table (Supplementary Table\u0026nbsp;4). The patient data from Supplementary Table\u0026nbsp;3 was randomly divided at a ratio of 2:1 into a training cohort and a testing cohort, with corresponding statistical information provided in Supplementary Table\u0026nbsp;5 (patients without metastasis information were not included in the statistical analysis). We first performed univariate logistic regression on the 9 genes using the R software in the training cohort to obtain 7 genes (Fig.\u0026nbsp;3A). Next, we conducted the least absolute shrinkage and selection operator (LASSO) method and multivariate logistic regression analysis on the 7 genes (Fig.\u0026nbsp;3B-D). By combining the two methods, we ultimately obtained 3 MAFGs (DPP4, SLC1A5, AIFM2). Then, based on multivariate logistic regression, we established a MAFG risk score (MAFG-RS)\u0026thinsp;=\u0026thinsp;Exp[DPP4]*(-0.248)\u0026thinsp;+\u0026thinsp;Exp[SLC1A5]*0.661\u0026thinsp;+\u0026thinsp;Exp[AIFM2]*0.737 \u003csup\u003e9\u003c/sup\u003e. The area under the receiver operating characteristic (ROC) curve (AUC) was 0.727 and 0.738 for predicting metastasis events in the training and testing cohorts, respectively (Fig.\u0026nbsp;3E). Additionally, calibration curves plotted using the calibrate function in R also showed good predictive performance for actual situations (Fig.\u0026nbsp;3F-G), with mean absolute errors of 0.023 and 0.03, respectively.\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;3.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;3. Screening and validation of MAFG risk score constructed in TCGA. (A) Univariate logistic regression screening of MAFGs. (B) Further analysis of MAFGs using the LASSO method. (D) Multivariate logistic regression screening to obtain the final 3 MAFGs. (E) ROC curve analysis of MAFG risk score predicting metastasis in TCGA training and testing cohorts. (F) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA training cohort. (G) Calibration curve analysis of MAFG risk score predicting metastasis in the TCGA testing cohort.\u003c/p\u003e\u003cp\u003eFurthermore, in the independent GSE22541 and ICGC cohorts, the AUCs of the receiver operating characteristic (ROC) curves for predicting metastasis events using the MAFG risk score were 0.729 and 0.666, respectively (Fig.\u0026nbsp;4A-B). Similarly, the calibration curves in Fig.\u0026nbsp;4C-D also demonstrated good predictive performance for patient metastasis, with mean absolute errors of 0.028 and 0.022, respectively. The Hosmer and Lemeshow test results showed that the MAFG risk score (MAFG-RS) had a good fit in both the GSE22541 (X-squared\u0026thinsp;=\u0026thinsp;8.1585, df\u0026thinsp;=\u0026thinsp;8, p-value\u0026thinsp;=\u0026thinsp;0.4181) and ICGC (X-squared\u0026thinsp;=\u0026thinsp;4.8809, df\u0026thinsp;=\u0026thinsp;8, p-value\u0026thinsp;=\u0026thinsp;0.7702) cohorts. Additionally, the MAFG risk score could effectively distinguish patients with different survival prognoses in the TCGA, GSE22541, and ICGC cohorts (Fig.\u0026nbsp;4E-I). As shown, a high MAFG-RS was associated with more death or recurrence/progression cases.\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;4.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;4. Further validation of the MAFG risk score's predictive ability for metastasis and survival prognosis. (A) ROC curve analysis of MAFG risk score predicting metastasis events in the GSE22541 cohort. (B) ROC curve analysis of MAFG risk score predicting metastasis events in the ICGC cohort. (C) Calibration curve analysis of MAFG risk score predicting metastasis in the GSE22541 cohort. (D) Calibration curve analysis of MAFG risk score predicting metastasis in the ICGC cohort. (E-G) Comparison of overall survival time, disease-specific survival time, and progression-free survival time between high and low MAFG risk score groups in the TCGA-KIRC cohort. (H) Kaplan-Meier survival curve of disease-free survival time for patients with high and low MAFG risk scores in the GSE22541 cohort. (I) Kaplan-Meier survival curve of overall survival time for patients with high and low MAFG risk scores in the the ICGC cohort.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4. Constructing a prediction model for MAFG nomogram\u003c/h2\u003e\u003cp\u003eFurthermore, we combined the MAFG-RS with other clinical variables to construct a comprehensive model for monitoring ccRCC progression. Since most patients lacked AJCC-N stage factor information, we excluded it from the candidate factors for the model. We first performed univariate logistic regression and multivariate logistic regression screening on candidate factors (age, gender, tumor grade, pathological stage, AJCC-T stage, and MAFG-RS) in the TCGA-KIRC training cohort, ultimately including AJCC-T stage and MAFG-RS in the prediction model (Supplementary Table\u0026nbsp;6, Fig.\u0026nbsp;5A). Using the rms package and Regplot package in R, we developed a generalized linear model (GLM) regression that includes AJCC-T stage and MAFG-RS to create the MAFG nomogram (Figu. 5B). We performed calibration curves in the training cohort based on the MAFG nomogram score (R language risk Regression package) and calculated the AUC of the ROC curve and the probability calibration Brier score. As shown in Fig.\u0026nbsp;5C, the predicted values and observed values were closely aligned, with AUC and Brier scores of 85.8% and 10.9%, respectively. Similarly, we performed calibration curves in the TCGA-KIRC testing cohort and calculated AUC and Brier scores of 85.7% and 9.1%, respectively (Fig.\u0026nbsp;5D). In summary, the constructed metastasis-related MAFG nomogram prediction model can effectively predict patients' metastasis status.\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;5.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;5. Constructing the MAFG nomogram scoring model and validating its effectiveness. (A) MAFG risk score (MAFG-RS) and T stage results obtained through multivariate logistic regression screening. (B) Plotting the MAFG nomogram based on MAFG risk score (MAFG-RS) and T stage to predict metastasis and calculating MAFG nomogram scores. (C) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA training set and calculating AUC and probability calibration Brier scores. (D) Performing calibration curve detection of the MAFG nomogram score predicting metastasis performance in the TCGA testing set and calculating AUC and probability calibration Brier scores.\u003c/p\u003e\u003cp\u003eAdditionally, in an independent ICGC cohort, the AUC of the ROC curve for MAFG nomogram score predicting metastatic events was 0.949 (Fig.\u0026nbsp;6A). The calibration curve in Fig.\u0026nbsp;6B also showed that MAFG nomogram score can effectively predict metastatic events, with a Brier score of 5.2% for predicting metastasis, and the p-value of the Unreliability test (U test) was 0.954. Similarly, the results of the Hosmer and Lemeshow Test indicated that the MAFG nomogram score can effectively predict the occurrence of metastatic events in the ICGC cohort (X-squared\u0026thinsp;=\u0026thinsp;0.056828, df\u0026thinsp;=\u0026thinsp;8, p-value\u0026thinsp;=\u0026thinsp;1). Furthermore, in both TCGA and ICGC cohorts, patients with high MAFG nomogram scores tended to have higher risks of death or recurrence/progression (Fig.\u0026nbsp;6C-F). By comparing K-M curves, we can easily conclude that the MAFG nomogram score can better differentiate between patients with different prognoses.\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;6.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;6. Further examining the predictive ability of the MAFG nomogram score for metastasis and survival prognosis. (A) ROC curve for predicting metastatic events using MAFG nomogram score in the ICGC cohort. (B) Performing calibration curve detection of MAFG nomogram score predicting metastasis performance in the ICGC cohort. (C-E) We compared the differences in overall survival, disease-free survival, and progression-free survival between patients with high and low MAFG nomogram scores in the TCGA-KIRC cohort. (F) Drawing KM survival curves of patients with high and low MAFG nomogram scores in the ICGC cohort.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5. Gene set enrichment analysis\u003c/h2\u003e\u003cp\u003eUsing the MAFG nomogram score as a reference phenotype, we selected the transcriptomic data of patients from the total TCGA-KIRS cohort for gene set enrichment analysis. We observed that the adipocyte cytokine signaling pathway, glycosylphosphatidylinositol (GPI) anchor biosynthesis, phosphatidylinositol metabolism, fatty acid metabolism, renal cell carcinoma, and mTOR signaling pathway were upregulated in the high MAFG nomogram score group. However, α-linolenic acid metabolism, cytoplasmic DNA sensing pathway, and P53 signaling pathway were downregulated in the low MAFG nomogram score group (Fig.\u0026nbsp;7). Thus, changes in the adipocyte cytokine signaling pathway, mTOR signaling pathway, and P53 signaling pathway may be involved in the occurrence and development of iron death-related renal cancer.\u003c/p\u003e\u003cp\u003e[insert Fig.\u0026nbsp;7.]\u003c/p\u003e\u003cp\u003eFigure\u0026nbsp;7. GSEA results showing significant differences in biological processes between high and low MAFG nomogram score levels.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Discussion","content":"\u003cp\u003eThe malignant progression and high tumor recurrence rates make renal cell carcinoma the deadliest type of kidney cancer in the urinary system\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. Previous studies mainly focused on screening biomarkers that are differentially expressed between tumor and non-tumor tissues\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. However, when processing large transcriptomic analyses of cell populations, important genes may be overlooked\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Clinically, there may be some patients who, although diagnosed with localized renal cancer, actually have micro-metastases that are undetectable by imaging. This subset of patients may be more prone to tumor metastasis and recurrence after primary lesion resection. Therefore, studying the mechanisms that regulate RCC progression is an important basis for improving RCC treatment. Identifying more valuable metastasis-related genes can enhance our understanding of the genomic changes between primary and metastatic ccRCC. These key biomarkers can provide references for better treatment strategies or predicting cancer progression. Additionally, elucidating the potential mechanisms related to the metastasis and recurrence of renal cell carcinoma is relatively more meaningful. Recent studies have shown that ferroptosis is closely related to the occurrence and development of various tumors\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eIn our study, we analyzed 116 high-quality single-cell original scRNA data to illustrate the differential genomic features between primary and metastatic ccRCC, and then cross-referenced them with a ferroptosis gene database to obtain 13 MAFGs. We selected renal clear cell carcinoma tissue sequencing data from TCGA for WGCNA, univariate logistic regression, LASSO, and multivariate logistic regression analysis, ultimately identifying three MAFGs: DPP4,SLC1A5, and AIFM2. We used internal and independent external cohorts to validate our robust MAFG signature, thus constructing an integrated MAFG nomogram model, which combines AJCC-T stage and MAFG-RS to efficiently predict cancer-specific tumor progression. Multi-omics analysis indicated that high MAFG nomogram risk scores are associated with high TMB, which has been proven to be a risk factor for prognosis. These findings suggest that the scRNA-seq method combined with validation in cohort populations has been proven to be a powerful and sensitive strategy for obtaining important gene signatures with potential clinical value in ccRCC.\u003c/p\u003e\u003cp\u003eNowadays, many studies have involved differences in gene expression profiles and identified molecular biomarkers related to RCCs, suggesting that predicting the prognosis of patients by using molecular biomarkers has great prospects\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. The single-cell RNA sequencing (scRNA-seq) analysis of cells was executed under stringent quality control conditions. Cells with a high proportion of mitochondrial DNA sequencing (\u0026gt;\u0026thinsp;5%) were excluded, as they constitute a confounding variable in the statistical outcomes. Subsequently, the t-distributed stochastic neighbor embedding (tSNE) and uniform manifold approximation and projection (UMAP) algorithmic analyses were carried out for nonlinear dimensionality reduction. This process successfully categorized renal clear cell carcinoma (ccRCC) cells into primary and metastatic subtypes based on the actual cell characteristics. On the basis of these findings, marker genes were screened between the two cell clusters to facilitate the subsequent analysis of MAFGs. It has been reported that some of the final three MAFGs play significant roles in the advancement of malignant tumors. For instance, research has demonstrated that DPP4 facilitates angiogenesis and inflammatory modulation in renal clear cell carcinoma, which indirectly substantiates its role in promoting cancer metastasis\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. A study by Kawakami I et al. showed that the downregulation of SLC1A5 significantly inhibits tumor growth, invasion, and migration\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. Furthermore, a high level of SLC1A5 expression is correlated with a poor prognosis\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. AIFM2 enhances mitochondrial biosynthesis to promote hepatocellular carcinoma metastasis by activating the sirt1/PGC-1α signaling pathway\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. Previous investigations have underlined the crucial roles of these genes in cancer metabolic regulation\u003csup\u003e\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. We observed differences in these four genes between in situ and metastatic cell populations, which are associated with a high probability of tumor progression, providing another new direction for our subsequent research.\u003c/p\u003e\u003cp\u003eWe used the GSE22541 cohort and ICGC cohort as external datasets to further test our MAFG signal and discovered the clinical value of MAFGs in predicting OS or DFS. Subsequent multivariable Cox regression analysis was performed to meticulously screen for clinically relevant and significant information pertaining to the metastasis of renal cell carcinoma (RCC). The MAFG nomogram developed therefrom exhibits enhanced predictive capabilities regarding the likelihood of patients experiencing metastasis. Nevertheless, the potential predictive value of MAFGs in the context of drug treatment remains ambiguous, representing an intriguing and highly valuable area for future research exploration. To further corroborate the efficacy of MAFGs, we carried out functional enrichment analyses within several prevalent biological pathways, such as the MTOR signaling pathway and the P53 signaling pathway, which are recognized as crucial signaling crosstalk pathways in the context of ccRCC. This approach aimed to elucidate the underlying mechanisms and potential associations of MAFGs within these pathways, thereby providing a more comprehensive understanding of their role in RCC biology and potentially uncovering novel therapeutic targets or biomarkers\u003csup\u003e\u003cspan additionalcitationids=\"CR24\" citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. Additionally, pathways such as adipokine signaling pathway, glycosylphosphatidylinositol (GPI) anchor biosynthesis, phosphoinositide metabolism, and fatty acid metabolism were also enriched\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e, suggesting that the metabolic aspects of ccRCC might be a promising research direction.\u003c/p\u003e\u003cp\u003eRemarkably, one of the prominent strengths of our research lies in the integration of scRNA-seq and validation within the cohort. We undertook further analyses of both internal and external datasets to convincingly demonstrate the robustness of the MAFG signal that we had identified\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e,\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e. ScRNA-seq offers distinct advantages in detecting potential hub markers that might be concealed within bulk sequencing data. Additionally, we incorporated multi-omics and large sample analyses to comprehensively characterize the MAFGs implicated in ccRCC metastasis. However, several limitations still exist and warrant further improvement. Firstly, the majority of the cells or tumor tissues were sourced from American or European populations, and it remains uncertain whether the identified MAFGs are applicable to Asian populations. Hence, it is essential to validate our findings in cohorts from local hospitals. Secondly, although this signal or nomogram has been rigorously validated in a large sample of the ccRCC population, further clinical experiments are indispensable to more deeply elucidate the specific mechanisms through which MAFGs drive tumor occurrence and development.\u003c/p\u003e\u003cp\u003eIn conclusion, this study represents the first attempt to screen marker genes based on scRNA-seq, TCGA data, and a ferroptosis gene library, and these findings have been validated in a substantial number of ccRCC samples. We not only delineated the genomic characteristics and heterogeneity between in situ RCC and metastatic RCC but also identified three MAFGs (DPP4, SLC1A5, AIFM2), thereby providing reliable signals for prognosis prediction and novel avenues for subsequent research. A deeper molecular understanding of these phenomena holds the potential to lead to the development of innovative anticancer therapies.\u003c/p\u003e"},{"header":"4. Materials and methods","content":"\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e4.1. Collection of cell samples and ccRCC cohort\u003c/h2\u003e\u003cp\u003eWe retrieved the raw data of 118 single-cell transcriptome sequencing from the GSE73121 dataset within the Gene Expression Omnibus (GEO) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.ncbi.nlm.nih.gov/geo/\u003c/span\u003e\u003cspan address=\"https://www.ncbi.nlm.nih.gov/geo/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). Subsequent to the removal of low-quality cells through a stringent filtering process, we ultimately acquired 116 single-cell sequencing data specific to ccRCC. The transcriptome data were then integrated into a matrix, and the DESeq2 package was employed for normalization and the screening of differentially expressed genes. In addition, we obtained 48 tissue sequencing data of ccRCC accompanied by clinical information from the GSE22541 dataset in the GEO database. Moreover, we downloaded the expression profiles of 607 Kidney Renal Clear Cell Carcinoma (KIRC) samples from the TCGA database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://portal.gdc.cancer.gov/\u003c/span\u003e\u003cspan address=\"https://portal.gdc.cancer.gov/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e), among which 498 samples had metastasis information. We also downloaded the expression profiles of 91 ccRCC patients from the International Cancer Genome Consortium (ICGC) database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://icgc.org/\u003c/span\u003e\u003cspan address=\"https://icgc.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\u003ch2\u003e4.2. Processing of single-cell RNA-seq data\u003c/h2\u003e\u003cp\u003eWe used GRCh38 as the reference genome and extracted transcriptome sequencing data of 118 ccRCC single cells. We utilized the Seurat package to generate objects and filter out poor-quality cells\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. We then performed standard data preprocessing, excluding genes detected in fewer than 3 cells and ignoring cells with fewer than 200 detected genes, while restricting the proportion of mitochondrial genes to less than 10%. Ultimately, we obtained 116 ccRCC single-cell transcriptome sequencing data. Furthermore, we identified genes with significant differences between different cell subpopulations and used these to distinguish different cells. Additionally, we performed dimensionality reduction analysis on the cells using t-distributed stochastic neighbor embedding (tSNE) and uniform manifold approximation and projection (UMAP) clustering methods\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. tSNE and UMAP algorithms have the advantage of visualizing high-dimensional data, and the latter can retain the characteristics of the original data to the greatest extent while significantly reducing the feature dimension\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e,\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003e4.3. Utilization of the weighted gene co-expression network analysis (WGCNA) method for further screening of MAFGs\u003c/h2\u003e\u003cp\u003eThe Weighted Gene Co-expression Network Analysis (WGCNA) method has the advantage of avoiding the removal of genes that exhibit little variation in expression yet are closely associated with certain traits\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e. Initially, we clustered the samples based on the similarity of gene expression among the 498 samples with metastasis information from TCGA-KIRC and excluded the outlier samples. For the remaining samples, we constructed a scale-free network and calculated an appropriate soft-threshold, which serves as the classification criterion for network construction. After selecting the soft-threshold, we adopted a one-step approach to construct the scale-free topology matrix. By cutting the gene dendrogram with a minimum module size of 10 and a merge cut height of 0.25, we obtained the module clustering tree. Subsequently, we linked the patients' clinical information with the module clustering tree of gene expression. Based on this correlation, we were able to identify which genes within specific modules were related to the patients' specific survival information. Moreover, we could calculate the gene significance (GS) value of specific genes associated with specific traits.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003e4.4. Identification of MAFGs in the ccRCC patient cohort\u003c/h2\u003e\u003cp\u003eConsidering that the differentially expressed genes related to metastasis have been detected through single-cell RNA sequencing (scRNA-seq) and that ferroptosis genes are closely associated with tumor metastasis, we obtained MAFGs by taking the intersection of these two gene sets. Subsequently, we further identified MAFGs in multiple ccRCC patient cohorts. Firstly, we randomly divided the TCGA-KIRC cohort with clinical information into training and testing groups at a 2:1 ratio. Then, we conducted univariate logistic regression, LASSO regression (the least absolute shrinkage and selection operator method), and multivariate logistic regression analyses on the TCGA training dataset using the glmnet package, aiming to identify the key ferroptosis genes related to metastasis (A LASSO regression model using the glmnet package was employed to identify the prognostic hub genes from the marker genes identified through scRNA-seq). Moreover, the MAFG risk score was calculated as follows: MAFG risk score\u0026thinsp;=\u0026thinsp;Σ(βi \u0026times; Expi), where βi represents the coefficient derived from multivariate regression and indicates the weight of each included gene\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. In the training dataset, we assessed the value of the MAFG risk score in predicting ccRCC metastasis by using ROC curves and calibration curves, and we evaluated differences in various survival outcomes through Kaplan-Meier analysis. Furthermore, we validated the predictive value of the MAFG risk score in external datasets (GSE22541 and the International Cancer Genome Consortium (ICGC) datasets).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e4.5. Construction of an accurate predictive model for monitoring ccRCC metastasis\u003c/h2\u003e\u003cp\u003eWe integrated the MAFG risk score with other clinical characteristics within the entire TCGA-KIRC cohort. Univariate and multivariate logistic regression methods were employed to evaluate the significant clinical variables. After excluding the missing and nonsensical variables, we established a comprehensive MAFG nomogram model using the Generalized Linear Model (GLM). The practical predictive significance of the MAFG nomogram score was appraised through the ROC plots and calibration curves in both the TCGA-KIRC and ICGC cohorts. In addition, Kaplan-Meier analysis was utilized to assess the survival differences between the high and low MAFG nomogram score groups.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\u003ch2\u003e4.6. Functional pathway analysis between two groups based on MAFG nomogram score\u003c/h2\u003e\u003cp\u003eThe TCGA-KIRC cohort was divided into two groups with high and low MAFG nomogram score levels. GSEA was further carried out using the nomogram score as a phenotype in the R software. The enriched signaling pathways with a False Discovery Rate (FDR)\u0026thinsp;\u0026lt;\u0026thinsp;0.05 were regarded as statistically significant.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\u003ch2\u003e4.7. Statistical analysis\u003c/h2\u003e\u003cp\u003eLASSO regression and logistic regression analyses were carried out by means of the glmnet package. Kaplan-Meier curves were constructed with the utilization of the survival package, and calibration curves were built using the riskRegression package. The GLM was established via the rms package. For continuous variables, comparison was performed using Student's t-test, whereas for categorical variables, the chi-square (χ2) test was employed. The Wilcoxon rank-sum test was utilized to compare ranked data between two groups, and the Kruskal-Wallis test was applied for comparisons among three or more groups. All statistical analyses were executed in R software (version 4.1.2), with the exception of Kaplan-Meier curves, which were plotted using GraphPad Prism 8 software. A P-value less than 0.05 was regarded as statistically significant.\u003c/p\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eWenxing Yue: Data curation, Formal analysis, Methodology, Writing-original draft. Meiyuan Huang, Qian Che, Xiyun Quan, Manling Tang \u0026amp; Taoli Wang: Formal analysis; Methodology. Siwei Zhang: Conceptualization; Funding acquisition, Project administration, Supervision, Writing-review \u0026amp; editing.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eTCGA and GEO belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest.We acknowledge TCGA and GEO database for providing their platforms and contributors for uploading their meaningful datasets.\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by The Scientific Research Plan Project of Zhuzhou Central Hospital(202243).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDisclosure\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTCGA, ICGC and GEO belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement: \u003c/strong\u003eTCGA, ICGC and GEO belong to public databases. The data in our article availability.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical approval: \u003c/strong\u003eThe patients in our article involved in TCGA and GEO have obtained ethical approval.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to Publish:\u003c/strong\u003e applicable. \u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to Participate:\u003c/strong\u003e applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe acknowledge TCGA and GEO database for providing their platforms and contributors for uploading their meaningful datasets.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eBarata, P. C. \u0026amp; Rini, B. I. Treatment of renal cell carcinoma: Current status and future directions. \u003cem\u003eCA Cancer J Clin\u003c/em\u003e \u003cstrong\u003e67\u003c/strong\u003e, 507-524 (2017).\u003c/li\u003e\n\u003cli\u003eCooley, L. S. et al. Experimental and computational modeling for signature and biomarker discovery of renal cell carcinoma progression. \u003cem\u003eMol Cancer\u003c/em\u003e \u003cstrong\u003e20\u003c/strong\u003e, 136 (2021).\u003c/li\u003e\n\u003cli\u003eCarril-Ajuria, L., Santos, M., Rold\u0026aacute;n-Romero, J. M., Rodriguez-Antona, C. \u0026amp; de Velasco, G. Prognostic and Predictive Value of PBRM1 in Clear Cell Renal Cell Carcinoma. \u003cem\u003eCancers (Basel)\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e (2019).\u003c/li\u003e\n\u003cli\u003eXue, D. et al. Circ-AKT3 inhibits clear cell renal cell carcinoma metastasis via altering miR-296-3p/E-cadherin signals. \u003cem\u003eMol Cancer\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 151 (2019).\u003c/li\u003e\n\u003cli\u003eStockwell, B. R. \u0026amp; Jiang, X. The Chemistry and Biology of Ferroptosis. \u003cem\u003eCell Chem Biol\u003c/em\u003e \u003cstrong\u003e27\u003c/strong\u003e, 365-375 (2020).\u003c/li\u003e\n\u003cli\u003eFriedmann Angeli, J. P., Krysko, D. V. \u0026amp; Conrad, M. Ferroptosis at the crossroads of cancer-acquired drug resistance and immune evasion. \u003cem\u003eNat Rev Cancer\u003c/em\u003e \u003cstrong\u003e19\u003c/strong\u003e, 405-414 (2019).\u003c/li\u003e\n\u003cli\u003eLei, G., Zhuang, L. \u0026amp; Gan, B. Targeting ferroptosis as a vulnerability in cancer. \u003cem\u003eNat Rev Cancer\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, 381-396 (2022).\u003c/li\u003e\n\u003cli\u003eZhang, J., Song, C., Tian, Y. \u0026amp; Yang, X. Single-Cell RNA Sequencing in Lung Cancer: Revealing Phenotype Shaping of Stromal Cells in the Microenvironment. \u003cem\u003eFront Immunol\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 802080 (2021).\u003c/li\u003e\n\u003cli\u003eZhang, H. et al. N6-Methyladenosine-Related lncRNAs as potential biomarkers for predicting prognoses and immune responses in patients with cervical cancer. \u003cem\u003eBMC Genom Data\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 8 (2022).\u003c/li\u003e\n\u003cli\u003eYao, X. et al. VHL Deficiency Drives Enhancer Activation of Oncogenes in Clear Cell Renal Cell Carcinoma. \u003cem\u003eCancer Discov\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 1284-1305 (2017).\u003c/li\u003e\n\u003cli\u003eTurajlic, S. et al. Deterministic Evolutionary Trajectories Influence Primary Tumor Growth: TRACERx Renal. \u003cem\u003eCell\u003c/em\u003e \u003cstrong\u003e173\u003c/strong\u003e, 595-610.e511 (2018).\u003c/li\u003e\n\u003cli\u003eChen, W. et al. Targeting renal cell carcinoma with a HIF-2 antagonist. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e539\u003c/strong\u003e, 112-117 (2016).\u003c/li\u003e\n\u003cli\u003eHarlander, S. et al. Combined mutation in Vhl, Trp53 and Rb1 causes clear cell renal cell carcinoma in mice. \u003cem\u003eNat Med\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 869-877 (2017).\u003c/li\u003e\n\u003cli\u003eSmith, C. C. et al. Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. \u003cem\u003eJ Clin Invest\u003c/em\u003e \u003cstrong\u003e128\u003c/strong\u003e, 4804-4820 (2018).\u003c/li\u003e\n\u003cli\u003eJonasch, E., Walker, C. L. \u0026amp; Rathmell, W. K. Clear cell renal cell carcinoma ontogeny and mechanisms of lethality. \u003cem\u003eNat Rev Nephrol\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 245-261 (2021).\u003c/li\u003e\n\u003cli\u003eQiu, L. et al. Pro-Angiogenic and Pro-Inflammatory Regulation by lncRNA MCM3AP-AS1-Mediated Upregulation of DPP4 in Clear Cell Renal Cell Carcinoma. \u003cem\u003eFront Oncol\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 705 (2020).\u003c/li\u003e\n\u003cli\u003eKawakami, I. et al. Targeting of the glutamine transporter SLC1A5 induces cellular senescence in clear cell renal cell carcinoma. \u003cem\u003eBiochem Biophys Res Commun\u003c/em\u003e \u003cstrong\u003e611\u003c/strong\u003e, 99-106 (2022).\u003c/li\u003e\n\u003cli\u003eLiu, Y. et al. High expression of Solute Carrier Family 1, member 5 (SLC1A5) is associated with poor prognosis in clear-cell renal cell carcinoma. \u003cem\u003eSci Rep\u003c/em\u003e \u003cstrong\u003e5\u003c/strong\u003e, 16954 (2015).\u003c/li\u003e\n\u003cli\u003eGuo, S. et al. AIFM2 promotes hepatocellular carcinoma metastasis by enhancing mitochondrial biogenesis through activation of SIRT1/PGC-1\u0026alpha; signaling. \u003cem\u003eOncogenesis\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 46 (2023).\u003c/li\u003e\n\u003cli\u003eKraja, A. T. et al. Associations of Mitochondrial and Nuclear Mitochondrial Variants and Genes with Seven Metabolic Traits. \u003cem\u003eAm J Hum Genet\u003c/em\u003e \u003cstrong\u003e104\u003c/strong\u003e, 112-138 (2019).\u003c/li\u003e\n\u003cli\u003eTriska, P. et al. Landscape of Germline and Somatic Mitochondrial DNA Mutations in Pediatric Malignancies. \u003cem\u003eCancer Res\u003c/em\u003e \u003cstrong\u003e79\u003c/strong\u003e, 1318-1330 (2019).\u003c/li\u003e\n\u003cli\u003eLi, S. et al. Mitochondrial Dysfunctions Contribute to Hypertrophic Cardiomyopathy in Patient iPSC-Derived Cardiomyocytes with MT-RNR2 Mutation. \u003cem\u003eStem Cell Reports\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 808-821 (2018).\u003c/li\u003e\n\u003cli\u003eLaGory, E. L. et al. Suppression of PGC-1\u0026alpha; Is Critical for Reprogramming Oxidative Metabolism in Renal Cell Carcinoma. \u003cem\u003eCell Rep\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 116-127 (2015).\u003c/li\u003e\n\u003cli\u003eKim, H. et al. Unsaturated Fatty Acids Stimulate Tumor Growth through Stabilization of \u0026beta;-Catenin. \u003cem\u003eCell Rep\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 495-503 (2015).\u003c/li\u003e\n\u003cli\u003eTang, X. et al. Cystine Deprivation Triggers Programmed Necrosis in VHL-Deficient Renal Cell Carcinomas. \u003cem\u003eCancer Res\u003c/em\u003e \u003cstrong\u003e76\u003c/strong\u003e, 1892-1903 (2016).\u003c/li\u003e\n\u003cli\u003eTan, S. K., Hougen, H. Y., Merchan, J. R., Gonzalgo, M. L. \u0026amp; Welford, S. M. Fatty acid metabolism reprogramming in ccRCC: mechanisms and potential targets. \u003cem\u003eNat Rev Urol\u003c/em\u003e \u003cstrong\u003e20\u003c/strong\u003e, 48-60 (2023).\u003c/li\u003e\n\u003cli\u003eYuan, L. et al. Co-expression network analysis identified six hub genes in association with progression and prognosis in human clear cell renal cell carcinoma (ccRCC). \u003cem\u003eGenom Data\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 132-140 (2017).\u003c/li\u003e\n\u003cli\u003eZeng, J. H. et al. Prognosis of clear cell renal cell carcinoma (ccRCC) based on a six-lncRNA-based risk score: an investigation based on RNA-sequencing data. \u003cem\u003eJ Transl Med\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 281 (2019).\u003c/li\u003e\n\u003cli\u003eButler, A., Hoffman, P., Smibert, P., Papalexi, E. \u0026amp; Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. \u003cem\u003eNat Biotechnol\u003c/em\u003e \u003cstrong\u003e36\u003c/strong\u003e, 411-420 (2018).\u003c/li\u003e\n\u003cli\u003ePont, F., Tosolini, M. \u0026amp; Fourni\u0026eacute;, J. J. Single-Cell Signature Explorer for comprehensive visualization of single cell signatures across scRNA-seq datasets. \u003cem\u003eNucleic Acids Res\u003c/em\u003e \u003cstrong\u003e47\u003c/strong\u003e, e133 (2019).\u003c/li\u003e\n\u003cli\u003eKobak, D. \u0026amp; Berens, P. The art of using t-SNE for single-cell transcriptomics. \u003cem\u003eNat Commun\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 5416 (2019).\u003c/li\u003e\n\u003cli\u003eBecht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. \u003cem\u003eNat Biotechnol\u003c/em\u003e (2018).\u003c/li\u003e\n\u003cli\u003eLangfelder, P. \u0026amp; Horvath, S. WGCNA: an R package for weighted correlation network analysis. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, 559 (2008). \u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"ccRCC, single-cell RNA-seq, metastasis, ferroptosis, LASSO cox regression","lastPublishedDoi":"10.21203/rs.3.rs-7083430/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7083430/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"metastasis of clear cell renal cell carcinoma (ccRCC) negatively affects patient survival. Meanwhile, ferroptosis genes have a certain relationship with cancer metastasis. Here, we conducted a screening of metastasis genes using single-cell sequencing of clear cell renal cell carcinoma GSE73121, and intersected it with the ferroptosis gene database to obtain 13 metastasis-related ferroptosis genes. Next, we performed gene set enrichment analysis (WGCNA) on the patient data of ccRCC in TCGA, resulting in 9 metastasis-related ferroptosis genes. Furthermore, we conducted univariate logistic analysis, lasso analysis, and multivariate logistic analysis on these 9 genes, ultimately identifying 3 key metastasis-related ferroptosis genes (MAFGs). A risk score (RS) for predicting metastasis was constructed based on three MAFGs (DPP4, SLC1A5, and AIFM2). The results showed good outcomes in ROC, calibration curves, and goodness-of-fit tests. Additionally, the risk score (RS) was well validated in clear cell renal cell carcinoma data from GEO22541 and ICGC. It also demonstrated good predictive effects on various survival times of patients. By combining clinical information of the patients, we constructed a nomogram score that includes RS. The nomogram score better predicts patient prognosis (AUC of 0.858). A higher MAFGs nomogram score is associated with fatty acid metabolism, MTOR signaling pathway, and P53 signaling pathway in ccRCC. In summary, we constructed a robust MAFGs using various sequencing data and validated the model in multiple patient cohort databases, which is of significant value for prognostic stratification and screening treatment of metastatic ccRCC.","manuscriptTitle":"Ferroptosis-related gene signature predicts clinical metastasis of clear cell renal cell carcinoma based on single-cell RNA-seq","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-29 14:51:50","doi":"10.21203/rs.3.rs-7083430/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"9fc3278d-6d8b-4b3f-8189-4e68b4dde65b","owner":[],"postedDate":"October 29th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-01-07T09:40:10+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-29 14:51:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7083430","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7083430","identity":"rs-7083430","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00