Crosstalk between mitochondrial and lysosomal co-regulators defines clinical outcomes of breast cancer by integrating multi-omics and machine learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Crosstalk between mitochondrial and lysosomal co-regulators defines clinical outcomes of breast cancer by integrating multi-omics and machine learning Huilin Chen, zhenghui wang, Jiale Shi, Jinghui Peng This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4176718/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background The impact of mitochondrial and lysosomal co-dysfunction on breast cancer patient outcomes is unclear. The objective of this study is to develop a predictive machine learning (ML) model utilizing mitochondrial and lysosomal co-regulators in order to enhance the prognosis for individuals with BC. Methods Differences and correlations of mitochondrial and lysosome related genes were screened and validated. WGCNA and univariate Cox regression were employed to identify prognostic mitochondrial and lysosomal co-regulators. ML was utilized to further selected these regulators as mitochondrial and lysosome-related model signature genes (mlMSGs)and constructed models. The association between the immune and mlMSGs score was investigated through scRNA-seq. Finally, the expression and function of the key gene SHMT2 were confirmed through in vitro experiments. Results According to the C-index, the coxboost+ Survivor-SVM model was identified as the most suitable for predicting outcomes in BC patients. Subsequently, patients were stratified into high and low risk groups based on the model, which demonstrated strong prognostic accuracy. While the overall immunoinfiltration of immune cells was decreased in the high-risk group, it was specifically noted that B cell mlMSGs activity remained diminished in high-risk patients. Additionally, the study found that SHMT2 promoted the proliferation, migration, and invasion of BC cells. Conclusion This study shows that the ML model accurately predicts the prognosis of BC patients. Analysis conducted through the model has identified decreased B-cell immune infiltration and reduced mlMSGs activity as significant factors influencing patient prognosis. These results may offer novel approaches for early intervention and prognostic forecasting in BC. Breast cancer mitochondrial and lysosomal dysfunction machine learning sc-RNA and immunotherapy Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Introduction Despite recent significant advances in breast cancer therapy, there are numerous challenges regarding its therapeutic efficacy. Inadequate treatment approaches contribute to poor prognoses, while excessive interventions may compromise quality of life and engender psychological distress ( 1 ). Given the intricate nature of breast cancer, characterized by complex genetic, epigenetic, and morphological variations within and between tumors ( 2 ), precision medicine has gained significant prominence in its treatment. The discernment of biomarkers is pivotal for facilitating the practical implementation of precision medicine in clinical settings( 3 ). Biomarkers should demonstrate consistent expression patterns within individual tumor tissues and across different tumor tissues. Consequently, a multigene signature may be an effective approach to address inherent heterogeneity. Progress in bioinformatics has facilitated the development of numerous prognostic models ( 4 ) ( 5 ). However, the clinical application of the prognostic models is limited due to insufficient data utilization, inappropriate machine-learning techniques, and the absence of high-quality and rigorously validated cohorts ( 6 , 7 ). Consequently, there is an urgent need for enhanced prognostic models that can accurately predict survival outcomes and identify patients with poor prognoses, thereby enabling personalized treatment interventions. Mitochondria and lysosomes are dynamic organelles crucial for numerous essential cellular processes. Research has disclosed that mitochondrial dysfunction in tumor cells increases glycolysis, reduces oxidative phosphorylation (OXPHOS) and apoptosis, and enhances sensitivity to radiation ( 8 ). Furthermore, the resistance of triple negative breast cancer (TNBC) to CDK4/6 inhibition can elevate lysosome biomass( 9 ). Additionally, cancer patients with elevated Cathepsin D (CTSD) levels, a lysosomal protease, have a poorer prognosis ( 10 ). A growing body of evidence in the literature highlights a strong correlation between mitochondria and lysosomes. Various lysosomal degradation deficiencies are potential causes of mitochondrial protein accumulation within the lysosomes ( 11 ). Additionally, the interaction between the mitochondria and lysosomes regulates the dynamics of mitochondrial calcium ions ( 12 ). Overall, lysosomes and mitochondria are extensively involved in cancer biogenesis and development. In contrast, few studies have documented the ability to predict breast cancer prognosis by examining mitochondria and lysosomes. Consequently, prognostic stratification based on mitochondrial and lysosome-related model signature genes could potentially serve as a valuable tool in guiding the clinical management of patients with breast cancer, ultimately enhancing treatment outcomes for those at high risk. This study formulated a risk stratification and validated it using mlMSGs derived from four distinct public datasets encompassing 4,897 patients with breast cancer. This study aimed to assess the prognosis, investigate immune correlations, and predict drug susceptibility. This study may assist in optimizing precision treatment and improve the prognosis of patients with breast cancer. Results Exploration of novel interconnectivity between mitochondria and lysosome-related genes (MLRGs) Figure 1 depicts the methodology employed in this study. We used the TCGA database to analyze the differential expression of mitochondrial and lysosome-related genes in BRCA samples and adjacent normal tissues. The LIMMA R package was used to identify differentially expressed genes (DEGs) with logFC > 1 and p-value < 0.05. Specifically, we identified MLRGs, including 76 mitochondria-related and 91 lysosome-related differential genes. Volcano plots were employed to represent the top 20 differential gene expression analyses visually (Figs. 2 A-B). Subsequently, we identified 94 copy number variation (CNV) genes among MLRGs. The incidence of CNV gains was higher in PIGR, CD34, RAB7B, FMOD, PRELP, COA6, FCER1A, FLAD1, and S100A7 (Fig. 2 C), and was primarily concentrated on chromosomes 1, 8, and 16 (Fig. 2 D). We conducted differential expression analysis of the 94 CNV MLRGs in BRCA samples and adjacent normal tissues to further investigate the relationship between CNV and mRNA expression. Our findings revealed that COA6, FLAD1, TDRKH, TMEM79, LAMTOR2, and COX6C exhibited CNV gain and higher mRNA expression levels in the BRCA samples (Fig. 2 E). In single nucleotide polymorphism (SNPs) analysis, 203 samples with MLRG mutations were selected from a larger pool of 968 breast cancer samples. Among the mutated genes, APOB (4%), LRP1 (3%), AHNAK (3%), and ANK2 (3%) exhibited the highest frequency of mutations (Fig. 2 F). Notably, significant co-occurrence was observed among genes with a higher mutation frequency, specifically APOB, ANK2, LRP1, and ANK2 (Fig. 2 G). Furthermore, correlation analysis found that MLRGs had a strong correlation, and protein network interaction analysis revealed a significant interaction between MLRGs (Figs. 2 H-I). Identification of co-regulators of mitochondria and lysosomes WGCNA analysis identified co-regulated gene sets associated with mitochondria and lysosomes. Initially, we assessed the mitochondrial and lysosomal activity in the samples using ssGSEA (Additional Figs. 1B-C). Subsequently, when the soft threshold value was set to 3, the data exhibited a stronger adherence to the power-law distribution, enhancing the stability of the mean connectivity. This increased stability further supports the suitability of the data for subsequent research endeavors (Fig. 3 A). The mergin yielded a cumulative count of 22 modules, with the minimum module count set to 50, deepSplit set to 2, and a similarity threshold of less than 0.25. (Fig. 3 B). We identified significant (p < 0.05) genes in the mitochondria and lysosome modules and analyzed their relationship with prognosis in multiple datasets using univariate COX regression. A total of 43 genes were significantly correlated with prognosis in TCGA, GSE26085, and GSE42568 datasets (Figs. 3 C–F, Additional Fig. 1D). Additionally, most of the 43 genes were linked to prognosis in GSE96058 by the univariate COX analysis (Fig. 3 G). CELSR2, SIAH2, BTG2, LEF1, RBBP8, and AGBL2 were identified as the prevalent protective factors across the four datasets, whereas SLC38A7, SHMT2, and CISD1 were recognized as common detrimental factors among the top 10 highest hazard ratio values in the four datasets (Fig. 3 G). Construction of prognostic models of mlMSGs The prognostic model was constructed by integrating sets of 43 genes into the framework, with the TCGA dataset serving as the training set and GEO database data as the test set. Based on TCGA training set, a consistency model was developed using 99 algorithm combinations, and the predictive power of each model was assessed by calculating the C-index across all cohorts. Among the 99 models, the CoxBoost + survival-SVM algorithm had the highest average C-index and was chosen as the final model (Fig. 4 A). Furthermore, 30 mlMSGs were screened from 43 genes using the CoxBoost algorithm (Additional table S4 ). Subsequently, the survival-SVM algorithm was employed to develop the final prognostic model using the 30 model genes. Risk scores for each sample across all cohorts revealed that patients with high-risk scores experienced unfavorable clinical outcomes in TCGA, GEO, and meta-cohorts (Fig. 4 B). To substantiate the prognostic model's superiority, the AUC values of the TCGA-BRCA dataset were 0.738, 0.746, and 0.738, respectively. The AUC values for the GSE20685 dataset were 0.781, 0.806, and 0.754. The AUC values for the GSE42568 dataset were 0.653, 0.696, and 0.783. The AUC values for the GSE90658 dataset were 0.715, 0.696, and 0.647. The AUC values for the meta-dataset were 0.724, 0.716, and 0.68 (Fig. 4 C). These results underscored the prognostic significance of the mlMSGs model. We also calculated the C-index of different clinical features related to prognosis in various datasets. We found that the constructed prognostic model had a better c-index than other clinical features (Fig. 4 D). Additionally, an extensive review of the pertinent literature from the past five years was conducted. Subsequently, 18 signature genes associated with diverse biological processes, including exosome( 13 , 14 ), TP53 mutation( 15 ), necroptosis( 16 ), depression( 17 ), pyroptosis( 18 ), autophagy ( 7 , 19 ), immune ( 20 – 22 ), angiogenesis ( 23 ), cuproptosis( 18 , 20 ), tumor microenvironment (TME)( 24 ), methylation ( 25 ), natural killer cell ( 26 ), lipid metabolism( 27 ), were incorporated for comparative analysis. The mlMSG prognostic model exhibited superior C-index performance compared to nearly all models present in TCGA and GEO datasets (Fig. 4 E). Our findings indicate that the risk scores of the mlMSG prognostic model exhibited superior performance regarding the C-index compared to all clinical features. Heterogeneity of mlMSGs activity in different cell types Initially, we categorized the samples in the GSE161529 dataset as normal (BM) or tumor (GM). Then, we grouped the cells into epithelial/cancer, immune, fibroblast, and endothelial categories (Fig. 5 A). We assessed mlMSGs activity levels in different cells within the BM and GM groups (Fig. 5 B). The analysis revealed increased mlMSGs activity in all four cell types in the GM group (Fig. 5 C). Additionally, we subdivided immune cells into T and B cells, MNPS, NK, Treg cells, and plasma cells to examine mlMSGs activity distribution (Figs. 5 D–E). Our study revealed a notable increase in mlMSGs activity of various immune cell types within the GM group, with NK cells exhibiting the most significant enhancement (Fig. 5 F). These results suggested that hyperactivity of mlMSGs might promote the development of breast cancer. To investigate the association between risk scores and mlMSGs activity, we computed the risk scores for the samples in the GSE176078 dataset. We stratified them into high- and low-risk groups based on the median value. Subsequently, the mlMSGs activity levels of various cell types, including epithelial/cancer, immune, fibroblast, endothelial, T and B cells, MNPS, NK, plasma cells were determined using the same methodology as above within the high- and low-risk groups (Figs. 5 G, J, M). The mlMSGs activity was positively correlated with risk score across various cell types, including epithelial/cancer, immune, fibroblast, and endothelial cells, as well as T cells, MNPS, NK cells, and plasma cells (Figs. 5 I, L and O), while exhibiting a negative correlation with B cells (Fig. 5 O). The findings suggest that suppressing B-cell mlMSGs activity diminishes the efficacy of antitumor treatments in high-risk patients, potentially contributing to the unfavorable prognosis observed in this patient population. Evaluation of mutation status in high- and low-risk groups We analyzed somatic mutations in TCGA cohort and compared high-risk to low-risk scores to comprehensively examine the correlation between risk scores and BRCA mutations. Figures A and B illustrate the genes with frequent mutations, with the high-risk group displaying a higher frequency of mutations than the low-risk group (Figs. 6 A-B). OGT, EPHA3, CELSR2, and SHMT2 had higher mutation frequencies in the low-risk group (Additional Fig. 4 A), while CELSR2, C11orf24, and NDRG1 had higher mutation frequencies in the high-risk group (Additional Fig. 4 B). CELSR2, OGT, EPHA3, RBBP3, and SHMT2 showed higher mutation frequencies in TCGA cohort samples (Additional Fig. 4 C), with widespread mlMSGs mutation co-occurrence in the TCGA cohort (Additional Fig. 4 D). Additionally, we observed significant co-occurrence in the high- and low-risk groups, with co-occurrence being more pronounced in the low-risk group than in the high-risk group (Figs. 6 C-D). After comparing the mutation frequency of genes with a minimum of five mutations in the high- and low-risk groups, the high-risk group had a higher frequency of gene mutations than the low-risk group (Fig. 6 E). Tumor mutational burden (TMB) was significantly higher in the high-risk group than in the low-risk group (Additional Fig. 4 E). Survival analysis suggested that the prognosis of the patients was better in the low-risk group than in the high-risk group (Fig. 6 F). Immune landscape related to risk score Mast cells were closely associated with mlMSGs, commonly linked to immune cells (Additional Figs. 4 F and H). A comprehensive investigation of the TME was conducted using the Immuno-Oncology Biology Research (IOBR) R package and ImmuneAI algorithm and revealed that patients in the low-risk score group exhibited markedly elevated levels of immune cell infiltration than in the high-risk score group, signifying immune activation (Figs. 7 A-B). Moreover, a comparative analysis of immune function between the high- and low-risk groups demonstrated that most immune functions in the low-risk group surpassed those in the high-risk group, such as immune checkpoints (IBC) and HLA family genes, thereby providing additional evidence of the hyperimmune state within the low-risk group (Figs. 7 C-E). According to the CIBERSORT database, the high-risk group had increased M0 and M2 macrophage cells and T follicular helper cell infiltration, while NK cells, B cells, and CD8 + T cell infiltration were reduced (Fig. 7 F). These findings suggest that the low-risk group might display a favorable prognosis, aligning with previous survival analysis outcomes. Subsequently, the tracking tumor immunophenotype (TIP) was calculated to investigate the potential biological mechanisms linked to the mlMSGs. As hypothesized, the low-risk group exhibited a predominance of step 4 (tumor immune infiltrating cell recruitment) and step 5 (immune cell infiltration), aligning with our earlier findings (Fig. 7 G). The low-risk group exhibited significantly higher stromal, immune, and ESTIMATE scores (p < 0.001), suggesting a heightened level of overall immunity and immunogenicity within the TME of this particular group (Fig. 7 H). Role of the risk score in BRCA treatment We utilized the Oncopredict package to calculate the IC 50 values of various drugs to evaluate the prognostic significance of the risk scores for chemotherapy response. Our findings revealed that the low-risk cohort was more susceptible to drugs than the high-risk cohort (Fig. 8 A). Individuals with a low-risk score demonstrated heightened sensitivity to dactinomycin, whereas the high-risk group displayed increased sensitivity to bortezomib, docetaxel, afatinib, and erlotinib (Figs. 8 A and 8 G). These findings imply that risk score has potential as a biomarker for predicting drug sensitivity. We examined the influence of risk scores on immunotherapy efficacy using TIDE (Figs. 8 B–D) and ImmuCellAI (Fig. 8 E) to comprehensively evaluate the predictive capacity of risk scores for immunotherapy sensitivity. Our analysis revealed that the high-risk group was more sensitive to immunotherapy than the low-risk group. Differential analysis of the IPS response suggested that patients in the low-risk group were more sensitive to immune checkpoint inhibitors (ICIs) (Fig. 8 F). Comprehensive analysis of SHMT2 in breast cancer samples A comparative analysis of model genes in TCGA database revealed significant differences in expression levels between normal and tumor tissues, with SHMT2 exhibiting notably higher expression in tumor tissues (Fig. 9 A). Subsequent cell experiments confirmed the upregulation of SHMT2 in tumor cells (Fig. 9 B). The pan-cancer analysis also demonstrated a consistent elevation of SHMT2 in various tumors, such as breast cancer, indicating a potentially pivotal role of SHMT2 in tumorigenesis and progression (Fig. 9 C). The pan-cancer analysis conducted on SHMT2 also revealed a significant correlation between SHMT2 and immune infiltration (Fig. 9 D), immune checkpoints (Additional Fig. 5 D), HRD scores (Additional Fig. 5 A), stemness scores (Additional Fig. 5 B), immunoinflammatory pathways (Additional Fig. 5 C), and TMB (Additional Fig. 4 G) in BRCA. Our study delved deeper into the distinction between normal and tumor cells within the scRNA dataset GSE161529, revealing that model genes exhibited significant differences at the single-cell level (Fig. 9 E). SHMT2 was prominently upregulated in tumor cells compared to normal cells across epithelial/cancer, immune, fibroblast, and endothelial cell types (Fig. 9 G). Furthermore, our investigation into the relationship between SHMT2 expression and prognosis demonstrated that lower SHMT2 expression was correlated with a more favorable prognosis (Fig. 9 F). Roles of SHMT2 in breast cancer cells We first knocked down SHMT2 and constructed a low SHMT2 expression cell line. Then, through CCK8 assay, we found that the proliferation activity of MCF-7 and HCC-1806 cells knocked out was significantly reduced compared with control cells (Fig. 10 A-B). Through cloning experiments, we observed that the colony area of the two cell lines was significantly reduced after knocking down SHMT2 (Fig. 10 C-D). We also found that knocking down SHMT2 inhibited the migration and invasion ability of breast cancer cells MCF-7 and HCC-1806 through wound healing assay and transwell experiments (Fig. 10 E-F). Methods Transcriptome Data acquisition and processing The transcriptome data, mutation data, and clinical data of BRCA were acquired from the The Cancer Genome Atlas (TCGA) databases, comprising a total of 1168 samples. Among these, 111 samples were classified as normal, while 1057 samples were categorized as tumour samples. Additionally, the Gene Expression Omnibus (GEO) expression profiles, specifically GSE42568, GSE20685 and GSE96058, were downloaded from the GEO database. Additional table S1 detailed the characteristics of samples derived from different cohorts (TCGA, GSE20685, GSE42568, GSE96058). To account for any discrepancies between the TCGA and GEO expression profiles, the "sva" package was employed for batch adjustment. ScRNA-seq data acquisition and processing The scRNA-seq data was obtained from the GSE161529 and GSE176078 database. Quality control of the scRNA-seq data was conducted using the "seurat" and "singleR" R packages. Cells with less than 15% expression of both mitochondrial and ribosomal genes were retained, along with genes whose expression levels ranged from 200 to 10,000 and were expressed in at least three cells. The remaining cells were normalized using a linear regression model with the "Log-normalisation" technique. Additionally, the "FindVariableFeatures" function was employed to identify 2000 hypervariable genes. The data was scaled using the "ScaleData" function, followed by the identification of the top 15 principal components (PCs) through t-distributed stochastic neighbor embedding (t-SNE) analysis to identify significant clusters. Acquisition of the mitochondria and lysosome related genes Mitochondria-related genes were derived from the intersection of MitoProteome and MitoCarta3.0 databases (Additional Fig. 1A), while the genes related to lysosomes were retrieved from the Gene Ontology (GO) database, resulting in a total of 872 lysosome-related genes. These genes are listed in Additional Table S2 . AUCell The "AUCell" R package was utilized to compute activity scores for mitochondria and lysosomes in each cell lineage. The gene expression of each cell was subsequently ranked based on the area under the curve (AUC) value of mitochondria and lysosome related genes, enabling estimation of the proportion of highly expressed gene sets. Subsequently, the cells were stratified into high and low-AUC groups using the median score. The "FindAllMarkers" function was employed to analyze the differences between these high and low groups. Weighted co-expression network (WGCNA) analysis The "WGCNA" R package was employed to examine the interconnectivity among distinct gene sets and the correlation between the phenotype and various gene sets. The construction of the gene co-expression network entailed the utilization of weighted expression correlation. Subsequently, hierarchical cluster analysis was conducted based on weighted correlation, resulting in the identification of diverse gene modules. Subsequently, the phenotypic traits under investigation were presented for weighted analysis, wherein the correlation and reliability of all genes within each gene module and phenotypic traits were computed. The core module, deemed the most pertinent and significant, was subsequently identified. Genes originating from the blue and green modules were then designated as hub mitochondria/lysosome regulators. Establishment of prognostic signature derived from Machine-learning To assess the association between MLRGs and prognosis, we utilized the TCGA cohort as the training set and the GEO cohort as the testing set. We employed ten machine learning algorithms, namely CoxBoost, stepwise Cox, Lasso, Ridge, elastic net (Enet), survival support vector machines (survival-SVMs), generalized boosted regression models (GBMs), supervised principal components (SuperPC), partial least Cox (plsRcox), and RSF, to construct a prognostic model that is both accurate and comprehensive. The construction process of the prognostic model proceeded as follows: initially, prognostic mlMSGs were chosen through univariate Cox analysis in both the TCGA and GEO cohorts. Subsequently, 101 algorithms, resulting from pairwise combinations of 10 machine learning algorithms, were employed to develop the most precise and comprehensive model with the highest C-index performance. Lastly, the C-index of each validation cohort was computed, and the optimal model was determined based on the highest average C-index value. Superiority and validity evaluation of the prognostic model ROC curves were created to evaluate the prediction accuracy of the model over 1,3, and 5 years. The risk scores of the samples in the training and validation sets were computed, and subsequently, these samples were categorized into high- and low-risk groups based on their respective risk scores. The Kaplan-Meier method was employed to generate survival curves, and the statistical significance was determined through log-rank tests. Furthermore, a total of 18 prognostic models pertaining to breast cancer were obtained, and the C-index of each model within each cohort was calculated to assess the prognostic predictive capability of the entire signature. To evaluate the clinical utility of the model, we conducted a comprehensive analysis of clinical data obtained from samples in the TCGA and GEO cohorts. This analysis encompassed various factors such as Age, ER, PR, Grade, HER2, Stage, N, and Lymph node status. Furthermore, we compared the prognostic predictive capabilities of these clinical variables with those of risk scores, as measured by the C-index. Mutational landscape analysis The Mutation Annotation Format (MAF) was acquired using the "maftools" R package to illustrate the mutation landscape between high- and low-risk groups. Subsequently, we examined the associations of gene mutations through co-occurrences. Analysis of immune-omics molecular characterization The infiltration of various immune cells was compared between the high and low risk groups in the immune-related database, utilizing the IOBR package. Specifically, the distribution of M0, and M2 cells between these groups was examined. ImmuneCell AI was used to compare the differences in immune infiltration between high and low-risk groups. At the same time, the wilcoxTest was used to compare the differences in immune function between high and low-risk groups, and further compare the differential expression of ICDs and HLA family genes in high and low-risk groups. Additionally, the anti-cancer immune status of tumor tissue and normal tissue was inferred through the analysis of the tumor immune cycle across seven stages. This study aims to analyze the current state of anti-cancer immunity by examining the various stages of the Cancer-Immunity Cycle, which include the release of cancer cell antigens (Step 1), the presentation of cancer antigens (Step 2), the priming and activation of immune responses (Step 3), the migration of immune cells to tumor sites (Step 4), the infiltration of immune cells into tumors (Step 5), the recognition of cancer cells by T cells (Step 6), and the subsequent elimination of cancer cells (Step 7). The relative abundance of stromal cells, immune cells, and tumor cells was assessed in high- and low-risk groups using the "estimate" R package. Additionally, the immune checkpoint conditions and immune function of these two groups were analyzed and visualized using the "ggplot2" R package. Drug sensitivity analysis The R package "oncoPredict" was utilized to predict drug sensitivity by analyzing gene expression levels. The calculation of half-maximal inhibitory concentrations (IC50) for chemotherapeutic drugs was performed using the aforementioned "oncoPredict" R package. TIDE and ImmuneCell AI analyses were performed to assess the immunotherapy sensitivity of the high and low-risk groups. Cell culture HBL-100 human normal breast epithelial cells were provided by the Cell Resource Center of Shanghai Life Sciences Institute, as well as MDA-MB-231, HCC1806, MCF-7, and BT-474 human breast cancer cell lines. DMEM or RPMI-1640 (Gibco BRL, USA) was used to culture these cell lines. They were growth at 37°C with 5% CO2 and 10% fetal bovine serum (FBS) (Gibco BRL, USA). Quantitative real-time PCR (qRT-PCR) Total RNA was isolated from breast cancer cells using TRIzol reagent (Invitrogen, 15596018) following the manufacturer's instructions. The concentration and quality of the total RNA were assessed using a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). Subsequently, the total RNA was reverse-transcribed using a PrimeScript RT reagent kit (Takara Bio Inc., Japan). The resulting cDNA was combined with SYBR Green qPCR Master Mix (Takara Bio), primer, and Diethyl Pyrocarbonate (DEPC; Beyotime Institute of Biotechnology, China) water to achieve a final volume of 10 µL. The experiment was conducted in 96-well plates utilizing the LightCycler 480® Real Time PCR System (Roche Diagnostics, Switzerland). Each reaction was replicated at least three times. The mRNA expression levels were standardized against the levels of glyceraldehyde 3-phosphgate dehydrogenase (GAPDH). The detection of nonspecific amplifications was monitored through the analysis of melting curves. The primer sequences were listed in Additional table S3 . Small-interfering RNA (siRNA) transfection The control used in this study was a non-specific scramble siRNA, while the experimental siRNA (RiboBio, Guangzhou, China) was transfected into breast cancer cells with 80% confluency using Lipofectamine 3000 (Invitrogen, CA, USA) as per the manufacturer's instructions. Following transfection, the cells were incubated for 48 hours in a culture incubator. Colony formation A total of 1×103 cells were transfected and subsequently cultured in 6-well plates for a duration of two weeks. Following the formation of cell clones, the cells were rinsed and fixed using a 4% paraformaldehyde (PFA) solution for a period of 15 minutes. Subsequently, these cells were stained with Crystal violet (Solarbio, China) for a duration of 20 minutes. Pan-cancer analysis The biogenic analysis tool sxdyc.com was utilized to investigate the differential expression of SHMT2 in normal and cancerous tissues, as well as its associations with stemness score, immune infiltration, immune inflammation, ICB, and TMB. CCK8 assay Cells were initially plated onto 96-well plates at a density of 2×103 and allowed to incubate overnight. Subsequently, the cells were subjected to varying durations of incubation (1, 2, 3, 4, 5, 6 days) at 37°C and 5% CO2. Following this incubation period, 10 µL of the CCK8 labeling reagent (0.5 mg/mL, Dojindo, Japan) was added to each well, and the cells were further incubated for an additional 2h at 37°C and 5% CO2. The absorbance of the cells at 450 nm was then measured using an enzyme-labeled meter (Thermo Scientific, Shanghai, China). Migration and invasion experiments Chambers with a diameter of 8µm (CORNING) were employed in the cell migration and invasion assay, either with or without matrigel (CORNING). Approximately 5x104 MCF-7 cells were seeded into the upper chamber and incubated in serum-free media containing varying peptide concentrations or an Akt inhibitor. Subsequently, DMEM with 10% fetal bovine serum was added to the lower chamber. Following a 48-hour incubation period at 37ºC with 95% air and 5% CO2, the membranes were fixed using a 4% paraformaldehyde solution for 20 minutes. Subsequently, the membranes were stained using a crystal violet solution for 15 minutes. The cells present on the upper chambers were gently wiped using a cotton swab. The average counts of migrative and invasive cells were assessed in five randomly selected fields under a light microscope at a magnification of 100x. Statistical analysis R (4.3.1) and Strawberry Perl (v5.32.1) were used to generate all the results and figures. The specific research methods and R packages used are described above. Three independent experiments were performed to record the data as mean ± standard deviation (SD). Student’s t-tests were utilized for comparisons between high- and low-groups. (*P < 0.05, **P < 0.01, ***P < 0.001). Discussion The American Joint Committee on Cancer (AJCC) staging systems are commonly employed in clinical practice to manage various aspects of BRCA treatment, including decision-making and monitoring strategies ( 28 ). However, this approach may be insufficient and consequently result in the possibility of overtreatment or undertreatment due to the inherent heterogeneity of BRCA ( 29 ) ( 30 ). Simultaneously, existing methods for prognosticating and determining drug sensitivity in patients with breast cancer are presently inadequate ( 31 ). Progress in molecular biology and immunology has expanded the range of treatment options for breast cancer, necessitating improved personalized evaluation techniques to guide clinical decision-making. For example, implementing a 21-gene test in premenopausal women enables the assessment of prognostic risk, potentially sparing certain patients from undergoing chemotherapy ( 32 ). Hence, the investigation of dependable and universally efficacious biomarkers for precise prognosis prediction in patients with breast cancer holds a pivotal position in clinical diagnosis and treatment. Mitochondrial and lysosomal dysfunctions, which are crucial organelles within breast cancer cells, have a substantial impact on the onset and progression of breast cancer ( 8 ) ( 9 ) ( 10 ). The current understanding of the impact of mitochondrial and lysosomal genes on breast cancer treatment remains uncertain. Our study was conducted to elucidate the significance of these genes and to assess the correlation between breast cancer-associated mitochondrial and lysosomal genes and prognosis, TME, and drug effectiveness. This study employed WGCNA and univariate Cox regression to discern the MLRGs implicated in the prognosis of patients with breast cancer. Subsequently, we trained a training dataset using 101 machine-learning models and gene expression profiles and validated the results using three independent datasets. The CoxBoost + survival-SVM model was chosen as the optimal approach for further analysis. Currently, integrating algorithms, such as artificial intelligence, with extensive biological data is a significant approach to investigating the correlation between diseases and genes. Its strength lies in using machine learning to identify the most suitable prognostic model for BRCA and streamline the model effectively ( 33 , 34 ). Nonetheless, overfitting is an important issue and should not be disregarded during the model construction phase. We employed the average C-index of multiple validation cohorts as the ranking criterion to mitigate the overfitting of the training set and ensure the generalizability of the model. Then, our meticulously chosen optimal model was compared against previously published models and the predictive effectiveness of various clinical features, thereby demonstrating the superiority of our model( 33 ) ( 34 ) ( 35 ). The impact of the TME on breast cancer cells had significant importance. Varied immune cell infiltration can influence the prognosis of patients with breast cancer ( 36 ). Our findings indicated a significantly higher level of immune infiltration in the low-risk group than in the high-risk group. Previous research has documented that M0 and M2 macrophages are strongly associated with a poor prognosis in breast cancer ( 37 ). Our study findings indicate a positive correlation between M0 and M2 macrophage infiltration and risk scores. T cells, B cells, and NK cells are immune cells with anti-tumor properties, and the augmentation of mitochondrial lysosome activity serves to bolster their anti-tumor efficacy( 38 ) ( 39 , 40 ). Our study revealed diminished infiltration of these cells in high-risk patients. Nevertheless, sc-RNA analysis revealed a notable increase in mitochondrial lysosome activity among immune cells, with the exception of B cells. This finding suggests that B cells may play a pivotal role in the development of anti-tumor immune deficiency in high-risk patients. Enhancing B cell infiltration and mitochondrial lysosome activity emerges as a crucial immunotherapy strategy to enhance the prognosis of high-risk patients. Patients with elevated TMB levels may exhibit an enhanced responsiveness to immunotherapy ( 41 ). Consequently, we examined TMB disparities between cohorts categorized as high- and low-risk, revealing a positive correlation between TMB levels and risk scores. Patients with elevated TMB demonstrated poorer prognosis, potentially attributable to the limited utilization of immunotherapy in breast cancer treatment. Hence, an additional assessment was conducted to examine the responsiveness of patients classified into high- and low-risk groups toward immunotherapy. The TIDE score was lower in the high-risk group than in the low-risk group, reinforcing that the high-risk group displayed a heightened sensitivity to immunotherapy. Similar outcomes were observed using ImmuCellAI for analysis. We analyzed the sensitivity to chemotherapy drugs in the high- and low-risk groups. We evaluated the effectiveness of the commonly used clinical chemotherapy drugs in these groups. Our findings indicate that patients classified as low-risk demonstrated sensitivity to dactinomycin, whereas those classified as high-risk demonstrated sensitivity to docetaxel, afatinib, osimertinib, savolitinib, and erlotinib. Afatinib, osimertinib, savolitinib, and erlotinib are tyrosine kinase inhibitors that have significantly impacted the treatment of various tumors( 42 ) ( 43 ) ( 44 , 45 ). Anastrozole, which inhibits the estrogen signaling pathway and reduces the stimulatory effects of estrogen on cancer cells, is also commonly used in breast cancer treatment ( 46 ). Additionally, drugs such as lapatinib target HER2 receptors and other related tyrosine kinases, demonstrating therapeutic benefits for HER2-negative breast cancer patients( 47 ) ( 48 ) ( 49 ). Furthermore, combining tyrosine kinase inhibitors with certain chemotherapy and immunotherapy drugs has been shown to enhance efficacy and potentially prolong drug resistance( 50 ) ( 51 ). Thus, the administration of tyrosinase inhibitors in conjunction with standard therapy may enhance the prognosis for high-risk patients. Among the model genes, we compared the HR values of model genes in different datasets and found that SHMT2 and CISD1 were significantly correlated with prognosis in TCGA and GEO datasets. SHMT2 has a higher mutation frequency in breast cancer; therefore, we selected SHMT2 for subsequent analyses. We validated the SHMT2 mRNA expression in various breast cancer cell lines. Our findings revealed upregulation of SHMT2 mRNA expression across different breast cancer cell lines. Moreover, a comprehensive pan-cancer analysis demonstrated differential SHMT2 expression in most cancer species, including BLCA, CESC, and COAD. Previous studies have shown that increased SHMT2 expression causes mitochondrial dysfunction, contributing to cell survival ( 52 ) ( 53 ). Cell-based experiments revealed that downregulating SHMT2 expression reduces cellular activity and invasion and migration capabilities of BRCA. These findings underscore the significant contribution of SHMT2 to the advancement of anti-breast cancer strategies, aligning with the outcomes of previous studies. Conclusion Our study demonstrated that the developed model could predict the outcomes of patients with breast cancer, thereby facilitating the formulation of customized treatment approaches for individuals with varying risk profiles. Furthermore, we substantiated the involvement of SHMT2 in BRCA through cellular experiments, thereby aiding in identifying potential genes suitable for personalized precision breast cancer therapy. Abbreviations ML, machine learning; mlMSGs, mitochondrial and lysosome-related model signature genes; OXPHOS, oxidative phosphorylation; TNBC, triple negative breast cancer; CTSD, Cathepsin D; DEGs, differentially expressed genes; MLRGs, mitochondria and lysosome-related genes; CNV, copy number variation; SNPs, single nucleotide polymorphism; TME, tumor microenvironment; TMB, Tumor mutational burden; IOBR, ImmunoOncology Biology Research; IBC, immune checkpoints; TIP, tumor immunophenotype; ICIs, immune checkpoint inhibitors; TCGA, The Cancer Genome Atlas; GEO, the Gene Expression Omnibus; PCs, principal components; t-SNE, t-distributed stochastic neighbor embedding; GO, Gene Ontology; AUC, area under the curve; WGCNA, Weighted co-expression network; Enet, elastic net; survival-SVMs, survival support vector machines; GBMs, generalized boosted regression models; SuperPC, supervised principal components; plsRcox, partial least Cox, MAF, The Mutation Annotation Format; IC50, half-maximal inhibitory concentrations; FBS, fetal bovine serum; qRT-PCR, Quantitative real-time PCR; DEPC, Diethyl Pyrocarbonate; GAPDH, glyceraldehyde 3-phosphgate dehydrogenase; siRNA, Small-interfering RNA; PFA, paraformaldehyde; SD, standard deviation; AJCC, The American Joint Committee on Cancer Declarations Author contributions The study was proposed and designed by JHP. JHP and HLC were primarily responsible for drafting the manuscript. Analyses were conducted by JHP, HLC, and ZHW. Experiments were carried out by ZHW and JLS. The manuscript was revised by JHP, HLC, and ZHW. All authors have reviewed and approved the final version of the manuscript. Declaration of competing interest The authors assert that they do not have any competing financial interests or personal relationships that could be perceived as influencing the work presented in this paper. Acknowledgments This study was supported by the Jiangsu Provincial People's Hospital youth talent project (Project No. PY2022023). Consent for publication Not applicable. Availability of data and materials All data generated or analyzed during this study are included in this article. Further enquiries can be directed to the corresponding author. Contributor Information Peng Jinghui, Email: [email protected] Shi jiale, Email: [email protected] References Strobl S, Korkmaz B, Devyatko Y, Schuetz M, Exner R, Dubsky PC, et al. Adjuvant Bisphosphonates and Breast Cancer Survival. Annu Rev Med. 2016;67:1-10. Asleh K, Negri GL, Spencer Miko SE, Colborne S, Hughes CS, Wang XQ, et al. Proteomic analysis of archival breast cancer clinical specimens identifies biological subtypes with distinct survival outcomes. Nat Commun. 2022;13(1):896. Matos Do Canto L, Marian C, Varghese RS, Ahn J, Da Cunha PA, Willey S, et al. Metabolomic profiling of breast tumors using ductal fluid. Int J Oncol. 2016;49(6):2245-54. Xie J, Yang Y, Gao Y, He J. Cuproptosis: mechanisms and links with cancers. Mol Cancer. 2023;22(1):46. Margolin AA, Bilal E, Huang E, Norman TC, Ottestad L, Mecham BH, et al. Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer. Sci Transl Med. 2013;5(181):181re1. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236-46. Kim CK, Choi JW, Jiao Z, Wang D, Wu J, Yi TY, et al. An automated COVID-19 triage pipeline using artificial intelligence based on chest radiographs and clinical data. NPJ Digit Med. 2022;5(1):5. Bao X, Zhang J, Huang G, Yan J, Xu C, Dou Z, et al. The crosstalk between HIFs and mitochondrial dysfunctions in cancer development. Cell Death Dis. 2021;12(2):215. Fassl A, Brain C, Abu-Remaileh M, Stukan I, Butter D, Stepien P, et al. Increased lysosomal biomass is responsible for the resistance of triple-negative breast cancers to CDK4/6 inhibition. Sci Adv. 2020;6(25):eabb2210. Ketterer S, Mitschke J, Ketscher A, Schlimpert M, Reichardt W, Baeuerle N, et al. Cathepsin D deficiency in mammary epithelium transiently stalls breast cancer by interference with mTORC1 signaling. Nat Commun. 2020;11(1):5133. Jin S. Autophagy, mitochondrial quality control, and oncogenesis. Autophagy. 2006;2(2):80-4. Peng W, Wong YC, Krainc D. Mitochondria-lysosome contacts regulate mitochondrial Ca(2+) dynamics via lysosomal TRPML1. Proc Natl Acad Sci U S A. 2020;117(32):19266-75. Qiu P, Guo Q, Yao Q, Chen J, Lin J. Characterization of Exosome-Related Gene Risk Model to Evaluate the Tumor Immune Microenvironment and Predict Prognosis in Triple-Negative Breast Cancer. Front Immunol. 2021;12:736030. Qiu P, Guo Q, Pan K, Chen J, Lin J. A pyroptosis-associated gene risk model for predicting the prognosis of triple-negative breast cancer. Front Oncol. 2022;12:890242. Jiang M, Wu X, Bao S, Wang X, Qu F, Liu Q, et al. Immunometabolism characteristics and a potential prognostic risk model associated with TP53 mutations in breast cancer. Front Immunol. 2022;13:946468. Pu S, Zhou Y, Xie P, Gao X, Liu Y, Ren Y, et al. Identification of necroptosis-related subtypes and prognosis model in triple negative breast cancer. Front Immunol. 2022;13:964118. Wang X, Wang N, Zhong LLD, Su K, Wang S, Zheng Y, et al. Development and Validation of a Risk Prediction Model for Breast Cancer Prognosis Based on Depression-Related Genes. Front Oncol. 2022;12:879563. Zhou Z, Deng J, Pan T, Zhu Z, Zhou X, Lv C, et al. Prognostic Significance of Cuproptosis-Related Gene Signatures in Breast Cancer Based on Transcriptomic Data Analysis. Cancers (Basel). 2022;14(23). Li X, Cao Y, Yu X, Jin F, Li Y. A novel autophagy-related genes prognostic risk model and validation of autophagy-related oncogene VPS35 in breast cancer. Cancer Cell Int. 2021;21(1):265. Li L, Li L, Liu M, Li Y, Sun Q. Novel immune-related prognostic model and nomogram for breast cancer based on ssGSEA. Front Genet. 2022;13:957675. Lu X, Gou Z, Yu L, Bu H. A novel risk model based on immune response predicts clinical outcomes and characterizes immunophenotypes in triple-negative breast cancer. Am J Cancer Res. 2022;12(8):3913-31. Chen L, Dong Y, Pan Y, Zhang Y, Liu P, Wang J, et al. Identification and development of an independent immune-related genes prognostic model for breast cancer. BMC Cancer. 2021;21(1):329. Tao D, Wang Y, Zhang X, Wang C, Yang D, Chen J, et al. Identification of Angiogenesis-Related Prognostic Biomarkers Associated With Immune Cell Infiltration in Breast Cancer. Front Cell Dev Biol. 2022;10:853324. Geng S, Fu Y, Fu S, Wu K. A tumor microenvironment-related risk model for predicting the prognosis and tumor immunity of breast cancer patients. Front Immunol. 2022;13:927565. Feng L, Jin F. Screening of differentially methylated genes in breast cancer and risk model construction based on TCGA database. Oncol Lett. 2018;16(5):6407-16. Liu Z, Ding M, Qiu P, Pan K, Guo Q. Natural killer cell-related prognostic risk model predicts prognosis and treatment outcomes in triple-negative breast cancer. Front Immunol. 2023;14:1200282. Ye Z, Zou S, Niu Z, Xu Z, Hu Y. A Novel Risk Model Based on Lipid Metabolism-Associated Genes Predicts Prognosis and Indicates Immune Microenvironment in Breast Cancer. Front Cell Dev Biol. 2021;9:691676. Ginter PS, Idress R, D'Alfonso TM, Fineberg S, Jaffer S, Sattar AK, et al. Histologic grading of breast carcinoma: a multi-institution study of interobserver variation using virtual microscopy. Mod Pathol. 2021;34(4):701-9. Zhang L, Dong D, Li H, Tian J, Ouyang F, Mo X, et al. Development and validation of a magnetic resonance imaging-based model for the prediction of distant metastasis before initial treatment of nasopharyngeal carcinoma: A retrospective cohort study. EBioMedicine. 2019;40:327-35. Wong KY, Fan C, Tanioka M, Parker JS, Nobel AB, Zeng D, et al. I-Boost: an integrative boosting approach for predicting survival time with multiple genomics platforms. Genome Biol. 2019;20(1):52. Kong J, Lee H, Kim D, Han SK, Ha D, Shin K, et al. Network-based machine learning in colorectal and bladder organoid models predicts anti-cancer drug efficacy in patients. Nat Commun. 2020;11(1):5485. Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, et al. Adjuvant Chemotherapy Guided by a 21-Gene Expression Assay in Breast Cancer. N Engl J Med. 2018;379(2):111-21. Liu P, Deng X, Zhou H, Xie J, Kong Y, Zou Y, et al. Multi-omics analyses unravel DNA damage repair-related clusters in breast cancer with experimental validation. Front Immunol. 2023;14:1297180. Chu G, Ji X, Wang Y, Niu H. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. Mol Ther Nucleic Acids. 2023;33:110-26. Liu Z, Liu L, Weng S, Guo C, Dang Q, Xu H, et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer. Nat Commun. 2022;13(1):816. Adams S, Gray RJ, Demaria S, Goldstein L, Perez EA, Shulman LN, et al. Prognostic value of tumor-infiltrating lymphocytes in triple-negative breast cancers from two phase III randomized adjuvant breast cancer trials: ECOG 2197 and ECOG 1199. J Clin Oncol. 2014;32(27):2959-66. Ali HR, Chlon L, Pharoah PD, Markowetz F, Caldas C. Patterns of Immune Infiltration in Breast Cancer and Their Clinical Implications: A Gene-Expression-Based Retrospective Study. PLoS Med. 2016;13(12):e1002194. Li X, Lu M, Yuan M, Ye J, Zhang W, Xu L, et al. CXCL10-armed oncolytic adenovirus promotes tumor-infiltrating T-cell chemotaxis to enhance anti-PD-1 therapy. Oncoimmunology. 2022;11(1):2118210. Pena-Romero AC, Orenes-Pinero E. Dual Effect of Immune Cells within Tumour Microenvironment: Pro- and Anti-Tumour Effects and Their Triggers. Cancers (Basel). 2022;14(7). Largeot A, Pagano G, Gonder S, Moussay E, Paggetti J. The B-side of Cancer Immunity: The Underrated Tune. Cells. 2019;8(5). Allgauer M, Budczies J, Christopoulos P, Endris V, Lier A, Rempel E, et al. Implementing tumor mutational burden (TMB) analysis in routine diagnostics-a primer for molecular pathologists and clinicians. Transl Lung Cancer Res. 2018;7(6):703-15. Lai E, Puzzoni M, Ziranu P, Pretta A, Impera V, Mariani S, et al. New therapeutic targets in pancreatic cancer. Cancer Treat Rev. 2019;81:101926. Wu S, Luo M, To KKW, Zhang J, Su C, Zhang H, et al. Intercellular transfer of exosomal wild type EGFR triggers osimertinib resistance in non-small cell lung cancer. Mol Cancer. 2021;20(1):17. Choueiri TK, Heng DYC, Lee JL, Cancel M, Verheijen RB, Mellemgaard A, et al. Efficacy of Savolitinib vs Sunitinib in Patients With MET-Driven Papillary Renal Cell Carcinoma: The SAVOIR Phase 3 Randomized Clinical Trial. JAMA Oncol. 2020;6(8):1247-55. Xie G, Zhu A, Gu X. Converged DNA Damage Response Renders Human Hepatocellular Carcinoma Sensitive to CDK7 Inhibition. Cancers (Basel). 2022;14(7). Tyutyunyk-Massey L, Gewirtz DA. Roles of autophagy in breast cancer treatment: Target, bystander or benefactor. Semin Cancer Biol. 2020;66:155-62. Fakhri S, Moradi SZ, Farzaei MH, Bishayee A. Modulation of dysregulated cancer metabolism by plant secondary metabolites: A mechanistic review. Semin Cancer Biol. 2022;80:276-305. Gaynor N, Crown J, Collins DM. Immune checkpoint inhibitors: Key trials and an emerging role in breast cancer. Semin Cancer Biol. 2022;79:44-57. Biancolella M, Testa B, Baghernajad Salehi L, D'Apice MR, Novelli G. Genetics and Genomics of Breast Cancer: update and translational perspectives. Semin Cancer Biol. 2021;72:27-35. Kok PS, Cho D, Yoon WH, Ritchie G, Marschner I, Lord S, et al. Validation of Progression-Free Survival Rate at 6 Months and Objective Response for Estimating Overall Survival in Immune Checkpoint Inhibitor Trials: A Systematic Review and Meta-analysis. JAMA Netw Open. 2020;3(9):e2011809. Zhu R, Li L, Nguyen B, Seo J, Wu M, Seale T, et al. FLT3 tyrosine kinase inhibitors synergize with BCL-2 inhibition to eliminate FLT3/ITD acute leukemia cells through BIM activation. Signal Transduct Target Ther. 2021;6(1):186. Zhang Y, Liu Z, Wang X, Jian H, Xiao H, Wen T. SHMT2 promotes cell viability and inhibits ROS-dependent, mitochondrial-mediated apoptosis via the intrinsic signaling pathway in bladder cancer cells. Cancer Gene Ther. 2022;29(10):1514-27. Ron-Harel N, Santos D, Ghergurovich JM, Sage PT, Reddy A, Lovitch SB, et al. Mitochondrial Biogenesis and Proteome Remodeling Promote One-Carbon Metabolism for T Cell Activation. Cell Metab. 2016;24(1):104-17. Supplementary Files Additionalfile1.tif Additionalfile2.tif Additionalfile3.tif Additionalfile4.tif Additionalfile5.tif Additionaltable1.docx Additioanaltable2.xlsx Additionaltable3.docx Additioanaltable4.xlsx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4176718","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":289933554,"identity":"a31733a2-6f43-4fcf-8400-f199c92be5f4","order_by":0,"name":"Huilin Chen","email":"","orcid":"","institution":"Jiangsu Province People's Hospital and Nanjing Medical University First Affiliated Hospital: The First Affiliated Hospital With Nanjing Medical University","correspondingAuthor":false,"prefix":"","firstName":"Huilin","middleName":"","lastName":"Chen","suffix":""},{"id":289933555,"identity":"354572f9-6b44-4fb0-a7b1-3efced48e405","order_by":1,"name":"zhenghui wang","email":"","orcid":"","institution":"Jiangsu Province People's Hospital and Nanjing Medical University First Affiliated Hospital: The First Affiliated Hospital With Nanjing Medical University","correspondingAuthor":false,"prefix":"","firstName":"zhenghui","middleName":"","lastName":"wang","suffix":""},{"id":289933556,"identity":"e75495f8-20db-4520-9681-a9be1bc69ab8","order_by":2,"name":"Jiale Shi","email":"","orcid":"","institution":"Jiangsu Province People's Hospital and Nanjing Medical University First Affiliated Hospital: The First Affiliated Hospital With Nanjing Medical University","correspondingAuthor":false,"prefix":"","firstName":"Jiale","middleName":"","lastName":"Shi","suffix":""},{"id":289933557,"identity":"ff1b1e2f-1c1d-46a8-91d8-f3ea476b6fd9","order_by":3,"name":"Jinghui Peng","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABG0lEQVRIiWNgGAWjYBACxmYGBmYGBgkw58CHiho5kOABBoYEorQwPpxx5pgxiI9XCwgww2hj3jbmxAZCWpjbeQ+/LmyzyOOXbr8mOYONLb1fIv8A0IVpDPzt3Vj1MTbzpVnPbJMolpxzpkziA49M7swZyQwHZ5zJYZA4c3YDdi08ZkD3SCRuuJGTJjlDgi13w41khsO8bRUMBhK5+LXsB2qR5jFgTjcgQovxY7AtEumHjXkSmBOgWnLw2sLMc06iWOJGDjCQDxwznNnz2ADolzQeXH4x7D9j/JmnrC6Pf0b6gwMf/9XI87MnPnzwoSJZjr+9F7uWBgY2UDwmMDDwGKDI8GBTDgLywKj5ANHC/gCXolEwCkbBKBjhAABFlWWFz1+RZQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0002-4166-8002","institution":"Jiangsu Province People's Hospital and Nanjing Medical University First Affiliated Hospital: The First Affiliated Hospital With Nanjing Medical University","correspondingAuthor":true,"prefix":"","firstName":"Jinghui","middleName":"","lastName":"Peng","suffix":""}],"badges":[],"createdAt":"2024-03-27 14:18:06","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4176718/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4176718/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":54993894,"identity":"8bb380f5-ed5a-4075-977a-70906e6f5f37","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":1525183,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThe methodology employed in this study is depicted by Figdraw in Figure 1.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/8506591c32bb7f06ff3e7654.jpg"},{"id":54993897,"identity":"847d4d6d-538b-4544-96e0-3f0562bc186a","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":2548958,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eNovel interconnectivity between mitochondria and lysosome related genes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA Volcano plots demonstrate top 20 differentially expressed genes related to lysosome; B Volcano plots demonstrate top 20 differentially expressed genes related to mitochondria; C Differential mitochondrial and lysosomal associated genes with copy number mutations; D The chromosomal location of the CNV genes; E Differential expression of the 94 CNV genes related to mitochondria and lysosomes between BRCA samples and adjacent normal tissues. F SNPs analysis of differential mitochondrial lysosomal related genes. G Analysis of mutation co-occurrence in the top 20 genes with mutation frequency. H-I Correlation heat map of MLRGs (H); Protein network interaction map of MLRGs (I).\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/90db495602db27eddc1acae2.jpg"},{"id":54993895,"identity":"222a92de-c63b-4c09-a51d-ba8546a6a934","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1163031,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eIdentification of co-regulators of mitochondria and lysosomes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA-C The WGCNA analysis. The genes related to lysosome and mitochondria were extracted. D-G\u003c/p\u003e\n\u003cp\u003eSignificant prognostic genes in the TCGA, GSE20685, GSE42568, GSE96058.\u003c/p\u003e","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/be0a3e2c125506744c0de212.jpg"},{"id":54994909,"identity":"5e61f86b-652c-4bfc-b95a-3a2f91377f0e","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":2247552,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConstruction of prognostic models\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA. 101 different machine learning algorithms were used to create prognostic models, and each model's C-index was calculated for each data set. B TCGA, GSE20685, GSE42568, GSE96058 survival curves by LMRGs. C ROC curves of 1-year, 3-year, and 5-year in TCGA, GSE20685, GSE42568, GSE96058. D C-index comparison of different clinical features and risk scores. E C-index analysis of LMRGs and 18 published signatures in TCGA, GSE20685, GSE42568, GSE96058 and MetaCohort.\u003c/p\u003e","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/6a814a42f7a553ccd739d12b.jpg"},{"id":54993899,"identity":"0025bb30-748c-4278-98a7-cd956738320e","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1905841,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eHeterogeneity of mlMSGs scores in different cell types\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA-I Analysis of GSE161529 dataset: A All cells were divided into Epithelial/Cancer, Immune, Fibroblast and Endothelial types. B Distribution of LMRGs activity in all cells; C Differential activity of LMRGs in normal cells and tumor cells; D The immune cells were further divided into T cells, MNPS, NK, B cells, Treg cells and Plasma cells; E Distribution of LMRGs activity in immune cells; F Differential activity of LMRGs in normal immune cells and tumor-infiltrating immune cells (TICs); G The NK cells were further divided into GNLY+NK, XCL1+NK, NKG7+NK; H Distribution of LMRGs activity in NK cells; I Differential activity of LMRGs in normal NK cells and tumor-infiltrating NK cells; J-R Analysis of GSE176078dataset: J All cells were divided into Epithelial/Cancer, Immune, Fibroblast and Endothelial types. K Distribution of LMRGs activity in all cells; L Differential activity of LMRGs in normal cells and tumor cells; M The immune cells were further divided into T cells, MNPS, NK, B cells; N Distribution of LMRGs activity in immune cells; O Differential activity of LMRGs in normal immune cells and tumor-infiltrating immune cells (TICs); P The NK cells were further divided into GNLY+NK, XCL1+NK, NKG7+NK; Q Distribution of LMRGs activity in NK cells; R Differential activity of LMRGs in normal NK cells and tumor-infiltrating NK cells.\u003c/p\u003e","description":"","filename":"Figure5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/866e5f6064d3681d2d08e201.jpg"},{"id":54993902,"identity":"a675721b-5d0f-4e23-8c18-4fc6f7a2ff7e","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1579163,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEvaluation of mutation status in high - and low-risk groups\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA-B \u0026nbsp;Genes with the top 20 mutation frequencies in the high-low risk group; C-D Significant instances of co-occurrence and exclusion in the high and low risk groups;E Comparison of mutation frequency of genes with a minimum number of mutations of 5 in high and low risk groups;G Correlation analysis between TMB and prognosis.\u003c/p\u003e","description":"","filename":"Figure6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/7ad6d6de620a0d5573a836e8.jpg"},{"id":54993913,"identity":"6c3c9727-e99a-4cf0-a9bc-9fddbe0a0700","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":2467718,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eImmune landscape related to risk score\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA Tumor immunoinfiltration analysis of 22 kinds of immune cells in high and low risk group; B Differential analysis of immune scores in high and low risk groups; C Analysis of differences in immune function in high and low risk groups; D-E Differential analysis of immune checkpoint and HLA family related genes in high and low risk groups; F Correlation analysis between M0 Macrophages, M2 Macrophages, CD8\u003csup\u003e+ \u003c/sup\u003eT cells, B cells, NK cells and T cells follicular helper and risk score; G Differences in the degree of activation between high-risk and low-risk groups at each step of the TIP; H Analysis of differences in TME-related scores(ESTIMATEScore, ImmuneScore, StromalScore) among high and low risk groups.\u003c/p\u003e","description":"","filename":"Figure7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/40a10e9567fb96cd24a423c1.jpg"},{"id":54993900,"identity":"ca34edaf-922e-41b3-a937-515e7b7b1927","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":831911,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRole of risk score in BRCA treatment\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA IC50 value of sensitive drugs in high and low risk group; B-D Response to immunotherapy in high and low risk group; E The ImmuCellAI algorithm predicts response to immunotherapy in high-low risk groups. F Differences in IPS reactivity between high and low risk groups; G Differences in IC50 values of drugs such as Dactinomycin, Bortezomib, Docetaxel, Afatinib and Erlotinib in high and low risk groups.\u003c/p\u003e","description":"","filename":"Figure8.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/1630cbd8524a1d84afe7767b.jpg"},{"id":54993906,"identity":"c55d6e8a-7748-4643-8a02-cd88aa45567d","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":2119752,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComprehensive analysis of SHMT2 in BC samples\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA Difference of SHMT2 expression between normal and tumor tissues in TCGA; B Difference of SHMT2 expression between normal epithelial cells (HBL-100) and breast cancer cells (BT-474, ZR-75-1, MCF-7, HCC1806, MDA-MB-231); C Pan-cancer differential analysis of SHMT2 between normal tissue and tumor tissue at different sites; D The correlation between SHMT2 expression and immunity in different tumor tissues. E Model gene differential expression in normal and tumor cells by scRNA analysis; F Survival curves of groups with high and low SHMT2 expression in different datasets; G SHMT2 differential expression in Epithelial/Cancer, Immune, Fibroblast and Endothelial cells between normal and tumor tissues in GSE161529.\u003c/p\u003e","description":"","filename":"Figure9.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/0bca5a5432b7044246227031.jpg"},{"id":54993911,"identity":"16bb0b22-c5e8-4cba-9a0f-cfa5dd171158","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"jpg","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":2851601,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRoles of SHMT2 in breast cancer\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA-B The impact of SHMT2 on the proliferation of MCF-7 and HCC-1806 cell lines was confirmed through the use of a CCK8 assay; C-D The impact of SHMT2 on the proliferation of MCF-7 and HCC-1806 cells was confirmed through colony formation; E The effect of SHMT2 on the migration of MCF-7 and HCC-1806 cells was verified by wound healing assay; F The effects of SHMT2 on the migration and invasion of MCF-7 and HCC-1806 cells were investigated through transwell experiments.\u003c/p\u003e","description":"","filename":"Figure10.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/f5e8280a3c869c4738ce9a6d.jpg"},{"id":56808172,"identity":"c578be1e-9f39-4989-9d24-73b4d15e93f9","added_by":"auto","created_at":"2024-05-20 18:25:21","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":20126399,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/3b37fac2-22a1-42b0-9f9c-b8b00697d766.pdf"},{"id":54994910,"identity":"ff6d73f1-585d-49ab-8b21-be709ef3ceb2","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"tif","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":771356,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile1.tif","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/d33962fd4473e036bfb36ba8.tif"},{"id":54993904,"identity":"addbc78a-8e2a-44c0-a274-2050064a008f","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"tif","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":2665924,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile2.tif","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/020f0aa713844139d0e83074.tif"},{"id":54996520,"identity":"c08d4f54-fe55-4ee1-98af-e5ae24c8b6b3","added_by":"auto","created_at":"2024-04-19 18:00:03","extension":"tif","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":3248852,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile3.tif","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/7bb354cb5885a1149cfb8a20.tif"},{"id":54996521,"identity":"291ae887-7e52-4b32-9ee4-b97e6b5a0bc8","added_by":"auto","created_at":"2024-04-19 18:00:03","extension":"tif","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":6147652,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile4.tif","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/41fc480dc229b4dc75f4932e.tif"},{"id":54994916,"identity":"08677372-33b4-4bb8-82fe-ece6db79d4fb","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"tif","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":6069308,"visible":true,"origin":"","legend":"","description":"","filename":"Additionalfile5.tif","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/d0dcb7b25a93b659c8f84e13.tif"},{"id":54994911,"identity":"1f8c7077-782e-4f50-969d-9e9c366fb4ed","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"docx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":16635,"visible":true,"origin":"","legend":"","description":"","filename":"Additionaltable1.docx","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/a15bbd7f400c1f9ea93114cf.docx"},{"id":54994913,"identity":"d6e7021e-e5cb-4fe0-ae75-7a5df4e2a425","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"xlsx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":30692,"visible":true,"origin":"","legend":"","description":"","filename":"Additioanaltable2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/ef127f9e99cb36a8fdf3d072.xlsx"},{"id":54993910,"identity":"e404b081-f80d-40c1-b2aa-1279b847ac9a","added_by":"auto","created_at":"2024-04-19 17:44:03","extension":"docx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":15776,"visible":true,"origin":"","legend":"","description":"","filename":"Additionaltable3.docx","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/b4c2c49ab524b210f5868a52.docx"},{"id":54994914,"identity":"527991c1-b3e0-4c17-8307-8c8967e0fa48","added_by":"auto","created_at":"2024-04-19 17:52:03","extension":"xlsx","order_by":9,"title":"","display":"","copyAsset":false,"role":"supplement","size":8986,"visible":true,"origin":"","legend":"","description":"","filename":"Additioanaltable4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4176718/v1/2d95bf8812ddb8e218dcb07b.xlsx"}],"financialInterests":"","formattedTitle":"Crosstalk between mitochondrial and lysosomal co-regulators defines clinical outcomes of breast cancer by integrating multi-omics and machine learning","fulltext":[{"header":"Introduction","content":"\u003cp\u003eDespite recent significant advances in breast cancer therapy, there are numerous challenges regarding its therapeutic efficacy. Inadequate treatment approaches contribute to poor prognoses, while excessive interventions may compromise quality of life and engender psychological distress (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e). Given the intricate nature of breast cancer, characterized by complex genetic, epigenetic, and morphological variations within and between tumors (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e), precision medicine has gained significant prominence in its treatment. The discernment of biomarkers is pivotal for facilitating the practical implementation of precision medicine in clinical settings(\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e). Biomarkers should demonstrate consistent expression patterns within individual tumor tissues and across different tumor tissues. Consequently, a multigene signature may be an effective approach to address inherent heterogeneity. Progress in bioinformatics has facilitated the development of numerous prognostic models (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) (\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e). However, the clinical application of the prognostic models is limited due to insufficient data utilization, inappropriate machine-learning techniques, and the absence of high-quality and rigorously validated cohorts (\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e). Consequently, there is an urgent need for enhanced prognostic models that can accurately predict survival outcomes and identify patients with poor prognoses, thereby enabling personalized treatment interventions.\u003c/p\u003e \u003cp\u003eMitochondria and lysosomes are dynamic organelles crucial for numerous essential cellular processes. Research has disclosed that mitochondrial dysfunction in tumor cells increases glycolysis, reduces oxidative phosphorylation (OXPHOS) and apoptosis, and enhances sensitivity to radiation (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e). Furthermore, the resistance of triple negative breast cancer (TNBC) to CDK4/6 inhibition can elevate lysosome biomass(\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e). Additionally, cancer patients with elevated Cathepsin D (CTSD) levels, a lysosomal protease, have a poorer prognosis (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e). A growing body of evidence in the literature highlights a strong correlation between mitochondria and lysosomes. Various lysosomal degradation deficiencies are potential causes of mitochondrial protein accumulation within the lysosomes (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e). Additionally, the interaction between the mitochondria and lysosomes regulates the dynamics of mitochondrial calcium ions (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e). Overall, lysosomes and mitochondria are extensively involved in cancer biogenesis and development. In contrast, few studies have documented the ability to predict breast cancer prognosis by examining mitochondria and lysosomes. Consequently, prognostic stratification based on mitochondrial and lysosome-related model signature genes could potentially serve as a valuable tool in guiding the clinical management of patients with breast cancer, ultimately enhancing treatment outcomes for those at high risk.\u003c/p\u003e \u003cp\u003eThis study formulated a risk stratification and validated it using mlMSGs derived from four distinct public datasets encompassing 4,897 patients with breast cancer. This study aimed to assess the prognosis, investigate immune correlations, and predict drug susceptibility. This study may assist in optimizing precision treatment and improve the prognosis of patients with breast cancer.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eExploration of novel interconnectivity between mitochondria and lysosome-related genes (MLRGs)\u003c/h2\u003e \u003cp\u003eFigure 1 depicts the methodology employed in this study. We used the TCGA database to analyze the differential expression of mitochondrial and lysosome-related genes in BRCA samples and adjacent normal tissues. The LIMMA R package was used to identify differentially expressed genes (DEGs) with logFC \u0026gt; 1 and p-value \u0026lt; 0.05. Specifically, we identified MLRGs, including 76 mitochondria-related and 91 lysosome-related differential genes. Volcano plots were employed to represent the top 20 differential gene expression analyses visually (Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eA-B). Subsequently, we identified 94 copy number variation (CNV) genes among MLRGs. The incidence of CNV gains was higher in PIGR, CD34, RAB7B, FMOD, PRELP, COA6, FCER1A, FLAD1, and S100A7 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eC), and was primarily concentrated on chromosomes 1, 8, and 16 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). We conducted differential expression analysis of the 94 CNV MLRGs in BRCA samples and adjacent normal tissues to further investigate the relationship between CNV and mRNA expression. Our findings revealed that COA6, FLAD1, TDRKH, TMEM79, LAMTOR2, and COX6C exhibited CNV gain and higher mRNA expression levels in the BRCA samples (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eE).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn single nucleotide polymorphism (SNPs) analysis, 203 samples with MLRG mutations were selected from a larger pool of 968 breast cancer samples. Among the mutated genes, APOB (4%), LRP1 (3%), AHNAK (3%), and ANK2 (3%) exhibited the highest frequency of mutations (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eF). Notably, significant co-occurrence was observed among genes with a higher mutation frequency, specifically APOB, ANK2, LRP1, and ANK2 (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eG). Furthermore, correlation analysis found that MLRGs had a strong correlation, and protein network interaction analysis revealed a significant interaction between MLRGs (Figs.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e2\u003c/span\u003eH-I).\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eIdentification of co-regulators of mitochondria and lysosomes\u003c/h3\u003e\n\u003cp\u003eWGCNA analysis identified co-regulated gene sets associated with mitochondria and lysosomes. Initially, we assessed the mitochondrial and lysosomal activity in the samples using ssGSEA (Additional Figs.\u0026nbsp;1B-C). Subsequently, when the soft threshold value was set to 3, the data exhibited a stronger adherence to the power-law distribution, enhancing the stability of the mean connectivity. This increased stability further supports the suitability of the data for subsequent research endeavors (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). The mergin yielded a cumulative count of 22 modules, with the minimum module count set to 50, deepSplit set to 2, and a similarity threshold of less than 0.25. (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). We identified significant (p \u0026lt; 0.05) genes in the mitochondria and lysosome modules and analyzed their relationship with prognosis in multiple datasets using univariate COX regression. A total of 43 genes were significantly correlated with prognosis in TCGA, GSE26085, and GSE42568 datasets (Figs.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003eC–F, Additional Fig.\u0026nbsp;1D). Additionally, most of the 43 genes were linked to prognosis in GSE96058 by the univariate COX analysis (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003eG). CELSR2, SIAH2, BTG2, LEF1, RBBP8, and AGBL2 were identified as the prevalent protective factors across the four datasets, whereas SLC38A7, SHMT2, and CISD1 were recognized as common detrimental factors among the top 10 highest hazard ratio values in the four datasets (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003eG).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eConstruction of prognostic models of mlMSGs\u003c/h2\u003e \u003cp\u003eThe prognostic model was constructed by integrating sets of 43 genes into the framework, with the TCGA dataset serving as the training set and GEO database data as the test set. Based on TCGA training set, a consistency model was developed using 99 algorithm combinations, and the predictive power of each model was assessed by calculating the C-index across all cohorts. Among the 99 models, the CoxBoost + survival-SVM algorithm had the highest average C-index and was chosen as the final model (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Furthermore, 30 mlMSGs were screened from 43 genes using the CoxBoost algorithm (Additional table \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003e). Subsequently, the survival-SVM algorithm was employed to develop the final prognostic model using the 30 model genes.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eRisk scores for each sample across all cohorts revealed that patients with high-risk scores experienced unfavorable clinical outcomes in TCGA, GEO, and meta-cohorts (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). To substantiate the prognostic model's superiority, the AUC values of the TCGA-BRCA dataset were 0.738, 0.746, and 0.738, respectively. The AUC values for the GSE20685 dataset were 0.781, 0.806, and 0.754. The AUC values for the GSE42568 dataset were 0.653, 0.696, and 0.783. The AUC values for the GSE90658 dataset were 0.715, 0.696, and 0.647. The AUC values for the meta-dataset were 0.724, 0.716, and 0.68 (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eC). These results underscored the prognostic significance of the mlMSGs model. We also calculated the C-index of different clinical features related to prognosis in various datasets. We found that the constructed prognostic model had a better c-index than other clinical features (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). Additionally, an extensive review of the pertinent literature from the past five years was conducted. Subsequently, 18 signature genes associated with diverse biological processes, including exosome(\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e), TP53 mutation(\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e), necroptosis(\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e), depression(\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e), pyroptosis(\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e), autophagy (\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e), immune (\u003cspan additionalcitationids=\"CR21\" citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e–\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e), angiogenesis (\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e), cuproptosis(\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e), tumor microenvironment (TME)(\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e), methylation (\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e), natural killer cell (\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e), lipid metabolism(\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e), were incorporated for comparative analysis. The mlMSG prognostic model exhibited superior C-index performance compared to nearly all models present in TCGA and GEO datasets (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eE). Our findings indicate that the risk scores of the mlMSG prognostic model exhibited superior performance regarding the C-index compared to all clinical features.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eHeterogeneity of mlMSGs activity in different cell types\u003c/h2\u003e \u003cp\u003eInitially, we categorized the samples in the GSE161529 dataset as normal (BM) or tumor (GM). Then, we grouped the cells into epithelial/cancer, immune, fibroblast, and endothelial categories (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eA). We assessed mlMSGs activity levels in different cells within the BM and GM groups (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eB). The analysis revealed increased mlMSGs activity in all four cell types in the GM group (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eC). Additionally, we subdivided immune cells into T and B cells, MNPS, NK, Treg cells, and plasma cells to examine mlMSGs activity distribution (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eD–E). Our study revealed a notable increase in mlMSGs activity of various immune cell types within the GM group, with NK cells exhibiting the most significant enhancement (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eF). These results suggested that hyperactivity of mlMSGs might promote the development of breast cancer.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo investigate the association between risk scores and mlMSGs activity, we computed the risk scores for the samples in the GSE176078 dataset. We stratified them into high- and low-risk groups based on the median value. Subsequently, the mlMSGs activity levels of various cell types, including epithelial/cancer, immune, fibroblast, endothelial, T and B cells, MNPS, NK, plasma cells were determined using the same methodology as above within the high- and low-risk groups (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eG, J, M). The mlMSGs activity was positively correlated with risk score across various cell types, including epithelial/cancer, immune, fibroblast, and endothelial cells, as well as T cells, MNPS, NK cells, and plasma cells (Figs.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eI, L and O), while exhibiting a negative correlation with B cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eO). The findings suggest that suppressing B-cell mlMSGs activity diminishes the efficacy of antitumor treatments in high-risk patients, potentially contributing to the unfavorable prognosis observed in this patient population.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation of mutation status in high- and low-risk groups\u003c/h2\u003e \u003cp\u003eWe analyzed somatic mutations in TCGA cohort and compared high-risk to low-risk scores to comprehensively examine the correlation between risk scores and BRCA mutations. Figures A and B illustrate the genes with frequent mutations, with the high-risk group displaying a higher frequency of mutations than the low-risk group (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003eA-B). OGT, EPHA3, CELSR2, and SHMT2 had higher mutation frequencies in the low-risk group (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eA), while CELSR2, C11orf24, and NDRG1 had higher mutation frequencies in the high-risk group (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). CELSR2, OGT, EPHA3, RBBP3, and SHMT2 showed higher mutation frequencies in TCGA cohort samples (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eC), with widespread mlMSGs mutation co-occurrence in the TCGA cohort (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eD). Additionally, we observed significant co-occurrence in the high- and low-risk groups, with co-occurrence being more pronounced in the low-risk group than in the high-risk group (Figs.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003eC-D). After comparing the mutation frequency of genes with a minimum of five mutations in the high- and low-risk groups, the high-risk group had a higher frequency of gene mutations than the low-risk group (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003eE). Tumor mutational burden (TMB) was significantly higher in the high-risk group than in the low-risk group (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eE). Survival analysis suggested that the prognosis of the patients was better in the low-risk group than in the high-risk group (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003eF).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eImmune landscape related to risk score\u003c/h2\u003e \u003cp\u003eMast cells were closely associated with mlMSGs, commonly linked to immune cells (Additional Figs.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eF and H). A comprehensive investigation of the TME was conducted using the Immuno-Oncology Biology Research (IOBR) R package and ImmuneAI algorithm and revealed that patients in the low-risk score group exhibited markedly elevated levels of immune cell infiltration than in the high-risk score group, signifying immune activation (Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eA-B). Moreover, a comparative analysis of immune function between the high- and low-risk groups demonstrated that most immune functions in the low-risk group surpassed those in the high-risk group, such as immune checkpoints (IBC) and HLA family genes, thereby providing additional evidence of the hyperimmune state within the low-risk group (Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eC-E). According to the CIBERSORT database, the high-risk group had increased M0 and M2 macrophage cells and T follicular helper cell infiltration, while NK cells, B cells, and CD8 + T cell infiltration were reduced (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eF). These findings suggest that the low-risk group might display a favorable prognosis, aligning with previous survival analysis outcomes. Subsequently, the tracking tumor immunophenotype (TIP) was calculated to investigate the potential biological mechanisms linked to the mlMSGs. As hypothesized, the low-risk group exhibited a predominance of step 4 (tumor immune infiltrating cell recruitment) and step 5 (immune cell infiltration), aligning with our earlier findings (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eG). The low-risk group exhibited significantly higher stromal, immune, and ESTIMATE scores (p \u0026lt; 0.001), suggesting a heightened level of overall immunity and immunogenicity within the TME of this particular group (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003eH).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eRole of the risk score in BRCA treatment\u003c/h2\u003e \u003cp\u003eWe utilized the Oncopredict package to calculate the IC\u003csub\u003e50\u003c/sub\u003e values of various drugs to evaluate the prognostic significance of the risk scores for chemotherapy response. Our findings revealed that the low-risk cohort was more susceptible to drugs than the high-risk cohort (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eA). Individuals with a low-risk score demonstrated heightened sensitivity to dactinomycin, whereas the high-risk group displayed increased sensitivity to bortezomib, docetaxel, afatinib, and erlotinib (Figs.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eA and \u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eG). These findings imply that risk score has potential as a biomarker for predicting drug sensitivity. We examined the influence of risk scores on immunotherapy efficacy using TIDE (Figs.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eB–D) and ImmuCellAI (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eE) to comprehensively evaluate the predictive capacity of risk scores for immunotherapy sensitivity. Our analysis revealed that the high-risk group was more sensitive to immunotherapy than the low-risk group. Differential analysis of the IPS response suggested that patients in the low-risk group were more sensitive to immune checkpoint inhibitors (ICIs) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003eF).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eComprehensive analysis of SHMT2 in breast cancer samples\u003c/h2\u003e \u003cp\u003eA comparative analysis of model genes in TCGA database revealed significant differences in expression levels between normal and tumor tissues, with SHMT2 exhibiting notably higher expression in tumor tissues (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eA). Subsequent cell experiments confirmed the upregulation of SHMT2 in tumor cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eB). The pan-cancer analysis also demonstrated a consistent elevation of SHMT2 in various tumors, such as breast cancer, indicating a potentially pivotal role of SHMT2 in tumorigenesis and progression (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eC). The pan-cancer analysis conducted on SHMT2 also revealed a significant correlation between SHMT2 and immune infiltration (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eD), immune checkpoints (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eD), HRD scores (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eA), stemness scores (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eB), immunoinflammatory pathways (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003eC), and TMB (Additional Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003eG) in BRCA. Our study delved deeper into the distinction between normal and tumor cells within the scRNA dataset GSE161529, revealing that model genes exhibited significant differences at the single-cell level (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eE). SHMT2 was prominently upregulated in tumor cells compared to normal cells across epithelial/cancer, immune, fibroblast, and endothelial cell types (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eG). Furthermore, our investigation into the relationship between SHMT2 expression and prognosis demonstrated that lower SHMT2 expression was correlated with a more favorable prognosis (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e9\u003c/span\u003eF).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eRoles of SHMT2 in breast cancer cells\u003c/h2\u003e \u003cp\u003eWe first knocked down SHMT2 and constructed a low SHMT2 expression cell line. Then, through CCK8 assay, we found that the proliferation activity of MCF-7 and HCC-1806 cells knocked out was significantly reduced compared with control cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e10\u003c/span\u003eA-B). Through cloning experiments, we observed that the colony area of the two cell lines was significantly reduced after knocking down SHMT2 (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e10\u003c/span\u003eC-D). We also found that knocking down SHMT2 inhibited the migration and invasion ability of breast cancer cells MCF-7 and HCC-1806 through wound healing assay and transwell experiments (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e10\u003c/span\u003eE-F).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003cdiv id=\"Sec13\" class=\"Section3\"\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e\u003c/h2\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003cdiv id=\"Sec23\" class=\"Section3\"\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec24\" class=\"Section2\"\u003e \u003cdiv id=\"Sec25\" class=\"Section3\"\u003e \u003c/div\u003e \u003cdiv id=\"Sec26\" class=\"Section3\"\u003e \u003c/div\u003e \u003cdiv id=\"Sec27\" class=\"Section3\"\u003e \u003c/div\u003e \u003c/div\u003e "},{"header":"Methods","content":"\u003ch2\u003eTranscriptome Data acquisition and processing\u003c/h2\u003e\u003cp\u003eThe transcriptome data, mutation data, and clinical data of BRCA were acquired from the The Cancer Genome Atlas (TCGA) databases, comprising a total of 1168 samples. Among these, 111 samples were classified as normal, while 1057 samples were categorized as tumour samples. Additionally, the Gene Expression Omnibus (GEO) expression profiles, specifically GSE42568, GSE20685 and GSE96058, were downloaded from the GEO database. Additional table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e detailed the characteristics of samples derived from different cohorts (TCGA, GSE20685, GSE42568, GSE96058). To account for any discrepancies between the TCGA and GEO expression profiles, the \"sva\" package was employed for batch adjustment.\u003c/p\u003e\u003cb\u003eScRNA-seq data acquisition and processing\u003c/b\u003e\u003cp\u003eThe scRNA-seq data was obtained from the GSE161529 and GSE176078 database. Quality control of the scRNA-seq data was conducted using the \"seurat\" and \"singleR\" R packages. Cells with less than 15% expression of both mitochondrial and ribosomal genes were retained, along with genes whose expression levels ranged from 200 to 10,000 and were expressed in at least three cells. The remaining cells were normalized using a linear regression model with the \"Log-normalisation\" technique. Additionally, the \"FindVariableFeatures\" function was employed to identify 2000 hypervariable genes. The data was scaled using the \"ScaleData\" function, followed by the identification of the top 15 principal components (PCs) through t-distributed stochastic neighbor embedding (t-SNE) analysis to identify significant clusters.\u003c/p\u003e\u003ch2\u003eAcquisition of the mitochondria and lysosome related genes\u003c/h2\u003e\u003cp\u003eMitochondria-related genes were derived from the intersection of MitoProteome and MitoCarta3.0 databases (Additional Fig.\u0026nbsp;1A), while the genes related to lysosomes were retrieved from the Gene Ontology (GO) database, resulting in a total of 872 lysosome-related genes. These genes are listed in \u003cb\u003eAdditional Table \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e\u003c/b\u003e.\u003c/p\u003e\u003ch2\u003eAUCell\u003c/h2\u003e\u003cp\u003eThe \"AUCell\" R package was utilized to compute activity scores for mitochondria and lysosomes in each cell lineage. The gene expression of each cell was subsequently ranked based on the area under the curve (AUC) value of mitochondria and lysosome related genes, enabling estimation of the proportion of highly expressed gene sets. Subsequently, the cells were stratified into high and low-AUC groups using the median score. The \"FindAllMarkers\" function was employed to analyze the differences between these high and low groups.\u003c/p\u003e\u003ch2\u003eWeighted co-expression network (WGCNA) analysis\u003c/h2\u003e\u003cp\u003eThe \"WGCNA\" R package was employed to examine the interconnectivity among distinct gene sets and the correlation between the phenotype and various gene sets. The construction of the gene co-expression network entailed the utilization of weighted expression correlation. Subsequently, hierarchical cluster analysis was conducted based on weighted correlation, resulting in the identification of diverse gene modules. Subsequently, the phenotypic traits under investigation were presented for weighted analysis, wherein the correlation and reliability of all genes within each gene module and phenotypic traits were computed. The core module, deemed the most pertinent and significant, was subsequently identified. Genes originating from the blue and green modules were then designated as hub mitochondria/lysosome regulators.\u003c/p\u003e\u003ch2\u003eEstablishment of prognostic signature derived from Machine-learning\u003c/h2\u003e\u003cp\u003eTo assess the association between MLRGs and prognosis, we utilized the TCGA cohort as the training set and the GEO cohort as the testing set. We employed ten machine learning algorithms, namely CoxBoost, stepwise Cox, Lasso, Ridge, elastic net (Enet), survival support vector machines (survival-SVMs), generalized boosted regression models (GBMs), supervised principal components (SuperPC), partial least Cox (plsRcox), and RSF, to construct a prognostic model that is both accurate and comprehensive. The construction process of the prognostic model proceeded as follows: initially, prognostic mlMSGs were chosen through univariate Cox analysis in both the TCGA and GEO cohorts. Subsequently, 101 algorithms, resulting from pairwise combinations of 10 machine learning algorithms, were employed to develop the most precise and comprehensive model with the highest C-index performance. Lastly, the C-index of each validation cohort was computed, and the optimal model was determined based on the highest average C-index value.\u003c/p\u003e\u003ch2\u003eSuperiority and validity evaluation of the prognostic model\u003c/h2\u003e\u003cp\u003eROC curves were created to evaluate the prediction accuracy of the model over 1,3, and 5 years. The risk scores of the samples in the training and validation sets were computed, and subsequently, these samples were categorized into high- and low-risk groups based on their respective risk scores. The Kaplan-Meier method was employed to generate survival curves, and the statistical significance was determined through log-rank tests. Furthermore, a total of 18 prognostic models pertaining to breast cancer were obtained, and the C-index of each model within each cohort was calculated to assess the prognostic predictive capability of the entire signature. To evaluate the clinical utility of the model, we conducted a comprehensive analysis of clinical data obtained from samples in the TCGA and GEO cohorts. This analysis encompassed various factors such as Age, ER, PR, Grade, HER2, Stage, N, and Lymph node status. Furthermore, we compared the prognostic predictive capabilities of these clinical variables with those of risk scores, as measured by the C-index.\u003c/p\u003e\u003ch2\u003eMutational landscape analysis\u003c/h2\u003e\u003cp\u003eThe Mutation Annotation Format (MAF) was acquired using the \"maftools\" R package to illustrate the mutation landscape between high- and low-risk groups. Subsequently, we examined the associations of gene mutations through co-occurrences.\u003c/p\u003e\u003ch2\u003eAnalysis of immune-omics molecular characterization\u003c/h2\u003e\u003cp\u003eThe infiltration of various immune cells was compared between the high and low risk groups in the immune-related database, utilizing the IOBR package. Specifically, the distribution of M0, and M2 cells between these groups was examined. ImmuneCell AI was used to compare the differences in immune infiltration between high and low-risk groups. At the same time, the wilcoxTest was used to compare the differences in immune function between high and low-risk groups, and further compare the differential expression of ICDs and HLA family genes in high and low-risk groups. Additionally, the anti-cancer immune status of tumor tissue and normal tissue was inferred through the analysis of the tumor immune cycle across seven stages. This study aims to analyze the current state of anti-cancer immunity by examining the various stages of the Cancer-Immunity Cycle, which include the release of cancer cell antigens (Step 1), the presentation of cancer antigens (Step 2), the priming and activation of immune responses (Step 3), the migration of immune cells to tumor sites (Step 4), the infiltration of immune cells into tumors (Step 5), the recognition of cancer cells by T cells (Step 6), and the subsequent elimination of cancer cells (Step 7). The relative abundance of stromal cells, immune cells, and tumor cells was assessed in high- and low-risk groups using the \"estimate\" R package. Additionally, the immune checkpoint conditions and immune function of these two groups were analyzed and visualized using the \"ggplot2\" R package.\u003c/p\u003e\u003ch2\u003eDrug sensitivity analysis\u003c/h2\u003e\u003cp\u003eThe R package \"oncoPredict\" was utilized to predict drug sensitivity by analyzing gene expression levels. The calculation of half-maximal inhibitory concentrations (IC50) for chemotherapeutic drugs was performed using the aforementioned \"oncoPredict\" R package. TIDE and ImmuneCell AI analyses were performed to assess the immunotherapy sensitivity of the high and low-risk groups.\u003c/p\u003e\u003ch2\u003eCell culture\u003c/h2\u003e\u003cp\u003e HBL-100 human normal breast epithelial cells were provided by the Cell Resource Center of Shanghai Life Sciences Institute, as well as MDA-MB-231, HCC1806, MCF-7, and BT-474 human breast cancer cell lines. DMEM or RPMI-1640 (Gibco BRL, USA) was used to culture these cell lines. They were growth at 37°C with 5% CO2 and 10% fetal bovine serum (FBS) (Gibco BRL, USA).\u003c/p\u003e\u003ch2\u003eQuantitative real-time PCR (qRT-PCR)\u003c/h2\u003e\u003cp\u003eTotal RNA was isolated from breast cancer cells using TRIzol reagent (Invitrogen, 15596018) following the manufacturer's instructions. The concentration and quality of the total RNA were assessed using a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). Subsequently, the total RNA was reverse-transcribed using a PrimeScript RT reagent kit (Takara Bio Inc., Japan). The resulting cDNA was combined with SYBR Green qPCR Master Mix (Takara Bio), primer, and Diethyl Pyrocarbonate (DEPC; Beyotime Institute of Biotechnology, China) water to achieve a final volume of 10 µL. The experiment was conducted in 96-well plates utilizing the LightCycler 480® Real Time PCR System (Roche Diagnostics, Switzerland). Each reaction was replicated at least three times. The mRNA expression levels were standardized against the levels of glyceraldehyde 3-phosphgate dehydrogenase (GAPDH). The detection of nonspecific amplifications was monitored through the analysis of melting curves. The primer sequences were listed in \u003cb\u003eAdditional table \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e\u003c/b\u003e.\u003c/p\u003e\u003ch2\u003eSmall-interfering RNA (siRNA) transfection\u003c/h2\u003e\u003cp\u003eThe control used in this study was a non-specific scramble siRNA, while the experimental siRNA (RiboBio, Guangzhou, China) was transfected into breast cancer cells with 80% confluency using Lipofectamine 3000 (Invitrogen, CA, USA) as per the manufacturer's instructions. Following transfection, the cells were incubated for 48 hours in a culture incubator.\u003c/p\u003e\u003ch2\u003eColony formation\u003c/h2\u003e\u003cp\u003eA total of 1×103 cells were transfected and subsequently cultured in 6-well plates for a duration of two weeks. Following the formation of cell clones, the cells were rinsed and fixed using a 4% paraformaldehyde (PFA) solution for a period of 15 minutes. Subsequently, these cells were stained with Crystal violet (Solarbio, China) for a duration of 20 minutes.\u003c/p\u003e\u003ch2\u003ePan-cancer analysis\u003c/h2\u003e\u003cp\u003eThe biogenic analysis tool sxdyc.com was utilized to investigate the differential expression of SHMT2 in normal and cancerous tissues, as well as its associations with stemness score, immune infiltration, immune inflammation, ICB, and TMB.\u003c/p\u003e\u003ch2\u003eCCK8 assay\u003c/h2\u003e\u003cp\u003eCells were initially plated onto 96-well plates at a density of 2×103 and allowed to incubate overnight. Subsequently, the cells were subjected to varying durations of incubation (1, 2, 3, 4, 5, 6 days) at 37°C and 5% CO2. Following this incubation period, 10 µL of the CCK8 labeling reagent (0.5 mg/mL, Dojindo, Japan) was added to each well, and the cells were further incubated for an additional 2h at 37°C and 5% CO2. The absorbance of the cells at 450 nm was then measured using an enzyme-labeled meter (Thermo Scientific, Shanghai, China).\u003c/p\u003e\u003ch2\u003eMigration and invasion experiments\u003c/h2\u003e\u003cp\u003eChambers with a diameter of 8µm (CORNING) were employed in the cell migration and invasion assay, either with or without matrigel (CORNING). Approximately 5x104 MCF-7 cells were seeded into the upper chamber and incubated in serum-free media containing varying peptide concentrations or an Akt inhibitor. Subsequently, DMEM with 10% fetal bovine serum was added to the lower chamber. Following a 48-hour incubation period at 37ºC with 95% air and 5% CO2, the membranes were fixed using a 4% paraformaldehyde solution for 20 minutes. Subsequently, the membranes were stained using a crystal violet solution for 15 minutes. The cells present on the upper chambers were gently wiped using a cotton swab. The average counts of migrative and invasive cells were assessed in five randomly selected fields under a light microscope at a magnification of 100x.\u003c/p\u003e\u003ch2\u003eStatistical analysis\u003c/h2\u003e\u003cp\u003eR (4.3.1) and Strawberry Perl (v5.32.1) were used to generate all the results and figures. The specific research methods and R packages used are described above. Three independent experiments were performed to record the data as mean ± standard deviation (SD). Student’s t-tests were utilized for comparisons between high- and low-groups. (*P \u0026lt; 0.05, **P \u0026lt; 0.01, ***P \u0026lt; 0.001).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe American Joint Committee on Cancer (AJCC) staging systems are commonly employed in clinical practice to manage various aspects of BRCA treatment, including decision-making and monitoring strategies (\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e). However, this approach may be insufficient and consequently result in the possibility of overtreatment or undertreatment due to the inherent heterogeneity of BRCA (\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e) (\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e). Simultaneously, existing methods for prognosticating and determining drug sensitivity in patients with breast cancer are presently inadequate (\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e). Progress in molecular biology and immunology has expanded the range of treatment options for breast cancer, necessitating improved personalized evaluation techniques to guide clinical decision-making. For example, implementing a 21-gene test in premenopausal women enables the assessment of prognostic risk, potentially sparing certain patients from undergoing chemotherapy (\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e). Hence, the investigation of dependable and universally efficacious biomarkers for precise prognosis prediction in patients with breast cancer holds a pivotal position in clinical diagnosis and treatment. Mitochondrial and lysosomal dysfunctions, which are crucial organelles within breast cancer cells, have a substantial impact on the onset and progression of breast cancer (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e) (\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e). The current understanding of the impact of mitochondrial and lysosomal genes on breast cancer treatment remains uncertain. Our study was conducted to elucidate the significance of these genes and to assess the correlation between breast cancer-associated mitochondrial and lysosomal genes and prognosis, TME, and drug effectiveness.\u003c/p\u003e \u003cp\u003eThis study employed WGCNA and univariate Cox regression to discern the MLRGs implicated in the prognosis of patients with breast cancer. Subsequently, we trained a training dataset using 101 machine-learning models and gene expression profiles and validated the results using three independent datasets. The CoxBoost\u0026thinsp;+\u0026thinsp;survival-SVM model was chosen as the optimal approach for further analysis. Currently, integrating algorithms, such as artificial intelligence, with extensive biological data is a significant approach to investigating the correlation between diseases and genes. Its strength lies in using machine learning to identify the most suitable prognostic model for BRCA and streamline the model effectively (\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e, \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e). Nonetheless, overfitting is an important issue and should not be disregarded during the model construction phase. We employed the average C-index of multiple validation cohorts as the ranking criterion to mitigate the overfitting of the training set and ensure the generalizability of the model. Then, our meticulously chosen optimal model was compared against previously published models and the predictive effectiveness of various clinical features, thereby demonstrating the superiority of our model(\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e) (\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e) (\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e).\u003c/p\u003e \u003cp\u003eThe impact of the TME on breast cancer cells had significant importance. Varied immune cell infiltration can influence the prognosis of patients with breast cancer (\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e). Our findings indicated a significantly higher level of immune infiltration in the low-risk group than in the high-risk group. Previous research has documented that M0 and M2 macrophages are strongly associated with a poor prognosis in breast cancer (\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e). Our study findings indicate a positive correlation between M0 and M2 macrophage infiltration and risk scores. T cells, B cells, and NK cells are immune cells with anti-tumor properties, and the augmentation of mitochondrial lysosome activity serves to bolster their anti-tumor efficacy(\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e) (\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e). Our study revealed diminished infiltration of these cells in high-risk patients. Nevertheless, sc-RNA analysis revealed a notable increase in mitochondrial lysosome activity among immune cells, with the exception of B cells. This finding suggests that B cells may play a pivotal role in the development of anti-tumor immune deficiency in high-risk patients. Enhancing B cell infiltration and mitochondrial lysosome activity emerges as a crucial immunotherapy strategy to enhance the prognosis of high-risk patients.\u003c/p\u003e \u003cp\u003ePatients with elevated TMB levels may exhibit an enhanced responsiveness to immunotherapy (\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e). Consequently, we examined TMB disparities between cohorts categorized as high- and low-risk, revealing a positive correlation between TMB levels and risk scores. Patients with elevated TMB demonstrated poorer prognosis, potentially attributable to the limited utilization of immunotherapy in breast cancer treatment. Hence, an additional assessment was conducted to examine the responsiveness of patients classified into high- and low-risk groups toward immunotherapy. The TIDE score was lower in the high-risk group than in the low-risk group, reinforcing that the high-risk group displayed a heightened sensitivity to immunotherapy. Similar outcomes were observed using ImmuCellAI for analysis.\u003c/p\u003e \u003cp\u003eWe analyzed the sensitivity to chemotherapy drugs in the high- and low-risk groups. We evaluated the effectiveness of the commonly used clinical chemotherapy drugs in these groups. Our findings indicate that patients classified as low-risk demonstrated sensitivity to dactinomycin, whereas those classified as high-risk demonstrated sensitivity to docetaxel, afatinib, osimertinib, savolitinib, and erlotinib. Afatinib, osimertinib, savolitinib, and erlotinib are tyrosine kinase inhibitors that have significantly impacted the treatment of various tumors(\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e) (\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e) (\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e, \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e). Anastrozole, which inhibits the estrogen signaling pathway and reduces the stimulatory effects of estrogen on cancer cells, is also commonly used in breast cancer treatment (\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e). Additionally, drugs such as lapatinib target HER2 receptors and other related tyrosine kinases, demonstrating therapeutic benefits for HER2-negative breast cancer patients(\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e) (\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e) (\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e). Furthermore, combining tyrosine kinase inhibitors with certain chemotherapy and immunotherapy drugs has been shown to enhance efficacy and potentially prolong drug resistance(\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e) (\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e). Thus, the administration of tyrosinase inhibitors in conjunction with standard therapy may enhance the prognosis for high-risk patients.\u003c/p\u003e \u003cp\u003eAmong the model genes, we compared the HR values of model genes in different datasets and found that SHMT2 and CISD1 were significantly correlated with prognosis in TCGA and GEO datasets. SHMT2 has a higher mutation frequency in breast cancer; therefore, we selected SHMT2 for subsequent analyses. We validated the SHMT2 mRNA expression in various breast cancer cell lines. Our findings revealed upregulation of SHMT2 mRNA expression across different breast cancer cell lines. Moreover, a comprehensive pan-cancer analysis demonstrated differential SHMT2 expression in most cancer species, including BLCA, CESC, and COAD. Previous studies have shown that increased SHMT2 expression causes mitochondrial dysfunction, contributing to cell survival (\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e) (\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e). Cell-based experiments revealed that downregulating SHMT2 expression reduces cellular activity and invasion and migration capabilities of BRCA. These findings underscore the significant contribution of SHMT2 to the advancement of anti-breast cancer strategies, aligning with the outcomes of previous studies.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eOur study demonstrated that the developed model could predict the outcomes of patients with breast cancer, thereby facilitating the formulation of customized treatment approaches for individuals with varying risk profiles. Furthermore, we substantiated the involvement of SHMT2 in BRCA through cellular experiments, thereby aiding in identifying potential genes suitable for personalized precision breast cancer therapy.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eML, machine learning; mlMSGs, mitochondrial and lysosome-related model signature genes; OXPHOS, oxidative phosphorylation; TNBC, triple negative breast cancer; CTSD, Cathepsin D; DEGs, differentially expressed genes; MLRGs, mitochondria and lysosome-related genes; CNV, copy number variation; SNPs, single nucleotide polymorphism; TME, tumor microenvironment; TMB, Tumor mutational burden; IOBR, ImmunoOncology Biology Research; IBC, immune checkpoints; TIP, tumor immunophenotype; ICIs, immune checkpoint inhibitors; TCGA, The Cancer Genome Atlas; GEO, the Gene Expression Omnibus; PCs, principal components; t-SNE, t-distributed stochastic neighbor embedding; GO, Gene Ontology; AUC, area under the curve; WGCNA, Weighted co-expression network; Enet, elastic net; survival-SVMs, survival support vector machines; GBMs, generalized boosted regression models; SuperPC, supervised principal components; plsRcox, partial least Cox, MAF, The Mutation Annotation Format; IC50, half-maximal inhibitory concentrations; FBS, fetal bovine serum; qRT-PCR, Quantitative real-time PCR; DEPC, Diethyl Pyrocarbonate; GAPDH, glyceraldehyde 3-phosphgate dehydrogenase; siRNA, Small-interfering RNA; PFA, paraformaldehyde; SD, standard deviation; AJCC, The American Joint Committee on Cancer\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study was proposed and designed by JHP. JHP and HLC were primarily responsible for drafting the manuscript. Analyses were conducted by JHP, HLC, and ZHW. Experiments were carried out by ZHW and JLS. The manuscript was revised by JHP, HLC, and ZHW. All authors have reviewed and approved the final version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDeclaration of competing interest\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors assert that they do not have any competing financial interests or personal relationships that could be perceived as influencing the work presented in this paper.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by the Jiangsu Provincial People\u0026apos;s Hospital youth talent project (Project No. PY2022023).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll data generated or analyzed during this study are included in this article. Further enquiries can be directed to the corresponding author.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContributor Information\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePeng Jinghui, Email:
[email protected]\u003c/p\u003e\n\u003cp\u003eShi jiale, Email:
[email protected]\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eStrobl S, Korkmaz B, Devyatko Y, Schuetz M, Exner R, Dubsky PC, et al. Adjuvant Bisphosphonates and Breast Cancer Survival. Annu Rev Med. 2016;67:1-10.\u003c/li\u003e\n\u003cli\u003eAsleh K, Negri GL, Spencer Miko SE, Colborne S, Hughes CS, Wang XQ, et al. Proteomic analysis of archival breast cancer clinical specimens identifies biological subtypes with distinct survival outcomes. Nat Commun. 2022;13(1):896.\u003c/li\u003e\n\u003cli\u003eMatos Do Canto L, Marian C, Varghese RS, Ahn J, Da Cunha PA, Willey S, et al. Metabolomic profiling of breast tumors using ductal fluid. Int J Oncol. 2016;49(6):2245-54.\u003c/li\u003e\n\u003cli\u003eXie J, Yang Y, Gao Y, He J. Cuproptosis: mechanisms and links with cancers. Mol Cancer. 2023;22(1):46.\u003c/li\u003e\n\u003cli\u003eMargolin AA, Bilal E, Huang E, Norman TC, Ottestad L, Mecham BH, et al. Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer. Sci Transl Med. 2013;5(181):181re1.\u003c/li\u003e\n\u003cli\u003eMiotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236-46.\u003c/li\u003e\n\u003cli\u003eKim CK, Choi JW, Jiao Z, Wang D, Wu J, Yi TY, et al. An automated COVID-19 triage pipeline using artificial intelligence based on chest radiographs and clinical data. NPJ Digit Med. 2022;5(1):5.\u003c/li\u003e\n\u003cli\u003eBao X, Zhang J, Huang G, Yan J, Xu C, Dou Z, et al. The crosstalk between HIFs and mitochondrial dysfunctions in cancer development. Cell Death Dis. 2021;12(2):215.\u003c/li\u003e\n\u003cli\u003eFassl A, Brain C, Abu-Remaileh M, Stukan I, Butter D, Stepien P, et al. Increased lysosomal biomass is responsible for the resistance of triple-negative breast cancers to CDK4/6 inhibition. Sci Adv. 2020;6(25):eabb2210.\u003c/li\u003e\n\u003cli\u003eKetterer S, Mitschke J, Ketscher A, Schlimpert M, Reichardt W, Baeuerle N, et al. Cathepsin D deficiency in mammary epithelium transiently stalls breast cancer by interference with mTORC1 signaling. Nat Commun. 2020;11(1):5133.\u003c/li\u003e\n\u003cli\u003eJin S. Autophagy, mitochondrial quality control, and oncogenesis. Autophagy. 2006;2(2):80-4.\u003c/li\u003e\n\u003cli\u003ePeng W, Wong YC, Krainc D. Mitochondria-lysosome contacts regulate mitochondrial Ca(2+) dynamics via lysosomal TRPML1. Proc Natl Acad Sci U S A. 2020;117(32):19266-75.\u003c/li\u003e\n\u003cli\u003eQiu P, Guo Q, Yao Q, Chen J, Lin J. Characterization of Exosome-Related Gene Risk Model to Evaluate the Tumor Immune Microenvironment and Predict Prognosis in Triple-Negative Breast Cancer. Front Immunol. 2021;12:736030.\u003c/li\u003e\n\u003cli\u003eQiu P, Guo Q, Pan K, Chen J, Lin J. A pyroptosis-associated gene risk model for predicting the prognosis of triple-negative breast cancer. Front Oncol. 2022;12:890242.\u003c/li\u003e\n\u003cli\u003eJiang M, Wu X, Bao S, Wang X, Qu F, Liu Q, et al. Immunometabolism characteristics and a potential prognostic risk model associated with TP53 mutations in breast cancer. Front Immunol. 2022;13:946468.\u003c/li\u003e\n\u003cli\u003ePu S, Zhou Y, Xie P, Gao X, Liu Y, Ren Y, et al. Identification of necroptosis-related subtypes and prognosis model in triple negative breast cancer. Front Immunol. 2022;13:964118.\u003c/li\u003e\n\u003cli\u003eWang X, Wang N, Zhong LLD, Su K, Wang S, Zheng Y, et al. Development and Validation of a Risk Prediction Model for Breast Cancer Prognosis Based on Depression-Related Genes. Front Oncol. 2022;12:879563.\u003c/li\u003e\n\u003cli\u003eZhou Z, Deng J, Pan T, Zhu Z, Zhou X, Lv C, et al. Prognostic Significance of Cuproptosis-Related Gene Signatures in Breast Cancer Based on Transcriptomic Data Analysis. Cancers (Basel). 2022;14(23).\u003c/li\u003e\n\u003cli\u003eLi X, Cao Y, Yu X, Jin F, Li Y. A novel autophagy-related genes prognostic risk model and validation of autophagy-related oncogene VPS35 in breast cancer. Cancer Cell Int. 2021;21(1):265.\u003c/li\u003e\n\u003cli\u003eLi L, Li L, Liu M, Li Y, Sun Q. Novel immune-related prognostic model and nomogram for breast cancer based on ssGSEA. Front Genet. 2022;13:957675.\u003c/li\u003e\n\u003cli\u003eLu X, Gou Z, Yu L, Bu H. A novel risk model based on immune response predicts clinical outcomes and characterizes immunophenotypes in triple-negative breast cancer. Am J Cancer Res. 2022;12(8):3913-31.\u003c/li\u003e\n\u003cli\u003eChen L, Dong Y, Pan Y, Zhang Y, Liu P, Wang J, et al. Identification and development of an independent immune-related genes prognostic model for breast cancer. BMC Cancer. 2021;21(1):329.\u003c/li\u003e\n\u003cli\u003eTao D, Wang Y, Zhang X, Wang C, Yang D, Chen J, et al. Identification of Angiogenesis-Related Prognostic Biomarkers Associated With Immune Cell Infiltration in Breast Cancer. Front Cell Dev Biol. 2022;10:853324.\u003c/li\u003e\n\u003cli\u003eGeng S, Fu Y, Fu S, Wu K. A tumor microenvironment-related risk model for predicting the prognosis and tumor immunity of breast cancer patients. Front Immunol. 2022;13:927565.\u003c/li\u003e\n\u003cli\u003eFeng L, Jin F. Screening of differentially methylated genes in breast cancer and risk model construction based on TCGA database. Oncol Lett. 2018;16(5):6407-16.\u003c/li\u003e\n\u003cli\u003eLiu Z, Ding M, Qiu P, Pan K, Guo Q. Natural killer cell-related prognostic risk model predicts prognosis and treatment outcomes in triple-negative breast cancer. Front Immunol. 2023;14:1200282.\u003c/li\u003e\n\u003cli\u003eYe Z, Zou S, Niu Z, Xu Z, Hu Y. A Novel Risk Model Based on Lipid Metabolism-Associated Genes Predicts Prognosis and Indicates Immune Microenvironment in Breast Cancer. Front Cell Dev Biol. 2021;9:691676.\u003c/li\u003e\n\u003cli\u003eGinter PS, Idress R, D\u0026apos;Alfonso TM, Fineberg S, Jaffer S, Sattar AK, et al. Histologic grading of breast carcinoma: a multi-institution study of interobserver variation using virtual microscopy. Mod Pathol. 2021;34(4):701-9.\u003c/li\u003e\n\u003cli\u003eZhang L, Dong D, Li H, Tian J, Ouyang F, Mo X, et al. Development and validation of a magnetic resonance imaging-based model for the prediction of distant metastasis before initial treatment of nasopharyngeal carcinoma: A retrospective cohort study. EBioMedicine. 2019;40:327-35.\u003c/li\u003e\n\u003cli\u003eWong KY, Fan C, Tanioka M, Parker JS, Nobel AB, Zeng D, et al. I-Boost: an integrative boosting approach for predicting survival time with multiple genomics platforms. Genome Biol. 2019;20(1):52.\u003c/li\u003e\n\u003cli\u003eKong J, Lee H, Kim D, Han SK, Ha D, Shin K, et al. Network-based machine learning in colorectal and bladder organoid models predicts anti-cancer drug efficacy in patients. Nat Commun. 2020;11(1):5485.\u003c/li\u003e\n\u003cli\u003eSparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, et al. Adjuvant Chemotherapy Guided by a 21-Gene Expression Assay in Breast Cancer. N Engl J Med. 2018;379(2):111-21.\u003c/li\u003e\n\u003cli\u003eLiu P, Deng X, Zhou H, Xie J, Kong Y, Zou Y, et al. Multi-omics analyses unravel DNA damage repair-related clusters in breast cancer with experimental validation. Front Immunol. 2023;14:1297180.\u003c/li\u003e\n\u003cli\u003eChu G, Ji X, Wang Y, Niu H. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. Mol Ther Nucleic Acids. 2023;33:110-26.\u003c/li\u003e\n\u003cli\u003eLiu Z, Liu L, Weng S, Guo C, Dang Q, Xu H, et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in colorectal cancer. Nat Commun. 2022;13(1):816.\u003c/li\u003e\n\u003cli\u003eAdams S, Gray RJ, Demaria S, Goldstein L, Perez EA, Shulman LN, et al. Prognostic value of tumor-infiltrating lymphocytes in triple-negative breast cancers from two phase III randomized adjuvant breast cancer trials: ECOG 2197 and ECOG 1199. J Clin Oncol. 2014;32(27):2959-66.\u003c/li\u003e\n\u003cli\u003eAli HR, Chlon L, Pharoah PD, Markowetz F, Caldas C. Patterns of Immune Infiltration in Breast Cancer and Their Clinical Implications: A Gene-Expression-Based Retrospective Study. PLoS Med. 2016;13(12):e1002194.\u003c/li\u003e\n\u003cli\u003eLi X, Lu M, Yuan M, Ye J, Zhang W, Xu L, et al. CXCL10-armed oncolytic adenovirus promotes tumor-infiltrating T-cell chemotaxis to enhance anti-PD-1 therapy. Oncoimmunology. 2022;11(1):2118210.\u003c/li\u003e\n\u003cli\u003ePena-Romero AC, Orenes-Pinero E. Dual Effect of Immune Cells within Tumour Microenvironment: Pro- and Anti-Tumour Effects and Their Triggers. Cancers (Basel). 2022;14(7).\u003c/li\u003e\n\u003cli\u003eLargeot A, Pagano G, Gonder S, Moussay E, Paggetti J. The B-side of Cancer Immunity: The Underrated Tune. Cells. 2019;8(5).\u003c/li\u003e\n\u003cli\u003eAllgauer M, Budczies J, Christopoulos P, Endris V, Lier A, Rempel E, et al. Implementing tumor mutational burden (TMB) analysis in routine diagnostics-a primer for molecular pathologists and clinicians. Transl Lung Cancer Res. 2018;7(6):703-15.\u003c/li\u003e\n\u003cli\u003eLai E, Puzzoni M, Ziranu P, Pretta A, Impera V, Mariani S, et al. New therapeutic targets in pancreatic cancer. Cancer Treat Rev. 2019;81:101926.\u003c/li\u003e\n\u003cli\u003eWu S, Luo M, To KKW, Zhang J, Su C, Zhang H, et al. Intercellular transfer of exosomal wild type EGFR triggers osimertinib resistance in non-small cell lung cancer. Mol Cancer. 2021;20(1):17.\u003c/li\u003e\n\u003cli\u003eChoueiri TK, Heng DYC, Lee JL, Cancel M, Verheijen RB, Mellemgaard A, et al. Efficacy of Savolitinib vs Sunitinib in Patients With MET-Driven Papillary Renal Cell Carcinoma: The SAVOIR Phase 3 Randomized Clinical Trial. JAMA Oncol. 2020;6(8):1247-55.\u003c/li\u003e\n\u003cli\u003eXie G, Zhu A, Gu X. Converged DNA Damage Response Renders Human Hepatocellular Carcinoma Sensitive to CDK7 Inhibition. Cancers (Basel). 2022;14(7).\u003c/li\u003e\n\u003cli\u003eTyutyunyk-Massey L, Gewirtz DA. Roles of autophagy in breast cancer treatment: Target, bystander or benefactor. Semin Cancer Biol. 2020;66:155-62.\u003c/li\u003e\n\u003cli\u003eFakhri S, Moradi SZ, Farzaei MH, Bishayee A. Modulation of dysregulated cancer metabolism by plant secondary metabolites: A mechanistic review. Semin Cancer Biol. 2022;80:276-305.\u003c/li\u003e\n\u003cli\u003eGaynor N, Crown J, Collins DM. Immune checkpoint inhibitors: Key trials and an emerging role in breast cancer. Semin Cancer Biol. 2022;79:44-57.\u003c/li\u003e\n\u003cli\u003eBiancolella M, Testa B, Baghernajad Salehi L, D\u0026apos;Apice MR, Novelli G. Genetics and Genomics of Breast Cancer: update and translational perspectives. Semin Cancer Biol. 2021;72:27-35.\u003c/li\u003e\n\u003cli\u003eKok PS, Cho D, Yoon WH, Ritchie G, Marschner I, Lord S, et al. Validation of Progression-Free Survival Rate at 6 Months and Objective Response for Estimating Overall Survival in Immune Checkpoint Inhibitor Trials: A Systematic Review and Meta-analysis. JAMA Netw Open. 2020;3(9):e2011809.\u003c/li\u003e\n\u003cli\u003eZhu R, Li L, Nguyen B, Seo J, Wu M, Seale T, et al. FLT3 tyrosine kinase inhibitors synergize with BCL-2 inhibition to eliminate FLT3/ITD acute leukemia cells through BIM activation. Signal Transduct Target Ther. 2021;6(1):186.\u003c/li\u003e\n\u003cli\u003eZhang Y, Liu Z, Wang X, Jian H, Xiao H, Wen T. SHMT2 promotes cell viability and inhibits ROS-dependent, mitochondrial-mediated apoptosis via the intrinsic signaling pathway in bladder cancer cells. Cancer Gene Ther. 2022;29(10):1514-27.\u003c/li\u003e\n\u003cli\u003eRon-Harel N, Santos D, Ghergurovich JM, Sage PT, Reddy A, Lovitch SB, et al. Mitochondrial Biogenesis and Proteome Remodeling Promote One-Carbon Metabolism for T Cell Activation. Cell Metab. 2016;24(1):104-17.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Breast cancer, mitochondrial and lysosomal dysfunction, machine learning, sc-RNA, and immunotherapy","lastPublishedDoi":"10.21203/rs.3.rs-4176718/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4176718/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe impact of mitochondrial and lysosomal co-dysfunction on breast cancer patient outcomes is unclear. The objective of this study is to develop a predictive machine learning (ML) model utilizing mitochondrial and lysosomal co-regulators in order to enhance the prognosis for individuals with BC.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDifferences and correlations of mitochondrial and lysosome related genes were screened and validated. WGCNA and univariate Cox regression were employed to identify prognostic mitochondrial and lysosomal co-regulators. ML was utilized to further selected these regulators as mitochondrial and lysosome-related model signature genes (mlMSGs)and constructed models. The association between the immune and mlMSGs score was investigated through scRNA-seq. Finally, the expression and function of the key gene SHMT2 were confirmed through in vitro experiments.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAccording to the C-index, the coxboost+ Survivor-SVM model was identified as the most suitable for predicting outcomes in BC patients. Subsequently, patients were stratified into high and low risk groups based on the model, which demonstrated strong prognostic accuracy. While the overall immunoinfiltration of immune cells was decreased in the high-risk group, it was specifically noted that B cell mlMSGs activity remained diminished in high-risk patients. Additionally, the study found that SHMT2 promoted the proliferation, migration, and invasion of BC cells.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusion\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study shows that the ML model accurately predicts the prognosis of BC patients. Analysis conducted through the model has identified decreased B-cell immune infiltration and reduced mlMSGs activity as significant factors influencing patient prognosis. These results may offer novel approaches for early intervention and prognostic forecasting in BC.\u003c/p\u003e","manuscriptTitle":"Crosstalk between mitochondrial and lysosomal co-regulators defines clinical outcomes of breast cancer by integrating multi-omics and machine learning","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-04-19 17:43:58","doi":"10.21203/rs.3.rs-4176718/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"0f4b6a04-42d8-43c4-acb0-7215cff03e4d","owner":[],"postedDate":"April 19th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-05-20T18:23:42+00:00","versionOfRecord":[],"versionCreatedAt":"2024-04-19 17:43:58","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4176718","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4176718","identity":"rs-4176718","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.