Development of a Prognostic Model for HR-positive HER2-negative and Node-negative Breast Cancer: Integrating Clinical and Transcriptional Biomarkers | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Development of a Prognostic Model for HR-positive HER2-negative and Node-negative Breast Cancer: Integrating Clinical and Transcriptional Biomarkers Xiaoxi Chen, Hongjin Liu, Min Gao, Jingming Ye This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4394836/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Purpose In this study, a prognostic model was constructed for HR-positive HER2-negative (HR+/HER2–) and node-negative breast cancer by integrating clinical and transcriptional biomarkers, with a particular focus on exploring both main effects and gene-gene (G × G) interactions. Methods Univariate and multivariate Cox regression were used to analyze three independent trans-ethnic cohorts with a total of 2180 samples. Independent prognostic factors were used to construct a prediction model. The Model was validated by ROC curves, calibration curve and decision curve analysis (DCA).The molecular basis of the Model was illustrated by integrating bulk-tumor and single-cell RNAseq datasets. Results Our findings revealed that a combination of clinical and transcriptional factors can improve the accuracy of prognostic models for HR+/HER2– and node-negative breast cancer. The Model achieved satisfactory discrimination, with the area under the curve (AUC) ranging from 0.65 (Metabric, 10-year survival) to 0.88 (GSE96058, 3-year survival). Conclusion This research provides a powerful tool for predicting outcomes in HR+/HER2– and node-negative breast cancer, offering initial insights into the molecular mechanisms that can guide future investigations. Breast cancer Hormone receptor-positive Prognostic prediction Interaction Nomogram Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Introduction Breast cancer (BC) is the most prevalent cancer among women worldwide, with an annual incidence increase of 0.5% [ 1 ]. The HR-positive HER2-negative (HR+/HER2–) is the most prevalent subtype, accounting for about 70% of new breast cancer cases [ 2 ]. Patients with HR+/HER2– breast cancer generally have a good prognosis. However, approximately one-third of these patients will experience late relapses, even decades after the initial diagnosis [ 3 ]. It is essential to predict which patients will have recurrences within 5 years of surgery so that they can be selected to receive adjuvant chemotherapy. For HR+/HER2– breast cancers, quantitative real-time reverse transcription PCR (qRT-PCR)-based multigene assays (e.g., 21-gene or 70-gene assays) can be used to guide adjuvant chemotherapy decisions [ 4 , 5 ]. In addition, it has recently been shown that for most cancers, by combining clinical and transcriptomic data, it is possible to improve survival prediction [ 6 ]. Also, predictors of gene-gene (G × G) interactions have the potential to improve prediction accuracy and offer crucial insights into the molecular mechanisms underlying complex disorders [ 7 , 8 ]. G × G interactions can be used as a basis for the construction of prognostic models [ 9 , 10 ]. Independent risk factors were selected to construct the prognostic model through single-factor and multifactor Cox regression analyses. Materials and methods Study population and data collection To construct the prognostic model, we first collected the expression profile and clinical information of 2180 patients with HR+/HER2- and node-negative breast cancer from three datasets: GSE96058 (RNAseq, n = 1429) [ 11 ], TCGA (RNAseq, n = 182) [ 12 ] and Metabric (microarray, n = 569) [ 13 ]. The TCGA and the Metabric datasets, including both clinical and gene expression data, were downloaded from the cBioPortal for Cancer Genomics [ 14 ]. Patients with complete transcriptomic data and information on overall survival were retained. All of the gene expression value was log-transformed and standardized before entering the association analyses. Two 10 × Genomics-based (GSE245601, GSE176078) HR+/HER2– Breast cancer single-cell expression profiling was used to study cellular dynamics and how these interact with each other [ 15 , 16 ]. In our subsequent analysis, there were a total of 2180 HR+/HER2– and node-negative breast cancer patients with 13409 genes, whose clinical characteristics were summarized in Table 1 . Table 1 Clinical characteristics of patients with HR+/HER2– and node-negative BC in every datasets. Biomarker screening Discovery Validation Overall Model construction Training Testing 1 Testing 2 Characteristics GSE96058 Metabric TCGA Number of samples 1429 569 183 2181 Age(years) Mean (SD) 63.9 (12.1) 60.7 (11.8) 61.7 (12.6) 62.9 (12.2) Menopausal State Post 0 (0%) 447 (78.6%) 138 (75.4%) 585 (26.8%) Pre 0 (0%) 122 (21.4%) 37 (20.2%) 159 (7.3%) Unknown 1429 (100%) 0 (0%) 8 (4.4%) 1437 (65.9%) ER status Negative 11 (0.8%) 9 (1.6%) 5 (2.7%) 25 (1.1%) Positive 1417 (99.2%) 560 (98.4%) 178 (97.3%) 2155 (98.8%) Unknown 1 (0.1%) 0 (0%) 0 (0%) 1 (0.0%) PR Status Negative 81 (5.7%) 159 (27.9%) 25 (13.7%) 265 (12.2%) Positive 1287 (90.1%) 410 (72.1%) 158 (86.3%) 1855 (85.1%) Unknown 61 (4.3%) 0 (0%) 0 (0%) 61 (2.8%) Histologic Subtype IDC a 0 (0%) 403 (70.8%) 121 (66.1%) 524 (24.0%) ILC b 0 (0%) 47 (8.3%) 37 (20.2%) 84 (3.9%) IMMC c 0 (0%) 15 (2.6%) 9 (4.9%) 24 (1.1%) MDLC d 0 (0%) 97 (17.0%) 0 (0%) 97 (4.4%) Other 0 (0%) 7 (1.2%) 16 (8.7%) 23 (1.1%) Unknown 1429 (100%) 0 (0%) 0 (0%) 1429 (65.5%) Histologic Grade G1 328 (23.0%) 78 (13.7%) 0 (0%) 406 (18.6%) G2 770 (53.9%) 283 (49.7%) 0 (0%) 1053 (48.3%) G3 322 (22.5%) 181 (31.8%) 0 (0%) 503 (23.1%) Unknown 9 (0.6%) 27 (4.7%) 183 (100%) 219 (10.0%) Tumor Size Mean (SD) 17.1 (9.30) 21.9 (11.0) NA (NA) 18.5 (10.1) Unknown 6 (0.4%) 0 (0%) 183 (100%) 189 (8.7%) T Stage I 1083 (75.8%) 368 (64.7%) 72 (39.3%) 1523 (69.8%) II 329 (23.0%) 195 (34.3%) 94 (51.4%) 618 (28.3%) III 11 (0.8%) 6 (1.1%) 15 (8.2%) 32 (1.5%) IV 0 (0%) 0 (0%) 2 (1.1%) 2 (0.1%) Unknown 6 (0.4%) 0 (0%) 0 (0%) 6 (0.3%) aIDC: invasive ductal carcinoma; bILC: invasive lobular carcinoma; cIMMC: invasive mixed mucinous breast carcinoma; dMDLC: mixed invasive ductal and lobular carcinoma Gene main effects, interactions and Model construction In total, 13409 mRNAs from GSE96058, TCGA, and Metabric cohorts were included. First, to determine their independent prognostic significance, the clinical information of age, menstrual status, histologic subtype, histological grade, T stage, and tumor size were included in the Cox-ph model. Only age was the main clinical effect of independent prognostic significance. The multivariate Cox-ph model (Model 1) was used to include individual genes to assess their independent prognostic importance, with age as a covariate. \({\text{Model}}1:h\left( t \right)={h_0}\left( t \right)exp\left( {bet{a_{genei}} \times gen{e_i}+bet{a_{age}} \times age} \right)\) To assess the prognostic significance of gene-gene interactions, we selected 761 cancer genes defined by OncoKB [ 17 , 18 ] that were shared by all three cohorts in this study. Model 2 was used to determine the prognostic significance of G × G interactions. \({\text{Model}}2:h\left( t \right)={h_0}\left( t \right)exp\left( {bet{a_{genei}} \times gen{e_i}+bet{a_{genej}} \times gen{e_j}+bet{a_{ij}} \times gen{e_i} \times gen{e_j}+bet{a_{age}} \times age} \right)\) Candidate genes and interactions were selected and validated using an independent validation dataset by scanning common and cancer genes. Specifically, for the GSE96058 dataset, the false positive rate was controlled at the 5% level (q-FDR ≤ 5%) for the selection of important genes and interactions by fitting Model 1 and Model 2, respectively. These selected genes or interactions were then validated in the TCGA cohort; only biomarkers with P ≤ 0.05 and in the same direction of action as in the discovery step were selected as candidate biomarkers to proceed to the next modelling stage. For the GSE96058 training cohort, the candidate genes and interactions identified in the previous screening phase were identified by forward stepwise regression using Cox models, using the likelihood ratio test with P entry ≤0.05 and P removal >0.05 to identify a final multivariable Cox model. For validation, the area under the receiver operating characteristic curve (AUC) or the concordance index (C-index) on an internal cohort (TCGA) and an external cohort (Metabric) were used to assess the discriminatory performance of the obtained model. Bioinformatics Differentially expressed genes (DEGs) were identified using the R package ‘limma’ [ 19 ]. The Metascape web tool ( https://metascape.org ) was utilized to perform Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis[ 20 ]. The R package 'Seurat' (v5.0.1) was used to process the single-cell transcriptome profiles of HR+/HER2- Breast cancer cases [ 21 , 22 ]. Briefly, genes were excluded from further analysis if they were expressed in fewer than three cells or in cells expressing fewer than 300 genes. Before integration, the expression matrices were subjected to an independent quality control. With regard to batch effects in different samples, the R package ‘Harmony’ was used for the integration of these samples in order to eliminate batch effects [ 23 ]. The annotation of cell types was performed by scType [ 23 ]. ‘AUCell’ was used to assess the activity of gene sets [ 24 ]. ‘Scissor’ was used to analyze the correlation between cell identity and model risk group [ 25 ]. Statistics Continuous variables were presented as mean ± standard deviation, while categorized variables were described in terms of frequency (n) and proportion (%). The Cox-ph model was used to screen for prognostically significant main effects of genes and G × G interactions. Differences in survival were decided using Kaplan-Meier (K-M) analysis and the log-rank test. The prediction model's accuracy and discriminative ability were assessed using ROC curves and corresponding C-indexes, AUCs, and decision curves [ 26 ]. Decision curve analysis (DCA) is used to assess the clinical value of a predictive model [ 27 ], which computes the clinical net benefit ( NB ) of assigning interventions based on the model. Clinical values can also be expressed as net reduction ( NR ) of interventions, that is, number of unnecessary interventions under the guidance of a prediction model when compared with the “intervention for all” strategy [ 28 ]. Statistical analyses were performed using R (version 4.3.2). A two-sided P value less than or equal to 0.05was regarded as statistically significant unless otherwise stated. Results Construction of the Model In the first step, in the GSE96058 cohort, 114 main effector genes and 307 pairs of G × G interaction genes were identified as having a potential association with overall survival. Among them, 3 main effector genes (P ≤ 0.05) and 16 pairs of G × G interaction genes were validated as candidate transcriptional predictors in the TCGA cohort. Then, among these candidate transcriptional predictors, a final Cox model was constructed on GSE96058 cohort through stepwise forward regression strategy, which included 2 main effector genes and 5 pairs of G × G interaction genes. Through the coefficient estimates therein, the final Cox model for the combined clinical and transcriptional predictors was defined as: \({\text{Model}}=0.0786 \times {\text{AGE}}+0.9677 \times {\text{Transcriptional\_Score}}\,\) \(\begin{gathered} {\text{Transcriptional\_Score}}= - 0.2227 \times {\text{C}}1{\text{RL}} - 0.1707 \times {\text{CELSR}}1 - 0.1892 \times {\text{MDC}}1+0.1129 \times {\text{NFE}}2{\text{L}}2 \hfill \\ - 0.2765 \times {\text{MDC}}1 \times {\text{NFE}}2{\text{L}}2 - 0.2212 \times {\text{ALB}}+0.0137 \times {\text{YAP}}1 - 0.2746 \times {\text{ALB}} \times {\text{YAP}}1 - 0.0561 \times {\text{FNBP}}1 \hfill \\ - 0.1409 \times {\text{TBL}}1{\text{XR}}1+0.2159 \times {\text{FNBP}}1 \times {\text{TBL}}1{\text{XR}}1+0.0602 \times {\text{CDH}}4 - 0.1751 \times {\text{GATA}}6 \hfill \\ - 0.1501 \times {\text{CDH}}4 \times {\text{GATA}}6+0.1503 \times {\text{FURIN}}+0.0481 \times {\text{PAX}}3 - 0.1611 \times {\text{FURIN}} \times {\text{PAX}}3 \hfill \\ \end{gathered}\) The prognostic significance of the Model A predictor score threshold (6.422) was established by function ‘surv_cutpoint()’ of ‘survminer’ package [ 29 ] in the training cohort to classify patients into low- and high-risk groups in order to demonstrate the clinical utility of the Model. Patients with a high risk score had a poorer prognosis in the K-M survival analysis of the Model. In both the training and testing sets, the predictor score demonstrated a sufficient capacity for discrimination. Compared to low-risk groups, high-risk groups in the training cohort (GSE96058), the internal testing cohort (TCGA), and the external testing cohort (Metabric) were associated with poorer survival, exhibiting larger hazard ratios (HRs) (HR GSE96058 =8.88, 95% CI: 6.16–12.81, P = 2.46×10 − 45 ; HR TCGA =6.12, 95% CI: 1.54–24.6, P = 3.37×10 − 3 ; HR Metabric =2.25, 95% CI: 1.49–3.40, P = 7.37×10 − 5 ) (Fig. 1 a-c). The Model accurately predicted 36-month and 60-month survival (AUC 36 − month =0.875 and 0.831; AUC 60 − month =0.786 and 0.829) (Fig. 1 d, e) for the GSE96058 training set and TCGA internal testing set (Fig. 1 d, e), and demonstrated excellent predictive ability in the Metabric external test set (AUC 120 − month =0.645, AUC 240 − month =0.665) (Fig. 1 f) showed excellent predictive ability. The calibration curve of the prediction Model demonstrated good accordance (Fig. 1 g-i). Decision curves showed that at 36 and 60 months of survival, the Model provided a greater net benefit than the standard model. Additionally, the Model showed excellent C-indexes in the GSE96058 cohort (0.805), the TCGA cohort (0.818), and the Metabric cohort (0.649); with a combined C-index of 0.7302 (95% CI: 0.7047–0.7558) (Fig. 1 j). The DCA curve revealed that the Model can result in more clinical net benefits (NB) than the all-or-none intervention strategy. Specifically, using 36-month survival as an endpoint, with an appropriate threshold probability (e.g., Pt = 0.15), the Model identified 11 true positive patients per 1,000 who should be intervened, compared with only 4.7 cases in the base model (NB Model =0.0110 vs NB Basic =0.0047) (Fig. 2 a). Furthermore, compared to the all intervention strategy, the Model showed a higher net reduction (NR) than the basic model (NR Model =79.4% vs NR Basic =75.8%). The results show a 79.4% reduction in the number of unnecessary clinical interventions, with missing no the treatment of any patients who are at high risk of mortality truly, compared to 75.8% for basic model (Fig. 2 b). As the decision curve indicates, the Model has an obvious net benefit for almost all threshold probabilities. For 36- and 60-month survival, the Model had the best average NB and NR (NB 36 − month =0.0103, NR 36 − month =76.9% and NB 60 − month =0.0302, NR 60 − month =54.1%), suggesting consistent utility and applicability for clinical implementation (Fig. 2 a-d). To visualize the Model, a nomogram was constructed, thereby providing a personalized tool to predict the individual prognosis (Fig. 2 e). Molecular underpinnings of the Model To further investigate the molecular mechanisms associated with risk scores, patients in the GSE96058 cohort were divided into high- and low-risk groups. The analysis of the DEGs showed that 17 genes were up-regulated (FGF14, CHGB, CARTPT, SERPINI1, PTPRN2, FGF10, FURIN, PCSK1, CPA6, SLC5A8, CST5, BPIFB2, CLEC3A, PIP, CPB1, DCD and TAT) and 28 genes down-regulated (VTCN1, UBD, TPSB2, SHISA2, SFRP2, SFRP1, OGN, MUCL1, MMP7, MAOB, KRT7, KRT5, KRT17, KRT15, KRT14, JCHAIN, IGLL5, F2RL2, CPXM1, COL3A1, COL17A1, COL14A1, CLIC6, CILP, CCL21, CCL19, APOD and ADH1B) in the high-risk group. A functional enrichment analysis based on DEGs showed that the top enriched signaling pathways of up-regulated genes were extracellular matrix (M5885: NABA_MATRISOME_ASSOCIATED), and of down-regulated genes were the intermediate filament organization (GO: 0045109) in the high-risk group. The enriched genes in signaling pathways of M5885 were 5 (CST5, FGF10, FGF14, SERPINI1, and CLEC3A), and in GO: 0045109 were 9 (KRT5, KRT7, KRT14, KRT15, KRT17, COL3A1, COL14A1, COL17A1 and SFRP2) (Fig. 3 a, b). For further insight, two scRNA-seq datasets (GSE245601 and GSE176078) of breast cancer were used for downstream integrative analysis. Fourteen clusters were initially found following cell circle normalization (Fig. 3 c). Different cell clusters were annotated according to cell localization, the marker database and known cell markers from other studies, including mesenchymal, cytotoxic T, naive memory B, luminal, mature luminal, perivascular, dendritic, cycling epithelial, epithelial, endothelial, macrophage cell and myoepithelial. The up- and down-regulated genes were collated to characterize the Model high- and low-risk groups. The up-regulated genes were mainly enriched in epithelial, mature luminal, luminal, and naive memory B cell, but down-regulated genes were mainly in myoepithelial cell (Fig. 4 a, b). The cell types were illustrated, in which the enriched genes in M5885 (CST5, FGF10, FGF14, SERPINI1, and CLEC3A), and GO: 0045109 (KRT5, KRT7, KRT14, KRT15, KRT17, COL3A1, COL14A1, COL17A1 and SFRP2) were individually expressed. The up-regulated genes FGF10 and FGF14 were expressed in mesenchymal and epithelial cells, genes CST5 and CLEC3A mainly in epithelial and mature luminal cells, and genes SERPINI1 mainly in perivascular cells and partial epithelial and mature luminal cells (Fig. 5 ). The down-regulated genes KRT5, KRT7, KRT14, KRT17 and COL17A1 were expressed mainly in myoepithelial cells, genes KRT7 and KRT15 also in partial luminal cells, and genes COL3A1, COL14A1 and SFRP2 mainly in mesenchymal cells (Fig. 6 ). Discussion Multiple mechanisms dominate prognostic heterogeneity in HR+/HER2– breast cancer patients. High mortal risk patients may need intensive surveillance and adjuvant therapy. Here a prognostic model is constructed through G × G interactions that increase the number of common candidates with predictive significance in small cohorts. The Model robustness was made through a multi-stage screening process and the coefficients were determined separately from RNAseq data. In conclusion, this study demonstrated that the Model can classify patients with HR+/HER2–, node-negative invasive breast cancer into low- and high-risk groups. Overall, we present a two-stage synthesis of gene expression data from multiple centers and propose a prognostic scoring method that combines main effects of genes and G × G interactions. This prognostic model was confirmed in a HR+/HER2– breast cancer validation cohort independently. It can effectively distinguish the survival outcomes of patients and significantly improve the prediction accuracy of their prognosis. G×G interactions provide clues for understanding the biological mechanism of diseases [ 30 ]. It has been shown that G × G interactions will improve the accuracy of prediction models [ 31 , 32 ]. While, if the effects of interactions are weak or significant interactions are rare, interactions may not significantly improve prediction but may optimize statistical modeling [ 33 ]. This study demonstrated that biomarkers with G×G interactions improved the accuracy of prognostic prediction significantly and dramatically in early stage HR+/HER2– breast cancer, possibly due to increased efficacy. The biological functions of these biomarkers in the Model were briefly summarized. Among the significant main effects genes, CELSR1 was implicated in breast cancer development [ 34 ]; C1RL is a negative biomarker for breast prognosis [ 35 ]. Among the pairs of genes with significant interactions, MDC1 is a novel estrogen receptor coregulator in invasive breast cancer, and inhibits breast cancer by enhancing estrogen receptor-mediated transactivation [ 36 , 37 ]. NFE2L2 depletion in metastatic cancer cells impairs primary tumor development and formation of lung metastases [ 38 ]. ALB encodes the most abundant protein in human blood. Hoogenboezem et al [ 39 ] showed that in the tumor microenvironment, ALB is rapidly taken up by the tumor to compensate for the relative lack of amino acids, thus meeting the high metabolic requirements of rapid tumor proliferation, leading to a reduction in serum ALB levels. In ER-positive patients, elevated YAP1 mRNA levels corresponded to better prognosis [ 40 ]. High FNBP1 expression is significantly correlated with favorable survival outcomes [ 41 ]. TBLR1/TBL1XR1 acts as an ER co-repressor and inhibits ER-mediated transcriptional activation in breast cell lines, and nuclear TBLR1 overexpression increases migration and invasion of breast cancer cells [ 42 ]. CDH4 encodes Ca2+-dependent intercellular adhesion glycoproteins, and hypermethylation of CDH4 is an independent risk factor for the development of breast cancer [ 43 ]. GATA6 is elevated in breast cancer and its expression level positively correlates with metastasis, leading to reduced overall survival [ 44 ]. Down-regulation of furin in breast cancer exerts antiproliferative effects by inhibiting IGF-1R maturation [ 45 ]. PAX3 is considered a key factor in normal development and tumorigenesis, and PAX3 gene is acting as an oncogene in breast tumorigenesis [ 46 – 48 ]. Based on the scRNA-seq results, FGF10 was expressed in mesenchymal and epithelial cells. Abolhassani et al. showed that FGF10 plays an important role in epithelial mesenchymal transition (EMT), which plays a key role in cancer cell invasion and metastasis [ 49 ]. FGF10 stimulates FGFR2b to promote receptor cycling, leading to increased breast cancer migration [ 50 ]. The lncRNA FGF14-AS2 was down-regulated in breast cancer tissue, while patients with lower FGF14-AS2 expression had advanced clinical stage [ 51 ]. FGF14-AS2 significantly affects breast cancer cell migration, invasion and tumor metastasis [ 52 ]. CST5 encodes an inhibitor of several cysteine proteases of the cathepsin family. CST5 was shown to mediate mesenchymal-epithelial transition (MET) [ 53 , 54 ]. Elevated CLEC3A expression may be associated with breast cancer IDC metastatic potential, and CLEC3A knockdown inhibits breast cancer cell growth and metastasis [ 55 ]. SERPINI1 is a key regulator of EMT. Down-regulation of SERPINI1 expression in cells results in reverse-EMT changes in protein levels and cell morphology [ 56 ]. KRT5 is strongly expressed in the basal layer and is mainly localized to the ductal myoepithelial in normal breast tissue. For basal-like carcinoma, KRT5 indicates poor prognosis [ 57 , 58 ]. Breast cancer cells expressing gene KRT14 in basal epithelial led collective invasion [ 59 ]. In epithelial ovarian cancer, deletion of KRT14 completely eliminates its invasive ability [ 60 ]. The expression of KRT17 was significantly lower in breast cancer tissues than in normal tissues, and its reduced expression was significantly associated with poor prognosis [ 61 ]. Genes KRT5, KRT14, and KRT17 were also expressing highly in triple negative breast cancers [ 62 ]. COL17A1 plays a key regulatory role in the clonal expansion of multilayered intraepithelial transformed cells [ 63 ]. High expression of the COL17A1 gene is associated with prolonged survival in patients with invasive breast cancer [ 64 ]. KRT7 regulated EMT and cell‑matrix adhesion in ovarian cancer [ 65 ]. KRT15 expression was positively associated with overall survival in breast cancer patients [ 66 ]. Lower KRT15 expression was significantly associated with a worse prognostic outcome [ 67 ]. COL3A1 encodes the α1 chain of Type III collagen, which is the crucial component of ECM and important for tumor microenvironment [ 68 ]. COL3A1 expression was elevated in TNBC tissues and cells, and silencing of COL3A1 exerted antitumor effects, which were negatively correlated with overall survival [ 69 ]. COL14A1 is predictive of brain metastasis in breast cancer [ 70 ]. For massive lymph nodes patients, the expression level of COL14A1 is high in metastatic tissues [ 71 ]. SFRP2 (secreted frizzled-related protein 2) was upregulated in breast cancer patients of all stages compared to healthy individuals [ 72 ]. SFRP2 mediated angiogenic responses by stimulating NFAT (nuclear factor of activated T-cells) in human breast carcinoma. Migration of endothelial cells and breast cancer cells can be inhibited by inhibiting SFRP2-stimulated test tube formation in vitro [ 73 ]. Based on three public breast cancer transcriptomic data, a predictive prognostic model was constructed. This model divided patients into low- and high-risk groups for the clinical utility. Patients in high-risk group had worse overall survival. DEGs analysis between the low- and high-risk groups revealed that the top enriched signaling pathways were extracellular matrix and intermediate filament organization respectively. Two single-cell RNAseq datasets of primary breast cancer were utilized to illustrate the expression of every gene on the top enriched signaling pathways in specific cell types. However, our study does have certain limitations. Firstly, heterogeneity exists among cohorts from different sequencing or microarray platforms. To address this, we have implemented standard normal transformation to unify the data, which has been effective to a certain extent. Secondly, some well-recognized prognostic factors are missing in several cohorts. We believe that with more comprehensive clinical data, there is significant potential for improvement in the prognostic prediction model. Finally, due to potential population heterogeneity or the limited sample size of individual datasets, the accuracy improvements in all external validation datasets are not consistent. Conclusions The prognostic model incorporating transcriptional biomarkers with both main effects and G×G interactions has high predictive accuracy for prognosis of early stage HR+/HER2– breast cancer survival. Declarations Funding: The Beijing Medical Award Foundation (Grant number: YXJL-2016-0040-0065). Competing Interests: The authors have no relevant financial or non-financial interests to disclose. Author Contributions: XXC and JMY made substantial contributions to conception of the study. XXC, MG and HJL made significant contributions to the analysis and interpretation of data. XXC, MG, HJL and JMY made significant contributions to the drafting or revising of the manuscript. Data Availability: GEO Database: https://www.ncbi.nlm.nih.gov/geo/ TCGA Database: https://portal.gdc.cancer.gov/ Metabric Database: https://www.cbioportal.org/datasets Ethics approval and consent to participate: Not applicable. Consent to publish: Not applicable. References Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71:209–49. 10.3322/caac.21660 . (2024) American Cancer Society. Breast Cancer Facts & Figs. 2022–4. In: American Cancer Society. Atlanta. Bense RD, Qiu SQ, de Vries EGE, Schröder CP, Fehrmann RSN. Considering the biology of late recurrences in selecting patients for extended endocrine therapy in breast cancer. Cancer Treat Rev. 2018;70:118–26. 10.1016/j.ctrv.2018.07.015 . Li G, Hu J, Hu G. Biomarker Studies in Early Detection and Prognosis of Breast Cancer. Adv Exp Med Biol. 2017;1026:27–39. 10.1007/978-981-10-6020-5_2 . Mamounas EP, Tang G, Fisher B, Paik S, Shak S, Costantino JP, Watson D, Geyer CE Jr., Wickerham DL, Wolmark N. Association between the 21-gene recurrence score assay and risk of locoregional recurrence in node-negative, estrogen receptor-positive breast cancer: results from NSABP B-14 and NSABP B-20. J Clin Oncol. 2010;28:1677–83. 10.1200/jco.2009.23.7610 . Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer. 2022;22:1045. 10.1186/s12885-022-10117-1 . Li J, Li X, Zhang S, Snyder M. Gene-Environment Interaction in the Era of Precision Medicine. Cell. 2019;177:38–44. 10.1016/j.cell.2019.03.004 . Zhang R, Lai L, He J, Chen C, You D, Duan W, Dong X, Zhu Y, Lin L, Shen S, Guo Y, Su L, Shafer A, Moran S, Fleischer T, Bjaanaes MM, Karlsson A, Planck M, Staaf J, Helland A, Esteller M, Wei Y, Chen F, Christiani DC. EGLN2 DNA methylation and expression interact with HIF1A to affect survival of early-stage NSCLC. Epigenetics. 2019;14:118–29. 10.1080/15592294.2019.1573066 . Ji H, Wang F, Liu Z, Li Y, Sun H, Xiao A, Zhang H, You C, Hu S, Liu Y. COVPRIG robustly predicts the overall survival of IDH wild-type glioblastoma and highlights METTL1(+) neural-progenitor-like tumor cell in driving unfavorable outcome. J Transl Med. 2023;21:533. 10.1186/s12967-023-04382-2 . Chen J, Shen S, Li Y, Fan J, Xiong S, Xu J, Zhu C, Lin L, Dong X, Duan W, Zhao Y, Qian X, Liu Z, Wei Y, Christiani DC, Zhang R, Chen F. APOLLO: An accurate and independently validated prediction model of lower-grade gliomas overall survival and a comparative study of model performance. eBioMedicine. 2022;79:104007. https://doi.org/10.1016/j.ebiom.2022.104007 . Brueffer C, Vallon-Christersson J, Grabau D, Ehinger A, Hakkinen J, Hegardt C, Malina J, Chen Y, Bendahl PO, Manjer J, Malmberg M, Larsson C, Loman N, Ryden L, Borg A, Saal LH. (2018) Clinical Value of RNA Sequencing-Based Classifiers for Prediction of the Five Conventional Breast Cancer Biomarkers: A Report From the Population-Based Multicenter Sweden Cancerome Analysis Network-Breast Initiative. JCO Precis Oncol 2. 10.1200/PO.17.00135 . Liu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, Kovatich AJ, Benz CC, Levine DA, Lee AV, Omberg L, Wolf DM, Shriver CD, Thorsson V, Cancer Genome Atlas Research N, Hu H. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018;173:400–e416411. 10.1016/j.cell.2018.02.052 . Rueda OM, Sammut SJ, Seoane JA, Chin SF, Caswell-Jin JL, Callari M, Batra R, Pereira B, Bruna A, Ali HR, Provenzano E, Liu B, Parisien M, Gillett C, McKinney S, Green AR, Murphy L, Purushotham A, Ellis IO, Pharoah PD, Rueda C, Aparicio S, Caldas C, Curtis C. Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature. 2019;567:399–404. 10.1038/s41586-019-1007-8 . Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, Byrne CJ, Heuer ML, Larsson E, Antipin Y, Reva B, Goldberg AP, Sander C, Schultz N. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–4. 10.1158/2159-8290.CD-12-0095 . Kim H, Whitman AA, Wisniewska K, Kakati RT, Garcia-Recio S, Calhoun BC, Franco HL, Perou CM, Spanheimer PM. Tamoxifen Response at Single-Cell Resolution in Estrogen Receptor-Positive Primary Human Breast Tumors. Clin Cancer Res. 2023;29:4894–907. 10.1158/1078-0432.Ccr-23-1248 . Wu SZ, Al-Eryani G, Roden DL, Junankar S, Harvey K, Andersson A, Thennavan A, Wang C, Torpy JR, Bartonicek N, Wang T, Larsson L, Kaczorowski D, Weisenfeld NI, Uytingco CR, Chew JG, Bent ZW, Chan CL, Gnanasambandapillai V, Dutertre CA, Gluch L, Hui MN, Beith J, Parker A, Robbins E, Segara D, Cooper C, Mak C, Chan B, Warrier S, Ginhoux F, Millar E, Powell JE, Williams SR, Liu XS, O'Toole S, Lim E, Lundeberg J, Perou CM, Swarbrick A. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet. 2021;53:1334–47. 10.1038/s41588-021-00911-1 . Suehnholz SP, Nissan MH, Zhang H, Kundra R, Nandakumar S, Lu C, Carrero S, Dhaneshwar A, Fernandez N, Xu BW, Arcila ME, Zehir A, Syed A, Brannon AR, Rudolph JE, Paraiso E, Sabbatini PJ, Levine RL, Dogan A, Gao J, Ladanyi M, Drilon A, Berger MF, Solit DB, Schultz N, Chakravarty D. Quantifying the Expanding Landscape of Clinical Actionability for Patients with Cancer. Cancer Discov. 2024;14:49–65. 10.1158/2159-8290.Cd-23-0467 . Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, Rudolph JE, Yaeger R, Soumerai T, Nissan MH, Chang MT, Chandarlapaty S, Traina TA, Paik PK, Ho AL, Hantash FM, Grupe A, Baxi SS, Callahan MK, Snyder A, Chi P, Danila DC, Gounder M, Harding JJ, Hellmann MD, Iyer G, Janjigian YY, Kaley T, Levine DA, Lowery M, Omuro A, Postow MA, Rathkopf D, Shoushtari AN, Shukla N, Voss MH, Paraiso E, Zehir A, Berger MF, Taylor BS, Saltz LB, Riely GJ, Ladanyi M, Hyman DM, Baselga J, Sabbatini P, Solit DB, Schultz N. OncoKB: A Precision Oncology Knowledge Base. JCO Precision Oncol. 2017;1–16. 10.1200/po.17.00011 . Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. 10.1093/nar/gkv007 . Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, Benner C, Chanda SK. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523. 10.1038/s41467-019-09234-6 . Heumos L, Schaar AC, Lance C, Litinetskaya A, Drost F, Zappia L, Lucken MD, Strobl DC, Henao J, Curion F, Single-cell Best Practices, Schiller C, Theis HB. FJ (2023) Best practices for single-cell analysis across modalities. Nat Rev Genet 24:550–572. 10.1038/s41576-023-00586-w . Hao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, Hoffman P, Stoeckius M, Papalexi E, Mimitou EP, Jain J, Srivastava A, Stuart T, Fleming LM, Yeung B, Rogers AJ, McElrath JM, Blish CA, Gottardo R, Smibert P, Satija R. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–e35873529. 10.1016/j.cell.2021.04.048 . Choi JH. In Kim H, Woo HG (2020) scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data. BMC Bioinformatics 21:342. 10.1186/s12859-020-03700-5 . Aibar S, Gonzalez-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, Rambow F, Marine JC, Geurts P, Aerts J, van den Oord J, Atak ZK, Wouters J, Aerts S. SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017;14:1083–6. 10.1038/nmeth.4463 . Sun D, Guan X, Moran AE, Wu LY, Qian DZ, Schedin P, Dai MS, Danilov AV, Alumkal JJ, Adey AC, Spellman PT, Xia Z. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat Biotechnol. 2022;40:527–38. 10.1038/s41587-021-01091-3 . Zheng Y, Heagerty PJ. Semiparametric estimation of time-dependent ROC curves for longitudinal marker data. Biostatistics. 2004;5:615–32. 10.1093/biostatistics/kxh013 . Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis making: Int J Soc Med Decis Mak. 2006;26:565–74. 10.1177/0272989x06295361 . Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn prognostic Res. 2019;3:18. 10.1186/s41512-019-0064-7 . Therneau TM. (2020) A Package for Survival Analysis in R. Cordell HJ. Detecting gene–gene interactions that underlie human diseases. Nat Rev Genet. 2009;10:392–404. 10.1038/nrg2579 . Pharoah PD, Antoniou AC, Easton DF, Ponder BA. Polygenes, risk prediction, and targeted prevention of breast cancer. N Engl J Med. 2008;358:2796–803. 10.1056/NEJMsa0708739 . Khoury MJ, Yang Q, Gwinn M, Little J, Dana Flanders W. An epidemiologic assessment of genomic profiling for measuring susceptibility to common diseases and targeting interventions. Genet Med. 2004;6:38–47. 10.1097/01.gim.0000105751.71430.79 . Aschard H, Chen J, Cornelis MC, Chibnik LB, Karlson EW, Kraft P. Inclusion of gene-gene and gene-environment interactions unlikely to dramatically improve risk prediction for complex diseases. Am J Hum Genet. 2012;90:962–72. 10.1016/j.ajhg.2012.04.017 . Liao S, Desouki MM, Gaile DP, Shepherd L, Nowak NJ, Conroy J, Barry WT, Geradts J. Differential copy number aberrations in novel candidate genes associated with progression from in situ to invasive ductal carcinoma of the breast. Genes Chromosomes Cancer. 2012;51:1067–78. 10.1002/gcc.21991 . Ponten F, Jirstrom K, Uhlen M. The Human Protein Atlas–a tool for pathology. J Pathol. 2008;216:387–93. 10.1002/path.2440 . Zou R, Zhong X, Wang C, Sun H, Wang S, Lin L, Sun S, Tong C, Luo H, Gao P, Li Y, Zhou T, Li D, Cao L, Zhao Y. MDC1 Enhances Estrogen Receptor-mediated Transactivation and Contributes to Breast Cancer Suppression. Int J Biol Sci. 2015;11:992–1005. 10.7150/ijbs.10918 . Sottnik JL, Bordeaux EK, Mehrotra S, Ferrara SE, Goodspeed AE, Costello JC, Sikora MJ. Mediator of DNA Damage Checkpoint 1 (MDC1) Is a Novel Estrogen Receptor Coregulator in Invasive Lobular Carcinoma of the Breast. Mol Cancer Res. 2021;19:1270–82. 10.1158/1541-7786.MCR-21-0025 . Wolowczyk C, Neckmann U, Aure MR, Hall M, Johannessen B, Zhao S, Skotheim RI, Andersen SB, Zwiggelaar R, Steigedal TS, Lingjaerde OC, Sahlberg KK, Almaas E, Bjorkoy G. NRF2 drives an oxidative stress response predictive of breast cancer. Free Radic Biol Med. 2022;184:170–84. 10.1016/j.freeradbiomed.2022.03.029 . Hoogenboezem EN, Duvall CL. Harnessing albumin as a carrier for cancer therapies. Adv Drug Deliv Rev. 2018;130:73–89. 10.1016/j.addr.2018.07.011 . Park I, Lee Y, Kim JH, Bae SJ, Ahn SG, Jeong J, Cha YJ. YAP1 Expression in HR + HER2- Breast Cancer: 21-Gene Recurrence Score Analysis and Public Dataset Validation. Cancers (Basel). 2023;15. 10.3390/cancers15205034 . Wang Z, Tian Z, Song X, Zhang J. Membrane tension sensing molecule-FNBP1 is a prognostic biomarker related to immune infiltration in BRCA, LUAD and STAD. BMC Immunol. 2022;23:1. 10.1186/s12865-021-00475-z . Wu X, Zhan Y, Li X, Wei J, Santiago L, Daniels G, Deng F, Zhong X, Chiriboga L, Basch R, Xiong S, Dong Y, Zhang X, Lee P. Nuclear TBLR1 as an ER corepressor promotes cell proliferation, migration and invasion in breast and ovarian cancer. Am J Cancer Res. 2016;6:2351–60. Zhang N, Li L, Long Z, Du J, Li S, Yin H, Xie K, Wu Z, Chen Y, Volontovich D, Cheng H, Wang F. Are dietary factors involved in the association of CDH4 methylation and breast cancer risk? Br J Nutr. 2022;127:1868–77. 10.1017/s0007114521002804 . Song Y, Tian T, Fu X, Wang W, Li S, Shi T, Suo A, Ruan Z, Guo H, Yao Y. GATA6 is overexpressed in breast cancer and promotes breast cancer cell epithelial-mesenchymal transition by upregulating slug expression. Exp Mol Pathol. 2015;99:617–27. 10.1016/j.yexmp.2015.10.005 . Farhat D, Léon S, Ghayad SE, Gadot N, Icard P, Le Romancer M, Hussein N, Lincet H. Lipoic acid decreases breast cancer cell proliferation by inhibiting IGF-1R via furin downregulation. Br J Cancer. 2020;122:885–94. 10.1038/s41416-020-0729-6 . Kong HK, Yoon S, Park JH. The regulatory mechanism of the LY6K gene expression in human breast cancer cells. J Biol Chem. 2012;287:38889–900. 10.1074/jbc.M112.394270 . Hsieh MJ, Yao YL, Lai IL, Yang WM. Transcriptional repression activity of PAX3 is modulated by competition between corepressor KAP1 and heterochromatin protein 1. Biochem Biophys Res Commun. 2006;349:573–81. 10.1016/j.bbrc.2006.08.064 . Kong HK, Park SJ, Kim YS, Kim KM, Lee HW, Kang HG, Woo YM, Park EY, Ko JY, Suzuki H, Chun KH, Song E, Jang KY, Park JH. Epigenetic activation of LY6K predicts the presence of metastasis and poor prognosis in breast carcinoma. Oncotarget. 2016;7:55677–89. 10.18632/oncotarget.10972 . Abolhassani A, Riazi GH, Azizi E, Amanpour S, Muhammadnejad S, Haddadi M, Zekri A, Shirkoohi R. FGF10: Type III Epithelial Mesenchymal Transition and Invasion in Breast Cancer Cell Lines. J Cancer. 2014;5:537–47. 10.7150/jca.7797 . Clayton NS, Grose RP. Emerging Roles of Fibroblast Growth Factor 10 in Cancer. Front Genet. 2018;9:499. 10.3389/fgene.2018.00499 . Yang F, Liu YH, Dong SY, Ma RM, Bhandari A, Zhang XH, Wang OC. A novel long non-coding RNA FGF14-AS2 is correlated with progression and prognosis in breast cancer. Biochem Biophys Res Commun. 2016;470:479–83. 10.1016/j.bbrc.2016.01.147 . Jin Y, Zhang M, Duan R, Yang J, Yang Y, Wang J, Jiang C, Yao B, Li L, Yuan H, Zha X, Ma C. Long noncoding RNA FGF14-AS2 inhibits breast cancer metastasis by regulating the miR-370-3p/FGF14 axis. Cell Death Discov. 2020;6:103. 10.1038/s41420-020-00334-7 . Alvarez-Díaz S, Valle N, García JM, Peña C, Freije JM, Quesada V, Astudillo A, Bonilla F, López-Otín C, Muñoz A. Cystatin D is a candidate tumor suppressor gene induced by vitamin D in human colon cancer cells. J Clin Invest. 2009;119:2343–58. 10.1172/jci37205 . Hünten S, Hermeking H. p53 directly activates cystatin D/CST5 to mediate mesenchymal-epithelial transition: a possible link to tumor suppression by vitamin D3. Oncotarget. 2015;6:15842–56. 10.18632/oncotarget.4683 . Ni J, Peng Y, Yang FL, Xi X, Huang XW, He C. Overexpression of CLEC3A promotes tumor progression and poor prognosis in breast invasive ductal cancer. Onco Targets Ther. 2018;11:3303–12. 10.2147/OTT.S161311 . Matsuda Y, Miura K, Yamane J, Shima H, Fujibuchi W, Ishida K, Fujishima F, Ohnuma S, Sasaki H, Nagao M, Tanaka N, Satoh K, Naitoh T, Unno M. SERPINI1 regulates epithelial-mesenchymal transition in an orthotopic implantation model of colorectal cancer. Cancer Sci. 2016;107:619–28. 10.1111/cas.12909 . van de Rijn M, Perou CM, Tibshirani R, Haas P, Kallioniemi O, Kononen J, Torhorst J, Sauter G, Zuber M, Kochli OR, Mross F, Dieterich H, Seitz R, Ross D, Botstein D, Brown P. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am J Pathol. 2002;161:1991–6. 10.1016/S0002-9440(10)64476-8 . Wang CC, Bajikar SS, Jamal L, Atkins KA, Janes KA. A time- and matrix-dependent TGFBR3-JUND-KRT5 regulatory circuit in single breast epithelial cells and basal-like premalignancies. Nat Cell Biol. 2014;16:345–56. 10.1038/ncb2930 . Hanley CJ, Henriet E, Sirka OK, Thomas GJ, Ewald AJ. Tumor-Resident Stromal Cells Promote Breast Cancer Invasion through Regulation of the Basal Phenotype. Mol Cancer Res. 2020;18:1615–22. 10.1158/1541-7786.Mcr-20-0334 . Bilandzic M, Rainczuk A, Green E, Fairweather N, Jobling TW, Plebanski M, Stephens AN. Keratin-14 (KRT14) Positive Leader Cells Mediate Mesothelial Clearance and Invasion by Ovarian Cancer Cells. Cancers (Basel). 2019;11. 10.3390/cancers11091228 . Tang S, Liu W, Yong L, Liu D, Lin X, Huang Y, Wang H, Cai F. Reduced Expression of KRT17 Predicts Poor Prognosis in HER2(high) Breast Cancer. Biomolecules. 2022;12. 10.3390/biom12091183 . Jinesh GG, Flores ER, Brohl AS. Chromosome 19 miRNA cluster and CEBPB expression specifically mark and potentially drive triple negative breast cancers. PLoS ONE. 2018;13:e0206008. 10.1371/journal.pone.0206008 . Kozawa K, Sekai M, Ohba K, Ito S, Sako H, Maruyama T, Kakeno M, Shirai T, Kuromiya K, Kamasaki T, Kohashi K, Tanaka S, Ishikawa S, Sato N, Asano S, Suzuki H, Tanimura N, Mukai Y, Gotoh N, Tanino M, Tanaka S, Natsuga K, Soga T, Nakamura T, Yabuta Y, Saitou M, Ito T, Matsuura K, Tsunoda M, Kikumori T, Iida T, Mizutani Y, Miyai Y, Kaibuchi K, Enomoto A, Fujita Y. (2021) The CD44/COL17A1 pathway promotes the formation of multilayered, transformed epithelia. Curr Biol 31:3086–3097 e3087. 10.1016/j.cub.2021.04.078 . Yodsurang V, Tanikawa C, Miyamoto T, Lo PHY, Hirata M, Matsuda K. Identification of a novel p53 target, COL17A1, that inhibits breast cancer cell migration and invasion. Oncotarget. 2017;8:55790–803. 10.18632/oncotarget.18433 . An Q, Liu T, Wang MY, Yang YJ, Zhang ZD, Liu ZJ, Yang B. KRT7 promotes epithelial–mesenchymal transition in ovarian cancer via the TGF–beta/Smad2/3 signaling pathway. Oncol Rep. 2021;45:481–92. 10.3892/or.2020.7886 . Zhang Z, Wang H, Jin Y, Zhou J, Chu C, Tang F, Zou L, Zou Q. KRT15 in early breast cancer screening and correlation with HER2 positivity, pathological grade and N stage. Biomark Med. 2023;17:553–62. 10.2217/bmm-2023-0130 . Zhong P, Shu R, Wu H, Liu Z, Shen X, Hu Y. Low KRT15 expression is associated with poor prognosis in patients with breast invasive carcinoma. Exp Ther Med. 2021;21:305. 10.3892/etm.2021.9736 . Kuivaniemi H, Tromp G. Type III collagen (COL3A1): Gene and protein structure, tissue distribution, and associated diseases. Gene. 2019;707:151–71. 10.1016/j.gene.2019.05.003 . Yang F, Lin L, Li X, Wen R, Zhang X. Silencing of COL3A1 represses proliferation, migration, invasion, and immune escape of triple negative breast cancer cells via down-regulating PD-L1 expression. Cell Biol Int. 2022;46:1959–69. 10.1002/cbin.11875 . Zeng C, Lin M, Jin Y, Zhang J. Identification of Key Genes Associated with Brain Metastasis from Breast Cancer: A Bioinformatics Analysis. Med Sci Monit. 2022;28:e935071. 10.12659/MSM.935071 . Goto R, Nakamura Y, Takami T, Sanke T, Tozuka Z. Quantitative LC-MS/MS Analysis of Proteins Involved in Metastasis of Breast Cancer. PLoS ONE. 2015;10:e0130760. 10.1371/journal.pone.0130760 . Wu ZH, Zhang YJ, Yue JX, Zhou T. Comprehensive Analysis of the Expression and Prognosis for SFRPs in Breast Carcinoma. Cell Transpl. 2020;29:963689720962479. 10.1177/0963689720962479 . Siamakpour-Reihani S, Caster J, Bandhu Nepal D, Courtwright A, Hilliard E, Usary J, Ketelsen D, Darr D, Shen XJ, Patterson C, Klauber-Demore N. The role of calcineurin/NFAT in SFRP2 induced angiogenesis–a rationale for breast cancer treatment with the calcineurin inhibitor tacrolimus. PLoS ONE. 2011;6:e20412. 10.1371/journal.pone.0020412 . Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4394836","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":305903460,"identity":"0858bb28-ac85-4d78-abbb-2f68213d4157","order_by":0,"name":"Xiaoxi Chen","email":"","orcid":"","institution":"Peking University First Hospital","correspondingAuthor":false,"prefix":"","firstName":"Xiaoxi","middleName":"","lastName":"Chen","suffix":""},{"id":305903463,"identity":"4db5d108-b4c0-456d-82c1-8d1b1358dff9","order_by":1,"name":"Hongjin Liu","email":"","orcid":"","institution":"Peking University First Hospital","correspondingAuthor":false,"prefix":"","firstName":"Hongjin","middleName":"","lastName":"Liu","suffix":""},{"id":305903464,"identity":"fea06c04-55f8-4e39-bc19-cff7a48c6c2c","order_by":2,"name":"Min Gao","email":"","orcid":"","institution":"Peking University First Hospital","correspondingAuthor":false,"prefix":"","firstName":"Min","middleName":"","lastName":"Gao","suffix":""},{"id":305903465,"identity":"1e451bbf-048c-445c-9658-6493d6bd08fb","order_by":3,"name":"Jingming Ye","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAvklEQVRIiWNgGAWjYFCCBCCuqJHjZ2Y+/IAELWeOGUu2s6UZEK+FsYU5ccN5HgUJojTotiewSXxsYEvcfJiHwYChxiaaoBazMw/YJGfukDHedpj3wAOGY2m5DQS13Ehgk+Y9wya77TBfggFjw2EitfxtY2bc3MxjIEG8FsY2ZsUNzERrOfOA2bIHGMgSh4GBnECUX44nMN74AYrK/sOHH3yosSGshYGB/wsiOhIIKwcD5g9EKhwFo2AUjIKRCgA6ZEADCc+22wAAAABJRU5ErkJggg==","orcid":"","institution":"Peking University First Hospital","correspondingAuthor":true,"prefix":"","firstName":"Jingming","middleName":"","lastName":"Ye","suffix":""}],"badges":[],"createdAt":"2024-05-09 11:30:33","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4394836/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4394836/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":57110871,"identity":"a7dac584-5128-4913-a0d1-29ba9c396705","added_by":"auto","created_at":"2024-05-24 20:06:28","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":236651,"visible":true,"origin":"","legend":"\u003cp\u003eConstruction and evaluation of the Model. Patients in the three cohorts were divided into high- and low-risk groups based on the predictor score threshold (6.422) defined in the training set GSE96058. Differences in survival between high- and low-risk patients were presented in (a) GSE96058, (b) TCGA and (c) Metabric cohorts. Time-dependent ROC and AUC of the Model were showed in (d) GSE96058, (e) TCGA and (f) Metabric cohorts, respectively. (g-i) Calibration curves for the Model in the three datasets. The x-axis represents the nomogram-predicted probability of survival, and the y-axis represents the actual probability of survival estimated by Kaplan-Meier analysis. (j) Pooled AUC\u003csub\u003e36-month\u003c/sub\u003e, AUC\u003csub\u003e60-month\u003c/sub\u003e and C-index accuracy of the Model across three independent cohorts were showed.\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/a2a0c6f2f88ab7d3c259eb74.png"},{"id":57110875,"identity":"6ff43d57-ac4b-4908-9829-88e6f8758611","added_by":"auto","created_at":"2024-05-24 20:06:28","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":209720,"visible":true,"origin":"","legend":"\u003cp\u003eDecision curve analysis and nomogram for the Model. Decision curve analysis for the Model and the basic model of clinical predictors for NB (net benefit) and NR (net reduction) of unnecessary interventions avoided at both 36 months (a-b) and 60 months (c-d) survival, respectively. (e) The nomogram of the Model. Each prediction indicator was assigned a corresponding score based on its value on the nomogram, and the sum of points from every predictor corresponding to the total points axis at the bottom of the nomogram and used to estimate the patient's 3 - and 5-year survival.\u003c/p\u003e","description":"","filename":"Figure2.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/d1f54238fbd95fc58edbf9e9.png"},{"id":57110872,"identity":"4c7b6fb7-605a-4a13-b864-ad1fccf70487","added_by":"auto","created_at":"2024-05-24 20:06:28","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":384124,"visible":true,"origin":"","legend":"\u003cp\u003eModel-based group transcriptome characteristics. (a) DEGs Volcano plot and (b) analysis of functional enrichment. Preprocess of single-cell datasets: (c) After batch effects correction. (d) Cell types and (e) cell type–specific marker genes.\u003c/p\u003e","description":"","filename":"Figure3.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/a24f98606abdb5e9653cd12d.png"},{"id":57111409,"identity":"630e26b6-7732-40f0-a081-8300d789d098","added_by":"auto","created_at":"2024-05-24 20:22:28","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":181869,"visible":true,"origin":"","legend":"\u003cp\u003eBreast cancer cells associated with poor outcome. Genes with |log2FC|\u0026gt; 1 and p.val \u0026lt; 0.05 were chosen as DEGs. The gene activity (a) of the Model_up and (b) of the Model_down in each type of cell.\u003c/p\u003e","description":"","filename":"Figure4.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/117e8ac0ee53481e182cbfce.png"},{"id":57111227,"identity":"3adc9f03-d060-48d7-8829-bd8076f6fa0a","added_by":"auto","created_at":"2024-05-24 20:14:28","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":130529,"visible":true,"origin":"","legend":"\u003cp\u003eCell types in which up-regulated genes of high-risk group expressed. (a-b) FGF10 and FGF14 were expressed in mesenchymal and epithelial cells, genes (d-e) CST5 and CLEC3A mainly in epithelial and mature luminal cells, and genes (c)SERPINI1 mainly in perivascular cells and partial epithelial and mature luminal cells.\u003c/p\u003e","description":"","filename":"Figure5.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/de4d0117c923442369eace70.png"},{"id":57110876,"identity":"1c39a204-3887-4feb-805e-5f4f952d75a0","added_by":"auto","created_at":"2024-05-24 20:06:28","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":236657,"visible":true,"origin":"","legend":"\u003cp\u003eCell types in which down-regulated genes of high-risk group expressed. (a-e)KRT5, KRT7, KRT14, KRT17 and COL17A1 were expressed mainly in myoepithelial cells, genes (e-f)KRT7 and KRT15 also in partial luminal cells, and genes (g-j)COL3A1, COL14A1 and SFRP2 mainly in mesenchymal cells.\u003c/p\u003e","description":"","filename":"Figure6.png","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/b0acd85e614e7de6c4de559c.png"},{"id":63627284,"identity":"b03684e4-f6c5-4e3d-a11a-9d6bac3aee5d","added_by":"auto","created_at":"2024-08-30 10:01:41","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1840827,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4394836/v1/1dfe993e-2270-480d-976f-f2b3e1ef933f.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Development of a Prognostic Model for HR-positive HER2-negative and Node-negative Breast Cancer: Integrating Clinical and Transcriptional Biomarkers","fulltext":[{"header":"Introduction","content":"\u003cp\u003eBreast cancer (BC) is the most prevalent cancer among women worldwide, with an annual incidence increase of 0.5% [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The HR-positive HER2-negative (HR+/HER2\u0026ndash;) is the most prevalent subtype, accounting for about 70% of new breast cancer cases [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. Patients with HR+/HER2\u0026ndash; breast cancer generally have a good prognosis. However, approximately one-third of these patients will experience late relapses, even decades after the initial diagnosis [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. It is essential to predict which patients will have recurrences within 5 years of surgery so that they can be selected to receive adjuvant chemotherapy.\u003c/p\u003e \u003cp\u003eFor HR+/HER2\u0026ndash; breast cancers, quantitative real-time reverse transcription PCR (qRT-PCR)-based multigene assays (e.g., 21-gene or 70-gene assays) can be used to guide adjuvant chemotherapy decisions [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. In addition, it has recently been shown that for most cancers, by combining clinical and transcriptomic data, it is possible to improve survival prediction [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Also, predictors of gene-gene (G \u0026times; G) interactions have the potential to improve prediction accuracy and offer crucial insights into the molecular mechanisms underlying complex disorders [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e, \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. G \u0026times; G interactions can be used as a basis for the construction of prognostic models [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. Independent risk factors were selected to construct the prognostic model through single-factor and multifactor Cox regression analyses.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStudy population and data collection\u003c/h2\u003e \u003cp\u003eTo construct the prognostic model, we first collected the expression profile and clinical information of 2180 patients with HR+/HER2- and node-negative breast cancer from three datasets: GSE96058 (RNAseq, n\u0026thinsp;=\u0026thinsp;1429) [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e], TCGA (RNAseq, n\u0026thinsp;=\u0026thinsp;182) [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e] and Metabric (microarray, n\u0026thinsp;=\u0026thinsp;569) [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. The TCGA and the Metabric datasets, including both clinical and gene expression data, were downloaded from the cBioPortal for Cancer Genomics [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Patients with complete transcriptomic data and information on overall survival were retained. All of the gene expression value was log-transformed and standardized before entering the association analyses. Two 10 \u0026times; Genomics-based (GSE245601, GSE176078) HR+/HER2\u0026ndash; Breast cancer single-cell expression profiling was used to study cellular dynamics and how these interact with each other [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. In our subsequent analysis, there were a total of 2180 HR+/HER2\u0026ndash; and node-negative breast cancer patients with 13409 genes, whose clinical characteristics were summarized in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eClinical characteristics of patients with HR+/HER2\u0026ndash; and node-negative BC in every datasets.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eBiomarker screening\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDiscovery\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eValidation\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c5\" morerows=\"2\" rowspan=\"3\"\u003e \u003cp\u003eOverall\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModel construction\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTraining\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTesting 1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTesting 2\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCharacteristics\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGSE96058\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMetabric\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTCGA\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNumber of samples\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1429\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e569\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e183\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2181\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAge(years)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMean (SD)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e63.9 (12.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e60.7 (11.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e61.7 (12.6)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e62.9 (12.2)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMenopausal State\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePost\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e447 (78.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e138 (75.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e585 (26.8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePre\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e122 (21.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e37 (20.2%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e159 (7.3%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1429 (100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e8 (4.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1437 (65.9%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eER status\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNegative\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e11 (0.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e9 (1.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e5 (2.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e25 (1.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePositive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1417 (99.2%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e560 (98.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e178 (97.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2155 (98.8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1 (0.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1 (0.0%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePR Status\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNegative\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e81 (5.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e159 (27.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e25 (13.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e265 (12.2%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePositive\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1287 (90.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e410 (72.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e158 (86.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1855 (85.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e61 (4.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e61 (2.8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHistologic Subtype\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIDC\u003csup\u003ea\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e403 (70.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e121 (66.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e524 (24.0%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eILC\u003csup\u003eb\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e47 (8.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e37 (20.2%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e84 (3.9%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIMMC\u003csup\u003ec\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e15 (2.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e9 (4.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e24 (1.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMDLC\u003csup\u003ed\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e97 (17.0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e97 (4.4%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e7 (1.2%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e16 (8.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e23 (1.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1429 (100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1429 (65.5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHistologic Grade\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eG1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e328 (23.0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e78 (13.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e406 (18.6%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eG2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e770 (53.9%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e283 (49.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1053 (48.3%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eG3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e322 (22.5%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e181 (31.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e503 (23.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e9 (0.6%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e27 (4.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e183 (100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e219 (10.0%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTumor Size\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMean (SD)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e17.1 (9.30)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e21.9 (11.0)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eNA (NA)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e18.5 (10.1)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (0.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e183 (100%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e189 (8.7%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eT Stage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e1083 (75.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e368 (64.7%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e72 (39.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1523 (69.8%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eII\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e329 (23.0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e195 (34.3%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e94 (51.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e618 (28.3%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIII\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e11 (0.8%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e6 (1.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e15 (8.2%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e32 (1.5%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eIV\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e2 (1.1%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e2 (0.1%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUnknown\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e6 (0.4%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e0 (0%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e6 (0.3%)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003ctfoot\u003e \u003ctr\u003e\u003ctd colspan=\"5\"\u003eaIDC: invasive ductal carcinoma; bILC: invasive lobular carcinoma; cIMMC: invasive mixed mucinous breast carcinoma; dMDLC: mixed invasive ductal and lobular carcinoma\u003c/td\u003e\u003c/tr\u003e \u003c/tfoot\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eGene main effects, interactions and Model construction\u003c/h2\u003e \u003cp\u003eIn total, 13409 mRNAs from GSE96058, TCGA, and Metabric cohorts were included. First, to determine their independent prognostic significance, the clinical information of age, menstrual status, histologic subtype, histological grade, T stage, and tumor size were included in the Cox-ph model. Only age was the main clinical effect of independent prognostic significance. The multivariate Cox-ph model (Model 1) was used to include individual genes to assess their independent prognostic importance, with age as a covariate.\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\({\\text{Model}}1:h\\left( t \\right)={h_0}\\left( t \\right)exp\\left( {bet{a_{genei}} \\times gen{e_i}+bet{a_{age}} \\times age} \\right)\\)\u003c/span\u003e \u003c/span\u003e \u003c/p\u003e \u003cp\u003eTo assess the prognostic significance of gene-gene interactions, we selected 761 cancer genes defined by OncoKB [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e] that were shared by all three cohorts in this study. Model 2 was used to determine the prognostic significance of G \u0026times; G interactions.\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\({\\text{Model}}2:h\\left( t \\right)={h_0}\\left( t \\right)exp\\left( {bet{a_{genei}} \\times gen{e_i}+bet{a_{genej}} \\times gen{e_j}+bet{a_{ij}} \\times gen{e_i} \\times gen{e_j}+bet{a_{age}} \\times age} \\right)\\)\u003c/span\u003e \u003c/span\u003eCandidate genes and interactions were selected and validated using an independent validation dataset by scanning common and cancer genes. Specifically, for the GSE96058 dataset, the false positive rate was controlled at the 5% level (q-FDR\u0026thinsp;\u0026le;\u0026thinsp;5%) for the selection of important genes and interactions by fitting Model 1 and Model 2, respectively. These selected genes or interactions were then validated in the TCGA cohort; only biomarkers with P\u0026thinsp;\u0026le;\u0026thinsp;0.05 and in the same direction of action as in the discovery step were selected as candidate biomarkers to proceed to the next modelling stage.\u003c/p\u003e \u003cp\u003eFor the GSE96058 training cohort, the candidate genes and interactions identified in the previous screening phase were identified by forward stepwise regression using Cox models, using the likelihood ratio test with P\u003csub\u003eentry\u003c/sub\u003e\u0026le;0.05 and P\u003csub\u003eremoval\u003c/sub\u003e\u0026gt;0.05 to identify a final multivariable Cox model. For validation, the area under the receiver operating characteristic curve (AUC) or the concordance index (C-index) on an internal cohort (TCGA) and an external cohort (Metabric) were used to assess the discriminatory performance of the obtained model.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eBioinformatics\u003c/h2\u003e \u003cp\u003eDifferentially expressed genes (DEGs) were identified using the R package \u0026lsquo;limma\u0026rsquo; [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. The Metascape web tool (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://metascape.org\u003c/span\u003e\u003cspan address=\"https://metascape.org\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) was utilized to perform Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis[\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe R package 'Seurat' (v5.0.1) was used to process the single-cell transcriptome profiles of HR+/HER2- Breast cancer cases [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e, \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. Briefly, genes were excluded from further analysis if they were expressed in fewer than three cells or in cells expressing fewer than 300 genes. Before integration, the expression matrices were subjected to an independent quality control. With regard to batch effects in different samples, the R package \u0026lsquo;Harmony\u0026rsquo; was used for the integration of these samples in order to eliminate batch effects [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. The annotation of cell types was performed by scType [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. \u0026lsquo;AUCell\u0026rsquo; was used to assess the activity of gene sets [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. \u0026lsquo;Scissor\u0026rsquo; was used to analyze the correlation between cell identity and model risk group [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e].\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eStatistics\u003c/h2\u003e \u003cp\u003eContinuous variables were presented as mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation, while categorized variables were described in terms of frequency (n) and proportion (%). The Cox-ph model was used to screen for prognostically significant main effects of genes and G \u0026times; G interactions. Differences in survival were decided using Kaplan-Meier (K-M) analysis and the log-rank test. The prediction model's accuracy and discriminative ability were assessed using ROC curves and corresponding C-indexes, AUCs, and decision curves [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. Decision curve analysis (DCA) is used to assess the clinical value of a predictive model [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], which computes the clinical net benefit (\u003cem\u003eNB\u003c/em\u003e) of assigning interventions based on the model. Clinical values can also be expressed as net reduction (\u003cem\u003eNR\u003c/em\u003e) of interventions, that is, number of unnecessary interventions under the guidance of a prediction model when compared with the \u0026ldquo;intervention for all\u0026rdquo; strategy [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eStatistical analyses were performed using R (version 4.3.2). A two-sided P value less than or equal to 0.05was regarded as statistically significant unless otherwise stated.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eConstruction of the Model\u003c/h2\u003e \u003cp\u003eIn the first step, in the GSE96058 cohort, 114 main effector genes and 307 pairs of G \u0026times; G interaction genes were identified as having a potential association with overall survival. Among them, 3 main effector genes (P\u0026thinsp;\u0026le;\u0026thinsp;0.05) and 16 pairs of G \u0026times; G interaction genes were validated as candidate transcriptional predictors in the TCGA cohort. Then, among these candidate transcriptional predictors, a final Cox model was constructed on GSE96058 cohort through stepwise forward regression strategy, which included 2 main effector genes and 5 pairs of G \u0026times; G interaction genes. Through the coefficient estimates therein, the final Cox model for the combined clinical and transcriptional predictors was defined as:\u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\({\\text{Model}}=0.0786 \\times {\\text{AGE}}+0.9677 \\times {\\text{Transcriptional\\_Score}}\\,\\)\u003c/span\u003e \u003c/span\u003e \u003c/p\u003e \u003cp\u003e \u003cspan class=\"InlineEquation\"\u003e \u003cspan class=\"mathinline\"\u003e\\(\\begin{gathered} {\\text{Transcriptional\\_Score}}= - 0.2227 \\times {\\text{C}}1{\\text{RL}} - 0.1707 \\times {\\text{CELSR}}1 - 0.1892 \\times {\\text{MDC}}1+0.1129 \\times {\\text{NFE}}2{\\text{L}}2 \\hfill \\\\ - 0.2765 \\times {\\text{MDC}}1 \\times {\\text{NFE}}2{\\text{L}}2 - 0.2212 \\times {\\text{ALB}}+0.0137 \\times {\\text{YAP}}1 - 0.2746 \\times {\\text{ALB}} \\times {\\text{YAP}}1 - 0.0561 \\times {\\text{FNBP}}1 \\hfill \\\\ - 0.1409 \\times {\\text{TBL}}1{\\text{XR}}1+0.2159 \\times {\\text{FNBP}}1 \\times {\\text{TBL}}1{\\text{XR}}1+0.0602 \\times {\\text{CDH}}4 - 0.1751 \\times {\\text{GATA}}6 \\hfill \\\\ - 0.1501 \\times {\\text{CDH}}4 \\times {\\text{GATA}}6+0.1503 \\times {\\text{FURIN}}+0.0481 \\times {\\text{PAX}}3 - 0.1611 \\times {\\text{FURIN}} \\times {\\text{PAX}}3 \\hfill \\\\ \\end{gathered}\\)\u003c/span\u003e \u003c/span\u003e \u003cb\u003eThe prognostic significance of the Model\u003c/b\u003e \u003c/p\u003e \u003cp\u003eA predictor score threshold (6.422) was established by function \u0026lsquo;surv_cutpoint()\u0026rsquo; of \u0026lsquo;survminer\u0026rsquo; package [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e] in the training cohort to classify patients into low- and high-risk groups in order to demonstrate the clinical utility of the Model. Patients with a high risk score had a poorer prognosis in the K-M survival analysis of the Model. In both the training and testing sets, the predictor score demonstrated a sufficient capacity for discrimination. Compared to low-risk groups, high-risk groups in the training cohort (GSE96058), the internal testing cohort (TCGA), and the external testing cohort (Metabric) were associated with poorer survival, exhibiting larger hazard ratios (HRs) (HR\u003csub\u003eGSE96058\u003c/sub\u003e=8.88, 95% CI: 6.16\u0026ndash;12.81, P\u0026thinsp;=\u0026thinsp;2.46\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;45\u003c/sup\u003e; HR\u003csub\u003eTCGA\u003c/sub\u003e =6.12, 95% CI: 1.54\u0026ndash;24.6, P\u0026thinsp;=\u0026thinsp;3.37\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e; HR\u003csub\u003eMetabric\u003c/sub\u003e =2.25, 95% CI: 1.49\u0026ndash;3.40, P\u0026thinsp;=\u0026thinsp;7.37\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;5\u003c/sup\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ea-c).\u003c/p\u003e \u003cp\u003eThe Model accurately predicted 36-month and 60-month survival (AUC\u003csub\u003e36\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.875 and 0.831; AUC\u003csub\u003e60\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.786 and 0.829) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ed, e) for the GSE96058 training set and TCGA internal testing set (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ed, e), and demonstrated excellent predictive ability in the Metabric external test set (AUC\u003csub\u003e120\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.645, AUC\u003csub\u003e240\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.665) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ef) showed excellent predictive ability. The calibration curve of the prediction Model demonstrated good accordance (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eg-i). Decision curves showed that at 36 and 60 months of survival, the Model provided a greater net benefit than the standard model. Additionally, the Model showed excellent C-indexes in the GSE96058 cohort (0.805), the TCGA cohort (0.818), and the Metabric cohort (0.649); with a combined C-index of 0.7302 (95% CI: 0.7047\u0026ndash;0.7558) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003ej).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe DCA curve revealed that the Model can result in more clinical net benefits (NB) than the all-or-none intervention strategy. Specifically, using 36-month survival as an endpoint, with an appropriate threshold probability (e.g., Pt\u0026thinsp;=\u0026thinsp;0.15), the Model identified 11 true positive patients per 1,000 who should be intervened, compared with only 4.7 cases in the base model (NB\u003csub\u003eModel\u003c/sub\u003e=0.0110 vs NB\u003csub\u003eBasic\u003c/sub\u003e=0.0047) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea). Furthermore, compared to the all intervention strategy, the Model showed a higher net reduction (NR) than the basic model (NR\u003csub\u003eModel\u003c/sub\u003e=79.4% vs NR\u003csub\u003eBasic\u003c/sub\u003e=75.8%). The results show a 79.4% reduction in the number of unnecessary clinical interventions, with missing no the treatment of any patients who are at high risk of mortality truly, compared to 75.8% for basic model (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb). As the decision curve indicates, the Model has an obvious net benefit for almost all threshold probabilities. For 36- and 60-month survival, the Model had the best average NB and NR (NB\u003csub\u003e36\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.0103, NR\u003csub\u003e36\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=76.9% and NB\u003csub\u003e60\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=0.0302, NR\u003csub\u003e60\u0026thinsp;\u0026minus;\u0026thinsp;month\u003c/sub\u003e=54.1%), suggesting consistent utility and applicability for clinical implementation (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea-d). To visualize the Model, a nomogram was constructed, thereby providing a personalized tool to predict the individual prognosis (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ee).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eMolecular underpinnings of the Model\u003c/h2\u003e \u003cp\u003eTo further investigate the molecular mechanisms associated with risk scores, patients in the GSE96058 cohort were divided into high- and low-risk groups. The analysis of the DEGs showed that 17 genes were up-regulated (FGF14, CHGB, CARTPT, SERPINI1, PTPRN2, FGF10, FURIN, PCSK1, CPA6, SLC5A8, CST5, BPIFB2, CLEC3A, PIP, CPB1, DCD and TAT) and 28 genes down-regulated (VTCN1, UBD, TPSB2, SHISA2, SFRP2, SFRP1, OGN, MUCL1, MMP7, MAOB, KRT7, KRT5, KRT17, KRT15, KRT14, JCHAIN, IGLL5, F2RL2, CPXM1, COL3A1, COL17A1, COL14A1, CLIC6, CILP, CCL21, CCL19, APOD and ADH1B) in the high-risk group. A functional enrichment analysis based on DEGs showed that the top enriched signaling pathways of up-regulated genes were extracellular matrix (M5885: NABA_MATRISOME_ASSOCIATED), and of down-regulated genes were the intermediate filament organization (GO: 0045109) in the high-risk group. The enriched genes in signaling pathways of M5885 were 5 (CST5, FGF10, FGF14, SERPINI1, and CLEC3A), and in GO: 0045109 were 9 (KRT5, KRT7, KRT14, KRT15, KRT17, COL3A1, COL14A1, COL17A1 and SFRP2) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea, b).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFor further insight, two scRNA-seq datasets (GSE245601 and GSE176078) of breast cancer were used for downstream integrative analysis. Fourteen clusters were initially found following cell circle normalization (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ec). Different cell clusters were annotated according to cell localization, the marker database and known cell markers from other studies, including mesenchymal, cytotoxic T, naive memory B, luminal, mature luminal, perivascular, dendritic, cycling epithelial, epithelial, endothelial, macrophage cell and myoepithelial.\u003c/p\u003e \u003cp\u003eThe up- and down-regulated genes were collated to characterize the Model high- and low-risk groups. The up-regulated genes were mainly enriched in epithelial, mature luminal, luminal, and naive memory B cell, but down-regulated genes were mainly in myoepithelial cell (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ea, b).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe cell types were illustrated, in which the enriched genes in M5885 (CST5, FGF10, FGF14, SERPINI1, and CLEC3A), and GO: 0045109 (KRT5, KRT7, KRT14, KRT15, KRT17, COL3A1, COL14A1, COL17A1 and SFRP2) were individually expressed. The up-regulated genes FGF10 and FGF14 were expressed in mesenchymal and epithelial cells, genes CST5 and CLEC3A mainly in epithelial and mature luminal cells, and genes SERPINI1 mainly in perivascular cells and partial epithelial and mature luminal cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). The down-regulated genes KRT5, KRT7, KRT14, KRT17 and COL17A1 were expressed mainly in myoepithelial cells, genes KRT7 and KRT15 also in partial luminal cells, and genes COL3A1, COL14A1 and SFRP2 mainly in mesenchymal cells (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eMultiple mechanisms dominate prognostic heterogeneity in HR+/HER2\u0026ndash; breast cancer patients. High mortal risk patients may need intensive surveillance and adjuvant therapy. Here a prognostic model is constructed through G \u0026times; G interactions that increase the number of common candidates with predictive significance in small cohorts. The Model robustness was made through a multi-stage screening process and the coefficients were determined separately from RNAseq data. In conclusion, this study demonstrated that the Model can classify patients with HR+/HER2\u0026ndash;, node-negative invasive breast cancer into low- and high-risk groups.\u003c/p\u003e \u003cp\u003eOverall, we present a two-stage synthesis of gene expression data from multiple centers and propose a prognostic scoring method that combines main effects of genes and G \u0026times; G interactions. This prognostic model was confirmed in a HR+/HER2\u0026ndash; breast cancer validation cohort independently. It can effectively distinguish the survival outcomes of patients and significantly improve the prediction accuracy of their prognosis.\u003c/p\u003e \u003cp\u003eG\u0026times;G interactions provide clues for understanding the biological mechanism of diseases [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. It has been shown that G \u0026times; G interactions will improve the accuracy of prediction models [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. While, if the effects of interactions are weak or significant interactions are rare, interactions may not significantly improve prediction but may optimize statistical modeling [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. This study demonstrated that biomarkers with G\u0026times;G interactions improved the accuracy of prognostic prediction significantly and dramatically in early stage HR+/HER2\u0026ndash; breast cancer, possibly due to increased efficacy.\u003c/p\u003e \u003cp\u003eThe biological functions of these biomarkers in the Model were briefly summarized. Among the significant main effects genes, CELSR1 was implicated in breast cancer development [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]; C1RL is a negative biomarker for breast prognosis [\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. Among the pairs of genes with significant interactions, MDC1 is a novel estrogen receptor coregulator in invasive breast cancer, and inhibits breast cancer by enhancing estrogen receptor-mediated transactivation [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e]. NFE2L2 depletion in metastatic cancer cells impairs primary tumor development and formation of lung metastases [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. ALB encodes the most abundant protein in human blood. Hoogenboezem et al [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e] showed that in the tumor microenvironment, ALB is rapidly taken up by the tumor to compensate for the relative lack of amino acids, thus meeting the high metabolic requirements of rapid tumor proliferation, leading to a reduction in serum ALB levels. In ER-positive patients, elevated YAP1 mRNA levels corresponded to better prognosis [\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. High FNBP1 expression is significantly correlated with favorable survival outcomes [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]. TBLR1/TBL1XR1 acts as an ER co-repressor and inhibits ER-mediated transcriptional activation in breast cell lines, and nuclear TBLR1 overexpression increases migration and invasion of breast cancer cells [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. CDH4 encodes Ca2+-dependent intercellular adhesion glycoproteins, and hypermethylation of CDH4 is an independent risk factor for the development of breast cancer [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e]. GATA6 is elevated in breast cancer and its expression level positively correlates with metastasis, leading to reduced overall survival [\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. Down-regulation of furin in breast cancer exerts antiproliferative effects by inhibiting IGF-1R maturation [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e]. PAX3 is considered a key factor in normal development and tumorigenesis, and PAX3 gene is acting as an oncogene in breast tumorigenesis [\u003cspan additionalcitationids=\"CR47\" citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eBased on the scRNA-seq results, FGF10 was expressed in mesenchymal and epithelial cells. Abolhassani et al. showed that FGF10 plays an important role in epithelial mesenchymal transition (EMT), which plays a key role in cancer cell invasion and metastasis [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. FGF10 stimulates FGFR2b to promote receptor cycling, leading to increased breast cancer migration [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. The lncRNA FGF14-AS2 was down-regulated in breast cancer tissue, while patients with lower FGF14-AS2 expression had advanced clinical stage [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e]. FGF14-AS2 significantly affects breast cancer cell migration, invasion and tumor metastasis [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eCST5 encodes an inhibitor of several cysteine proteases of the cathepsin family. CST5 was shown to mediate mesenchymal-epithelial transition (MET) [\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e]. Elevated CLEC3A expression may be associated with breast cancer IDC metastatic potential, and CLEC3A knockdown inhibits breast cancer cell growth and metastasis [\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e]. SERPINI1 is a key regulator of EMT. Down-regulation of SERPINI1 expression in cells results in reverse-EMT changes in protein levels and cell morphology [\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eKRT5 is strongly expressed in the basal layer and is mainly localized to the ductal myoepithelial in normal breast tissue. For basal-like carcinoma, KRT5 indicates poor prognosis [\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e, \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e]. Breast cancer cells expressing gene KRT14 in basal epithelial led collective invasion [\u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e]. In epithelial ovarian cancer, deletion of KRT14 completely eliminates its invasive ability [\u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e]. The expression of KRT17 was significantly lower in breast cancer tissues than in normal tissues, and its reduced expression was significantly associated with poor prognosis [\u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e61\u003c/span\u003e]. Genes KRT5, KRT14, and KRT17 were also expressing highly in triple negative breast cancers [\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e]. COL17A1 plays a key regulatory role in the clonal expansion of multilayered intraepithelial transformed cells [\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e]. High expression of the COL17A1 gene is associated with prolonged survival in patients with invasive breast cancer [\u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e]. KRT7 regulated EMT and cell‑matrix adhesion in ovarian cancer [\u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e65\u003c/span\u003e]. KRT15 expression was positively associated with overall survival in breast cancer patients [\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e]. Lower KRT15 expression was significantly associated with a worse prognostic outcome [\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e]. COL3A1 encodes the α1 chain of Type III collagen, which is the crucial component of ECM and important for tumor microenvironment [\u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e]. COL3A1 expression was elevated in TNBC tissues and cells, and silencing of COL3A1 exerted antitumor effects, which were negatively correlated with overall survival [\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e]. COL14A1 is predictive of brain metastasis in breast cancer [\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e]. For massive lymph nodes patients, the expression level of COL14A1 is high in metastatic tissues [\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e]. SFRP2 (secreted frizzled-related protein 2) was upregulated in breast cancer patients of all stages compared to healthy individuals [\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e]. SFRP2 mediated angiogenic responses by stimulating NFAT (nuclear factor of activated T-cells) in human breast carcinoma. Migration of endothelial cells and breast cancer cells can be inhibited by inhibiting SFRP2-stimulated test tube formation in vitro [\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eBased on three public breast cancer transcriptomic data, a predictive prognostic model was constructed. This model divided patients into low- and high-risk groups for the clinical utility. Patients in high-risk group had worse overall survival. DEGs analysis between the low- and high-risk groups revealed that the top enriched signaling pathways were extracellular matrix and intermediate filament organization respectively. Two single-cell RNAseq datasets of primary breast cancer were utilized to illustrate the expression of every gene on the top enriched signaling pathways in specific cell types.\u003c/p\u003e \u003cp\u003eHowever, our study does have certain limitations. Firstly, heterogeneity exists among cohorts from different sequencing or microarray platforms. To address this, we have implemented standard normal transformation to unify the data, which has been effective to a certain extent. Secondly, some well-recognized prognostic factors are missing in several cohorts. We believe that with more comprehensive clinical data, there is significant potential for improvement in the prognostic prediction model. Finally, due to potential population heterogeneity or the limited sample size of individual datasets, the accuracy improvements in all external validation datasets are not consistent.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eThe prognostic model incorporating transcriptional biomarkers with both main effects and G\u0026times;G interactions has high predictive accuracy for prognosis of early stage HR+/HER2\u0026ndash; breast cancer survival.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFunding:\u003c/strong\u003e The Beijing Medical Award Foundation (Grant number: YXJL-2016-0040-0065).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests:\u0026nbsp;\u003c/strong\u003eThe authors have no relevant financial or non-financial interests to disclose.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions:\u003c/strong\u003e XXC and JMY made substantial contributions to conception of the study. \u0026nbsp;XXC, MG and HJL made significant contributions to the analysis and interpretation of data. \u0026nbsp;XXC, MG, HJL and JMY made significant contributions to the drafting or revising of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability:\u003c/strong\u003e\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eGEO Database: https://www.ncbi.nlm.nih.gov/geo/\u003c/p\u003e\n\u003cp\u003eTCGA Database: https://portal.gdc.cancer.gov/\u003c/p\u003e\n\u003cp\u003eMetabric Database: https://www.cbioportal.org/datasets\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate:\u003c/strong\u003e Not applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to publish:\u003c/strong\u003e Not applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eSung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71:209\u0026ndash;49. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3322/caac.21660\u003c/span\u003e\u003cspan address=\"10.3322/caac.21660\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e(2024) American Cancer Society. Breast Cancer Facts \u0026amp; Figs. 2022\u0026ndash;4. In: American Cancer Society. Atlanta.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBense RD, Qiu SQ, de Vries EGE, Schr\u0026ouml;der CP, Fehrmann RSN. Considering the biology of late recurrences in selecting patients for extended endocrine therapy in breast cancer. Cancer Treat Rev. 2018;70:118\u0026ndash;26. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ctrv.2018.07.015\u003c/span\u003e\u003cspan address=\"10.1016/j.ctrv.2018.07.015\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi G, Hu J, Hu G. Biomarker Studies in Early Detection and Prognosis of Breast Cancer. Adv Exp Med Biol. 2017;1026:27\u0026ndash;39. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/978-981-10-6020-5_2\u003c/span\u003e\u003cspan address=\"10.1007/978-981-10-6020-5_2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMamounas EP, Tang G, Fisher B, Paik S, Shak S, Costantino JP, Watson D, Geyer CE Jr., Wickerham DL, Wolmark N. Association between the 21-gene recurrence score assay and risk of locoregional recurrence in node-negative, estrogen receptor-positive breast cancer: results from NSABP B-14 and NSABP B-20. J Clin Oncol. 2010;28:1677\u0026ndash;83. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1200/jco.2009.23.7610\u003c/span\u003e\u003cspan address=\"10.1200/jco.2009.23.7610\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer. 2022;22:1045. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12885-022-10117-1\u003c/span\u003e\u003cspan address=\"10.1186/s12885-022-10117-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi J, Li X, Zhang S, Snyder M. Gene-Environment Interaction in the Era of Precision Medicine. Cell. 2019;177:38\u0026ndash;44. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cell.2019.03.004\u003c/span\u003e\u003cspan address=\"10.1016/j.cell.2019.03.004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang R, Lai L, He J, Chen C, You D, Duan W, Dong X, Zhu Y, Lin L, Shen S, Guo Y, Su L, Shafer A, Moran S, Fleischer T, Bjaanaes MM, Karlsson A, Planck M, Staaf J, Helland A, Esteller M, Wei Y, Chen F, Christiani DC. EGLN2 DNA methylation and expression interact with HIF1A to affect survival of early-stage NSCLC. Epigenetics. 2019;14:118\u0026ndash;29. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1080/15592294.2019.1573066\u003c/span\u003e\u003cspan address=\"10.1080/15592294.2019.1573066\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJi H, Wang F, Liu Z, Li Y, Sun H, Xiao A, Zhang H, You C, Hu S, Liu Y. COVPRIG robustly predicts the overall survival of IDH wild-type glioblastoma and highlights METTL1(+) neural-progenitor-like tumor cell in driving unfavorable outcome. J Transl Med. 2023;21:533. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12967-023-04382-2\u003c/span\u003e\u003cspan address=\"10.1186/s12967-023-04382-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen J, Shen S, Li Y, Fan J, Xiong S, Xu J, Zhu C, Lin L, Dong X, Duan W, Zhao Y, Qian X, Liu Z, Wei Y, Christiani DC, Zhang R, Chen F. APOLLO: An accurate and independently validated prediction model of lower-grade gliomas overall survival and a comparative study of model performance. eBioMedicine. 2022;79:104007. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ebiom.2022.104007\u003c/span\u003e\u003cspan address=\"10.1016/j.ebiom.2022.104007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrueffer C, Vallon-Christersson J, Grabau D, Ehinger A, Hakkinen J, Hegardt C, Malina J, Chen Y, Bendahl PO, Manjer J, Malmberg M, Larsson C, Loman N, Ryden L, Borg A, Saal LH. (2018) Clinical Value of RNA Sequencing-Based Classifiers for Prediction of the Five Conventional Breast Cancer Biomarkers: A Report From the Population-Based Multicenter Sweden Cancerome Analysis Network-Breast Initiative. JCO Precis Oncol 2. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1200/PO.17.00135\u003c/span\u003e\u003cspan address=\"10.1200/PO.17.00135\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu J, Lichtenberg T, Hoadley KA, Poisson LM, Lazar AJ, Cherniack AD, Kovatich AJ, Benz CC, Levine DA, Lee AV, Omberg L, Wolf DM, Shriver CD, Thorsson V, Cancer Genome Atlas Research N, Hu H. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell. 2018;173:400\u0026ndash;e416411. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cell.2018.02.052\u003c/span\u003e\u003cspan address=\"10.1016/j.cell.2018.02.052\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRueda OM, Sammut SJ, Seoane JA, Chin SF, Caswell-Jin JL, Callari M, Batra R, Pereira B, Bruna A, Ali HR, Provenzano E, Liu B, Parisien M, Gillett C, McKinney S, Green AR, Murphy L, Purushotham A, Ellis IO, Pharoah PD, Rueda C, Aparicio S, Caldas C, Curtis C. Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature. 2019;567:399\u0026ndash;404. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41586-019-1007-8\u003c/span\u003e\u003cspan address=\"10.1038/s41586-019-1007-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, Jacobsen A, Byrne CJ, Heuer ML, Larsson E, Antipin Y, Reva B, Goldberg AP, Sander C, Schultz N. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401\u0026ndash;4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/2159-8290.CD-12-0095\u003c/span\u003e\u003cspan address=\"10.1158/2159-8290.CD-12-0095\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim H, Whitman AA, Wisniewska K, Kakati RT, Garcia-Recio S, Calhoun BC, Franco HL, Perou CM, Spanheimer PM. Tamoxifen Response at Single-Cell Resolution in Estrogen Receptor-Positive Primary Human Breast Tumors. Clin Cancer Res. 2023;29:4894\u0026ndash;907. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/1078-0432.Ccr-23-1248\u003c/span\u003e\u003cspan address=\"10.1158/1078-0432.Ccr-23-1248\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu SZ, Al-Eryani G, Roden DL, Junankar S, Harvey K, Andersson A, Thennavan A, Wang C, Torpy JR, Bartonicek N, Wang T, Larsson L, Kaczorowski D, Weisenfeld NI, Uytingco CR, Chew JG, Bent ZW, Chan CL, Gnanasambandapillai V, Dutertre CA, Gluch L, Hui MN, Beith J, Parker A, Robbins E, Segara D, Cooper C, Mak C, Chan B, Warrier S, Ginhoux F, Millar E, Powell JE, Williams SR, Liu XS, O'Toole S, Lim E, Lundeberg J, Perou CM, Swarbrick A. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet. 2021;53:1334\u0026ndash;47. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41588-021-00911-1\u003c/span\u003e\u003cspan address=\"10.1038/s41588-021-00911-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSuehnholz SP, Nissan MH, Zhang H, Kundra R, Nandakumar S, Lu C, Carrero S, Dhaneshwar A, Fernandez N, Xu BW, Arcila ME, Zehir A, Syed A, Brannon AR, Rudolph JE, Paraiso E, Sabbatini PJ, Levine RL, Dogan A, Gao J, Ladanyi M, Drilon A, Berger MF, Solit DB, Schultz N, Chakravarty D. Quantifying the Expanding Landscape of Clinical Actionability for Patients with Cancer. Cancer Discov. 2024;14:49\u0026ndash;65. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/2159-8290.Cd-23-0467\u003c/span\u003e\u003cspan address=\"10.1158/2159-8290.Cd-23-0467\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, Rudolph JE, Yaeger R, Soumerai T, Nissan MH, Chang MT, Chandarlapaty S, Traina TA, Paik PK, Ho AL, Hantash FM, Grupe A, Baxi SS, Callahan MK, Snyder A, Chi P, Danila DC, Gounder M, Harding JJ, Hellmann MD, Iyer G, Janjigian YY, Kaley T, Levine DA, Lowery M, Omuro A, Postow MA, Rathkopf D, Shoushtari AN, Shukla N, Voss MH, Paraiso E, Zehir A, Berger MF, Taylor BS, Saltz LB, Riely GJ, Ladanyi M, Hyman DM, Baselga J, Sabbatini P, Solit DB, Schultz N. OncoKB: A Precision Oncology Knowledge Base. JCO Precision Oncol. 2017;1\u0026ndash;16. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1200/po.17.00011\u003c/span\u003e\u003cspan address=\"10.1200/po.17.00011\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRitchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/nar/gkv007\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkv007\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, Benner C, Chanda SK. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41467-019-09234-6\u003c/span\u003e\u003cspan address=\"10.1038/s41467-019-09234-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeumos L, Schaar AC, Lance C, Litinetskaya A, Drost F, Zappia L, Lucken MD, Strobl DC, Henao J, Curion F, Single-cell Best Practices, Schiller C, Theis HB. FJ (2023) Best practices for single-cell analysis across modalities. Nat Rev Genet 24:550\u0026ndash;572. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41576-023-00586-w\u003c/span\u003e\u003cspan address=\"10.1038/s41576-023-00586-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby C, Zager M, Hoffman P, Stoeckius M, Papalexi E, Mimitou EP, Jain J, Srivastava A, Stuart T, Fleming LM, Yeung B, Rogers AJ, McElrath JM, Blish CA, Gottardo R, Smibert P, Satija R. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573\u0026ndash;e35873529. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cell.2021.04.048\u003c/span\u003e\u003cspan address=\"10.1016/j.cell.2021.04.048\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChoi JH. In Kim H, Woo HG (2020) scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data. BMC Bioinformatics 21:342. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12859-020-03700-5\u003c/span\u003e\u003cspan address=\"10.1186/s12859-020-03700-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAibar S, Gonzalez-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, Rambow F, Marine JC, Geurts P, Aerts J, van den Oord J, Atak ZK, Wouters J, Aerts S. SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017;14:1083\u0026ndash;6. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nmeth.4463\u003c/span\u003e\u003cspan address=\"10.1038/nmeth.4463\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSun D, Guan X, Moran AE, Wu LY, Qian DZ, Schedin P, Dai MS, Danilov AV, Alumkal JJ, Adey AC, Spellman PT, Xia Z. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat Biotechnol. 2022;40:527\u0026ndash;38. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41587-021-01091-3\u003c/span\u003e\u003cspan address=\"10.1038/s41587-021-01091-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZheng Y, Heagerty PJ. Semiparametric estimation of time-dependent ROC curves for longitudinal marker data. Biostatistics. 2004;5:615\u0026ndash;32. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/biostatistics/kxh013\u003c/span\u003e\u003cspan address=\"10.1093/biostatistics/kxh013\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis making: Int J Soc Med Decis Mak. 2006;26:565\u0026ndash;74. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/0272989x06295361\u003c/span\u003e\u003cspan address=\"10.1177/0272989x06295361\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagn prognostic Res. 2019;3:18. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s41512-019-0064-7\u003c/span\u003e\u003cspan address=\"10.1186/s41512-019-0064-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTherneau TM. (2020) A Package for Survival Analysis in R.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCordell HJ. Detecting gene\u0026ndash;gene interactions that underlie human diseases. Nat Rev Genet. 2009;10:392\u0026ndash;404. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/nrg2579\u003c/span\u003e\u003cspan address=\"10.1038/nrg2579\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePharoah PD, Antoniou AC, Easton DF, Ponder BA. Polygenes, risk prediction, and targeted prevention of breast cancer. N Engl J Med. 2008;358:2796\u0026ndash;803. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1056/NEJMsa0708739\u003c/span\u003e\u003cspan address=\"10.1056/NEJMsa0708739\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhoury MJ, Yang Q, Gwinn M, Little J, Dana Flanders W. An epidemiologic assessment of genomic profiling for measuring susceptibility to common diseases and targeting interventions. Genet Med. 2004;6:38\u0026ndash;47. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1097/01.gim.0000105751.71430.79\u003c/span\u003e\u003cspan address=\"10.1097/01.gim.0000105751.71430.79\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAschard H, Chen J, Cornelis MC, Chibnik LB, Karlson EW, Kraft P. Inclusion of gene-gene and gene-environment interactions unlikely to dramatically improve risk prediction for complex diseases. Am J Hum Genet. 2012;90:962\u0026ndash;72. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ajhg.2012.04.017\u003c/span\u003e\u003cspan address=\"10.1016/j.ajhg.2012.04.017\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiao S, Desouki MM, Gaile DP, Shepherd L, Nowak NJ, Conroy J, Barry WT, Geradts J. Differential copy number aberrations in novel candidate genes associated with progression from in situ to invasive ductal carcinoma of the breast. Genes Chromosomes Cancer. 2012;51:1067\u0026ndash;78. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/gcc.21991\u003c/span\u003e\u003cspan address=\"10.1002/gcc.21991\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePonten F, Jirstrom K, Uhlen M. The Human Protein Atlas\u0026ndash;a tool for pathology. J Pathol. 2008;216:387\u0026ndash;93. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/path.2440\u003c/span\u003e\u003cspan address=\"10.1002/path.2440\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZou R, Zhong X, Wang C, Sun H, Wang S, Lin L, Sun S, Tong C, Luo H, Gao P, Li Y, Zhou T, Li D, Cao L, Zhao Y. MDC1 Enhances Estrogen Receptor-mediated Transactivation and Contributes to Breast Cancer Suppression. Int J Biol Sci. 2015;11:992\u0026ndash;1005. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.7150/ijbs.10918\u003c/span\u003e\u003cspan address=\"10.7150/ijbs.10918\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSottnik JL, Bordeaux EK, Mehrotra S, Ferrara SE, Goodspeed AE, Costello JC, Sikora MJ. Mediator of DNA Damage Checkpoint 1 (MDC1) Is a Novel Estrogen Receptor Coregulator in Invasive Lobular Carcinoma of the Breast. Mol Cancer Res. 2021;19:1270\u0026ndash;82. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/1541-7786.MCR-21-0025\u003c/span\u003e\u003cspan address=\"10.1158/1541-7786.MCR-21-0025\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWolowczyk C, Neckmann U, Aure MR, Hall M, Johannessen B, Zhao S, Skotheim RI, Andersen SB, Zwiggelaar R, Steigedal TS, Lingjaerde OC, Sahlberg KK, Almaas E, Bjorkoy G. NRF2 drives an oxidative stress response predictive of breast cancer. Free Radic Biol Med. 2022;184:170\u0026ndash;84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.freeradbiomed.2022.03.029\u003c/span\u003e\u003cspan address=\"10.1016/j.freeradbiomed.2022.03.029\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHoogenboezem EN, Duvall CL. Harnessing albumin as a carrier for cancer therapies. Adv Drug Deliv Rev. 2018;130:73\u0026ndash;89. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.addr.2018.07.011\u003c/span\u003e\u003cspan address=\"10.1016/j.addr.2018.07.011\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePark I, Lee Y, Kim JH, Bae SJ, Ahn SG, Jeong J, Cha YJ. YAP1 Expression in HR\u0026thinsp;+\u0026thinsp;HER2- Breast Cancer: 21-Gene Recurrence Score Analysis and Public Dataset Validation. Cancers (Basel). 2023;15. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/cancers15205034\u003c/span\u003e\u003cspan address=\"10.3390/cancers15205034\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Z, Tian Z, Song X, Zhang J. Membrane tension sensing molecule-FNBP1 is a prognostic biomarker related to immune infiltration in BRCA, LUAD and STAD. BMC Immunol. 2022;23:1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12865-021-00475-z\u003c/span\u003e\u003cspan address=\"10.1186/s12865-021-00475-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu X, Zhan Y, Li X, Wei J, Santiago L, Daniels G, Deng F, Zhong X, Chiriboga L, Basch R, Xiong S, Dong Y, Zhang X, Lee P. Nuclear TBLR1 as an ER corepressor promotes cell proliferation, migration and invasion in breast and ovarian cancer. Am J Cancer Res. 2016;6:2351\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang N, Li L, Long Z, Du J, Li S, Yin H, Xie K, Wu Z, Chen Y, Volontovich D, Cheng H, Wang F. Are dietary factors involved in the association of CDH4 methylation and breast cancer risk? Br J Nutr. 2022;127:1868\u0026ndash;77. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1017/s0007114521002804\u003c/span\u003e\u003cspan address=\"10.1017/s0007114521002804\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong Y, Tian T, Fu X, Wang W, Li S, Shi T, Suo A, Ruan Z, Guo H, Yao Y. GATA6 is overexpressed in breast cancer and promotes breast cancer cell epithelial-mesenchymal transition by upregulating slug expression. Exp Mol Pathol. 2015;99:617\u0026ndash;27. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.yexmp.2015.10.005\u003c/span\u003e\u003cspan address=\"10.1016/j.yexmp.2015.10.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFarhat D, L\u0026eacute;on S, Ghayad SE, Gadot N, Icard P, Le Romancer M, Hussein N, Lincet H. Lipoic acid decreases breast cancer cell proliferation by inhibiting IGF-1R via furin downregulation. Br J Cancer. 2020;122:885\u0026ndash;94. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41416-020-0729-6\u003c/span\u003e\u003cspan address=\"10.1038/s41416-020-0729-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKong HK, Yoon S, Park JH. The regulatory mechanism of the LY6K gene expression in human breast cancer cells. J Biol Chem. 2012;287:38889\u0026ndash;900. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1074/jbc.M112.394270\u003c/span\u003e\u003cspan address=\"10.1074/jbc.M112.394270\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHsieh MJ, Yao YL, Lai IL, Yang WM. Transcriptional repression activity of PAX3 is modulated by competition between corepressor KAP1 and heterochromatin protein 1. Biochem Biophys Res Commun. 2006;349:573\u0026ndash;81. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.bbrc.2006.08.064\u003c/span\u003e\u003cspan address=\"10.1016/j.bbrc.2006.08.064\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKong HK, Park SJ, Kim YS, Kim KM, Lee HW, Kang HG, Woo YM, Park EY, Ko JY, Suzuki H, Chun KH, Song E, Jang KY, Park JH. Epigenetic activation of LY6K predicts the presence of metastasis and poor prognosis in breast carcinoma. Oncotarget. 2016;7:55677\u0026ndash;89. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.18632/oncotarget.10972\u003c/span\u003e\u003cspan address=\"10.18632/oncotarget.10972\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbolhassani A, Riazi GH, Azizi E, Amanpour S, Muhammadnejad S, Haddadi M, Zekri A, Shirkoohi R. FGF10: Type III Epithelial Mesenchymal Transition and Invasion in Breast Cancer Cell Lines. J Cancer. 2014;5:537\u0026ndash;47. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.7150/jca.7797\u003c/span\u003e\u003cspan address=\"10.7150/jca.7797\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eClayton NS, Grose RP. Emerging Roles of Fibroblast Growth Factor 10 in Cancer. Front Genet. 2018;9:499. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fgene.2018.00499\u003c/span\u003e\u003cspan address=\"10.3389/fgene.2018.00499\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang F, Liu YH, Dong SY, Ma RM, Bhandari A, Zhang XH, Wang OC. A novel long non-coding RNA FGF14-AS2 is correlated with progression and prognosis in breast cancer. Biochem Biophys Res Commun. 2016;470:479\u0026ndash;83. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.bbrc.2016.01.147\u003c/span\u003e\u003cspan address=\"10.1016/j.bbrc.2016.01.147\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJin Y, Zhang M, Duan R, Yang J, Yang Y, Wang J, Jiang C, Yao B, Li L, Yuan H, Zha X, Ma C. Long noncoding RNA FGF14-AS2 inhibits breast cancer metastasis by regulating the miR-370-3p/FGF14 axis. Cell Death Discov. 2020;6:103. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41420-020-00334-7\u003c/span\u003e\u003cspan address=\"10.1038/s41420-020-00334-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlvarez-D\u0026iacute;az S, Valle N, Garc\u0026iacute;a JM, Pe\u0026ntilde;a C, Freije JM, Quesada V, Astudillo A, Bonilla F, L\u0026oacute;pez-Ot\u0026iacute;n C, Mu\u0026ntilde;oz A. Cystatin D is a candidate tumor suppressor gene induced by vitamin D in human colon cancer cells. J Clin Invest. 2009;119:2343\u0026ndash;58. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1172/jci37205\u003c/span\u003e\u003cspan address=\"10.1172/jci37205\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eH\u0026uuml;nten S, Hermeking H. p53 directly activates cystatin D/CST5 to mediate mesenchymal-epithelial transition: a possible link to tumor suppression by vitamin D3. Oncotarget. 2015;6:15842\u0026ndash;56. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.18632/oncotarget.4683\u003c/span\u003e\u003cspan address=\"10.18632/oncotarget.4683\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNi J, Peng Y, Yang FL, Xi X, Huang XW, He C. Overexpression of CLEC3A promotes tumor progression and poor prognosis in breast invasive ductal cancer. Onco Targets Ther. 2018;11:3303\u0026ndash;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.2147/OTT.S161311\u003c/span\u003e\u003cspan address=\"10.2147/OTT.S161311\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMatsuda Y, Miura K, Yamane J, Shima H, Fujibuchi W, Ishida K, Fujishima F, Ohnuma S, Sasaki H, Nagao M, Tanaka N, Satoh K, Naitoh T, Unno M. SERPINI1 regulates epithelial-mesenchymal transition in an orthotopic implantation model of colorectal cancer. Cancer Sci. 2016;107:619\u0026ndash;28. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1111/cas.12909\u003c/span\u003e\u003cspan address=\"10.1111/cas.12909\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003evan de Rijn M, Perou CM, Tibshirani R, Haas P, Kallioniemi O, Kononen J, Torhorst J, Sauter G, Zuber M, Kochli OR, Mross F, Dieterich H, Seitz R, Ross D, Botstein D, Brown P. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am J Pathol. 2002;161:1991\u0026ndash;6. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/S0002-9440(10)64476-8\u003c/span\u003e\u003cspan address=\"10.1016/S0002-9440(10)64476-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang CC, Bajikar SS, Jamal L, Atkins KA, Janes KA. A time- and matrix-dependent TGFBR3-JUND-KRT5 regulatory circuit in single breast epithelial cells and basal-like premalignancies. Nat Cell Biol. 2014;16:345\u0026ndash;56. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/ncb2930\u003c/span\u003e\u003cspan address=\"10.1038/ncb2930\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHanley CJ, Henriet E, Sirka OK, Thomas GJ, Ewald AJ. Tumor-Resident Stromal Cells Promote Breast Cancer Invasion through Regulation of the Basal Phenotype. Mol Cancer Res. 2020;18:1615\u0026ndash;22. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1158/1541-7786.Mcr-20-0334\u003c/span\u003e\u003cspan address=\"10.1158/1541-7786.Mcr-20-0334\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBilandzic M, Rainczuk A, Green E, Fairweather N, Jobling TW, Plebanski M, Stephens AN. Keratin-14 (KRT14) Positive Leader Cells Mediate Mesothelial Clearance and Invasion by Ovarian Cancer Cells. Cancers (Basel). 2019;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/cancers11091228\u003c/span\u003e\u003cspan address=\"10.3390/cancers11091228\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTang S, Liu W, Yong L, Liu D, Lin X, Huang Y, Wang H, Cai F. Reduced Expression of KRT17 Predicts Poor Prognosis in HER2(high) Breast Cancer. Biomolecules. 2022;12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/biom12091183\u003c/span\u003e\u003cspan address=\"10.3390/biom12091183\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJinesh GG, Flores ER, Brohl AS. Chromosome 19 miRNA cluster and CEBPB expression specifically mark and potentially drive triple negative breast cancers. PLoS ONE. 2018;13:e0206008. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0206008\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0206008\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKozawa K, Sekai M, Ohba K, Ito S, Sako H, Maruyama T, Kakeno M, Shirai T, Kuromiya K, Kamasaki T, Kohashi K, Tanaka S, Ishikawa S, Sato N, Asano S, Suzuki H, Tanimura N, Mukai Y, Gotoh N, Tanino M, Tanaka S, Natsuga K, Soga T, Nakamura T, Yabuta Y, Saitou M, Ito T, Matsuura K, Tsunoda M, Kikumori T, Iida T, Mizutani Y, Miyai Y, Kaibuchi K, Enomoto A, Fujita Y. (2021) The CD44/COL17A1 pathway promotes the formation of multilayered, transformed epithelia. Curr Biol 31:3086\u0026ndash;3097 e3087. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.cub.2021.04.078\u003c/span\u003e\u003cspan address=\"10.1016/j.cub.2021.04.078\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYodsurang V, Tanikawa C, Miyamoto T, Lo PHY, Hirata M, Matsuda K. Identification of a novel p53 target, COL17A1, that inhibits breast cancer cell migration and invasion. Oncotarget. 2017;8:55790\u0026ndash;803. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.18632/oncotarget.18433\u003c/span\u003e\u003cspan address=\"10.18632/oncotarget.18433\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAn Q, Liu T, Wang MY, Yang YJ, Zhang ZD, Liu ZJ, Yang B. KRT7 promotes epithelial\u0026ndash;mesenchymal transition in ovarian cancer via the TGF\u0026ndash;beta/Smad2/3 signaling pathway. Oncol Rep. 2021;45:481\u0026ndash;92. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3892/or.2020.7886\u003c/span\u003e\u003cspan address=\"10.3892/or.2020.7886\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Z, Wang H, Jin Y, Zhou J, Chu C, Tang F, Zou L, Zou Q. KRT15 in early breast cancer screening and correlation with HER2 positivity, pathological grade and N stage. Biomark Med. 2023;17:553\u0026ndash;62. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.2217/bmm-2023-0130\u003c/span\u003e\u003cspan address=\"10.2217/bmm-2023-0130\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhong P, Shu R, Wu H, Liu Z, Shen X, Hu Y. Low KRT15 expression is associated with poor prognosis in patients with breast invasive carcinoma. Exp Ther Med. 2021;21:305. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3892/etm.2021.9736\u003c/span\u003e\u003cspan address=\"10.3892/etm.2021.9736\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuivaniemi H, Tromp G. Type III collagen (COL3A1): Gene and protein structure, tissue distribution, and associated diseases. Gene. 2019;707:151\u0026ndash;71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.gene.2019.05.003\u003c/span\u003e\u003cspan address=\"10.1016/j.gene.2019.05.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang F, Lin L, Li X, Wen R, Zhang X. Silencing of COL3A1 represses proliferation, migration, invasion, and immune escape of triple negative breast cancer cells via down-regulating PD-L1 expression. Cell Biol Int. 2022;46:1959\u0026ndash;69. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1002/cbin.11875\u003c/span\u003e\u003cspan address=\"10.1002/cbin.11875\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZeng C, Lin M, Jin Y, Zhang J. Identification of Key Genes Associated with Brain Metastasis from Breast Cancer: A Bioinformatics Analysis. Med Sci Monit. 2022;28:e935071. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.12659/MSM.935071\u003c/span\u003e\u003cspan address=\"10.12659/MSM.935071\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoto R, Nakamura Y, Takami T, Sanke T, Tozuka Z. Quantitative LC-MS/MS Analysis of Proteins Involved in Metastasis of Breast Cancer. PLoS ONE. 2015;10:e0130760. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0130760\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0130760\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu ZH, Zhang YJ, Yue JX, Zhou T. Comprehensive Analysis of the Expression and Prognosis for SFRPs in Breast Carcinoma. Cell Transpl. 2020;29:963689720962479. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1177/0963689720962479\u003c/span\u003e\u003cspan address=\"10.1177/0963689720962479\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSiamakpour-Reihani S, Caster J, Bandhu Nepal D, Courtwright A, Hilliard E, Usary J, Ketelsen D, Darr D, Shen XJ, Patterson C, Klauber-Demore N. The role of calcineurin/NFAT in SFRP2 induced angiogenesis\u0026ndash;a rationale for breast cancer treatment with the calcineurin inhibitor tacrolimus. PLoS ONE. 2011;6:e20412. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0020412\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0020412\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Breast cancer, Hormone receptor-positive, Prognostic prediction, Interaction, Nomogram","lastPublishedDoi":"10.21203/rs.3.rs-4394836/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4394836/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003ePurpose\u003c/h2\u003e \u003cp\u003eIn this study, a prognostic model was constructed for HR-positive HER2-negative (HR+/HER2\u0026ndash;) and node-negative breast cancer by integrating clinical and transcriptional biomarkers, with a particular focus on exploring both main effects and gene-gene (G \u0026times; G) interactions.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eUnivariate and multivariate Cox regression were used to analyze three independent trans-ethnic cohorts with a total of 2180 samples. Independent prognostic factors were used to construct a prediction model. The Model was validated by ROC curves, calibration curve and decision curve analysis (DCA).The molecular basis of the Model was illustrated by integrating bulk-tumor and single-cell RNAseq datasets.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eOur findings revealed that a combination of clinical and transcriptional factors can improve the accuracy of prognostic models for HR+/HER2\u0026ndash; and node-negative breast cancer. The Model achieved satisfactory discrimination, with the area under the curve (AUC) ranging from 0.65 (Metabric, 10-year survival) to 0.88 (GSE96058, 3-year survival).\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eThis research provides a powerful tool for predicting outcomes in HR+/HER2\u0026ndash; and node-negative breast cancer, offering initial insights into the molecular mechanisms that can guide future investigations.\u003c/p\u003e","manuscriptTitle":"Development of a Prognostic Model for HR-positive HER2-negative and Node-negative Breast Cancer: Integrating Clinical and Transcriptional Biomarkers","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-05-24 20:06:23","doi":"10.21203/rs.3.rs-4394836/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"bf58f5aa-3c9f-4585-bf59-261ff4d30c78","owner":[],"postedDate":"May 24th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-08-30T09:53:32+00:00","versionOfRecord":[],"versionCreatedAt":"2024-05-24 20:06:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4394836","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4394836","identity":"rs-4394836","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.