Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality

doi:10.21203/rs.3.rs-3994390/v1

Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality

2024 · doi:10.21203/rs.3.rs-3994390/v1

preprint OA: closed

Full text JSON View at publisher

Full text 125,684 characters · extracted from preprint-html · click to expand

Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality Yelin Zhao, Xinxiu Li, Joseph Loscalzo, Martin Smelik, Oleg Sysoev, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3994390/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 11 May, 2024 Read the published version in Journal of Translational Medicine → Version 1 posted 4 You are reading this latest preprint version Abstract Background Characterization of shared cancer mechanisms have been proposed to improve therapy strategies and prognosis. Here, we aimed to identify shared cell-cell interactions (CCIs) within the tumor microenvironment across multiple solid cancers and assess their association with cancer mortality. Methods CCIs of each cancer were identified by NicheNet analysis of single-cell RNA sequencing data from breast, colon, liver, lung, and ovarian cancers. These CCIs were used to construct a shared multi-cellular tumor model (shMCTM) representing common CCIs across cancers. A gene signature was identified from the shMCTM and tested on the mRNA and protein level in two large independent cohorts: The Cancer Genome Atlas (TCGA, 9,185 tumor samples and 727 controls across 22 cancers) and UK biobank (UKBB, 10,384 cancer patients and 5,063 controls with proteomics data across 17 cancers). Cox proportional hazards models were used to evaluate the association of the signature with 10-year all-cause mortality, including sex-specific analysis. Results A shMCTM was derived from five individual cancers. A shared gene signature was extracted from this shMCTM and the most prominent regulatory cell type, matrix cancer-associated fibroblast (mCAF). The signature exhibited significant expression changes in multiple cancers compared to controls at both mRNA and protein levels in two independent cohorts. Importantly, it was significantly associated with mortality in cancer patients in both cohorts. The highest hazard ratios were observed for brain cancer in TCGA (HR [95%CI] = 6.90[4.64–10.25]) and ovarian cancer in UKBB (5.53[2.08–8.80]). Sex-specific analysis revealed distinct risks, with a higher mortality risk associated with the protein signature score in males (2.41[1.97–2.96]) compared to females (1.84[1.44–2.37]). Conclusion We identified a gene signature from a comprehensive shMCTM representing common CCIs across different cancers and revealed the regulatory role of mCAF in the tumor microenvironment. The pathogenic relevance of the gene signature was supported by differential expression and association with mortality on both mRNA and protein levels in two independent cohorts. Cell-cell interactions Cancer-associated fibroblast Single-cell RNA sequencing Prioritization Pan-cancer Mortality Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Introduction According to the WHO global cancer cases will increase by more than 75% by 2050, significantly increasing mortality 1 . This increase involves highly diverse cancers in both women and men. Could this indicate that, despite this diversity, there are shared mechanisms across cancers? If so, are those mechanisms important for pathogenesis and mortality? Previous studies of highly diverse complex disease have shown shared mechanisms despite great complexity and heterogeneity. In support of their pathogenic importance those mechanisms are highly interconnected, and enriched for disease-associated genetic variants, so that their combined effects are large 2 . The existence of shared genes across cancers is supported by a previous study of deconvoluted bulk RNA sequencing data from 20 solid tumor types, which found converging molecular interactions between cancer and stromal cells in the tumor microenvironment (TME) 3 . Another study focused cell-cell interactions (CCIs) between fibroblast subtypes and tumor cells in six different cancers and found associations with the response to immunotherapy 4 . These pioneering studies supported the idea that there may be shared ligand-target interactions between specific cell types across cancers. If so, such interactions could have important implications: Since carcinogenesis involves multiple cell types, and not only malignant ones, CCIs could constitute a higher order representation of the complex and heterogeneous changes in all those cell types 5 . This implication leads to an unanswered question: Are there shared CCIs when all cell types in different tumors are analyzed? If so, would it be possible to systematically organize those into one comprehensive model, which could be used to prioritize the most important interactions? Previous single-cell RNA sequencing (scRNA-seq) studies of inflammatory diseases have used such CCIs to construct multicellular models. In those models, the upstream regulatory (UR) ligands could be ranked and prioritized based on the relative numbers of cell types and downstream target (DS) genes. The disease relevance of the models and URs was validated by functional studies 6 – 8 . Here, we translated the same principles to construct multicellular tumor models (MCTMs) of different cancers based on scRNA-seq data. We next hypothesized that those MCTMs could be used to construct a shared MCTM (shMCTM) from which a shared gene signature could be prioritized. This did result in the identification of an shMCTM and a gene signature, whose pathogenic relevance was validated by differential mRNA and protein expression, as well as association with mortality in independent data from The Cancer Genome Atlas (TCGA, 9,185 tumor tissues, 727 control tissues from cancers of 22 different tissue origins) and from the UK Biobank cohort (UKBB, 10,384 cancer patients, 5,063 controls with proteomics data of cancers from 17 different tissue origins). Methods Data source ScRNA-seq ScRNA-seq count matrix files of five common cancers: breast cancer (ER-positive breast cancer, GSE161529 9 ), colon cancer (colorectal cancer, GSE144735 10 ), liver cancer (intrahepatic cholangiocarcinoma, GSE138709 11 ), lung cancer (lung adenocarcinoma, GSE123902 12 and ovarian cancer (E-MTAB-8107 13 ) from Gene Expression Omnibus (GEO) 14 or ArrayExpress 15 (Additional file 1 S1). These cancer datasets were selected because they are among the most prevalent solid tumors, and high-quality scRNA-seq data of untreated primary tumors and control samples (adjacent normal tissue, except for the breast cancer dataset, where normal tissues were from mammary gland cells of non-breast cancer patients) were available. All retrieved scRNA-seq studies were performed using 10× Genomics’ scRNA-seq technology. UK biobank Proteomic data from the UKBB cohort includes plasma proteome of 54,306 unique UKBB participants from the UK Biobank Pharma Proteomics Project 16 . The expression of 2,911 plasma proteins (the second release) were tested using the antibody-based Proximity Extension Assay by Olink and was provided as Normalized Protein Expression (NPX) 16 . NPX is a relative quantification unit related to protein concentration; it was background-corrected, log2-transformed, and normalized within all samples 17 . The identification of cancer cases and healthy controls was performed using the International Classification of Diseases (ICD) coding system, specifically ICD-9 and ICD-10 (Additional file 1 S2). A detailed description of the UKBB proteomic data and the sample size is provided in the Supplemental Methods and Additional file 1 S3. Proteins with more than 20% missing data were removed, and the one with less than 20% missing were imputed using the K-nearest neighbor (KNN) method with 10 nearest neighbors. TCGA and GTEx The mRNA transcripts per million (TPM) expression profiles of tissues from 22 organ sites were obtained from TCGA and GTEx through UCSC Xena 18 . The gene expression was transformed to log2(TPM + 0.001) for both TCGA and GTEx samples by the same RNA-seq pipeline. The normal sample of GTEx was combined with the normal sample in TCGA according to organ site, the merging strategy and sample size for each dataset are listed at Additional file 1 S4. Ethics The UKBB study received ethical approval from the National Information Governance Board for Health and Social Care and the National Health Service Northwest Multi-Center Research Ethics Committee, and all participants provided written consent. This research has been conducted under approved application number 102162. All participants in TCGA were consented, and the data is openly accessible to researchers. ScRNA-seq processing The downloaded count matrices of each cancer scRNA-seq data set were processed and quality controlled using the R package Seurat v4.0.4 20 . For each sample, the low-quality cells were filtered out based on mitochondrial RNA percentage, the range of read counts, and gene coverage (Supplement Methods). Each cancer was analyzed independently. Single-cell profiles from different samples within the same cancer were integrated using Seurat 4 anchor-based integration methods IntegrateData . Cell clusters were identified using the default FindClusters function. Cell types were annotated by known cell type markers detailed in the Additional file 2. A Model-based Analysis of Single-cell Transcriptomics (MAST) 21 was used for the identification of differentially expressed genes (DEG) between tumor tissues and normal tissues within the same cell type. DEGs with adjusted p-value 0.25 were used for downstream analysis. Fibroblasts from each cancer were extracted and integrated into one dataset using the Seurat IntegrateData function to adjust the difference between cancers. The clustering resolution was 0.1 with seed 42. Marker genes of each cluster were identified using the FindAllMarkers function with default settings. Construction of MCTM and shMCTM To infer cell-cell interactions (CCIs) of all cell type pairs, an R package NicheNet (v1.1.0) was applied 22 . This analysis was performed separately for each cancer. In brief, the cell type and DEGs list for each cell type served as input for NicheNet. CCIs were then identified between each pair of cell types using the default analysis setup. For each identified interaction, potential ligands and target genes in the source cell type were determined using the predict_ligand_activities and get_weighted_ligand_target_links functions with default settings. The predicted interactions for all cell type pairs in each cancer were used to construct a MCTM. To identify common CCIs across all five cancers, a shMCTM was created as follows: 1) URs found in all cancers were identified. 2) For each cell type, the log2FC of each UR from step 1 was compared among all five cancers. A UR was considered a shared UR (shUR) if it exhibited the same direction of expression change in one cell type in at least four cancers. The cell type of this UR was recorded and used for shMCTM construction. 3) Subsequently, DSs of shURs were identified; we defined shared DSs (shDSs) using the same criteria applied for identifying shURs. 4) The genes (shURs and shDSs) and their corresponding cell types (shMCTM cell types) were used for constructing the shMCTM. Prioritization of shURs To systematically prioritize shURs, shURs were clustered based on the number of interactions with each downstream cell type in the shMCTM. Euclidean distance was used for clustering and clusters were cut into two main subclusters according to the dendrogram. The cluster with a larger number of interactions in all cell types was considered as the top cluster, and shURs in this cluster were considered as top shURs. Genome-wide association studies (GWAS) gene enrichment analyses and disease relevance GWAS gene enrichment analysis (Fisher’s exact test, double-sided) of shMCTM genes was performed for each cancer separately. All DEGs within that cancer were used as a background. The GWAS-associated genes were downloaded from DisGeNET in November 2021 23 . The “diseaseName” and GWAS genes for each cancer were listed in Additional file 2 S2. The disease relevance was computed using DisGeNET “disgenet2r” R package version 0.99.2. To perform disease enrichment of genes included in the shMCTM, default setting of the disease_enrichment function was used. The p-values resulting from the multiple Fisher tests were corrected for multiple testing using the False Discovery Rate (FDR) method. KEGG enrichment KEGG enrichment was performed using the R clusterProfiler package (v3.18.1) 24 . The KEGG enrichment for marker genes of each cancer-associated fibroblast (CAF) subcluster was performed using function enrichKEGG . Function compareCluster was used for plotting the top KEGG terms of shDSs of shURs expressed in fibroblast shURs. Definition of all-cause mortality and survival time The all-cause mortality was defined as death with any reason during the observation period (10 years after cancer diagnosis). The survival time was defined as the period from initial cancer diagnosis until the date of death from any cause, loss to follow-up or the end of the follow-up period (30 November 2022 in UKBB) 25 . Gene set and protein set scoring The gene score of signature genes was calculated for cancer patients in TCGA. The default “gsva” method in the GSVA R package was used for calculating these scores 26 . The corresponding protein score was calculated for the UKBB cancer patients using the average NPX of proteins encoded by signature genes. The gene score and protein score were divided into high and low groups using their average value as cutoff. Statistics Differential expressions of mRNAs and proteins were tested between tumor tissue vs. normal tissue or cancer patient vs. healthy control in TCGA or UKBB, respectively. The differential expression of each mRNA or protein was assessed for each individual cancer using the two-sided Wilcoxon test, and the difference of expression was presented as log2FC. The survival analysis was performed in all cancer patients pooled together and each cancer individually. The Cox proportional hazards model was used to calculate hazard ratios (HRs) and 95% confidence intervals (CIs) for the associations of each mRNA or protein (and the mRNA signature score or protein signature score) with the 10-year mortality of patients who diagnosed cancer. This association was also performed in each sex subgroup. The Cox models were adjusted for basic confounding factors when appropriate (UKBB: sex, age of diagnosis, time difference from diagnosis to sampling, and cancer type; TCGA: sex, age of diagnosis, and cancer type). Sex was excluded from the model when performing survival analysis in each sex subgroup, and cancer type was excluded from the model when testing in each individual cancer. Cancers with less than 20 death events were excluded when testing the association in each individual cancer. Kaplan-Meier survival curves were plotted for the combination of signature score level (high or low) and sex (female or male) using the ggsurvplot function and compared using the two-sided log-rank test. All statistical analyses were performed using R (version 4.0.4). The FDR method was applied for multiple comparisons, and an adjusted p-value < 0.05 indicated a significant difference. Results Overall design Our hypotheses were that 1) there were shared cell-cell interactions (CCIs) across cancers and that 2) these interactions were important for pathogenesis and mortality. To test the first hypothesis, we analyzed single-cell datasets from different cancers and compared the CCIs between them. This resulted in a shMCTM that represented shared cellular interactions across different cancers, from which we identified a gene signature (Fig. 1 A and B). For the second hypothesis, we assessed the signature at both mRNA and protein levels, subsequently referred to as the mRNA signature and protein signature, in two extensive independent cohorts (TCGA and UKBB). We first compared the expression differences of signature mRNAs between tumor and control: Next, we tested the association of the mRNA/protein signatures with 10-year all-cause mortality in cancer patients (Fig. 1 C). Analyses of scRNA-seq data from five different cancers shows shared differentially expressed genes Given the importance of multiple local cell types other than tumor cells (e.g. stromal cells and immune cells ) we conducted analysis of scRNA-seq data from five common cancers, namely breast, colon, liver, lung, and ovarian cancers. Following quality control procedures, a total of 281,302 cells was analyzed and clustered (Fig. 2 A). For each cancer, 12–15 distinct cell types were identified, with the expression of known cell type marker genes illustrated in Fig. 2 B. The proportions of cell types in the tumor microenvironment differed greatly between the five cancers (Fig. 2 C). Epithelial cells and fibroblasts predominated in breast, ovary, colon, and liver cancers, while immune cells were more prevalent in lung cancer. Across all five cancers, the proportion of epithelial cells increased in tumor tissue. Liver cancer exhibited a significant increase in epithelial cells but a decreased proportion of immune cells. These changes in cellular proportions were associated with thousands of DEGs between tumor and normal tissue, which also varied greatly between cell types and cancers (Additional file 3). Nevertheless, we identified 1,153 DEGs that were shared across these five cancers (Fig. 2 D). This led us to ask if these DEGs were associated with shared interactions between the cell types in the cancers. Multi-cellular tumor models show dispersion of pathogenic mechanisms To search for shared interactions, we first constructed MCTMs of each of the five cancers. The MCTMs showed directed molecular interactions between URs in any cell type and DSs in other cell types (Additional figure S1 A). The median (range) number of URs per cancer was 203 (155–232), with 74 URs found in all five cancers (Additional figure S1 B). The median (range) number of DSs per cancer was 1,641 (1,279–2,135), with 577 shared across all cancers (Additional figure S1 C). Rather than a hierarchical organization in which most interactions originated from cancer cells, the interactions formed highly interconnected networks (Additional figure S2 and Additional file 4). Most cell types in the MCTMs were enriched with cancer related traits identified by GWAS (Additional figure S3 ). This suggested that pathogenic mechanisms were distributed across cell types rather than originating solely from cancer cells. Construction of a shared MCTM To identify potential shared interactions across cancers, we explored the possibility of constructing a shMCTM from the five MCTMs. To characterize interactions, we identified URs and DSs that were shared across the MCTMs (shURs and shDSs). The criteria for shURs and shDSs were that they should 1) be URs or DSs in all five cancer MCTMs, and 2) have the same direction of expression change in the same cell type in at least four cancers (Fig. 1 B). A total of 117 shMCTM genes (30 shURs and 98 shDSs) located in shMCTM cell types (fibroblast, cancer cells, macrophages, endothelial cells, pericytes and T cells) were identified and used to construct the shMCTM (Fig. 3 A and Additional file 5). In support of the pathogenic relevance of shMCTM, the shMCTM genes (shURs and shDSs) exhibited enrichment for GWAS-associated genes in the five studied cancers, with odds ratios ranging from 2.51 to 3.81 (adjusted p-value < 0.05, except for 0.06 in Ovarian cancer, Additional file 5). Additionally, the shMCTM genes were found to be associated with malignant and fibrotic diseases, as indicated by the DisGeNET database (Fig. 3 B). These observations underscored the pathogenic importance not only of malignant cells but also of fibroblasts relative to other cell types in the TME. Prioritization of a shared gene signature based on the shMCTM To prioritize a shared gene signature based on the shMCTM, we focused on the shURs that regulated the largest number of shDSs. Briefly, we clustered the shURs based on their total number of interactions towards each downstream cell type in the shMCTM. This identified two main clusters, of which the one with the most interactions included eight shURs ( COL1A1 , FN1 , SPP1 , COL4A1 , COL18A1 , PLAU , CLEC11A , and MDK ) (Fig. 3 C). Interestingly, seven out of eight shURs were more highly expressed in fibroblasts compared to other cell types in the shMCTM (Fig. 3 D). This led us to subtype fibroblasts to search for more genes to include in the gene signature. Prioritization of genes for the gene signature based on a subtype of CAF To search for and prioritize subtypes of fibroblasts, fibroblasts from the five cancers were re-integrated into one dataset. A total of 36,601 fibroblast cells were clustered into seven subpopulations (Fig. 4 A, B and C), of which four clusters (subclusters 0, 4, 5 and 6) were mainly enriched in tumor tissues, whereas subclusters 1, 2, and 3 were mainly present in normal tissues. All seven subclusters expressed canonical fibroblast markers such as ACTA2 ( a-SMA ), while each subcluster displayed distinct transcriptomic markers (Fig. 4 D and Additional Figure S4 ) and highly diverse functions (Additional Note 1 and Additional Figure S5 ). Instead of being dispersed across different fibroblast subtypes, most shURs and shDSs were highly expressed in CAF_C0 (Fig. 4 E). The CAF_C0 represented the largest CAF cluster and exhibited characteristics consistent with previously reported matrix CAFs (mCAF) 4 , 27 , showing elevated expression of extracellular matrix (ECM) remodeling genes. Therefore, we subsequently refer to it as mCAF in the following context (Fig. 4 D and F). In further support of the importance of mCAF, its shURs regulated shDSs in all other cell types. As commented in the discussion, KEGG pathway analysis of those shDSs in epithelial cells revealed a wide variety of pathways relevant for malignant transformation (Fig. 4 G, Additional Figure S5 and Supplement Methods). Therefore, we hypothesized that genes in mCAF could be relevant to add to the shared gene signature. For this purpose, we prioritized genes 1) with the top 10 highest log2FC between mCAF and other CAF clusters and 2) that were DEGs between tumor and normal in mCAF. This analysis resulted in eight candidate biomarkers, in addition to the eight shURs, namely MMP11 , CTHRC1 , COL1A2 , COL3A1 , SPARC , COL5A2 , POSTN and COL11A1 . In total, 16 signature genes were identified (eight shURs and eight mCAF marker genes). The general pathogenic relevance of the 16 signature genes and their mRNAs and protein products is supported by analyses of two large cohorts To assess the general pathogenic relevance of these 16 signature genes, we hypothesized that the signature at both mRNA and protein levels (subsequently referred to as the mRNA signature and protein signature) 1) should be differentially expressed in tumor tissue/cancer plasma compared to normal tissue/healthy plasma and 2) associated with outcome of cancer patients – all-cause mortality in ten years. Differential expression in tumor tissue vs. normal tissue of the mRNA signature was tested in bulk RNA sequencing data of tissue samples in TCGA (9,185 patients and 727 controls from 22 cancers). The protein signatures were tested using the plasma proteomics data from the UKBB (10,384 patients and 5,063 controls from 19 cancers, 12 proteins were detected) (Additional file 1 S2 to 4). These signature mRNAs/proteins were evaluated for each cancer type in both cohorts. We found that they were generally significantly differentially expressed in all cancer types on both tissue mRNA and plasma protein levels. Several mRNAs/proteins showed similar expression change in tumors from both cohorts, for example CTHRC1, MDK and SPP1, while some had more variation (e.g., COL18A1) (Fig. 5 and Additional file 6 S1 and S2). Nevertheless, the similar differential expression patterns of these signature mRNAs/proteins across different cancers suggested that this signature could represent molecular mechanisms of clinical importance. To examine this, we next analyzed if the signature was associated with the perhaps most important clinical trait - mortality. Signature genes in tumor tissues were associated with mortality in multiple cancers The association of these signature mRNAs in tumor tissues with 10-year all-cause mortality after cancer diagnosis were evaluated using the Cox proportional hazards model in cancer patients from TCGA. The signature mRNAs showed significant associations with mortality in all cancer patients with HR ranging from 1.06 to 1.2. The mRNAs signature score was associated with higher risk of death (HR[95%CI] = 1.69[1.55–1.85]) compared to each single mRNAs (Fig. 6 A and Additional file 6 S3). Similar results were found in each sex subgroup (Fig. 6 AB). When looking at each individual cancer, the mRNAs signature score was associated with mortality in 11 cancers. Particularly strong associations were found in cancers of the brain (HR[95%CI] = 6.9[4.64–10.25]), mesothelioma (HR[95%CI] = 3.13[1.87–5.24]) and uterus (HR[95%CI] = 3.02[1.61–5.66]) (Fig. 6 C and Additional file 6 S4). Signature proteins in plasma were associated with mortality in multiple cancers The association of 12 signature proteins with survival were analyzed in plasma from cancers patients from the UKBB. Eight plasma proteins were associated with mortality, with COL18A1 showing the highest HR in all cancer patients (HR[95%CI] = 1.72[1.92–2.50]). Compared to each individual proteins, the protein score of these nine proteins was associated with greater risk of death in all cancer patients (HR[95%CI] = 2.16[1.84,2.53]) (Fig. 7 A and Additional file 6 S5). In female and male subgroups, more proteins were associated with mortality in males compared to females (8 vs. 4 proteins), while a higher protein score correlated with higher risk of death in both female and male cancer patients (Fig. 7 A and B). Notably, females, overall, showed lower risk of death compared to males (Fig. 7 B). The protein score of these nine proteins was associated with mortality in nine cancer types. The HR ranged from 1.47 to 5.53, with the highest HR for the death risk being found for ovarian cancer (HR[95%CI] = 5.53 [2.08–14.67]) followed by prostate cancer (4.63[2.80–7.68]) and lymphoma (HR[95%CI] = 4.62[2.43–8.8]) (Fig. 7 C and Additional file 6 S6). Discussion Despite the great complexity and heterogeneity of cancers this study showed molecular changes that were shared across multiple cancers. The pathogenic and clinical importance of those changes was supported by enrichment of GWAS genes and association with mortality. The study was based on scRNA-seq, which allows the characterization of molecular changes in all cell types in a tumor. This may be advantageous because increasing evidence points to the pathogenic importance of multiple cell types in the TME 28 , 29 . This complexity leads to the problems of how best to organize systematically and prioritize mechanisms across cancers. Previous scRNA-seq studies of complex diseases, which also are multicellular, have shown that these problems can be addressed by constructing multicellular network models based on connecting URs in any cell type with their DSs in other cell types, and prioritizing the URs with the largest effects on DSs 6 , 7 . We applied these principles to scRNA-seq data from five cancers. In summary, we found that despite great cellular and molecular differences among the analyzed cancers, their MCTMs showed overarching similarities. These included pathogenic URs and DSs being dispersed across cell types, rather than only originating from cancer cells. A similar organization was found in the shMCTM, which showed a higher-order representation of the complex changes. In support of a shared multicellular pathogenesis across cancers, the shMCTM was enriched for GWAS genes and pathways associated with malignant transformation. Since shURs regulated the shDSs, the shURs would have a superior role relative to shDSs. The shURs that regulated more shDSs and cells were prioritized and considered as signature genes that could have important pathogenic roles. Notably, these prioritized shURs exhibited elevated expression levels in fibroblasts compared to other cell types in the shMCTM. This agreed with the previous finding of a hierarchy of cell-cell interactions dominated by fibroblasts to macrophages in breast cancer 30 . Moreover, we found CAF has a higher hierarchy over multiple cell types in five different tumors, supporting the crucial role of CAF in TME and tumor progression 4 , 31 , 32 . This led us to subtype CAF cells into clusters, of which four were more common in cancer than in normal tissues. We found that most shURs and shDSs were mainly expressed in the largest cluster (CAF_0). This cluster is in agreement with previously reported mCAF, which shows high expression of ECM remodeling genes and a pro-angiogenic effects in TME 4 , 27 . Interestingly, shURs located in mCAF regulated shDSs in all other cell types. KEGG pathway analysis of those shDSs revealed a wide variety of pathways related to cancer, vascular function, coagulation, immunity, and metabolism. In support of a direct tumorigenic role of the fibroblast shURs, their shDSs in epithelial cells encoded cancer-related pathways, namely proteoglycan- and AGE-RAGE signaling, as well as pathways associated with many specific cancers. This finding suggested a key regulatory role of mCAF which was mainly associated with ECM according to KEGG pathway enrichment analysis. Therefore, we hypothesized that mCAF could be used to add genes to the shared gene signature. This resulted in a gene signature with eight genes from mCAF and eight shURs. Recently, CCI and shared mechanisms were discussed for their potential use relates to cancer’s clinical outcomes 29 , 33 . In this study, we hypothesized that this signature was associated with the mortality of cancer patients and tested the hypothesis in two independent cohorts (TCGA and UKBB). The expression of signature mRNAs and proteins showed significant differences between tumor and normal samples in both cohorts, underscoring their pathogenic relevance. Additionally, our analysis revealed that each individual signature mRNA/protein was correlated with all-cause mortality in cancer patients from both cohorts. When evaluating the overall associations of the mRNA and protein signature scores within specific cancer types, we observed moderate to high associations with mortality in both datasets. The signature genes that belong to collagen family (e.g. COL18A1 and COL4A1) showed the highest association with increased risk of death. This is in line with previous findings implicating members of the collagen family as prognostic markers for cancers 34 – 36 . Moreover, CTHRC1 also exhibited a high association with death risk in both mRNA and protein levels. This agrees with previous findings showing its association with tumor progression, metastasis and prognosis in several cancer types 34 , 37 – 39 . While both mRNA and protein scores were linked to all-cause mortality, the association differed between TCGA and UKBB. The association of signature score with mortality was demonstrated to be similar between females and males in TCGA, but it was notably associated with a greater risk of death in males compared to females in the UKBB dataset. Furthermore, the cancers with the highest associations in TCGA were located in the brain, mesothelioma and uterus, while the highest associations in UKBB were ovarian cancer, prostate and lymphoma, indicating differences between tissue mRNA and blood proteins. Nevertheless, the consistent significant association of both mRNA and protein scores with mortality underscores the pathogenic relevance of the signature. Despite this, this study has potential limitations. Our analysis is limited to mRNAs and proteins, while multiple other types of molecules have been shown to play important pathogenic roles. Another limitation is that the scRNA-seq data were derived from a small number of patients from solid tumors. However, the relevance of the signature genes was supported in both cohorts by analyses of their associations with mortality in multiple other cancers including non-solid tumors like leukemia in independent cohorts. We propose that further studies are warranted to examine the signature genes in other cancers, as well as their associations with disease-relevant traits. In conclusion, our findings support the pathogenic and clinical relevance of molecular interactions that are shared across cancers. We have made the methods and data underlying this study freely available for basic and translational studies ( https://github.com/SDTC-CPMed/shMCTM_cancer_mortality ). Abbreviations CAF: Cancer-associated fibroblast; CCI: Cell-cell interactions; CI: Confidential interval; DEG: Differentially expressed genes; DS: Downstream target; ECM, extracellular matrix; FDR: False discovery rate; GEO: Gene expression omnibus; GTEx: Genotype-tissue expression; GWAS: Genome-wide association studies; HR: Hazards ratio; ICD: International Classification of Diseases; KNN: K-nearest neighbor; Log2FC: Log2 transformed fold change; MAST: Model-based Analysis of Single-cell Transcriptomics; MCTM: Multicellular tumor model; NPX: Normalized Protein Expression; scRNA-seq: single-cell RNA sequencing; shMCTM: Shared multicellular tumor model; shDS: Shared downstream target genes shUR: Shared upstream regulator gene; TCGA: The Cancer Genome Atlas; TME: Tumor microenvironment; TPM: transcripts per million; UKBB: UK biobank; UMAP: Uniform manifold approximation and projection; UR: Upstream regulator. Declarations Ethics approval and consent to participate UK Biobank has approval from the Northwest Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB) approval. This approval means that researchers do not require separate ethical clearance and can operate under the RTB approval (there are certain exceptions to this which are set out in the Access Procedures, such as re-contact applications). Consent for publication Not appliable Availability of data and materials The scRNA-seq data used in this study is publicly available on GEO, with accession number GSE161529, GSE144735, GSE138709, GSE123902 and ArrayExpress E-MTAB-8107. The metadata of all the scRNA-seq datasets, URs, DSs, as well as their interactions in each dataset, and codes generated during this study are publicly available at https://github.com/SDTC-CPMed/shMCTM_cancer_mortality. Competing interests MB is the scientific founder of Mavatar, Inc. JL is co-scientific founder of Scipher Medicine, Inc. The other authors declare that they have no competing interests. Funding This work was supported by: Swedish Cancer Society and Swedish Research Council. Authors' contributions YZ had a primary role in analyses, interpretation of the data, as well as manuscript writing XL, YW, MS and OS contributed to those analyses, and JL with translational expertise. DA, FM and MB supervised these studies. All authors contributed to the writing of the manuscript. The authors read and approved the final manuscript. DA, FM and MB contributed equally to this work. Acknowledgements Not appliable References Global cancer burden growing, amidst mounting need for services. https://www.who.int/news/item/01-02-2024-global-cancer-burden-growing--amidst-mounting-need-for-services. (2024). Barrenas, F. et al. Highly interconnected genes in disease-specific networks are enriched for disease-associated polymorphisms. Genome Biol 13 , R46, doi:10.1186/gb-2012-13-6-r46 (2012). Ghoshdastider, U. et al. Pan-Cancer Analysis of Ligand–Receptor Cross-talk in the Tumor Microenvironment. Cancer Research 81 , 1802-1812, doi:10.1158/0008-5472.Can-20-2352 (2021). Ma, C. et al. Pan-cancer spatially resolved single-cell analysis reveals the crosstalk between cancer-associated fibroblasts and tumor microenvironment. Molecular Cancer 22 , 170, doi:10.1186/s12943-023-01876-x (2023). Shalek, A. K. & Benson, M. Single-cell analyses to tailor treatments. Sci Transl Med 9 , doi:10.1126/scitranslmed.aan4730 (2017). Gawel, D. R. et al. A validated single-cell-based strategy to identify diagnostic and therapeutic targets in complex diseases. Genome Med 11 , 47, doi:10.1186/s13073-019-0657-3 (2019). Li, X. et al. A dynamic single cell-based framework for digital twins to prioritize disease genes and drug targets. Genome Medicine 14 , 48, doi:10.1186/s13073-022-01048-4 (2022). Lilja, S. et al. Multi-organ single-cell analysis reveals an on/off switch system with potential for personalized treatment of immunological diseases. Cell Rep Med 4 , 100956, doi:10.1016/j.xcrm.2023.100956 (2023). Pal, B. et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J , e107333, doi:10.15252/embj.2020107333 (2021). Lee, H. O. et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat Genet 52 , 594-603, doi:10.1038/s41588-020-0636-z (2020). Zhang, M. et al. Single-cell transcriptomic architecture and intercellular crosstalk of human intrahepatic cholangiocarcinoma. J Hepatol 73 , 1118-1130, doi:10.1016/j.jhep.2020.05.039 (2020). Laughney, A. M. et al. Regenerative lineages and immune-mediated pruning in lung cancer metastasis. Nat Med 26 , 259-269, doi:10.1038/s41591-019-0750-6 (2020). Qian, J. et al. A pan-cancer blueprint of the heterogeneous tumor microenvironment revealed by single-cell profiling. Cell Res 30 , 745-762, doi:10.1038/s41422-020-0355-0 (2020). Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/. ArrayExpress. https://www.ebi.ac.uk/arrayexpress/. Sun, B. B. et al. Plasma proteomic associations with genetics and health in the UK Biobank. Nature 622 , 329-338, doi:10.1038/s41586-023-06592-6 (2023). UKB - Olink Explore - Data Normalization Strategy. https://biobank.ctsu.ox.ac.uk/crystal/ukb/docs/Olink_1536_B0_to_B7_FAQ.pdf. UCS Xena: Cohort: TCGA TARGET GTEx. https://xenabrowser.net/datapages/. https://xenabrowser.net/datapages/ . Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol 38 , 675-678, doi:10.1038/s41587-020-0546-8 (2020). Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177 , 1888-1902 e1821, doi:10.1016/j.cell.2019.05.031 (2019). Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biology 16 , 278, doi:10.1186/s13059-015-0844-5 (2015). Browaeys, R., Saelens, W. & Saeys, Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 17 , 159-162, doi:10.1038/s41592-019-0667-5 (2020). Pinero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res 48 , D845-D855, doi:10.1093/nar/gkz1021 (2020). Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The Innovation 2 , doi:10.1016/j.xinn.2021.100141 (2021). Biobank, U. UK biobank data providers and dates of data availability. https://biobank.ndph.ox.ac.uk/showcase/exinfo.cgi?src=Data_providers_and_dates. (2023). Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics 14 , 7, doi:10.1186/1471-2105-14-7 (2013). Liu, C. et al. Single-cell dissection of cellular and molecular features underlying human cervical squamous cell carcinoma initiation and progression. Science Advances 9 , eadd8977, doi:doi:10.1126/sciadv.add8977 (2023). Peltanova, B., Raudenska, M. & Masarik, M. Effect of tumor microenvironment on pathogenesis of the head and neck squamous cell carcinoma: a systematic review. Molecular Cancer 18 , 63, doi:10.1186/s12943-019-0983-5 (2019). Chen, L.-x. et al. Cell–cell communications shape tumor microenvironment and predict clinical outcomes in clear cell renal carcinoma. Journal of Translational Medicine 21 , 113, doi:10.1186/s12967-022-03858-x (2023). Mayer, S. et al. The tumor microenvironment shows a hierarchy of cell-cell interactions dominated by fibroblasts. Nature Communications 14 , 5810, doi:10.1038/s41467-023-41518-w (2023). Pelon, F. et al. Cancer-associated fibroblast heterogeneity in axillary lymph nodes drives metastases in breast cancer through complementary mechanisms. Nature Communications 11 , 404, doi:10.1038/s41467-019-14134-w (2020). Czekay, R. P., Cheon, D. J., Samarakoon, R., Kutz, S. M. & Higgins, P. J. Cancer-Associated Fibroblasts: Mechanisms of Tumor Progression and Novel Therapeutic Targets. Cancers (Basel) 14 , doi:10.3390/cancers14051231 (2022). Weiss, F., Lauffenburger, D. & Friedl, P. Towards targeting of shared mechanisms of cancer metastasis and therapy resistance. Nat Rev Cancer 22 , 157-173, doi:10.1038/s41568-021-00427-0 (2022). Ni, S. et al. CTHRC1 overexpression predicts poor survival and enhances epithelial-mesenchymal transition in colorectal cancer. Cancer Med 7 , 5643-5654, doi:10.1002/cam4.1807 (2018). Li, X., Li, Z., Gu, S. & Zhao, X. A pan-cancer analysis of collagen VI family on prognosis, tumor microenvironment, and its potential therapeutic effect. BMC Bioinformatics 23 , 390, doi:10.1186/s12859-022-04951-0 (2022). Necula, L. et al. Collagen Family as Promising Biomarkers and Therapeutic Targets in Cancer. Int J Mol Sci 23 , doi:10.3390/ijms232012415 (2022). Sial, N. et al. CTHRC1 expression is a novel shared diagnostic and prognostic biomarker of survival in six different human cancer subtypes. Sci Rep 11 , 19873, doi:10.1038/s41598-021-99321-w (2021). Li, Y. et al. Single-cell landscape reveals active cell subtypes and their interaction in the tumor microenvironment of gastric cancer. Theranostics 12 , 3818-3833, doi:10.7150/thno.71833 (2022). Chen, Y. et al. High CTHRC1 expression may be closely associated with angiogenesis and indicates poor prognosis in lung adenocarcinoma patients. Cancer Cell International 19 , 318, doi:10.1186/s12935-019-1041-5 (2019). Supplementary Files Additionalfile1.xlsx Additional file 1.xlsx. Description of scRNA, UKBB and TCGA datasets. Additionalfile2.xlsx Additional file 2.xlsx. Known cell type marker genes and GWAS genes. Additionalfile3.xlsx Additional file 3.xlsx. DEGs between tumor vs. normal per cell type per cancer. Additionalfile4.xlsx Additional file 4.xlsx. MCTMs for each cancer. Additionalfile5.xlsx Additional file 5.xlsx. shMCTM and GWAS enrichment. Additionalfile6.xlsx Additional file 6.xlsx. Original results for TCGA and UKBB figures. AdditionalFiguresandNotes.docx Additional Figures and Notes. SupplementMethods.docx Additional Methods. Cite Share Download PDF Status: Published Journal Publication published 11 May, 2024 Read the published version in Journal of Translational Medicine → Version 1 posted Reviewers agreed at journal 04 Mar, 2024 Reviewers invited by journal 04 Mar, 2024 Editor assigned by journal 28 Feb, 2024 First submitted to journal 26 Feb, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3994390","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":276258361,"identity":"df401348-8993-4898-ab90-00099aede26c","order_by":0,"name":"Yelin Zhao","email":"","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":false,"prefix":"","firstName":"Yelin","middleName":"","lastName":"Zhao","suffix":""},{"id":276258362,"identity":"81a5804a-6d03-4351-8488-9b6c45418b61","order_by":1,"name":"Xinxiu Li","email":"","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":false,"prefix":"","firstName":"Xinxiu","middleName":"","lastName":"Li","suffix":""},{"id":276258363,"identity":"ac29a7e1-4d98-4989-8351-fcd4c1787c2f","order_by":2,"name":"Joseph Loscalzo","email":"","orcid":"","institution":"Harvard Medical School Department of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Joseph","middleName":"","lastName":"Loscalzo","suffix":""},{"id":276258364,"identity":"90ab5eba-da64-449c-8fd3-07d6472a4bea","order_by":3,"name":"Martin Smelik","email":"","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":false,"prefix":"","firstName":"Martin","middleName":"","lastName":"Smelik","suffix":""},{"id":276258365,"identity":"3e5dcc9e-11e6-4d88-8cb4-b260311605fe","order_by":4,"name":"Oleg Sysoev","email":"","orcid":"","institution":"Linkopings universitet","correspondingAuthor":false,"prefix":"","firstName":"Oleg","middleName":"","lastName":"Sysoev","suffix":""},{"id":276258366,"identity":"bd0c858c-b19e-452e-ba91-398154406e35","order_by":5,"name":"Yunzhang Wang","email":"","orcid":"","institution":"Karolinska Institute Department of Clinical Sciences Danderyd Hospital: Karolinska Institutet Institutionen for kliniska vetenskaper Danderyds sjukhus","correspondingAuthor":false,"prefix":"","firstName":"Yunzhang","middleName":"","lastName":"Wang","suffix":""},{"id":276258367,"identity":"1a5cc26c-034c-4e0e-8dd5-f8f8ef52d95f","order_by":6,"name":"Firoj Mahmud AKM","email":"","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":false,"prefix":"","firstName":"Firoj","middleName":"Mahmud","lastName":"AKM","suffix":""},{"id":276258368,"identity":"b3015e74-a1a3-4c1b-a1a6-24f4bd0fc4e7","order_by":7,"name":"Dina Mansour Aly","email":"","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":false,"prefix":"","firstName":"Dina","middleName":"Mansour","lastName":"Aly","suffix":""},{"id":276258369,"identity":"c48515cd-6d53-43da-bcc5-e665c2098660","order_by":8,"name":"Mikael Benson","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA3ElEQVRIiWNgGAWjYDACZjYgUcCQAOFVJPAQqcUApuUMMVoYkLUwtiUQ1qDbzpb44YMBQx6/dPvDx4Xz0mQY+A8fwKvF7DDbYckZBgzFknPOGBvP3JbDwyCRht8qs8PsDdI8BgyJG27ksEnzbqsAagFyCWhp/v0HqGX/jfRn0rxzgFr4z38g5LBj0gwgWyQSzKR5G4AOY8jBqwOkJc2yx0CiWOJGjrExz7E0HjaJNAIOO3/M+MaPCps8/hnpDx/z1CTb8/MffoDfGgiQQDDZiFE/CkbBKBgFowA/AADkSjyRiexWkwAAAABJRU5ErkJggg==","orcid":"","institution":"Karolinska Institutet","correspondingAuthor":true,"prefix":"","firstName":"Mikael","middleName":"","lastName":"Benson","suffix":""}],"badges":[],"createdAt":"2024-02-27 16:42:02","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3994390/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3994390/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12967-024-05268-7","type":"published","date":"2024-05-11T21:18:03+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":52105264,"identity":"ee030807-af3e-4c01-bd2b-b81b19a085ce","added_by":"auto","created_at":"2024-03-06 19:27:15","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":932585,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOverview of the study.\u003c/strong\u003e \u003cstrong\u003eA)\u003c/strong\u003e Single cell and cell-cell interaction (CCI) analyses of each cancer dataset separately. A1 and A2, clustering, cell typing, and differentially expressed genes (DEG) identification. A3, identification of cell-cell interactions using DEGs for each cell type pair. \u003cstrong\u003eB) \u003c/strong\u003eB1, schematic figure depicting how shared upstream regulator gene (shURs) and shared downstream target genes (shDSs) were identified. For each UR and DS identified by NicheNet, the fold change between tumor vs. normal was examined within each cell type. that the identified shURs, shDSs and the four cell types were connected to construct the shMCTM. Red and blue denote increased and decreased expression in the tumor, respectively, while white means no difference. B2, A shMCTM representing shared CCIs was constructed using shURs, shDSs, and their interactions. Each color of the outer ring represents one cell type, which is connected by predicted molecular interactions, the directions of which are indicated by pointed curved lines. B3, shURs were prioritized based on the numbers of shDSs and cell types. B4, identification of the predominant cluster expressing top shURs. B5, top shURs and the top marker genes of the predominant cluster were combined to a gene signature with the concordant mRNA and protein signatures. \u003cstrong\u003eC)\u003c/strong\u003e The pathogenic relevance of the mRNA/protein signatures were tested in The Cancer Genome Atlas (TCGA) and UK biobank (UKBB). C1, description of the two testing cohorts. C2, mRNA/protein expression differences between tumor vs. normal in both cohorts. C3, the associations of mRNA/protein signatures with 10-year all-cause mortality in cancer patients from both cohorts.\u003c/p\u003e","description":"","filename":"floatimage1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/3f330cefe05d7d68d8e5cdce.jpg"},{"id":52106007,"identity":"afb86498-0c95-4d3a-a516-5be1dc83a705","added_by":"auto","created_at":"2024-03-06 19:35:15","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":762083,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCellular and molecular heterogeneity of cancer. A) \u003c/strong\u003eClustering of each cancer, colored by cell type. \u003cstrong\u003eB)\u003c/strong\u003e Expression of known marker genes of each cell type in each cancer. \u003cstrong\u003eC)\u003c/strong\u003e Proportion of cell types in each cancer. \u003cstrong\u003eD)\u003c/strong\u003e The overlap of DEGs of each cancer.\u003c/p\u003e","description":"","filename":"floatimage2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/a72a79808a181299f7595b09.jpg"},{"id":52105266,"identity":"b05e3f4d-1dc4-4fff-b9bf-5069d1e97eaa","added_by":"auto","created_at":"2024-03-06 19:27:15","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1199331,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalyses of a shMCTM. A)\u003c/strong\u003e A shMCTM that represented the shared CCIs across five cancers (the construction principles are outlined in Figure 1). Each edge shows the shUR and its predicted, directed interaction towards its downstream cell type; the thickness of each edge represents the number of shDSs; the color of the edges indicates the cellular origin of each interaction.\u003cstrong\u003e B)\u003c/strong\u003e The disease relevance of the shMCTM genes was supported by pathway analyses in DisGeNET. Size indicates the number of genes that were enriched in each term and color indicates the significance level after FDR adjustment. \u003cstrong\u003eC) \u003c/strong\u003eshURs were clustered based on their predicted downstream effects. Red spectra show the total number of interactions of each shUR towards its shDSs in each downstream cell type. White indicates no downstream genes. shURs with larger downstream effects were selected for a gene signature representing the shMCTM. \u003cstrong\u003eD) \u003c/strong\u003eExpression of prioritized shURs in cell types from tumor and normal tissues. The dot size indicates the percent expression in each cell type and the color scale indicates the expression level.\u003c/p\u003e","description":"","filename":"floatimage3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/4732243ce374fb62957458c0.jpg"},{"id":52105270,"identity":"d8b6448a-b0d9-4752-aa25-9476fcd48e7b","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":2094556,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eClustering and gene expression of fibroblasts from five cancers.\u003c/strong\u003e \u003cstrong\u003eA-B) \u003c/strong\u003eFibroblasts from all cancers were integrated; the Uniform Manifold Approximation and Projection (UMAP) shows clusters in resolution 0.1, segregated by \u003cstrong\u003eA) \u003c/strong\u003ecancer type and \u003cstrong\u003eB) \u003c/strong\u003etissue type. \u003cstrong\u003eC)\u003c/strong\u003e Proportions of each cluster in each cancer. \u003cstrong\u003eD)\u003c/strong\u003e Heatmap showing the top marker genes for each cluster. \u003cstrong\u003eE) \u003c/strong\u003eScaled expression of fibroblast shURs and shDSs in each subcluster. The fibroblast shURs and shDSs were those that had higher expression in fibroblast compared to other cell types (log2FC \u0026gt; 0.25, adjusted p-value \u0026lt; 0.05); color scale shows the expression level while the size of dots represents the percent of cells in this cluster that expressed this gene. \u003cstrong\u003eF) \u003c/strong\u003eKEGG enrichment of CAF_C0 marker genes\u003cstrong\u003e. G) \u003c/strong\u003eKEGG enrichment of shDSs of fibroblast shURs. The size of each dot represents the ratio of genes mapped to each term. The number on the x-axis indicates the number of shDSs in each cell type. T, tumor tissue; N, adjacent normal tissue.\u003c/p\u003e","description":"","filename":"floatimage4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/0f91d848230d2b744936e3f6.jpg"},{"id":52105269,"identity":"fa187308-de1a-4a73-a215-3a532d0b5a01","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1228003,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eThe expression difference of mRNA and protein signatures in two large cohorts.\u003c/strong\u003e \u003cstrong\u003eA) \u003c/strong\u003eThe expression difference of each signature mRNA between tumor sample and normal samples in TCGA. \u003cstrong\u003eB) \u003c/strong\u003eThe expression difference of each signature protein in plasma between cancer patients and healthy controls in UKBB. Blue represents lower expression in tumor and red represents higher expression in tumor, filled dot means statistically significant (adjusted p-value \u0026lt; 0.05) while open circle means not statistically significant (adjusted p-value \u0026gt; 0.05). Blank area in the UKBB panel means the protein is not detected.\u003c/p\u003e","description":"","filename":"floatimage5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/e5bc3b5b9fda311500eb0acc.jpg"},{"id":52105274,"identity":"548413eb-51d4-406b-8881-5a38a80bd078","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":1672881,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalysis of the risk association of signature genes with 10-year mortality in TCGA cancer patients.\u003c/strong\u003e \u003cstrong\u003eA)\u003c/strong\u003e Cox regression of each signature mRNAs and the mRNA score with mortality in all cancer patients, or each sex subgroup from the TCGA cohort. \u003cstrong\u003eB)\u003c/strong\u003e Cancer patients were divided into low and high score groups based on the average score, and the Kaplan-Meier curve shows each sex and gene score combination. \u003cstrong\u003eC)\u003c/strong\u003e Cox regression of the gene score with mortality in each cancer type. Note: * FDR adjusted p-value \u0026lt; 0.05, ** \u0026lt; 0.01, *** \u0026lt; 0.001, **** \u0026lt; 0.0001.\u003c/p\u003e","description":"","filename":"floatimage6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/8b0cada8f908b47078650428.jpg"},{"id":52106620,"identity":"dc7bfc3b-b703-4df8-9f79-411528c915ad","added_by":"auto","created_at":"2024-03-06 19:43:16","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":1425744,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAnalysis of the risk association of signature genes with 10-year mortality in UKBB cancer patients.\u003c/strong\u003e \u003cstrong\u003eA)\u003c/strong\u003e Cox regression of each signature protein and the protein score with mortality in all cancer patients, or each sex subgroup from the UKBB cohort. \u003cstrong\u003eB)\u003c/strong\u003e \u0026nbsp;Cancer patients were divided into low and high score groups based on the average score, and the Kaplan-Meier curve shows each sex and gene score combination. \u003cstrong\u003eC)\u003c/strong\u003eCox regression of the gene score with mortality in each cancer type. Note: * FDR adjusted p-value \u0026lt; 0.05, ** \u0026lt; 0.01, *** \u0026lt; 0.001, **** \u0026lt; 0.0001.\u003c/p\u003e","description":"","filename":"floatimage7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/151b013ece1b47d98dbcb21c.jpg"},{"id":56488567,"identity":"c71da94d-e6a2-42b7-a1f4-74d66c4b1377","added_by":"auto","created_at":"2024-05-14 21:32:40","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2063231,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/4bb2b608-90ac-4536-9fe9-5ee047035ff9.pdf"},{"id":52105267,"identity":"f1752284-0139-4878-8f67-a0e393635a77","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":36740,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 1.xlsx. Description of scRNA, UKBB and TCGA datasets.\u003c/p\u003e","description":"","filename":"Additionalfile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/c145f8db1b2cd0cf1d2e0265.xlsx"},{"id":52105265,"identity":"9fbb3b38-992e-42d6-a222-d3f3aa9edac7","added_by":"auto","created_at":"2024-03-06 19:27:15","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":69521,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 2.xlsx. Known cell type marker genes and GWAS genes.\u003c/p\u003e","description":"","filename":"Additionalfile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/62448783a036773aaa3d295a.xlsx"},{"id":52105271,"identity":"e0d6e717-c6cd-4d28-92bd-984415f05efd","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":4043747,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 3.xlsx. DEGs between tumor vs. normal per cell type per cancer.\u003c/p\u003e","description":"","filename":"Additionalfile3.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/cfcd1456d446f2ce8e5b1801.xlsx"},{"id":52105277,"identity":"07c04c0b-f0dc-4d26-9b12-c5849fc7f297","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":1577028,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 4.xlsx. MCTMs for each cancer.\u003c/p\u003e","description":"","filename":"Additionalfile4.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/a7003b07c36780ffdb278b91.xlsx"},{"id":52106008,"identity":"0a6c0a47-f7b3-4e4c-852f-f632145b2967","added_by":"auto","created_at":"2024-03-06 19:35:16","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":50534,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 5.xlsx. shMCTM and GWAS enrichment.\u003c/p\u003e","description":"","filename":"Additionalfile5.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/433697de485ac77ac7e70f2e.xlsx"},{"id":52105278,"identity":"f92c97f2-fb16-449f-89b8-55f6ebd0ee4b","added_by":"auto","created_at":"2024-03-06 19:27:17","extension":"xlsx","order_by":6,"title":"","display":"","copyAsset":false,"role":"supplement","size":103931,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional file 6.xlsx. Original results for TCGA and UKBB figures.\u003c/p\u003e","description":"","filename":"Additionalfile6.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/b34e1e7bd44b0e0f5e56e864.xlsx"},{"id":52105275,"identity":"f9fdc871-3e5c-4eb6-b26f-6407821ffeb4","added_by":"auto","created_at":"2024-03-06 19:27:16","extension":"docx","order_by":7,"title":"","display":"","copyAsset":false,"role":"supplement","size":2936622,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional Figures and Notes.\u003c/p\u003e","description":"","filename":"AdditionalFiguresandNotes.docx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/c6ca5eebe7551b8eab188798.docx"},{"id":52106010,"identity":"daeb3bbd-8461-4a76-8ee9-54cde0e974cb","added_by":"auto","created_at":"2024-03-06 19:35:16","extension":"docx","order_by":8,"title":"","display":"","copyAsset":false,"role":"supplement","size":125128,"visible":true,"origin":"","legend":"\u003cp\u003eAdditional Methods.\u003c/p\u003e","description":"","filename":"SupplementMethods.docx","url":"https://assets-eu.researchsquare.com/files/rs-3994390/v1/058ea712824b7b5a994a0eeb.docx"}],"financialInterests":"","formattedTitle":"Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAccording to the WHO global cancer cases will increase by more than 75% by 2050, significantly increasing mortality\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. This increase involves highly diverse cancers in both women and men. Could this indicate that, despite this diversity, there are shared mechanisms across cancers? If so, are those mechanisms important for pathogenesis and mortality?\u003c/p\u003e \u003cp\u003ePrevious studies of highly diverse complex disease have shown shared mechanisms despite great complexity and heterogeneity. In support of their pathogenic importance those mechanisms are highly interconnected, and enriched for disease-associated genetic variants, so that their combined effects are large \u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. The existence of shared genes across cancers is supported by a previous study of deconvoluted bulk RNA sequencing data from 20 solid tumor types, which found converging molecular interactions between cancer and stromal cells in the tumor microenvironment (TME) \u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. Another study focused cell-cell interactions (CCIs) between fibroblast subtypes and tumor cells in six different cancers and found associations with the response to immunotherapy \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThese pioneering studies supported the idea that there may be shared ligand-target interactions between specific cell types across cancers. If so, such interactions could have important implications: Since carcinogenesis involves multiple cell types, and not only malignant ones, CCIs could constitute a higher order representation of the complex and heterogeneous changes in all those cell types \u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. This implication leads to an unanswered question: Are there shared CCIs when all cell types in different tumors are analyzed? If so, would it be possible to systematically organize those into one comprehensive model, which could be used to prioritize the most important interactions? Previous single-cell RNA sequencing (scRNA-seq) studies of inflammatory diseases have used such CCIs to construct multicellular models. In those models, the upstream regulatory (UR) ligands could be ranked and prioritized based on the relative numbers of cell types and downstream target (DS) genes. The disease relevance of the models and URs was validated by functional studies \u003csup\u003e\u003cspan additionalcitationids=\"CR7\" citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. Here, we translated the same principles to construct multicellular tumor models (MCTMs) of different cancers based on scRNA-seq data. We next hypothesized that those MCTMs could be used to construct a shared MCTM (shMCTM) from which a shared gene signature could be prioritized. This did result in the identification of an shMCTM and a gene signature, whose pathogenic relevance was validated by differential mRNA and protein expression, as well as association with mortality in independent data from The Cancer Genome Atlas (TCGA, 9,185 tumor tissues, 727 control tissues from cancers of 22 different tissue origins) and from the UK Biobank cohort (UKBB, 10,384 cancer patients, 5,063 controls with proteomics data of cancers from 17 different tissue origins).\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eData source\u003c/h2\u003e \u003cp\u003e \u003cb\u003eScRNA-seq\u003c/b\u003e ScRNA-seq count matrix files of five common cancers: breast cancer (ER-positive breast cancer, GSE161529 \u003csup\u003e9\u003c/sup\u003e), colon cancer (colorectal cancer, GSE144735 \u003csup\u003e10\u003c/sup\u003e), liver cancer (intrahepatic cholangiocarcinoma, GSE138709 \u003csup\u003e11\u003c/sup\u003e), lung cancer (lung adenocarcinoma, GSE123902 \u003csup\u003e12\u003c/sup\u003e and ovarian cancer (E-MTAB-8107 \u003csup\u003e13\u003c/sup\u003e) from Gene Expression Omnibus (GEO) \u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e or ArrayExpress \u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e(Additional file 1 S1). These cancer datasets were selected because they are among the most prevalent solid tumors, and high-quality scRNA-seq data of untreated primary tumors and control samples (adjacent normal tissue, except for the breast cancer dataset, where normal tissues were from mammary gland cells of non-breast cancer patients) were available. All retrieved scRNA-seq studies were performed using 10\u0026times; Genomics\u0026rsquo; scRNA-seq technology.\u003c/p\u003e \u003cp\u003e \u003cb\u003eUK biobank\u003c/b\u003e Proteomic data from the UKBB cohort includes plasma proteome of 54,306 unique UKBB participants from the UK Biobank Pharma Proteomics Project \u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. The expression of 2,911 plasma proteins (the second release) were tested using the antibody-based Proximity Extension Assay by Olink and was provided as Normalized Protein Expression (NPX) \u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. NPX is a relative quantification unit related to protein concentration; it was background-corrected, log2-transformed, and normalized within all samples \u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. The identification of cancer cases and healthy controls was performed using the International Classification of Diseases (ICD) coding system, specifically ICD-9 and ICD-10 (Additional file 1 S2). A detailed description of the UKBB proteomic data and the sample size is provided in the Supplemental Methods and Additional file 1 S3. Proteins with more than 20% missing data were removed, and the one with less than 20% missing were imputed using the K-nearest neighbor (KNN) method with 10 nearest neighbors.\u003c/p\u003e \u003cp\u003e \u003cb\u003eTCGA and GTEx\u003c/b\u003e The mRNA transcripts per million (TPM) expression profiles of tissues from 22 organ sites were obtained from TCGA and GTEx through UCSC Xena \u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. The gene expression was transformed to log2(TPM\u0026thinsp;+\u0026thinsp;0.001) for both TCGA and GTEx samples by the same RNA-seq pipeline. The normal sample of GTEx was combined with the normal sample in TCGA according to organ site, the merging strategy and sample size for each dataset are listed at Additional file 1 S4.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eEthics\u003c/h2\u003e \u003cp\u003eThe UKBB study received ethical approval from the National Information Governance Board for Health and Social Care and the National Health Service Northwest Multi-Center Research Ethics Committee, and all participants provided written consent. This research has been conducted under approved application number 102162. All participants in TCGA were consented, and the data is openly accessible to researchers.\u003c/p\u003e \u003cdiv id=\"Sec5\" class=\"Section3\"\u003e \u003ch2\u003eScRNA-seq processing\u003c/h2\u003e \u003cp\u003eThe downloaded count matrices of each cancer scRNA-seq data set were processed and quality controlled using the R package Seurat v4.0.4 \u003csup\u003e20\u003c/sup\u003e. For each sample, the low-quality cells were filtered out based on mitochondrial RNA percentage, the range of read counts, and gene coverage (Supplement Methods). Each cancer was analyzed independently. Single-cell profiles from different samples within the same cancer were integrated using Seurat 4 anchor-based integration methods \u003cem\u003eIntegrateData\u003c/em\u003e. Cell clusters were identified using the default \u003cem\u003eFindClusters\u003c/em\u003e function. Cell types were annotated by known cell type markers detailed in the Additional file 2. A Model-based Analysis of Single-cell Transcriptomics (MAST) \u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e was used for the identification of differentially expressed genes (DEG) between tumor tissues and normal tissues within the same cell type. DEGs with adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 and absolute log2 transformed fold change (log2FC)\u0026thinsp;\u0026gt;\u0026thinsp;0.25 were used for downstream analysis.\u003c/p\u003e \u003cp\u003eFibroblasts from each cancer were extracted and integrated into one dataset using the Seurat \u003cem\u003eIntegrateData\u003c/em\u003e function to adjust the difference between cancers. The clustering resolution was 0.1 with seed 42. Marker genes of each cluster were identified using the \u003cem\u003eFindAllMarkers\u003c/em\u003e function with default settings.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section3\"\u003e \u003ch2\u003eConstruction of MCTM and shMCTM\u003c/h2\u003e \u003cp\u003eTo infer cell-cell interactions (CCIs) of all cell type pairs, an R package NicheNet (v1.1.0) was applied \u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. This analysis was performed separately for each cancer. In brief, the cell type and DEGs list for each cell type served as input for NicheNet. CCIs were then identified between each pair of cell types using the default analysis setup. For each identified interaction, potential ligands and target genes in the source cell type were determined using the \u003cem\u003epredict_ligand_activities\u003c/em\u003e and \u003cem\u003eget_weighted_ligand_target_links\u003c/em\u003e functions with default settings. The predicted interactions for all cell type pairs in each cancer were used to construct a MCTM.\u003c/p\u003e \u003cp\u003eTo identify common CCIs across all five cancers, a shMCTM was created as follows: 1) URs found in all cancers were identified. 2) For each cell type, the log2FC of each UR from step 1 was compared among all five cancers. A UR was considered a shared UR (shUR) if it exhibited the same direction of expression change in one cell type in at least four cancers. The cell type of this UR was recorded and used for shMCTM construction. 3) Subsequently, DSs of shURs were identified; we defined shared DSs (shDSs) using the same criteria applied for identifying shURs. 4) The genes (shURs and shDSs) and their corresponding cell types (shMCTM cell types) were used for constructing the shMCTM.\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003ePrioritization of shURs\u003c/h2\u003e \u003cp\u003eTo systematically prioritize shURs, shURs were clustered based on the number of interactions with each downstream cell type in the shMCTM. Euclidean distance was used for clustering and clusters were cut into two main subclusters according to the dendrogram. The cluster with a larger number of interactions in all cell types was considered as the top cluster, and shURs in this cluster were considered as top shURs.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eGenome-wide association studies (GWAS) gene enrichment analyses and disease relevance\u003c/h2\u003e \u003cp\u003eGWAS gene enrichment analysis (Fisher\u0026rsquo;s exact test, double-sided) of shMCTM genes was performed for each cancer separately. All DEGs within that cancer were used as a background. The GWAS-associated genes were downloaded from DisGeNET in November 2021 \u003csup\u003e23\u003c/sup\u003e. The \u0026ldquo;diseaseName\u0026rdquo; and GWAS genes for each cancer were listed in Additional file 2 S2.\u003c/p\u003e \u003cp\u003eThe disease relevance was computed using DisGeNET \u0026ldquo;disgenet2r\u0026rdquo; R package version 0.99.2. To perform disease enrichment of genes included in the shMCTM, default setting of the \u003cem\u003edisease_enrichment\u003c/em\u003e function was used. The p-values resulting from the multiple Fisher tests were corrected for multiple testing using the False Discovery Rate (FDR) method.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eKEGG enrichment\u003c/h2\u003e \u003cp\u003eKEGG enrichment was performed using the R clusterProfiler package (v3.18.1) \u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e. The KEGG enrichment for marker genes of each cancer-associated fibroblast (CAF) subcluster was performed using function \u003cem\u003eenrichKEGG\u003c/em\u003e. Function \u003cem\u003ecompareCluster\u003c/em\u003e was used for plotting the top KEGG terms of shDSs of shURs expressed in fibroblast shURs.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eDefinition of all-cause mortality and survival time\u003c/h2\u003e \u003cp\u003eThe all-cause mortality was defined as death with any reason during the observation period (10 years after cancer diagnosis). The survival time was defined as the period from initial cancer diagnosis until the date of death from any cause, loss to follow-up or the end of the follow-up period (30 November 2022 in UKBB) \u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eGene set and protein set scoring\u003c/h2\u003e \u003cp\u003eThe gene score of signature genes was calculated for cancer patients in TCGA. The default \u0026ldquo;gsva\u0026rdquo; method in the GSVA R package was used for calculating these scores\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e. The corresponding protein score was calculated for the UKBB cancer patients using the average NPX of proteins encoded by signature genes. The gene score and protein score were divided into high and low groups using their average value as cutoff.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eStatistics\u003c/h2\u003e \u003cp\u003eDifferential expressions of mRNAs and proteins were tested between tumor tissue vs. normal tissue or cancer patient vs. healthy control in TCGA or UKBB, respectively. The differential expression of each mRNA or protein was assessed for each individual cancer using the two-sided Wilcoxon test, and the difference of expression was presented as log2FC.\u003c/p\u003e \u003cp\u003eThe survival analysis was performed in all cancer patients pooled together and each cancer individually. The Cox proportional hazards model was used to calculate hazard ratios (HRs) and 95% confidence intervals (CIs) for the associations of each mRNA or protein (and the mRNA signature score or protein signature score) with the 10-year mortality of patients who diagnosed cancer. This association was also performed in each sex subgroup. The Cox models were adjusted for basic confounding factors when appropriate (UKBB: sex, age of diagnosis, time difference from diagnosis to sampling, and cancer type; TCGA: sex, age of diagnosis, and cancer type). Sex was excluded from the model when performing survival analysis in each sex subgroup, and cancer type was excluded from the model when testing in each individual cancer. Cancers with less than 20 death events were excluded when testing the association in each individual cancer. Kaplan-Meier survival curves were plotted for the combination of signature score level (high or low) and sex (female or male) using the \u003cem\u003eggsurvplot\u003c/em\u003e function and compared using the two-sided log-rank test. All statistical analyses were performed using R (version 4.0.4). The FDR method was applied for multiple comparisons, and an adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 indicated a significant difference.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eOverall design\u003c/h2\u003e \u003cp\u003eOur hypotheses were that 1) there were shared cell-cell interactions (CCIs) across cancers and that 2) these interactions were important for pathogenesis and mortality. To test the first hypothesis, we analyzed single-cell datasets from different cancers and compared the CCIs between them. This resulted in a shMCTM that represented shared cellular interactions across different cancers, from which we identified a gene signature (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA and B). For the second hypothesis, we assessed the signature at both mRNA and protein levels, subsequently referred to as the mRNA signature and protein signature, in two extensive independent cohorts (TCGA and UKBB). We first compared the expression differences of signature mRNAs between tumor and control: Next, we tested the association of the mRNA/protein signatures with 10-year all-cause mortality in cancer patients (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eAnalyses of scRNA-seq data from five different cancers shows shared differentially expressed genes\u003c/h2\u003e \u003cp\u003eGiven the importance of multiple local cell types other than tumor cells (e.g. stromal cells and immune cells\u003cb\u003e)\u003c/b\u003e we conducted analysis of scRNA-seq data from five common cancers, namely breast, colon, liver, lung, and ovarian cancers. Following quality control procedures, a total of 281,302 cells was analyzed and clustered (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). For each cancer, 12\u0026ndash;15 distinct cell types were identified, with the expression of known cell type marker genes illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB. The proportions of cell types in the tumor microenvironment differed greatly between the five cancers (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eC). Epithelial cells and fibroblasts predominated in breast, ovary, colon, and liver cancers, while immune cells were more prevalent in lung cancer. Across all five cancers, the proportion of epithelial cells increased in tumor tissue. Liver cancer exhibited a significant increase in epithelial cells but a decreased proportion of immune cells. These changes in cellular proportions were associated with thousands of DEGs between tumor and normal tissue, which also varied greatly between cell types and cancers (Additional file 3). Nevertheless, we identified 1,153 DEGs that were shared across these five cancers (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eD). This led us to ask if these DEGs were associated with shared interactions between the cell types in the cancers.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec16\" class=\"Section2\"\u003e \u003ch2\u003eMulti-cellular tumor models show dispersion of pathogenic mechanisms\u003c/h2\u003e \u003cp\u003eTo search for shared interactions, we first constructed MCTMs of each of the five cancers. The MCTMs showed directed molecular interactions between URs in any cell type and DSs in other cell types (Additional figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eA). The median (range) number of URs per cancer was 203 (155\u0026ndash;232), with 74 URs found in all five cancers (Additional figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eB). The median (range) number of DSs per cancer was 1,641 (1,279\u0026ndash;2,135), with 577 shared across all cancers (Additional figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003eC). Rather than a hierarchical organization in which most interactions originated from cancer cells, the interactions formed highly interconnected networks (Additional figure \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e and Additional file 4). Most cell types in the MCTMs were enriched with cancer related traits identified by GWAS (Additional figure \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e). This suggested that pathogenic mechanisms were distributed across cell types rather than originating solely from cancer cells.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec17\" class=\"Section2\"\u003e \u003ch2\u003eConstruction of a shared MCTM\u003c/h2\u003e \u003cp\u003eTo identify potential shared interactions across cancers, we explored the possibility of constructing a shMCTM from the five MCTMs. To characterize interactions, we identified URs and DSs that were shared across the MCTMs (shURs and shDSs). The criteria for shURs and shDSs were that they should 1) be URs or DSs in all five cancer MCTMs, and 2) have the same direction of expression change in the same cell type in at least four cancers (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB). A total of 117 shMCTM genes (30 shURs and 98 shDSs) located in shMCTM cell types (fibroblast, cancer cells, macrophages, endothelial cells, pericytes and T cells) were identified and used to construct the shMCTM (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA and Additional file 5).\u003c/p\u003e \u003cp\u003eIn support of the pathogenic relevance of shMCTM, the shMCTM genes (shURs and shDSs) exhibited enrichment for GWAS-associated genes in the five studied cancers, with odds ratios ranging from 2.51 to 3.81 (adjusted p-value\u0026thinsp;\u0026lt;\u0026thinsp;0.05, except for 0.06 in Ovarian cancer, Additional file 5). Additionally, the shMCTM genes were found to be associated with malignant and fibrotic diseases, as indicated by the DisGeNET database (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). These observations underscored the pathogenic importance not only of malignant cells but also of fibroblasts relative to other cell types in the TME.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec18\" class=\"Section2\"\u003e \u003ch2\u003ePrioritization of a shared gene signature based on the shMCTM\u003c/h2\u003e \u003cp\u003eTo prioritize a shared gene signature based on the shMCTM, we focused on the shURs that regulated the largest number of shDSs. Briefly, we clustered the shURs based on their total number of interactions towards each downstream cell type in the shMCTM. This identified two main clusters, of which the one with the most interactions included eight shURs (\u003cem\u003eCOL1A1\u003c/em\u003e, \u003cem\u003eFN1\u003c/em\u003e, \u003cem\u003eSPP1\u003c/em\u003e, \u003cem\u003eCOL4A1\u003c/em\u003e, \u003cem\u003eCOL18A1\u003c/em\u003e, \u003cem\u003ePLAU\u003c/em\u003e, \u003cem\u003eCLEC11A\u003c/em\u003e, and \u003cem\u003eMDK\u003c/em\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eC). Interestingly, seven out of eight shURs were more highly expressed in fibroblasts compared to other cell types in the shMCTM (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eD). This led us to subtype fibroblasts to search for more genes to include in the gene signature.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec19\" class=\"Section2\"\u003e \u003ch2\u003ePrioritization of genes for the gene signature based on a subtype of CAF\u003c/h2\u003e \u003cp\u003eTo search for and prioritize subtypes of fibroblasts, fibroblasts from the five cancers were re-integrated into one dataset. A total of 36,601 fibroblast cells were clustered into seven subpopulations (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA, B and C), of which four clusters (subclusters 0, 4, 5 and 6) were mainly enriched in tumor tissues, whereas subclusters 1, 2, and 3 were mainly present in normal tissues. All seven subclusters expressed canonical fibroblast markers such as \u003cem\u003eACTA2\u003c/em\u003e (\u003cem\u003ea-SMA\u003c/em\u003e), while each subcluster displayed distinct transcriptomic markers (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD and Additional Figure \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003e) and highly diverse functions (Additional Note 1 and Additional Figure \u003cspan refid=\"MOESM5\" class=\"InternalRef\"\u003eS5\u003c/span\u003e). Instead of being dispersed across different fibroblast subtypes, most shURs and shDSs were highly expressed in CAF_C0 (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eE). The CAF_C0 represented the largest CAF cluster and exhibited characteristics consistent with previously reported matrix CAFs (mCAF) \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e, showing elevated expression of extracellular matrix (ECM) remodeling genes. Therefore, we subsequently refer to it as mCAF in the following context (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eD and F). In further support of the importance of mCAF, its shURs regulated shDSs in all other cell types. As commented in the discussion, KEGG pathway analysis of those shDSs in epithelial cells revealed a wide variety of pathways relevant for malignant transformation (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eG, Additional Figure \u003cspan refid=\"MOESM5\" class=\"InternalRef\"\u003eS5\u003c/span\u003e and Supplement Methods).\u003c/p\u003e \u003cp\u003eTherefore, we hypothesized that genes in mCAF could be relevant to add to the shared gene signature. For this purpose, we prioritized genes 1) with the top 10 highest log2FC between mCAF and other CAF clusters and 2) that were DEGs between tumor and normal in mCAF. This analysis resulted in eight candidate biomarkers, in addition to the eight shURs, namely \u003cem\u003eMMP11\u003c/em\u003e, \u003cem\u003eCTHRC1\u003c/em\u003e, \u003cem\u003eCOL1A2\u003c/em\u003e, \u003cem\u003eCOL3A1\u003c/em\u003e, \u003cem\u003eSPARC\u003c/em\u003e, \u003cem\u003eCOL5A2\u003c/em\u003e, \u003cem\u003ePOSTN\u003c/em\u003e and \u003cem\u003eCOL11A1\u003c/em\u003e. In total, 16 signature genes were identified (eight shURs and eight mCAF marker genes).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cb\u003eThe general pathogenic relevance of the 16 signature genes and their mRNAs and protein products is supported by analyses of two large cohorts\u003c/b\u003e \u003c/p\u003e \u003cp\u003eTo assess the general pathogenic relevance of these 16 signature genes, we hypothesized that the signature at both mRNA and protein levels (subsequently referred to as the mRNA signature and protein signature) 1) should be differentially expressed in tumor tissue/cancer plasma compared to normal tissue/healthy plasma and 2) associated with outcome of cancer patients \u0026ndash; all-cause mortality in ten years.\u003c/p\u003e \u003cp\u003eDifferential expression in tumor tissue vs. normal tissue of the mRNA signature was tested in bulk RNA sequencing data of tissue samples in TCGA (9,185 patients and 727 controls from 22 cancers). The protein signatures were tested using the plasma proteomics data from the UKBB (10,384 patients and 5,063 controls from 19 cancers, 12 proteins were detected) (Additional file 1 S2 to 4). These signature mRNAs/proteins were evaluated for each cancer type in both cohorts. We found that they were generally significantly differentially expressed in all cancer types on both tissue mRNA and plasma protein levels. Several mRNAs/proteins showed similar expression change in tumors from both cohorts, for example CTHRC1, MDK and SPP1, while some had more variation (e.g., COL18A1) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e and Additional file 6 S1 and S2). Nevertheless, the similar differential expression patterns of these signature mRNAs/proteins across different cancers suggested that this signature could represent molecular mechanisms of clinical importance. To examine this, we next analyzed if the signature was associated with the perhaps most important clinical trait - mortality.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003eSignature genes in tumor tissues were associated with mortality in multiple cancers\u003c/h2\u003e \u003cp\u003eThe association of these signature mRNAs in tumor tissues with 10-year all-cause mortality after cancer diagnosis were evaluated using the Cox proportional hazards model in cancer patients from TCGA. The signature mRNAs showed significant associations with mortality in all cancer patients with HR ranging from 1.06 to 1.2. The mRNAs signature score was associated with higher risk of death (HR[95%CI]\u0026thinsp;=\u0026thinsp;1.69[1.55\u0026ndash;1.85]) compared to each single mRNAs (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA and Additional file 6 S3). Similar results were found in each sex subgroup (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e AB). When looking at each individual cancer, the mRNAs signature score was associated with mortality in 11 cancers. Particularly strong associations were found in cancers of the brain (HR[95%CI]\u0026thinsp;=\u0026thinsp;6.9[4.64\u0026ndash;10.25]), mesothelioma (HR[95%CI]\u0026thinsp;=\u0026thinsp;3.13[1.87\u0026ndash;5.24]) and uterus (HR[95%CI]\u0026thinsp;=\u0026thinsp;3.02[1.61\u0026ndash;5.66]) (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eC and Additional file 6 S4).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003eSignature proteins in plasma were associated with mortality in multiple cancers\u003c/h2\u003e \u003cp\u003eThe association of 12 signature proteins with survival were analyzed in plasma from cancers patients from the UKBB. Eight plasma proteins were associated with mortality, with COL18A1 showing the highest HR in all cancer patients (HR[95%CI]\u0026thinsp;=\u0026thinsp;1.72[1.92\u0026ndash;2.50]). Compared to each individual proteins, the protein score of these nine proteins was associated with greater risk of death in all cancer patients (HR[95%CI]\u0026thinsp;=\u0026thinsp;2.16[1.84,2.53]) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA and Additional file 6 S5). In female and male subgroups, more proteins were associated with mortality in males compared to females (8 vs. 4 proteins), while a higher protein score correlated with higher risk of death in both female and male cancer patients (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eA and B). Notably, females, overall, showed lower risk of death compared to males (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eB). The protein score of these nine proteins was associated with mortality in nine cancer types. The HR ranged from 1.47 to 5.53, with the highest HR for the death risk being found for ovarian cancer (HR[95%CI]\u0026thinsp;=\u0026thinsp;5.53 [2.08\u0026ndash;14.67]) followed by prostate cancer (4.63[2.80\u0026ndash;7.68]) and lymphoma (HR[95%CI]\u0026thinsp;=\u0026thinsp;4.62[2.43\u0026ndash;8.8]) (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003eC and Additional file 6 S6).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eDespite the great complexity and heterogeneity of cancers this study showed molecular changes that were shared across multiple cancers. The pathogenic and clinical importance of those changes was supported by enrichment of GWAS genes and association with mortality.\u003c/p\u003e \u003cp\u003eThe study was based on scRNA-seq, which allows the characterization of molecular changes in all cell types in a tumor. This may be advantageous because increasing evidence points to the pathogenic importance of multiple cell types in the TME \u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e,\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e\u003c/sup\u003e. This complexity leads to the problems of how best to organize systematically and prioritize mechanisms across cancers.\u003c/p\u003e \u003cp\u003ePrevious scRNA-seq studies of complex diseases, which also are multicellular, have shown that these problems can be addressed by constructing multicellular network models based on connecting URs in any cell type with their DSs in other cell types, and prioritizing the URs with the largest effects on DSs \u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. We applied these principles to scRNA-seq data from five cancers. In summary, we found that despite great cellular and molecular differences among the analyzed cancers, their MCTMs showed overarching similarities. These included pathogenic URs and DSs being dispersed across cell types, rather than only originating from cancer cells. A similar organization was found in the shMCTM, which showed a higher-order representation of the complex changes. In support of a shared multicellular pathogenesis across cancers, the shMCTM was enriched for GWAS genes and pathways associated with malignant transformation. Since shURs regulated the shDSs, the shURs would have a superior role relative to shDSs. The shURs that regulated more shDSs and cells were prioritized and considered as signature genes that could have important pathogenic roles.\u003c/p\u003e \u003cp\u003eNotably, these prioritized shURs exhibited elevated expression levels in fibroblasts compared to other cell types in the shMCTM. This agreed with the previous finding of a hierarchy of cell-cell interactions dominated by fibroblasts to macrophages in breast cancer \u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. Moreover, we found CAF has a higher hierarchy over multiple cell types in five different tumors, supporting the crucial role of CAF in TME and tumor progression \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e,\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. This led us to subtype CAF cells into clusters, of which four were more common in cancer than in normal tissues. We found that most shURs and shDSs were mainly expressed in the largest cluster (CAF_0). This cluster is in agreement with previously reported mCAF, which shows high expression of ECM remodeling genes and a pro-angiogenic effects in TME \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. Interestingly, shURs located in mCAF regulated shDSs in all other cell types. KEGG pathway analysis of those shDSs revealed a wide variety of pathways related to cancer, vascular function, coagulation, immunity, and metabolism. In support of a direct tumorigenic role of the fibroblast shURs, their shDSs in epithelial cells encoded cancer-related pathways, namely proteoglycan- and AGE-RAGE signaling, as well as pathways associated with many specific cancers. This finding suggested a key regulatory role of mCAF which was mainly associated with ECM according to KEGG pathway enrichment analysis. Therefore, we hypothesized that mCAF could be used to add genes to the shared gene signature. This resulted in a gene signature with eight genes from mCAF and eight shURs.\u003c/p\u003e \u003cp\u003eRecently, CCI and shared mechanisms were discussed for their potential use relates to cancer\u0026rsquo;s clinical outcomes \u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e,\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e. In this study, we hypothesized that this signature was associated with the mortality of cancer patients and tested the hypothesis in two independent cohorts (TCGA and UKBB). The expression of signature mRNAs and proteins showed significant differences between tumor and normal samples in both cohorts, underscoring their pathogenic relevance. Additionally, our analysis revealed that each individual signature mRNA/protein was correlated with all-cause mortality in cancer patients from both cohorts. When evaluating the overall associations of the mRNA and protein signature scores within specific cancer types, we observed moderate to high associations with mortality in both datasets. The signature genes that belong to collagen family (e.g. COL18A1 and COL4A1) showed the highest association with increased risk of death. This is in line with previous findings implicating members of the collagen family as prognostic markers for cancers \u003csup\u003e\u003cspan additionalcitationids=\"CR35\" citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e\u003c/sup\u003e. Moreover, CTHRC1 also exhibited a high association with death risk in both mRNA and protein levels. This agrees with previous findings showing its association with tumor progression, metastasis and prognosis in several cancer types \u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e,\u003cspan additionalcitationids=\"CR38\" citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eWhile both mRNA and protein scores were linked to all-cause mortality, the association differed between TCGA and UKBB. The association of signature score with mortality was demonstrated to be similar between females and males in TCGA, but it was notably associated with a greater risk of death in males compared to females in the UKBB dataset. Furthermore, the cancers with the highest associations in TCGA were located in the brain, mesothelioma and uterus, while the highest associations in UKBB were ovarian cancer, prostate and lymphoma, indicating differences between tissue mRNA and blood proteins. Nevertheless, the consistent significant association of both mRNA and protein scores with mortality underscores the pathogenic relevance of the signature.\u003c/p\u003e \u003cp\u003eDespite this, this study has potential limitations. Our analysis is limited to mRNAs and proteins, while multiple other types of molecules have been shown to play important pathogenic roles. Another limitation is that the scRNA-seq data were derived from a small number of patients from solid tumors. However, the relevance of the signature genes was supported in both cohorts by analyses of their associations with mortality in multiple other cancers including non-solid tumors like leukemia in independent cohorts. We propose that further studies are warranted to examine the signature genes in other cancers, as well as their associations with disease-relevant traits.\u003c/p\u003e \u003cp\u003eIn conclusion, our findings support the pathogenic and clinical relevance of molecular interactions that are shared across cancers. We have made the methods and data underlying this study freely available for basic and translational studies (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/SDTC-CPMed/shMCTM_cancer_mortality\u003c/span\u003e\u003cspan address=\"https://github.com/SDTC-CPMed/shMCTM_cancer_mortality\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eCAF: Cancer-associated fibroblast; CCI: Cell-cell interactions; CI: Confidential interval; DEG: Differentially expressed genes; DS: Downstream target; ECM, extracellular matrix; FDR: False discovery rate; GEO: Gene expression omnibus; GTEx: Genotype-tissue expression; GWAS: Genome-wide association studies; HR: Hazards ratio; ICD: International Classification of Diseases; KNN: K-nearest neighbor; Log2FC: Log2 transformed fold change; MAST: Model-based Analysis of Single-cell Transcriptomics; MCTM: Multicellular tumor model; NPX: Normalized Protein Expression; scRNA-seq: single-cell RNA sequencing; shMCTM: Shared multicellular tumor model; shDS: Shared downstream target genes shUR: Shared upstream regulator gene; TCGA: The Cancer Genome Atlas; TME: Tumor microenvironment; TPM: transcripts per million; UKBB: UK biobank; UMAP: Uniform manifold approximation and projection; UR: Upstream regulator.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eUK Biobank has approval from the Northwest Multi-centre Research Ethics Committee (MREC) as a Research Tissue Bank (RTB) approval. This approval means that researchers do not require separate ethical clearance and can operate under the RTB approval (there are certain exceptions to this which are set out in the Access Procedures, such as re-contact applications).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot appliable\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe scRNA-seq data used in this study is publicly available on GEO, with accession number GSE161529, GSE144735, GSE138709, GSE123902 and ArrayExpress E-MTAB-8107. The metadata of all the scRNA-seq datasets, URs, DSs, as well as their interactions in each dataset, and codes generated during this study are publicly available at https://github.com/SDTC-CPMed/shMCTM_cancer_mortality.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eMB is the scientific founder of Mavatar, Inc. JL is co-scientific founder of Scipher Medicine, Inc. The other authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by:\u0026nbsp;Swedish Cancer Society and Swedish Research Council.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors\u0026apos; contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eYZ had a primary role in analyses, interpretation of the data, as well as manuscript writing \u0026nbsp;XL, YW, MS and OS contributed to those analyses, and JL with translational expertise. DA, FM and MB supervised these studies. All authors contributed to the writing of the manuscript. The authors read and approved the final manuscript. DA, FM and MB contributed equally to this work.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot appliable\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eGlobal cancer burden growing, amidst mounting need for services. https://www.who.int/news/item/01-02-2024-global-cancer-burden-growing--amidst-mounting-need-for-services. (2024).\u003c/li\u003e\n\u003cli\u003eBarrenas, F.\u003cem\u003e et al.\u003c/em\u003e Highly interconnected genes in disease-specific networks are enriched for disease-associated polymorphisms. \u003cem\u003eGenome Biol\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, R46, doi:10.1186/gb-2012-13-6-r46 (2012).\u003c/li\u003e\n\u003cli\u003eGhoshdastider, U.\u003cem\u003e et al.\u003c/em\u003e Pan-Cancer Analysis of Ligand\u0026ndash;Receptor Cross-talk in the Tumor Microenvironment. \u003cem\u003eCancer Research\u003c/em\u003e \u003cstrong\u003e81\u003c/strong\u003e, 1802-1812, doi:10.1158/0008-5472.Can-20-2352 (2021).\u003c/li\u003e\n\u003cli\u003eMa, C.\u003cem\u003e et al.\u003c/em\u003e Pan-cancer spatially resolved single-cell analysis reveals the crosstalk between cancer-associated fibroblasts and tumor microenvironment. \u003cem\u003eMolecular Cancer\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, 170, doi:10.1186/s12943-023-01876-x (2023).\u003c/li\u003e\n\u003cli\u003eShalek, A. K. \u0026amp; Benson, M. Single-cell analyses to tailor treatments. \u003cem\u003eSci Transl Med\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, doi:10.1126/scitranslmed.aan4730 (2017).\u003c/li\u003e\n\u003cli\u003eGawel, D. R.\u003cem\u003e et al.\u003c/em\u003e A validated single-cell-based strategy to identify diagnostic and therapeutic targets in complex diseases. \u003cem\u003eGenome Med\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 47, doi:10.1186/s13073-019-0657-3 (2019).\u003c/li\u003e\n\u003cli\u003eLi, X.\u003cem\u003e et al.\u003c/em\u003e A dynamic single cell-based framework for digital twins to prioritize disease genes and drug targets. \u003cem\u003eGenome Medicine\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 48, doi:10.1186/s13073-022-01048-4 (2022).\u003c/li\u003e\n\u003cli\u003eLilja, S.\u003cem\u003e et al.\u003c/em\u003e Multi-organ single-cell analysis reveals an on/off switch system with potential for personalized treatment of immunological diseases. \u003cem\u003eCell Rep Med\u003c/em\u003e \u003cstrong\u003e4\u003c/strong\u003e, 100956, doi:10.1016/j.xcrm.2023.100956 (2023).\u003c/li\u003e\n\u003cli\u003ePal, B.\u003cem\u003e et al.\u003c/em\u003e A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. \u003cem\u003eEMBO J\u003c/em\u003e, e107333, doi:10.15252/embj.2020107333 (2021).\u003c/li\u003e\n\u003cli\u003eLee, H. O.\u003cem\u003e et al.\u003c/em\u003e Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. \u003cem\u003eNat Genet\u003c/em\u003e \u003cstrong\u003e52\u003c/strong\u003e, 594-603, doi:10.1038/s41588-020-0636-z (2020).\u003c/li\u003e\n\u003cli\u003eZhang, M.\u003cem\u003e et al.\u003c/em\u003e Single-cell transcriptomic architecture and intercellular crosstalk of human intrahepatic cholangiocarcinoma. \u003cem\u003eJ Hepatol\u003c/em\u003e \u003cstrong\u003e73\u003c/strong\u003e, 1118-1130, doi:10.1016/j.jhep.2020.05.039 (2020).\u003c/li\u003e\n\u003cli\u003eLaughney, A. M.\u003cem\u003e et al.\u003c/em\u003e Regenerative lineages and immune-mediated pruning in lung cancer metastasis. \u003cem\u003eNat Med\u003c/em\u003e \u003cstrong\u003e26\u003c/strong\u003e, 259-269, doi:10.1038/s41591-019-0750-6 (2020).\u003c/li\u003e\n\u003cli\u003eQian, J.\u003cem\u003e et al.\u003c/em\u003e A pan-cancer blueprint of the heterogeneous tumor microenvironment revealed by single-cell profiling. \u003cem\u003eCell Res\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 745-762, doi:10.1038/s41422-020-0355-0 (2020).\u003c/li\u003e\n\u003cli\u003eGene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/.\u003c/li\u003e\n\u003cli\u003eArrayExpress. https://www.ebi.ac.uk/arrayexpress/.\u003c/li\u003e\n\u003cli\u003eSun, B. B.\u003cem\u003e et al.\u003c/em\u003e Plasma proteomic associations with genetics and health in the UK Biobank. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e622\u003c/strong\u003e, 329-338, doi:10.1038/s41586-023-06592-6 (2023).\u003c/li\u003e\n\u003cli\u003eUKB - Olink Explore - Data Normalization Strategy. https://biobank.ctsu.ox.ac.uk/crystal/ukb/docs/Olink_1536_B0_to_B7_FAQ.pdf.\u003c/li\u003e\n\u003cli\u003eUCS Xena: Cohort: TCGA TARGET GTEx. https://xenabrowser.net/datapages/. \u003cem\u003ehttps://xenabrowser.net/datapages/\u003c/em\u003e.\u003c/li\u003e\n\u003cli\u003eGoldman, M. J.\u003cem\u003e et al.\u003c/em\u003e Visualizing and interpreting cancer genomics data via the Xena platform. \u003cem\u003eNat Biotechnol\u003c/em\u003e \u003cstrong\u003e38\u003c/strong\u003e, 675-678, doi:10.1038/s41587-020-0546-8 (2020).\u003c/li\u003e\n\u003cli\u003eStuart, T.\u003cem\u003e et al.\u003c/em\u003e Comprehensive Integration of Single-Cell Data. \u003cem\u003eCell\u003c/em\u003e \u003cstrong\u003e177\u003c/strong\u003e, 1888-1902 e1821, doi:10.1016/j.cell.2019.05.031 (2019).\u003c/li\u003e\n\u003cli\u003eFinak, G.\u003cem\u003e et al.\u003c/em\u003e MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. \u003cem\u003eGenome Biology\u003c/em\u003e \u003cstrong\u003e16\u003c/strong\u003e, 278, doi:10.1186/s13059-015-0844-5 (2015).\u003c/li\u003e\n\u003cli\u003eBrowaeys, R., Saelens, W. \u0026amp; Saeys, Y. NicheNet: modeling intercellular communication by linking ligands to target genes. \u003cem\u003eNat Methods\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 159-162, doi:10.1038/s41592-019-0667-5 (2020).\u003c/li\u003e\n\u003cli\u003ePinero, J.\u003cem\u003e et al.\u003c/em\u003e The DisGeNET knowledge platform for disease genomics: 2019 update. \u003cem\u003eNucleic Acids Res\u003c/em\u003e \u003cstrong\u003e48\u003c/strong\u003e, D845-D855, doi:10.1093/nar/gkz1021 (2020).\u003c/li\u003e\n\u003cli\u003eWu, T.\u003cem\u003e et al.\u003c/em\u003e clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. \u003cem\u003eThe Innovation\u003c/em\u003e \u003cstrong\u003e2\u003c/strong\u003e, doi:10.1016/j.xinn.2021.100141 (2021).\u003c/li\u003e\n\u003cli\u003eBiobank, U. UK biobank data providers and dates of data availability. https://biobank.ndph.ox.ac.uk/showcase/exinfo.cgi?src=Data_providers_and_dates. (2023).\u003c/li\u003e\n\u003cli\u003eH\u0026auml;nzelmann, S., Castelo, R. \u0026amp; Guinney, J. GSVA: gene set variation analysis for microarray and RNA-Seq data. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 7, doi:10.1186/1471-2105-14-7 (2013).\u003c/li\u003e\n\u003cli\u003eLiu, C.\u003cem\u003e et al.\u003c/em\u003e Single-cell dissection of cellular and molecular features underlying human cervical squamous cell carcinoma initiation and progression. \u003cem\u003eScience Advances\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, eadd8977, doi:doi:10.1126/sciadv.add8977 (2023).\u003c/li\u003e\n\u003cli\u003ePeltanova, B., Raudenska, M. \u0026amp; Masarik, M. Effect of tumor microenvironment on pathogenesis of the head and neck squamous cell carcinoma: a systematic review. \u003cem\u003eMolecular Cancer\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 63, doi:10.1186/s12943-019-0983-5 (2019).\u003c/li\u003e\n\u003cli\u003eChen, L.-x.\u003cem\u003e et al.\u003c/em\u003e Cell\u0026ndash;cell communications shape tumor microenvironment and predict clinical outcomes in clear cell renal carcinoma. \u003cem\u003eJournal of Translational Medicine\u003c/em\u003e \u003cstrong\u003e21\u003c/strong\u003e, 113, doi:10.1186/s12967-022-03858-x (2023).\u003c/li\u003e\n\u003cli\u003eMayer, S.\u003cem\u003e et al.\u003c/em\u003e The tumor microenvironment shows a hierarchy of cell-cell interactions dominated by fibroblasts. \u003cem\u003eNature Communications\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 5810, doi:10.1038/s41467-023-41518-w (2023).\u003c/li\u003e\n\u003cli\u003ePelon, F.\u003cem\u003e et al.\u003c/em\u003e Cancer-associated fibroblast heterogeneity in axillary lymph nodes drives metastases in breast cancer through complementary mechanisms. \u003cem\u003eNature Communications\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 404, doi:10.1038/s41467-019-14134-w (2020).\u003c/li\u003e\n\u003cli\u003eCzekay, R. P., Cheon, D. J., Samarakoon, R., Kutz, S. M. \u0026amp; Higgins, P. J. Cancer-Associated Fibroblasts: Mechanisms of Tumor Progression and Novel Therapeutic Targets. \u003cem\u003eCancers (Basel)\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, doi:10.3390/cancers14051231 (2022).\u003c/li\u003e\n\u003cli\u003eWeiss, F., Lauffenburger, D. \u0026amp; Friedl, P. Towards targeting of shared mechanisms of cancer metastasis and therapy resistance. \u003cem\u003eNat Rev Cancer\u003c/em\u003e \u003cstrong\u003e22\u003c/strong\u003e, 157-173, doi:10.1038/s41568-021-00427-0 (2022).\u003c/li\u003e\n\u003cli\u003eNi, S.\u003cem\u003e et al.\u003c/em\u003e CTHRC1 overexpression predicts poor survival and enhances epithelial-mesenchymal transition in colorectal cancer. \u003cem\u003eCancer Med\u003c/em\u003e \u003cstrong\u003e7\u003c/strong\u003e, 5643-5654, doi:10.1002/cam4.1807 (2018).\u003c/li\u003e\n\u003cli\u003eLi, X., Li, Z., Gu, S. \u0026amp; Zhao, X. A pan-cancer analysis of collagen VI family on prognosis, tumor microenvironment, and its potential therapeutic effect. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 390, doi:10.1186/s12859-022-04951-0 (2022).\u003c/li\u003e\n\u003cli\u003eNecula, L.\u003cem\u003e et al.\u003c/em\u003e Collagen Family as Promising Biomarkers and Therapeutic Targets in Cancer. \u003cem\u003eInt J Mol Sci\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, doi:10.3390/ijms232012415 (2022).\u003c/li\u003e\n\u003cli\u003eSial, N.\u003cem\u003e et al.\u003c/em\u003e CTHRC1 expression is a novel shared diagnostic and prognostic biomarker of survival in six different human cancer subtypes. \u003cem\u003eSci Rep\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 19873, doi:10.1038/s41598-021-99321-w (2021).\u003c/li\u003e\n\u003cli\u003eLi, Y.\u003cem\u003e et al.\u003c/em\u003e Single-cell landscape reveals active cell subtypes and their interaction in the tumor microenvironment of gastric cancer. \u003cem\u003eTheranostics\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 3818-3833, doi:10.7150/thno.71833 (2022).\u003c/li\u003e\n\u003cli\u003eChen, Y.\u003cem\u003e et al.\u003c/em\u003e High CTHRC1 expression may be closely associated with angiogenesis and indicates poor prognosis in lung adenocarcinoma patients. \u003cem\u003eCancer Cell International\u003c/em\u003e \u003cstrong\u003e19\u003c/strong\u003e, 318, doi:10.1186/s12935-019-1041-5 (2019).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"journal-of-translational-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jtrm","sideBox":"Learn more about [Journal of Translational Medicine](http://translational-medicine.biomedcentral.com)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/jtrm/default.aspx","title":"Journal of Translational Medicine","twitterHandle":"@BioMedCentral","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Cell-cell interactions, Cancer-associated fibroblast, Single-cell RNA sequencing, Prioritization, Pan-cancer, Mortality","lastPublishedDoi":"10.21203/rs.3.rs-3994390/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3994390/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eCharacterization of shared cancer mechanisms have been proposed to improve therapy strategies and prognosis. Here, we aimed to identify shared cell-cell interactions (CCIs) within the tumor microenvironment across multiple solid cancers and assess their association with cancer mortality.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eCCIs of each cancer were identified by NicheNet analysis of single-cell RNA sequencing data from breast, colon, liver, lung, and ovarian cancers. These CCIs were used to construct a shared multi-cellular tumor model (shMCTM) representing common CCIs across cancers. A gene signature was identified from the shMCTM and tested on the mRNA and protein level in two large independent cohorts: The Cancer Genome Atlas (TCGA, 9,185 tumor samples and 727 controls across 22 cancers) and UK biobank (UKBB, 10,384 cancer patients and 5,063 controls with proteomics data across 17 cancers). Cox proportional hazards models were used to evaluate the association of the signature with 10-year all-cause mortality, including sex-specific analysis.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eA shMCTM was derived from five individual cancers. A shared gene signature was extracted from this shMCTM and the most prominent regulatory cell type, matrix cancer-associated fibroblast (mCAF). The signature exhibited significant expression changes in multiple cancers compared to controls at both mRNA and protein levels in two independent cohorts. Importantly, it was significantly associated with mortality in cancer patients in both cohorts. The highest hazard ratios were observed for brain cancer in TCGA (HR [95%CI]\u0026thinsp;=\u0026thinsp;6.90[4.64\u0026ndash;10.25]) and ovarian cancer in UKBB (5.53[2.08\u0026ndash;8.80]). Sex-specific analysis revealed distinct risks, with a higher mortality risk associated with the protein signature score in males (2.41[1.97\u0026ndash;2.96]) compared to females (1.84[1.44\u0026ndash;2.37]).\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e \u003cp\u003eWe identified a gene signature from a comprehensive shMCTM representing common CCIs across different cancers and revealed the regulatory role of mCAF in the tumor microenvironment. The pathogenic relevance of the gene signature was supported by differential expression and association with mortality on both mRNA and protein levels in two independent cohorts.\u003c/p\u003e","manuscriptTitle":"Transcript and protein signatures derived from shared molecular interactions across cancers are associated with mortality","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-03-06 19:27:10","doi":"10.21203/rs.3.rs-3994390/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"","date":"2024-03-04T11:30:21+00:00","index":0,"fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-03-04T11:05:16+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-02-28T08:36:14+00:00","index":"","fulltext":""},{"type":"submitted","content":"Journal of Translational Medicine","date":"2024-02-26T12:11:45+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"journal-of-translational-medicine","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"jtrm","sideBox":"Learn more about [Journal of Translational Medicine](http://translational-medicine.biomedcentral.com)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/jtrm/default.aspx","title":"Journal of Translational Medicine","twitterHandle":"@BioMedCentral","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"BMC/SO AJ","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"3d97304b-12fe-4484-8612-4fef82ae2fae","owner":[],"postedDate":"March 6th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2024-05-14T21:29:51+00:00","versionOfRecord":{"articleIdentity":"rs-3994390","link":"https://doi.org/10.1186/s12967-024-05268-7","journal":{"identity":"journal-of-translational-medicine","isVorOnly":false,"title":"Journal of Translational Medicine"},"publishedOn":"2024-05-11 21:18:03","publishedOnDateReadable":"May 11th, 2024"},"versionCreatedAt":"2024-03-06 19:27:10","video":"","vorDoi":"10.1186/s12967-024-05268-7","vorDoiUrl":"https://doi.org/10.1186/s12967-024-05268-7","workflowStages":[]},"version":"v1","identity":"rs-3994390","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3994390","identity":"rs-3994390","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00